Hardware Accelerated Algorithms and Architectures for Various DSP Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (30 June 2023) | Viewed by 5485

Special Issue Editor


E-Mail Website
Guest Editor
Associate Professor, Department of Electronic Engineering, Myongji University, Yongin 17058, Korea
Interests: digital signal processing; VLSI for signal processing algorithms; system-on-chip design; HW-SW co-design; machine learning; digital communication

Special Issue Information

Dear Colleagues,

Achieving high performance even with low power consumption has become a major issue when designing embedded systems with limited resources. To this end, there is a growing demand for accelerated computing that speeds up processing by allocating some processing to subsystems such as GPGPU, FPGA, ASIC, and DSP. Even if a new algorithm meets the design goals, the usefulness of the algorithm inevitably decreases if it is not hardware friendly. The rapid evolution of high-level synthesis tools that automatically convert algorithm level to register-transfer level is also based on the growing demand for accelerated computing and shorter time-to-market. Therefore, in the algorithm development stage, not only should the ease of hardware implementation be considered but also the optimized hardware architecture. In addition, appropriate software–hardware partitioning along with their co-optimization is essential to maximize performance with limited resources.

 The main purpose of this Special Issue is to attract submissions on the development of high-performance signal processing algorithms and optimized hardware architectures required for various signal processing applications, such as artificial intelligence, 5G, autonomous vehicles, cybersecurity, biomedical systems, video analytics, etc. Optimization at all levels of abstraction falls within the field of interest of this Special Issue, e.g., at the algorithm, register-transfer, and circuit levels.  

The topics of interest include, but are not limited to: 

• Models, methods, and architectures for accelerated computing
• Hardware architecture for digital signal processing algorithm
• Hardware design in edge/IoT computing
• Low power and high-performance design
• Hardware/software co-design                
• High level synthesis
• Reconfigurable system design
• Interaction of CPU and FPGA components
• System on chip design
• High-speed algorithm and architecture
• Optimized design of digital communication systems
• Digital system design with reduced memory bandwidth

Prof. Dr. Sang Yoon Park
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 1469 KiB  
Article
CUDA-Optimized GPU Acceleration of 3GPP 3D Channel Model Simulations for 5G Network Planning
by Nasir Ali Shah, Mihai T. Lazarescu, Roberto Quasso and Luciano Lavagno
Electronics 2023, 12(15), 3214; https://doi.org/10.3390/electronics12153214 - 25 Jul 2023
Viewed by 1047
Abstract
The simulation of massive multiple-input multiple-output (MIMO) channel models is becoming increasingly important for testing and validation of fifth-generation new radio (5G NR) wireless networks and beyond. However, simulation performance tends to be limited when modeling a large number of antenna elements combined [...] Read more.
The simulation of massive multiple-input multiple-output (MIMO) channel models is becoming increasingly important for testing and validation of fifth-generation new radio (5G NR) wireless networks and beyond. However, simulation performance tends to be limited when modeling a large number of antenna elements combined with a complex and realistic representation of propagation conditions. In this paper, we propose an efficient implementation of a 3rd Generation Partnership Project (3GPP) three-dimensional (3D) channel model, specifically designed for graphics processing unit (GPU) platforms, with the goal of minimizing the computational time required for channel simulation. The channel model is highly parameterized to encompass a wide range of configurations required for real-world optimized 5G NR network deployments. We use several compute unified device architecture (CUDA)-based optimization techniques to exploit the parallelism and memory hierarchy of the GPU. Experimental data show that the developed system achieves an overall speedup of about 240× compared to the original C++ model executed on an Intel processor. Compared to a design previously accelerated on a datacenter-class field programmable gate array (FPGA), the GPU design has a 33.3% higher single-precision performance but a 7.5% higher power consumption. The proposed GPU accelerator can provide fast and accurate channel simulations for 5G NR network planning and optimization. Full article
Show Figures

Figure 1

27 pages, 4299 KiB  
Article
Design and FPGA-Based Hardware Implementation of NB-IoT Physical Uplink Shared Channel Transmitter and Physical Downlink Shared Channel Receiver
by Abdallah Abostait, Rania M. Tawfik, M. Saeed Darweesh and Hassan Mostafa
Electronics 2023, 12(9), 1966; https://doi.org/10.3390/electronics12091966 - 24 Apr 2023
Viewed by 2393
Abstract
With the anticipated growth of the internet of things (IoT) market, many low-power wide-area (LPWA) technologies have been introduced to connect a wide range of IoT devices with varying performance requirements. The narrowband internet of things (NB-IoT) is a 3rd generation partnership project [...] Read more.
With the anticipated growth of the internet of things (IoT) market, many low-power wide-area (LPWA) technologies have been introduced to connect a wide range of IoT devices with varying performance requirements. The narrowband internet of things (NB-IoT) is a 3rd generation partnership project (3GPP) standardized LPWA technology that meets most IoT service requirements. In this paper, the design and implementation of the physical uplink transmitting chain as well as the physical downlink receiving chain of the NB-IoT user equipment (UE) are presented. Both chains’ main blocks are designed to follow the 3GPP NB-IoT LTE standard Release 14 (Rel-14). The whole design is experimentally implemented on the Virtex 7 (VC 709) Connectivity kit, and all performance metrics are reported. Moreover, an NB-IoT base station is implemented and integrated with the two prototyped UEs to set-up an NB-IoT system, which is employed to send data from one UE to another UE through the NB-IoT base station using two FPGAs (one to implement the sending UE and the other one to implement the receiving UE) and three universal software radio peripherals (USRPs) B200 (one to implement the RF front-end of the transmitting UE FPGA, one to implement the base station RF front-end, and one to implement the RF front-end of the receiving UE FPGA). Experimental results show that the implemented NB-IoT system is working successfully, as the two NB-IoT UEs communicate together successfully through the NB-IoT base station and exchange the data properly. Full article
Show Figures

Figure 1

17 pages, 2801 KiB  
Article
Multi-Gbps LDPC Decoder on GPU Devices
by Jingxin Dai, Hang Yin, Yansong Lv, Weizhang Xu and Zhanxin Yang
Electronics 2022, 11(21), 3447; https://doi.org/10.3390/electronics11213447 - 25 Oct 2022
Cited by 1 | Viewed by 1372
Abstract
To meet the high throughput requirement of communication systems, the design of high-throughput low-density parity-check (LDPC) decoders has attracted significant attention. This paper proposes a high-throughput GPU-based LDPC decoder, aiming at the large-scale data process scenario, which optimizes the decoder from the perspectives [...] Read more.
To meet the high throughput requirement of communication systems, the design of high-throughput low-density parity-check (LDPC) decoders has attracted significant attention. This paper proposes a high-throughput GPU-based LDPC decoder, aiming at the large-scale data process scenario, which optimizes the decoder from the perspectives of the decoding parallelism and data scheduling strategy, respectively. For decoding parallelism, the intra-codeword parallelism is fully exploited by combining the characteristics of the flooding-based decoding algorithm and GPU programming model, and the inter-codeword parallelism is improved using the single-instruction multiple-data (SIMD) instructions. For the data scheduling strategy, the utilization of off-chip memory is optimized to satisfy the demands of large-scale data processing. The experimental results demonstrate that the decoder achieves 10 Gbps throughput by incorporating the early termination mechanism on general-purpose GPU (GPGPU) devices and can also achieve a high-throughput and high-power-efficiency performance on low-power embedded GPU (EGPU) devices. Compared with the state-of-the-art work, the proposed decoder had a ×1.787 normalized throughput speedup at the same error correcting performance. Full article
Show Figures

Graphical abstract

Back to TopTop