Embedded Systems for Neural Network Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 15 September 2024 | Viewed by 6567

Special Issue Editor


E-Mail Website
Guest Editor
Department of Electrical Engineering, Pohang University of Science and Technology, 77 Cheongam-ro, Nam-gu, Pohang 37673, Republic of Korea
Interests: low-power circuits for deep learning; approximate computing; parallel computing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Various types of neural network models are being used for intelligent decision making in various types of embedded systems.

Such systems are typically used in mobile environments and require extremely power-efficient circuits with fast inference capabilities.

Examples include speech processing systems in smartphones, headsets, and earbuds; computer vision processing systems in tablets, smartphones, and smart eyeglasses; and real-time video processing systems for autonomous vehicles.

Various models and techniques can be used to enable state-of-the-art artificial intelligence capabilities in such devices, even when used in low-network-bandwidth or unconnected environments.

This Special Issue will investigate the latest state-of-the-art techniques for embedded computing in special-purpose neural network hardware, field-programmable gate array designs, and specially designed computer systems using mass-market general-purpose CPUs.

All contributions investigating any aspects of embedded systems for neural network applications are welcome.

Prof. Dr. Sunggu Lee
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • embedded systems
  • edge computing
  • neural network
  • low-power circuits
  • neural accelerator

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

23 pages, 2718 KiB  
Article
Voltage Scaled Low Power DNN Accelerator Design on Reconfigurable Platform
by Rourab Paul, Sreetama Sarkar, Suman Sau, Sanghamitra Roy, Koushik Chakraborty and Amlan Chakrabarti
Electronics 2024, 13(8), 1431; https://doi.org/10.3390/electronics13081431 - 10 Apr 2024
Viewed by 787
Abstract
The exponential emergence of Field-Programmable Gate Arrays (FPGAs) has accelerated research on hardware implementation of Deep Neural Networks (DNNs). Among all DNN processors, domain-specific architectures such as Google’s Tensor Processor Unit (TPU) have outperformed conventional GPUs (Graphics Processing Units) and CPUs (Central Processing [...] Read more.
The exponential emergence of Field-Programmable Gate Arrays (FPGAs) has accelerated research on hardware implementation of Deep Neural Networks (DNNs). Among all DNN processors, domain-specific architectures such as Google’s Tensor Processor Unit (TPU) have outperformed conventional GPUs (Graphics Processing Units) and CPUs (Central Processing Units). However, implementing low-power TPUs in reconfigurable hardware remains a challenge in this field. Voltage scaling, a popular approach for energy savings, can be challenging in FPGAs, as it may lead to timing failures if not implemented appropriately. This work presents an ultra-low-power FPGA implementation of a TPU for edge applications. We divide the systolic array of a TPU into different FPGA partitions based on the minimum slack value of different design paths of Multiplier Accumulators (MACs). Each partition uses different near-threshold (NTC) biasing voltages to run its FPGA cores. The biasing voltage for each partition is roughly calculated by the proposed static schemes. However, further calibration of biasing voltage is performed by the proposed runtime scheme. To overcome the timing failure caused by NTC, the MACs with higher minimum slack are placed in lower-voltage partitions, while the MACs with lower minimum slack paths are placed in higher-voltage partitions. The proposed architecture is implemented in a commercial platform, namely Vivado with Xilinx Artix-7 FPGA and academic platform VTR with 22 nm, 45 nm and 130 nm FPGAs. Any timing error caused by NTC can be caught by the Razor flipflop used in each MAC. The proposed voltage-scaled, partitioned systolic array can save 3.1% to 11.6% of dynamic power in Vivado and VTR tools, respectively, depending on the FPGA technology, partition size, number of partitions and biasing voltages. The normalized performance and accuracy of benchmark models running on our low-power TPU are very competitive compared to existing literature. Full article
(This article belongs to the Special Issue Embedded Systems for Neural Network Applications)
Show Figures

Figure 1

20 pages, 3384 KiB  
Article
MCFP-YOLO Animal Species Detector for Embedded Systems
by Mai Ibraheam, Kin Fun Li and Fayez Gebali
Electronics 2023, 12(24), 5044; https://doi.org/10.3390/electronics12245044 - 18 Dec 2023
Viewed by 797
Abstract
Advances in deep learning have led to the development of various animal species detection models suited for different environments. Building on this, our research introduces a detection model that efficiently handles both batch and real-time processing. It achieves this by integrating a motion-based [...] Read more.
Advances in deep learning have led to the development of various animal species detection models suited for different environments. Building on this, our research introduces a detection model that efficiently handles both batch and real-time processing. It achieves this by integrating a motion-based frame selection algorithm and a two-stage pipelining–dataflow hybrid parallel processing approach. These modifications significantly reduced the processing delay and power consumption of the proposed MCFP-YOLO detector, particularly on embedded systems with limited resources, without trading off the accuracy of our animal species detection system. For field applications, the proposed MCFP-YOLO model was deployed and tested on two embedded devices: the RP4B and the Jetson Nano. While the Jetson Nano provided faster processing, the RP4B was selected due to its lower power consumption and a balanced cost–performance ratio, making it particularly suitable for extended use in remote areas. Full article
(This article belongs to the Special Issue Embedded Systems for Neural Network Applications)
Show Figures

Figure 1

17 pages, 2254 KiB  
Article
An Improved Routing Approach for Enhancing QoS Performance for D2D Communication in B5G Networks
by Valmik Tilwari, Taewon Song and Sangheon Pack
Electronics 2022, 11(24), 4118; https://doi.org/10.3390/electronics11244118 - 10 Dec 2022
Cited by 3 | Viewed by 1182
Abstract
Device-to-device (D2D) communication is one of the eminent promising technologies in Beyond Fifth Generation (B5G) wireless networks. It promises high data rates and ubiquitous coverage with low latency, energy, and spectral efficiency among peer-to-peer users. These advantages enable D2D communication to be fully [...] Read more.
Device-to-device (D2D) communication is one of the eminent promising technologies in Beyond Fifth Generation (B5G) wireless networks. It promises high data rates and ubiquitous coverage with low latency, energy, and spectral efficiency among peer-to-peer users. These advantages enable D2D communication to be fully realized in a multi-hop communication scenario. However, to ideally implement multi-hop D2D communication networks, the routing aspect should be thoroughly addressed since a multi-hop network can perform worse than a conventional mobile system if wrong routing decisions are made without proper mechanisms. Thus, routing in multi-hop networks needs to consider device mobility, battery, link quality, and fairness, which issues do not exist in orthodox cellular networking. Therefore, this paper proposed a mobility, battery, link quality, and contention window size-aware routing (MBLCR) approach to boost the overall network performance. In addition, a multicriteria decision-making (MCDM) method is applied to the relay devices for optimal path establishment, which provides weights according to the evaluated values of the devices. Extensive simulation results under various device speed scenarios show the advantages of the MBLCR compared to conventional algorithms in terms of throughput, packet delivery ratio, latency, and energy efficiency. Full article
(This article belongs to the Special Issue Embedded Systems for Neural Network Applications)
Show Figures

Figure 1

16 pages, 657 KiB  
Article
Feasibility Analysis and Implementation of Adaptive Dynamic Reconfiguration of CNN Accelerators
by Ke Han and Yingqi Luo
Electronics 2022, 11(22), 3805; https://doi.org/10.3390/electronics11223805 - 18 Nov 2022
Cited by 1 | Viewed by 1310
Abstract
In multi-tasking scenarios with dynamically changing loads, the parallel computing of convolutional neural networks (CNNs) causes high energy and resource consumption in the system. Another critical problem is that previous neural network hardware accelerators are often limited to fixed scenarios and lack the [...] Read more.
In multi-tasking scenarios with dynamically changing loads, the parallel computing of convolutional neural networks (CNNs) causes high energy and resource consumption in the system. Another critical problem is that previous neural network hardware accelerators are often limited to fixed scenarios and lack the function of adaptive adjustment. To solve these problems, a reconfiguration adaptive system based on the prediction of algorithm workload is proposed in this paper. Deep Learning Processor Unit (DPU) from Xilinx has excellent performance in accelerating network computing. After summarizing the characteristics of hardware accelerators and gaining an in-depth understanding of the DPU structure, we propose a regression model for CNNs runtime prediction and a guidance scheme for adaptive reconfiguration combined with the characteristics of Deep Learning Processor Unit. For different DPU sizes, the accuracy of the proposed prediction model achieves 90.7%. With the dynamic reconfiguration technology, the proposed strategy can enable accurate and fast reconfiguration. In the load change scenario, the proposed system can significantly reduce power consumption. Full article
(This article belongs to the Special Issue Embedded Systems for Neural Network Applications)
Show Figures

Graphical abstract

Review

Jump to: Research

25 pages, 1264 KiB  
Review
Investigation into Perceptual-Aware Optimization for Single-Image Super-Resolution in Embedded Systems
by Khanh Hung Vu, Duc Phuc Nguyen, Duc Dung Nguyen and Hoang-Anh Pham
Electronics 2023, 12(11), 2544; https://doi.org/10.3390/electronics12112544 - 05 Jun 2023
Viewed by 1434
Abstract
Deep learning has been introduced to single-image super-resolution (SISR) in the last decade. These techniques have taken over the benchmarks of SISR tasks. Nevertheless, most architectural designs necessitate substantial computational resources, leading to a prolonged inference time on embedded systems or rendering them [...] Read more.
Deep learning has been introduced to single-image super-resolution (SISR) in the last decade. These techniques have taken over the benchmarks of SISR tasks. Nevertheless, most architectural designs necessitate substantial computational resources, leading to a prolonged inference time on embedded systems or rendering them infeasible for deployment. This paper presents a comprehensive survey of plausible solutions and optimization methods to address this problem. Then, we propose a pipeline that aggregates the latter in order to enhance the inference time without significantly compromising the perceptual quality. We investigate the effectiveness of the proposed method on a lightweight Generative Adversarial Network (GAN)-based perceptual-oriented model as a case study. The experimental results show that our proposed method leads to significant improvement in the inference time on both Desktop and Jetson Xavier NX, especially for higher resolution input sizes on the latter, thereby making it deployable in practice. Full article
(This article belongs to the Special Issue Embedded Systems for Neural Network Applications)
Show Figures

Figure 1

Back to TopTop