Energy-Efficient Processors, Systems, and Their Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence Circuits and Systems (AICAS)".

Deadline for manuscript submissions: closed (15 November 2022) | Viewed by 14538

Special Issue Editors


E-Mail Website
Guest Editor
Institute of Computer Science (ICS), Foundation for Research and Technology—Hellas (FORTH), Vassilika Vouton, GR-70013 Heraklion, Crete, Greece
Interests: computer architecture; high-performance computing; reconfigurable computing; Internet of Things; VLSI

E-Mail Website
Guest Editor
School of Electrical and Computer Engineering, Technical University of Crete, Akrotiri Campus, 731 00 Chania, Greece
Interests: systems and network security; security policy; privacy; high-speed networks
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Electrical and Computer Engineering, Technical University of Crete, Campus Kounoupidiana, 73100 Chania, Crete, Greece
Interests: reconfigurable hardware; high-performance computing; cybersecurity; FPGA AI

Special Issue Information

Dear Colleagues,

Maintaining energy efficiency, in the context of high-performance execution, has become the primary and ubiquitous concern for the majority of today’s computing systems, ranging from exascale and cloud systems to IoT and embedded systems on the edge. While there are fundamental differences in the performance, cost, and form factor requirements between such systems, energy consumption is emerging as the primary common goal for high-performance execution and scalable computing. Toward this end, revolutionary methods are required with a stronger integration among hardware features, system software, and applications. Current approaches for energy-efficient computing and for scalable and energy-efficient interplay in different layers of application, system software and hardware, rely heavily on:

  • Energy-efficient CPUs, such as ARM and RISC-V CPUs;
  • Application-specific hardware accelerators, such as FPGAs and TPUs;
  • System software optimizations that allow for efficient communication and avoid system contentions, such as user-level interprocess communication, user-level interrupts, and lightweight runtime systems;
  • Application optimizations to reduce data movements, i.e., bring data closer to computational units as well as avoid data transfer bottlenecks.

The main focus of this Special Issue will be on the most recent and novel developments in the domain of high-performance computing, which maintain a primary focus on energy efficiency. This is the steppingstone upon which whole systems of connected devices can be brought together to give rise to highly complex and intelligent solutions that facilitate modern societies from the social to the consumer spheres.

Dr. Iakovos Mavroidis
Prof. Dr. Sotiris Ioannidis
Dr. Konstantinos Georgopoulos
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • energy-efficient computing
  • hardware acceleration
  • low-latency communication
  • HPC
  • reconfigurable computing
  • system software
  • runtime systems

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

13 pages, 4675 KiB  
Article
A Study on the Design Procedure of Re-Configurable Convolutional Neural Network Engine for FPGA-Based Applications
by Pervesh Kumar, Imran Ali, Dong-Gyun Kim, Sung-June Byun, Dong-Gyu Kim, Young-Gun Pu and Kang-Yoon Lee
Electronics 2022, 11(23), 3883; https://doi.org/10.3390/electronics11233883 - 24 Nov 2022
Cited by 3 | Viewed by 1702
Abstract
Convolutional neural networks (CNNs) have become a primary approach in the field of artificial intelligence (AI), with wide range of applications. The two computational phases for every neural network are; the training phase and the testing phase. Usually, testing is performed on high-processing [...] Read more.
Convolutional neural networks (CNNs) have become a primary approach in the field of artificial intelligence (AI), with wide range of applications. The two computational phases for every neural network are; the training phase and the testing phase. Usually, testing is performed on high-processing hardware engines, however, the training part is still a challenge for low-power devices. There are several neural accelerators; such as graphics processing units and field-programmable-gate-arrays (FPGAs). From the design perspective, an efficient hardware engine at the register-transfer level and efficient CNN modeling at the TensorFlow level are mandatory for any type of application. Hence, we propose a comprehensive, and step-by-step design procedure for a re-configurable CNN engine. We used TensorFlow and Keras libraries for modeling in Python, whereas the register-transfer-level part was performed using Verilog. The proposed idea was synthesized, placed, and routed for 180 nm complementary metal-oxide semiconductor technology using synopsis design compiler tools. The proposed design layout occupies an area of 3.16 × 3.16 mm2. A competitive accuracy of approximately 96% was achieved for the Modified National Institute of Standards and Technology (MNIST) and Canadian Institute for Advanced Research (CIFAR-10) datasets. Full article
(This article belongs to the Special Issue Energy-Efficient Processors, Systems, and Their Applications)
Show Figures

Figure 1

15 pages, 300 KiB  
Article
Preconditioned Conjugate Gradient Acceleration on FPGA-Based Platforms
by Pavlos Malakonakis, Giovanni Isotton, Panagiotis Miliadis, Chloe Alverti, Dimitris Theodoropoulos, Dionisios Pnevmatikatos, Aggelos Ioannou, Konstantinos Harteros, Konstantinos Georgopoulos, Ioannis Papaefstathiou and Iakovos Mavroidis
Electronics 2022, 11(19), 3039; https://doi.org/10.3390/electronics11193039 - 24 Sep 2022
Cited by 2 | Viewed by 1351
Abstract
Reconfigurable computing can significantly improve the performance and energy efficiency of many applications. However, FPGA-based chips are evolving rapidly, increasing the difficulty of evaluating the impact of new capabilities such as HBM and high-speed links. In this paper, a real-world application was implemented [...] Read more.
Reconfigurable computing can significantly improve the performance and energy efficiency of many applications. However, FPGA-based chips are evolving rapidly, increasing the difficulty of evaluating the impact of new capabilities such as HBM and high-speed links. In this paper, a real-world application was implemented on different FPGAs in order to better understand the new capabilities of modern FPGAs and how new FPGA technology improves performance and scalability. The aforementioned application was the preconditioned conjugate gradient (PCG) method that is utilized in underground analysis. The implementation was done on four different FPGAs, including an MPSoC, taking into account each platform’s characteristics. The results show that today’s FPGA-based chips offer eight times better performance on a memory-bound problem than 5-year-old FPGAs, as they incorporate HBM and can operate at higher clock frequencies. Full article
(This article belongs to the Special Issue Energy-Efficient Processors, Systems, and Their Applications)
Show Figures

Figure 1

14 pages, 418 KiB  
Article
Improving FPGA Based Impedance Spectroscopy Measurement Equipment by Means of HLS Described Neural Networks to Apply Edge AI
by Jorge Fe, Rafael Gadea-Gironés, Jose M. Monzo, Ángel Tebar-Ruiz and Ricardo Colom-Palero
Electronics 2022, 11(13), 2064; https://doi.org/10.3390/electronics11132064 - 30 Jun 2022
Cited by 2 | Viewed by 2313
Abstract
The artificial intelligence (AI) application in instruments such as impedance spectroscopy highlights the difficulty to choose an electronic technology that correctly solves the basic performance problems, adaptation to the context, flexibility, precision, autonomy, and speed of design. Present work demonstrates that FPGAs, in [...] Read more.
The artificial intelligence (AI) application in instruments such as impedance spectroscopy highlights the difficulty to choose an electronic technology that correctly solves the basic performance problems, adaptation to the context, flexibility, precision, autonomy, and speed of design. Present work demonstrates that FPGAs, in conjunction with an optimized high-level synthesis (HLS), allow us to have an efficient connection between the signals sensed by the instrument and the artificial neural network-based AI computing block that will analyze them. State-of-the-art comparisons and experimental results also demonstrate that our designed and developed architectures offer the best compromise between performance, efficiency, and system costs in terms of artificial neural networks implementation. In the present work, computational efficiency above 21 Mps/DSP and power efficiency below 1.24 mW/Mps are achieved. It is important to remark that these results are more relevant because the system can be implemented on a low-cost FPGA. Full article
(This article belongs to the Special Issue Energy-Efficient Processors, Systems, and Their Applications)
Show Figures

Figure 1

17 pages, 1037 KiB  
Article
The Diversification and Enhancement of an IDS Scheme for the Cybersecurity Needs of Modern Supply Chains
by Dimitris Deyannis, Eva Papadogiannaki, Grigorios Chrysos, Konstantinos Georgopoulos and Sotiris Ioannidis
Electronics 2022, 11(13), 1944; https://doi.org/10.3390/electronics11131944 - 22 Jun 2022
Viewed by 1566
Abstract
Despite the tremendous socioeconomic importance of supply chains (SCs), security officers and operators are faced with no easy and integrated way for protecting their critical, and interconnected, infrastructures from cyber-attacks. As a result, solutions and methodologies that support the detection of malicious activity [...] Read more.
Despite the tremendous socioeconomic importance of supply chains (SCs), security officers and operators are faced with no easy and integrated way for protecting their critical, and interconnected, infrastructures from cyber-attacks. As a result, solutions and methodologies that support the detection of malicious activity on SCs are constantly researched into and proposed. Hence, this work presents the implementation of a low-cost reconfigurable intrusion detection system (IDS), on the edge, that can be easily integrated into SC networks, thereby elevating the featured levels of security. Specifically, the proposed system offers real-time cybersecurity intrusion detection over high-speed networks and services by offloading elements of the security check workloads on dedicated reconfigurable hardware. Our solution uses a novel framework that implements the Aho–Corasick algorithm on the reconfigurable fabric of a multi-processor system-on-chip (MPSoC), which supports parallel matching for multiple network packet patterns. The initial performance evaluation of this proof-of-concept shows that it holds the potential to outperform existing software-based solutions while unburdening SC nodes from demanding cybersecurity check workloads. The proposed system performance and its efficiency were evaluated using a real-life environment in the context of European Union’s Horizon 2020 research and innovation program, i.e., CYRENE. Full article
(This article belongs to the Special Issue Energy-Efficient Processors, Systems, and Their Applications)
Show Figures

Figure 1

37 pages, 2473 KiB  
Article
Universal Reconfigurable Hardware Accelerator for Sparse Machine Learning Predictive Models
by Vuk Vranjkovic, Predrag Teodorovic and Rastislav Struharik
Electronics 2022, 11(8), 1178; https://doi.org/10.3390/electronics11081178 - 08 Apr 2022
Cited by 1 | Viewed by 1959
Abstract
This study presents a universal reconfigurable hardware accelerator for efficient processing of sparse decision trees, artificial neural networks and support vector machines. The main idea is to develop a hardware accelerator that will be able to directly process sparse machine learning models, resulting [...] Read more.
This study presents a universal reconfigurable hardware accelerator for efficient processing of sparse decision trees, artificial neural networks and support vector machines. The main idea is to develop a hardware accelerator that will be able to directly process sparse machine learning models, resulting in shorter inference times and lower power consumption compared to existing solutions. To the author’s best knowledge, this is the first hardware accelerator of this type. Additionally, this is the first accelerator that is capable of processing sparse machine learning models of different types. Besides the hardware accelerator itself, algorithms for induction of sparse decision trees, pruning of support vector machines and artificial neural networks are presented. Such sparse machine learning classifiers are attractive since they require significantly less memory resources for storing model parameters. This results in reduced data movement between the accelerator and the DRAM memory, as well as a reduced number of operations required to process input instances, leading to faster and more energy-efficient processing. This could be of a significant interest in edge-based applications, with severely constrained memory, computation resources and power consumption. The performance of algorithms and the developed hardware accelerator are demonstrated using standard benchmark datasets from the UCI Machine Learning Repository database. The results of the experimental study reveal that the proposed algorithms and presented hardware accelerator are superior when compared to some of the existing solutions. Throughput is increased up to 2 times for decision trees, 2.3 times for support vector machines and 38 times for artificial neural networks. When the processing latency is considered, maximum performance improvement is even higher: up to a 4.4 times reduction for decision trees, a 84.1 times reduction for support vector machines and a 22.2 times reduction for artificial neural networks. Finally, since it is capable of supporting sparse classifiers, the usage of the proposed hardware accelerator leads to a significant reduction in energy spent on DRAM data transfers and a reduction of 50.16% for decision trees, 93.65% for support vector machines and as much as 93.75% for artificial neural networks, respectively. Full article
(This article belongs to the Special Issue Energy-Efficient Processors, Systems, and Their Applications)
Show Figures

Figure 1

16 pages, 1691 KiB  
Article
FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit
by Mannhee Cho and Youngmin Kim
Electronics 2021, 10(22), 2859; https://doi.org/10.3390/electronics10222859 - 19 Nov 2021
Cited by 14 | Viewed by 4512
Abstract
Convolutional neural networks (CNNs) are widely used in modern applications for their versatility and high classification accuracy. Field-programmable gate arrays (FPGAs) are considered to be suitable platforms for CNNs based on their high performance, rapid development, and reconfigurability. Although many studies have proposed [...] Read more.
Convolutional neural networks (CNNs) are widely used in modern applications for their versatility and high classification accuracy. Field-programmable gate arrays (FPGAs) are considered to be suitable platforms for CNNs based on their high performance, rapid development, and reconfigurability. Although many studies have proposed methods for implementing high-performance CNN accelerators on FPGAs using optimized data types and algorithm transformations, accelerators can be optimized further by investigating more efficient uses of FPGA resources. In this paper, we propose an FPGA-based CNN accelerator using multiple approximate accumulation units based on a fixed-point data type. We implemented the LeNet-5 CNN architecture, which performs classification of handwritten digits using the MNIST handwritten digit dataset. The proposed accelerator was implemented, using a high-level synthesis tool on a Xilinx FPGA. The proposed accelerator applies an optimized fixed-point data type and loop parallelization to improve performance. Approximate operation units are implemented using FPGA logic resources instead of high-precision digital signal processing (DSP) blocks, which are inefficient for low-precision data. Our accelerator model achieves 66% less memory usage and approximately 50% reduced network latency, compared to a floating point design and its resource utilization is optimized to use 78% fewer DSP blocks, compared to general fixed-point designs. Full article
(This article belongs to the Special Issue Energy-Efficient Processors, Systems, and Their Applications)
Show Figures

Figure 1

Back to TopTop