Advances in Programming Parallel and Heterogeneous Computing for Cyber-Physical Systems

A special issue of Journal of Low Power Electronics and Applications (ISSN 2079-9268).

Deadline for manuscript submissions: closed (31 March 2021) | Viewed by 19379

Special Issue Editors


E-Mail Website
Guest Editor
Department of Electrical, Electronic, and Information Engineering "Guglielmo Marconi", University of Bologna, 33 - 40126 Bologna, Italy
Interests: embedded systems; low power; multicore; SoC; bioinformatics

E-Mail Website
Guest Editor
Department of Electrical, Electronic, and Information Engineering "Guglielmo Marconi", University of Bologna, 33 - 40126 Bologna, Italy
Interests: microprocessor chips; optimisation problems and software develop for bioinformatics algorithms; heterogeneous platforms

Special Issue Information

Dear Colleagues,

In the last decade, the scientific community has taken significant steps in the development of increasingly complex heterogeneous and parallel computing systems for cyberphysical system (CPS) applications. These platforms feature several units dedicated to accelerating specific tasks such as machine learning, (event or streaming) sensor data processing, cyberphysical system control, and cybersecurity. These accelerators require dedicated programming models which in some cases require completely different approaches. This picture is even more diverse considering the increasing interest in beyond-Von-Neumann architectures such as neuromorphic devices exploiting densely interconnected multicores and mixed-signal integrated architectures.

In this context, programming environments are struggling to catch up and reduce the so-called software gap by implementing suitable programming environments for CPS. This Special Issue invites researchers to share discoveries and knowledge in programming models, compilers, toolchain, and runtime support for next-generation heterogeneous and parallel architectures, particularly focusing on CPS applications and going beyond traditional computing approaches.

Prof. Dr. Andrea Acquaviva
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Journal of Low Power Electronics and Applications is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Heterogenous platforms
  • Multicore and many core
  • Cyber-Physical systems
  • High-performance computing
  • Neuromorphic device
  • Artificial Intelligence
  • Deep learning
  • Spiking neural networks
  • Event sensors and actuators
  • Machine learning
  • Memristors
  • Neural networks
  • Parallel computing

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

13 pages, 522 KiB  
Article
Dynamic Compilation for Transprecision Applications on Heterogeneous Platform
by Julie Dumas, Henri-Pierre Charles, Kévin Mambu and Maha Kooli
J. Low Power Electron. Appl. 2021, 11(3), 28; https://doi.org/10.3390/jlpea11030028 - 29 Jun 2021
Cited by 1 | Viewed by 3075
Abstract
This article describes a software environment called HybroGen, which helps to experiment binary code generation at run time. As computing architectures are getting more complex, the application performance is becoming data-dependent. The proposed experimental platform is helpful in programming applications that can [...] Read more.
This article describes a software environment called HybroGen, which helps to experiment binary code generation at run time. As computing architectures are getting more complex, the application performance is becoming data-dependent. The proposed experimental platform is helpful in programming applications that can be reconfigured at run time in order to be adapted for a new data environment. The HybroGen platform is adapted to heterogeneous architectures and can generate instructions for different targets. This platform allows to go farther than classical JIT compilation in many directions: the code generator is smaller by three orders of magnitude and faster by three orders of magnitude, compared to JIT (Just-In-Time) platforms, and allows making code transformation that is impossible in traditional compilation schemes, such as code generation for non von Neumann accelerators or dynamic code transformations for transprecision. The latter is illustrated in a code example: the square root with Newton’s algorithm. We also illustrate the proposed HybroGen platform with two other examples: a multiplication with a specialization on a value determined at run time, and a conversion of degrees Celsius to degrees Fahrenheit. This article presents a proof of concept of the proposed HybroGen platform in terms of its functionalities, and demonstrates the working status. Full article
Show Figures

Figure 1

14 pages, 620 KiB  
Article
PageRank Implemented with the MPI Paradigm Running on a Many-Core Neuromorphic Platform
by Evelina Forno, Alessandro Salvato, Enrico Macii and Gianvito Urgese
J. Low Power Electron. Appl. 2021, 11(2), 25; https://doi.org/10.3390/jlpea11020025 - 28 May 2021
Cited by 4 | Viewed by 3219
Abstract
SpiNNaker is a neuromorphic hardware platform, especially designed for the simulation of Spiking Neural Networks (SNNs). To this end, the platform features massively parallel computation and an efficient communication infrastructure based on the transmission of small packets. The effectiveness of SpiNNaker in the [...] Read more.
SpiNNaker is a neuromorphic hardware platform, especially designed for the simulation of Spiking Neural Networks (SNNs). To this end, the platform features massively parallel computation and an efficient communication infrastructure based on the transmission of small packets. The effectiveness of SpiNNaker in the parallel execution of the PageRank (PR) algorithm has been tested by the realization of a custom SNN implementation. In this work, we propose a PageRank implementation fully realized with the MPI programming paradigm ported to the SpiNNaker platform. We compare the scalability of the proposed program with the equivalent SNN implementation, and we leverage the characteristics of the PageRank algorithm to benchmark our implementation of MPI on SpiNNaker when faced with massive communication requirements. Experimental results show that the algorithm exhibits favorable scaling for a mid-sized execution context, while highlighting that the performance of MPI-PageRank on SpiNNaker is bounded by memory size and speed limitations on the current version of the hardware. Full article
Show Figures

Figure 1

19 pages, 3055 KiB  
Article
Efficient ROS-Compliant CPU-iGPU Communication on Embedded Platforms
by Mirco De Marchi, Francesco Lumpp, Enrico Martini, Michele Boldo, Stefano Aldegheri and Nicola Bombieri
J. Low Power Electron. Appl. 2021, 11(2), 24; https://doi.org/10.3390/jlpea11020024 - 26 May 2021
Cited by 2 | Viewed by 4899
Abstract
Many modern programmable embedded devices contain CPUs and a GPU that share the same system memory on a single die. Such a unified memory architecture (UMA) allows programmers to implement different communication models between CPU and the integrated GPU (iGPU). Although the simpler [...] Read more.
Many modern programmable embedded devices contain CPUs and a GPU that share the same system memory on a single die. Such a unified memory architecture (UMA) allows programmers to implement different communication models between CPU and the integrated GPU (iGPU). Although the simpler model guarantees implicit synchronization at the cost of performance, the more advanced model allows, through the zero-copy paradigm, the explicit data copying between CPU and iGPU to be eliminated with the benefit of significantly improving performance and energy savings. On the other hand, the robot operating system (ROS) has become a de-facto reference standard for developing robotic applications. It allows for application re-use and the easy integration of software blocks in complex cyber-physical systems. Although ROS compliance is strongly required for SW portability and reuse, it can lead to performance loss and elude the benefits of the zero-copy communication. In this article we present efficient techniques to implement CPU–iGPU communication by guaranteeing compliance to the ROS standard. We show how key features of each communication model are maintained and the corresponding overhead involved by the ROS compliancy. Full article
Show Figures

Figure 1

12 pages, 3442 KiB  
Article
Accelerating Population Count with a Hardware Co-Processor for MicroBlaze
by Iouliia Skliarova
J. Low Power Electron. Appl. 2021, 11(2), 20; https://doi.org/10.3390/jlpea11020020 - 24 Apr 2021
Cited by 6 | Viewed by 3425
Abstract
This paper proposes a Field-Programmable Gate Array (FPGA)-based hardware accelerator for assisting the embedded MicroBlaze soft-core processor in calculating population count. The population count is frequently required to be executed in cyber-physical systems and can be applied to large data sets, such as [...] Read more.
This paper proposes a Field-Programmable Gate Array (FPGA)-based hardware accelerator for assisting the embedded MicroBlaze soft-core processor in calculating population count. The population count is frequently required to be executed in cyber-physical systems and can be applied to large data sets, such as in the case of molecular similarity search in cheminformatics, or assisting with computations performed by binarized neural networks. The MicroBlaze instruction set architecture (ISA) does not support this operation natively, so the count has to be realized as either a sequence of native instructions (in software) or in parallel in a dedicated hardware accelerator. Different hardware accelerator architectures are analyzed and compared to one another and to implementing the population count operation in MicroBlaze. The achieved experimental results with large vector lengths (up to 217) demonstrate that the best hardware accelerator with DMA (Direct Memory Access) is ~31 times faster than the best software version running on MicroBlaze. The proposed architectures are scalable and can easily be adjusted to both smaller and bigger input vector lengths. The entire system was implemented and tested on a Nexys-4 prototyping board containing a low-cost/low-power Artix-7 FPGA. Full article
Show Figures

Figure 1

Review

Jump to: Research

26 pages, 1106 KiB  
Review
Deep Learning Approaches to Source Code Analysis for Optimization of Heterogeneous Systems: Recent Results, Challenges and Opportunities
by Francesco Barchi, Emanuele Parisi, Andrea Bartolini and Andrea Acquaviva
J. Low Power Electron. Appl. 2022, 12(3), 37; https://doi.org/10.3390/jlpea12030037 - 05 Jul 2022
Cited by 1 | Viewed by 3846
Abstract
To cope with the increasing complexity of digital systems programming, deep learning techniques have recently been proposed to enhance software deployment by analysing source code for different purposes, ranging from performance and energy improvement to debugging and security assessment. As embedded platforms for [...] Read more.
To cope with the increasing complexity of digital systems programming, deep learning techniques have recently been proposed to enhance software deployment by analysing source code for different purposes, ranging from performance and energy improvement to debugging and security assessment. As embedded platforms for cyber-physical systems are characterised by increasing heterogeneity and parallelism, one of the most challenging and specific problems is efficiently allocating computational kernels to available hardware resources. In this field, deep learning applied to source code can be a key enabler to face this complexity. However, due to the rapid development of such techniques, it is not easy to understand which of those are suitable and most promising for this class of systems. For this purpose, we discuss recent developments in deep learning for source code analysis, and focus on techniques for kernel mapping on heterogeneous platforms, highlighting recent results, challenges and opportunities for their applications to cyber-physical systems. Full article
Show Figures

Figure 1

Back to TopTop