New Trends in High-Performance Computer Architectures and Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (31 October 2022) | Viewed by 10650

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Architecture and Technology/CITIC, University of Granada, 18071 Granada, Spain
Interests: computer architecture; communication networks; distributed systems; security; neutrino telescopes
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

HPC is revolutionising our changing world and offers new resources to solve complex problems. This computational power rises steadily due to a continuous evolution of advanced hardware platforms, networks, parallelisation, distributed programming, and optimised algorithms. However, it is a great challenge to take advantage of this huge amount of resources.

In addition, cloud computing, intensive AI requirements, and new computational models have spurred advanced solutions based on accelerators and specialised designs. A combination of these technologies also requires elements that enrich the overall design to fulfil specific requirements such as reliability, fault tolerance, energy-efficient solutions, secure frameworks, scalability, load balancing, or monitoring.

Thanks to the creative vision of scientists and engineers that are one step ahead of the latest technologies, it is possible to find new solutions that improve the wellbeing of our society.

This Special Issue aims to gather innovative contributions with new approaches, proposals, techniques, and applications in this field. The topics of interest include but are not limited to:

  • High-performance computer architecture
  • High-performance computer system software
  • Multi-core and multi-threaded architecture methods
  • Applications related to high-performance computing
  • Distributed systems
  • Communication networks
  • Network virtualisation
  • High-performance computing in cloud platforms
  • Data-intensive computing
  • Optimal energy solutions
  • Energy-aware scheduling
  • Accelerators such as GPUs and TPUs
  • I/O subsystems
  • Hardware virtualisation and simulation
  • Hardware security

Dr. Antonio F. Díaz
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • high-performance computing
  • cluster computing
  • simulation tools
  • distributed computing
  • parallelisation techniques
  • computer architecture
  • computer networks
  • GPU
  • energy-aware scheduling
  • HPC-I/O systems
  • SDN
  • IaaS
  • security

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

18 pages, 2553 KiB  
Article
swAFL: A Library of High-Performance Activation Function for the Sunway Architecture
by Jinchen Xu, Fei Li, Ming Hou and Panjie Wang
Electronics 2022, 11(19), 3141; https://doi.org/10.3390/electronics11193141 - 30 Sep 2022
Viewed by 1160
Abstract
The Sunway supercomputers have recently attracted considerable attention to execute neural networks. Meanwhile, activation functions help extend the applicability of neural networks to nonlinear models by introducing nonlinear factors. Despite the numerous activation function-supported AI frameworks, only PyTorch and TensorFlow were ported to [...] Read more.
The Sunway supercomputers have recently attracted considerable attention to execute neural networks. Meanwhile, activation functions help extend the applicability of neural networks to nonlinear models by introducing nonlinear factors. Despite the numerous activation function-supported AI frameworks, only PyTorch and TensorFlow were ported to the Sunway platforms. Although these libraries can meet the minimum functional requirements to deploy a neural network on the Sunway machines, there still exist some drawbacks including the limited number of usable functions and unsatisfactory performances remaining unresolved. Therefore, two activation function algorithms with different computing accuracies were developed in this study, and an efficient implementation scheme was designed using the single instruction/multiple data extension and multiply–add instructions of the platform. Finally, an efficient library-swAFL-composed of 48 function interfaces was designed and implemented on the Sunway platforms. Experimental results indicate that swAFL outperformed PyTorch and TensorFlow by 19.5 and 23 times, respectively, on average. Full article
Show Figures

Figure 1

22 pages, 4941 KiB  
Article
The Adaptive Streaming SAR Back-Projection Algorithm Based on Half-Precision in GPU
by Yihao Xu, Zhuo Zhang, Longyong Chen, Zhenhua Li and Ling Yang
Electronics 2022, 11(18), 2807; https://doi.org/10.3390/electronics11182807 - 06 Sep 2022
Viewed by 1507
Abstract
The back-projection (BP) algorithm is completely accurate in the imaging principle, but the computational complexity is extremely high. The single-precision arithmetic used in the traditional graphics processing unit (GPU) acceleration scheme has low throughput and its usage of the video memory is large. [...] Read more.
The back-projection (BP) algorithm is completely accurate in the imaging principle, but the computational complexity is extremely high. The single-precision arithmetic used in the traditional graphics processing unit (GPU) acceleration scheme has low throughput and its usage of the video memory is large. An adaptive asynchronous streaming scheme for the BP algorithm based on half-precision is proposed in this study, and then it is extended to the fast back-projection (FBP) algorithm. In this scheme, the adaptive loss factors selection strategy ensures the dynamic range of data, the asynchronous streaming structure ensures the efficiency of large scene imaging, and the mixed-precision data processing ensures the imaging quality. The schemes proposed in this paper are compared with BP, FBP, and fast factorized back-projection (FFBP) algorithms of single-precision in GPU. The experimental results show that the two half-precision acceleration schemes in this paper reduce the video memory usage to 74% and 59% of the single-precision schemes with guaranteed image quality. The efficiency improvements of the proposed schemes are almost one and 0.5 times greater than that of the corresponding single-precision scheme, and the advantage can be more obvious when dealing with large computations. Full article
Show Figures

Graphical abstract

24 pages, 962 KiB  
Article
Combining Distributed and Kernel Tracing for Performance Analysis of Cloud Applications
by Loïc Gelle, Naser Ezzati-Jivan and Michel R. Dagenais
Electronics 2021, 10(21), 2610; https://doi.org/10.3390/electronics10212610 - 26 Oct 2021
Cited by 6 | Viewed by 2322
Abstract
Distributed tracing allows tracking user requests that span across multiple services and machines in a distributed application. However, typical cloud applications rely on abstraction layers that can hide the root cause of latency happening between processes or in the kernel. Because of its [...] Read more.
Distributed tracing allows tracking user requests that span across multiple services and machines in a distributed application. However, typical cloud applications rely on abstraction layers that can hide the root cause of latency happening between processes or in the kernel. Because of its focus on high-level events, existing methodologies in applying distributed tracing can be limited when trying to detect complex contentions and relate them back to the originating requests. Cross-level analyses that include kernel-level events are necessary to debug problems as prevalent as mutex or disk contention, however cross-level analysis and associating events in the kernel and distributed tracing data is complex and can add a lot of overhead. This paper describes a new solution for combining distributed tracing with low-level software tracing in order to find the latency root cause better. We explain how we achieve a hybrid trace collection to capture and synchronize both kernel and distributed request events. Then, we present our design and implementation for a critical path analysis. We show that our analysis describes precisely how each request spends its time and what stands in its critical path while limiting overhead. Full article
Show Figures

Figure 1

Review

Jump to: Research

20 pages, 1373 KiB  
Review
Resistive-RAM-Based In-Memory Computing for Neural Network: A Review
by Weijian Chen, Zhi Qi, Zahid Akhtar and Kamran Siddique
Electronics 2022, 11(22), 3667; https://doi.org/10.3390/electronics11223667 - 09 Nov 2022
Cited by 11 | Viewed by 4861
Abstract
Processing-in-memory (PIM) is a promising architecture to design various types of neural network accelerators as it ensures the efficiency of computation together with Resistive Random Access Memory (ReRAM). ReRAM has now become a promising solution to enhance computing efficiency due to its crossbar [...] Read more.
Processing-in-memory (PIM) is a promising architecture to design various types of neural network accelerators as it ensures the efficiency of computation together with Resistive Random Access Memory (ReRAM). ReRAM has now become a promising solution to enhance computing efficiency due to its crossbar structure. In this paper, a ReRAM-based PIM neural network accelerator is addressed, and different kinds of methods and designs of various schemes are discussed. Various models and architectures implemented for a neural network accelerator are determined for research trends. Further, the limitations or challenges of ReRAM in a neural network are also addressed in this review. Full article
Show Figures

Figure 1

Back to TopTop