Heterogeneous Computing Solutions

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 June 2024 | Viewed by 1265

Special Issue Editor

Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Interests: high performance computing; parallel programming; linear algebra; computational fluid dynamics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

While computing technologies have remained relatively stable for nearly two decades, new architectural features, such as specialized hardware, heterogeneous cores, deep memory hierarchies, and near-memory processing, have emerged as possible solutions to address the concerns of energy efficiency, manufacturability, and cost. However, we expect this ‘golden age’ of architectural change to lead to extreme heterogeneity, and it will have a major impact on software systems and applications. In this upcoming exascale and extreme heterogeneity era, it will be critical to explore new software approaches that will enable us to effectively exploit this diverse hardware to advance science, and next-generation systems with heterogeneous elements will need to accommodate complex workflows. This is mainly due to the many forms of heterogeneous accelerators (no longer just GPU accelerators) in this heterogeneous era, and the need for mapping different parts of an application onto elements most appropriate for that application component.

The goal of this Special Issue is to provide a forum to discuss new and emerging solutions to address these important challenges in the upcoming extreme heterogeneity era. Papers are being sought on many aspects of heterogeneous computing, including (but not limited to):

Heterogeneous programming environments and runtime systems

  • Programming models and systems
  • Parallel resource management on heterogeneous systems
  • Automated parallelization and compiler techniques (Autotuning)

Heterogeneous solutions for HPC and scientific applications

  • Parallel and distributed algorithms
  • Parallel libraries and frameworks
  • Parallel processing on heterogeneous systems

Heterogeneous (including non-von Neuman) architectures

  • Power/energy management
  • Heterogeneous architectures for emerging application domains
  • Architecture designs including non-von Neuman architectures, memory and interconnection Reliability/benchmarking/measurements
  • Debugging, performance tools and techniques
  • Fault tolerance and resilience
  • Application/hardware benchmarks

Dr. Pedro Valero-Lara
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • heterogeneous computing
  • programing
  • scientific applications
  • hardware
  • CPU+GPU

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

20 pages, 716 KiB  
Article
Cross-Feature Transfer Learning for Efficient Tensor Program Generation
by Gaurav Verma, Siddhisanket Raskar, Murali Emani and Barbara Chapman
Appl. Sci. 2024, 14(2), 513; https://doi.org/10.3390/app14020513 - 06 Jan 2024
Viewed by 750
Abstract
Tuning tensor program generation involves navigating a vast search space to find optimal program transformations and measurements for a program on the target hardware. The complexity of this process is further amplified by the exponential combinations of transformations, especially in heterogeneous environments. This [...] Read more.
Tuning tensor program generation involves navigating a vast search space to find optimal program transformations and measurements for a program on the target hardware. The complexity of this process is further amplified by the exponential combinations of transformations, especially in heterogeneous environments. This research addresses these challenges by introducing a novel approach that learns the joint neural network and hardware features space, facilitating knowledge transfer to new, unseen target hardware. A comprehensive analysis is conducted on the existing state-of-the-art dataset, TenSet, including a thorough examination of test split strategies and the proposal of methodologies for dataset pruning. Leveraging an attention-inspired technique, we tailor the tuning of tensor programs to embed both neural network and hardware-specific features. Notably, our approach substantially reduces the dataset size by up to 53% compared to the baseline without compromising Pairwise Comparison Accuracy (PCA). Furthermore, our proposed methodology demonstrates competitive or improved mean inference times with only 25–40% of the baseline tuning time across various networks and target hardware. The attention-based tuner can effectively utilize schedules learned from previous hardware program measurements to optimize tensor program tuning on previously unseen hardware, achieving a top-5 accuracy exceeding 90%. This research introduces a significant advancement in autotuning tensor program generation, addressing the complexities associated with heterogeneous environments and showcasing promising results regarding efficiency and accuracy. Full article
(This article belongs to the Special Issue Heterogeneous Computing Solutions)
Show Figures

Figure 1

Back to TopTop