Next Article in Journal
Performance and Emissions of a Spark Ignition Engine Fueled with Water-in-Gasoline Emulsion Produced through Micro-Channels Emulsification
Next Article in Special Issue
A Survey on Malleability Solutions for High-Performance Distributed Computing
Previous Article in Journal
A Novel Manufacturing Process for Glass THGEMs and First Characterisation in an Optical Gaseous Argon TPC
Previous Article in Special Issue
Analyzing the Performance of the S3 Object Storage API for HPC Workloads
 
 
Article
Peer-Review Record

RLSchert: An HPC Job Scheduler Using Deep Reinforcement Learning and Remaining Time Prediction

Appl. Sci. 2021, 11(20), 9448; https://doi.org/10.3390/app11209448
by Qiqi Wang 1, Hongjie Zhang 1, Cheng Qu 1, Yu Shen 2, Xiaohui Liu 2 and Jing Li 1,2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2021, 11(20), 9448; https://doi.org/10.3390/app11209448
Submission received: 9 September 2021 / Revised: 1 October 2021 / Accepted: 7 October 2021 / Published: 12 October 2021
(This article belongs to the Special Issue State-of-the-Art High-Performance Computing and Networking)

Round 1

Reviewer 1 Report

The manuscript proposed HPC scheduling algorithm using re-enforcement deep learning.

Proposed algorithm RLShert provides better resource management than existing job schedulers. 

Also, the manuscript proposed various experiments to prove their methods, which is good.

Overall, I recommend accept this manuscript. 

Author Response

Dear Reviewer:
Thanks very much for taking your time to review this manuscript. We really appreciate all your comments. And thank you for your approval and support for our research.

Reviewer 2 Report

This paper presents an ML-based HPC job scheduler. The paper is well written and easy to follow. I enjoyed reading the paper. 
My primary concern with this paper is the time predictor, which is the crux of this paper. While the paper gives the impression that the time predictor is generic, in reality, it is not. It is heavily tied to one type of application (VASP jobs). Predicting the runtime of a single class of applications is not difficult and is a well-studied problem; hence the prediction itself is not novel. 

The primary requirement of the HPC job scheduler is to schedule applications with different resource requirement characteristics. For a given class of applications, it is easy to identify the relevant input parameters for ML prediction and model the execution time. The authors must address how this will be done for random applications, whose input parameters could even be file paths. The authors must also experiment with a mix of applications in section 4. 

I am good with the rest of the paper. 
 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Eventhough, the authors did not fully solve my concern, I am fine with the current text and new data.

Back to TopTop