Computational Optimizations for Machine Learning

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: closed (30 November 2021) | Viewed by 30416

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editor


E-Mail Website
Guest Editor
Faculty of Engineering, Ruppin Academic Center, Emek Hefer 4025000, Israel
Interests: artificial intelligence; machine learning and deep neural network algorithms; deep compression of machine learning; explainable AI (XAI); high performance computing algorithms and acceleration; computation systems and algorithm modeling and simulations
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In the recent decade, machine learning has emerged as an indispensable tool for incredible number of applications such as computer vision, medicine, fintech, autonomous systems, speech recognition, traffic management, social media, and many others. Machine learning models provide state-of-the-art and robust accuracy in various applications. The increasing deployment of machine learning algorithms introduces major computational challenges due to the explosive growth in model size and complexity. These challenges have been further emphasized due to the diverse hosting platforms from edge devices and cloud systems to high-performance computing. Given that each platform introduces different computational and cost constraints, the need for computational optimizations that are fine-tuned to the application and platform is crucial. This Special Issue looks for novel developments of computational optimizations for algorithms in the domain of machine learning algorithms such as:

  • Supervised, unsupervised, reinforcement, and hybrid machine learning classes.
  • Various types of machine learning algorithms: deep neural networks, convolutional neural networks, GANs, decision trees, linear regression, SVM, K-means clustering, Q-learning, temporal difference, deep adversarial networks, etc.
  • Application-specific machine learning models.
  • Machine learning optimization methods such as pruning, deep compression, and others.

Prof. Dr. Freddy Gabbay
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • deep neural network
  • deep compression
  • machine learning optimizations
  • machine learning under constrained resources

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

45 pages, 676 KiB  
Article
A Review of the Modification Strategies of the Nature Inspired Algorithms for Feature Selection Problem
by Ruba Abu Khurma, Ibrahim Aljarah, Ahmad Sharieh, Mohamed Abd Elaziz, Robertas Damaševičius and Tomas Krilavičius
Mathematics 2022, 10(3), 464; https://doi.org/10.3390/math10030464 - 31 Jan 2022
Cited by 73 | Viewed by 4929
Abstract
This survey is an effort to provide a research repository and a useful reference for researchers to guide them when planning to develop new Nature-inspired Algorithms tailored to solve Feature Selection problems (NIAs-FS). We identified and performed a thorough literature review in three [...] Read more.
This survey is an effort to provide a research repository and a useful reference for researchers to guide them when planning to develop new Nature-inspired Algorithms tailored to solve Feature Selection problems (NIAs-FS). We identified and performed a thorough literature review in three main streams of research lines: Feature selection problem, optimization algorithms, particularly, meta-heuristic algorithms, and modifications applied to NIAs to tackle the FS problem. We provide a detailed overview of 156 different articles about NIAs modifications for tackling FS. We support our discussions by analytical views, visualized statistics, applied examples, open-source software systems, and discuss open issues related to FS and NIAs. Finally, the survey summarizes the main foundations of NIAs-FS with approximately 34 different operators investigated. The most popular operator is chaotic maps. Hybridization is the most widely used modification technique. There are three types of hybridization: Integrating NIA with another NIA, integrating NIA with a classifier, and integrating NIA with a classifier. The most widely used hybridization is the one that integrates a classifier with the NIA. Microarray and medical applications are the dominated applications where most of the NIA-FS are modified and used. Despite the popularity of the NIAs-FS, there are still many areas that need further investigation. Full article
(This article belongs to the Special Issue Computational Optimizations for Machine Learning)
Show Figures

Figure 1

24 pages, 7281 KiB  
Article
Prediction of Hydraulic Jumps on a Triangular Bed Roughness Using Numerical Modeling and Soft Computing Methods
by Mehdi Dasineh, Amir Ghaderi, Mohammad Bagherzadeh, Mohammad Ahmadi and Alban Kuriqi
Mathematics 2021, 9(23), 3135; https://doi.org/10.3390/math9233135 - 05 Dec 2021
Cited by 19 | Viewed by 2715
Abstract
This study investigates the characteristics of free and submerged hydraulic jumps on the triangular bed roughness in various T/I ratios (i.e., height and distance of roughness) using CFD modeling techniques. The accuracy of numerical modeling outcomes was checked and compared using [...] Read more.
This study investigates the characteristics of free and submerged hydraulic jumps on the triangular bed roughness in various T/I ratios (i.e., height and distance of roughness) using CFD modeling techniques. The accuracy of numerical modeling outcomes was checked and compared using artificial intelligence methods, namely Support Vector Machines (SVM), Gene Expression Programming (GEP), and Random Forest (RF). The results of the FLOW-3D® model and experimental data showed that the overall mean value of relative error is 4.1%, which confirms the numerical model’s ability to predict the characteristics of the free and submerged jumps. The SVM model with a minimum of Root Mean Square Error (RMSE) and a maximum of correlation coefficient (R2), compared with GEP and RF models in the training and testing phases for predicting the sequent depth ratio (y2/y1), submerged depth ratio (y3/y1), tailwater depth ratio (y4/y1), length ratio of jumps (Lj/y2*) and energy dissipation (ΔE/E1), was recognized as the best model. Moreover, the best result for predicting the length ratio of free jumps (Ljf/y2*) in the optimal gamma is γ = 10 and the length ratio of submerged jumps (Ljs/y2*) is γ = 0.60. Based on sensitivity analysis, the Froude number has the greatest effect on predicting the (y3/y1) compared with submergence factors (SF) and T/I. By omitting this parameter, the prediction accuracy is significantly reduced. Finally, the relationships with good correlation coefficients for the mentioned parameters in free and submerged jumps were presented based on numerical results. Full article
(This article belongs to the Special Issue Computational Optimizations for Machine Learning)
Show Figures

Figure 1

18 pages, 1125 KiB  
Article
Early Prediction of DNN Activation Using Hierarchical Computations
by Bharathwaj Suresh, Kamlesh Pillai, Gurpreet Singh Kalsi, Avishaii Abuhatzera and Sreenivas Subramoney
Mathematics 2021, 9(23), 3130; https://doi.org/10.3390/math9233130 - 04 Dec 2021
Viewed by 2116
Abstract
Deep Neural Networks (DNNs) have set state-of-the-art performance numbers in diverse fields of electronics (computer vision, voice recognition), biology, bioinformatics, etc. However, the process of learning (training) from the data and application of the learnt information (inference) process requires huge computational resources. Approximate [...] Read more.
Deep Neural Networks (DNNs) have set state-of-the-art performance numbers in diverse fields of electronics (computer vision, voice recognition), biology, bioinformatics, etc. However, the process of learning (training) from the data and application of the learnt information (inference) process requires huge computational resources. Approximate computing is a common method to reduce computation cost, but it introduces loss in task accuracy, which limits their application. Using an inherent property of Rectified Linear Unit (ReLU), a popular activation function, we propose a mathematical model to perform MAC operation using reduced precision for predicting negative values early. We also propose a method to perform hierarchical computation to achieve the same results as IEEE754 full precision compute. Applying this method on ResNet50 and VGG16 shows that up to 80% of ReLU zeros (which is 50% of all ReLU outputs) can be predicted and detected early by using just 3 out of 23 mantissa bits. This method is equally applicable to other floating-point representations. Full article
(This article belongs to the Special Issue Computational Optimizations for Machine Learning)
Show Figures

Figure 1

34 pages, 20038 KiB  
Article
Compression of Neural Networks for Specialized Tasks via Value Locality
by Freddy Gabbay and Gil Shomron
Mathematics 2021, 9(20), 2612; https://doi.org/10.3390/math9202612 - 16 Oct 2021
Cited by 2 | Viewed by 1883
Abstract
Convolutional Neural Networks (CNNs) are broadly used in numerous applications such as computer vision and image classification. Although CNN models deliver state-of-the-art accuracy, they require heavy computational resources that are not always affordable or available on every platform. Limited performance, system cost, and [...] Read more.
Convolutional Neural Networks (CNNs) are broadly used in numerous applications such as computer vision and image classification. Although CNN models deliver state-of-the-art accuracy, they require heavy computational resources that are not always affordable or available on every platform. Limited performance, system cost, and energy consumption, such as in edge devices, argue for the optimization of computations in neural networks. Toward this end, we propose herein the value-locality-based compression (VELCRO) algorithm for neural networks. VELCRO is a method to compress general-purpose neural networks that are deployed for a small subset of focused specialized tasks. Although this study focuses on CNNs, VELCRO can be used to compress any deep neural network. VELCRO relies on the property of value locality, which suggests that activation functions exhibit values in proximity through the inference process when the network is used for specialized tasks. VELCRO consists of two stages: a preprocessing stage that identifies output elements of the activation function with a high degree of value locality, and a compression stage that replaces these elements with their corresponding average arithmetic values. As a result, VELCRO not only saves the computation of the replaced activations but also avoids processing their corresponding output feature map elements. Unlike common neural network compression algorithms, which require computationally intensive training processes, VELCRO introduces significantly fewer computational requirements. An analysis of our experiments indicates that, when CNNs are used for specialized tasks, they introduce a high degree of value locality relative to the general-purpose case. In addition, the experimental results show that without any training process, VELCRO produces a compression-saving ratio in the range 13.5–30.0% with no degradation in accuracy. Finally, the experimental results indicate that, when VELCRO is used with a relatively low compression target, it significantly improves the accuracy by 2–20% for specialized CNN tasks. Full article
(This article belongs to the Special Issue Computational Optimizations for Machine Learning)
Show Figures

Figure 1

17 pages, 3134 KiB  
Article
Effect of Initial Configuration of Weights on Training and Function of Artificial Neural Networks
by Ricardo J. Jesus, Mário L. Antunes, Rui A. da Costa, Sergey N. Dorogovtsev, José F. F. Mendes and Rui L. Aguiar
Mathematics 2021, 9(18), 2246; https://doi.org/10.3390/math9182246 - 13 Sep 2021
Cited by 7 | Viewed by 1987
Abstract
The function and performance of neural networks are largely determined by the evolution of their weights and biases in the process of training, starting from the initial configuration of these parameters to one of the local minima of the loss function. We perform [...] Read more.
The function and performance of neural networks are largely determined by the evolution of their weights and biases in the process of training, starting from the initial configuration of these parameters to one of the local minima of the loss function. We perform the quantitative statistical characterization of the deviation of the weights of two-hidden-layer feedforward ReLU networks of various sizes trained via Stochastic Gradient Descent (SGD) from their initial random configuration. We compare the evolution of the distribution function of this deviation with the evolution of the loss during training. We observed that successful training via SGD leaves the network in the close neighborhood of the initial configuration of its weights. For each initial weight of a link we measured the distribution function of the deviation from this value after training and found how the moments of this distribution and its peak depend on the initial weight. We explored the evolution of these deviations during training and observed an abrupt increase within the overfitting region. This jump occurs simultaneously with a similarly abrupt increase recorded in the evolution of the loss function. Our results suggest that SGD’s ability to efficiently find local minima is restricted to the vicinity of the random initial configuration of weights. Full article
(This article belongs to the Special Issue Computational Optimizations for Machine Learning)
Show Figures

Figure 1

24 pages, 3766 KiB  
Article
Genetic and Swarm Algorithms for Optimizing the Control of Building HVAC Systems Using Real Data: A Comparative Study
by Alberto Garces-Jimenez, Jose-Manuel Gomez-Pulido, Nuria Gallego-Salvador and Alvaro-Jose Garcia-Tejedor
Mathematics 2021, 9(18), 2181; https://doi.org/10.3390/math9182181 - 07 Sep 2021
Cited by 6 | Viewed by 2732
Abstract
Buildings consume a considerable amount of electrical energy, the Heating, Ventilation, and Air Conditioning (HVAC) system being the most demanding. Saving energy and maintaining comfort still challenge scientists as they conflict. The control of HVAC systems can be improved by modeling their behavior, [...] Read more.
Buildings consume a considerable amount of electrical energy, the Heating, Ventilation, and Air Conditioning (HVAC) system being the most demanding. Saving energy and maintaining comfort still challenge scientists as they conflict. The control of HVAC systems can be improved by modeling their behavior, which is nonlinear, complex, and dynamic and works in uncertain contexts. Scientific literature shows that Soft Computing techniques require fewer computing resources but at the expense of some controlled accuracy loss. Metaheuristics-search-based algorithms show positive results, although further research will be necessary to resolve new challenging multi-objective optimization problems. This article compares the performance of selected genetic and swarm-intelligence-based algorithms with the aim of discerning their capabilities in the field of smart buildings. MOGA, NSGA-II/III, OMOPSO, SMPSO, and Random Search, as benchmarking, are compared in hypervolume, generational distance, ε-indicator, and execution time. Real data from the Building Management System of Teatro Real de Madrid have been used to train a data model used for the multiple objective calculations. The novelty brought by the analysis of the different proposed dynamic optimization algorithms in the transient time of an HVAC system also includes the addition, to the conventional optimization objectives of comfort and energy efficiency, of the coefficient of performance, and of the rate of change in ambient temperature, aiming to extend the equipment lifecycle and minimize the overshooting effect when passing to the steady state. The optimization works impressively well in energy savings, although the results must be balanced with other real considerations, such as realistic constraints on chillers’ operational capacity. The intuitive visualization of the performance of the two families of algorithms in a real multi-HVAC system increases the novelty of this proposal. Full article
(This article belongs to the Special Issue Computational Optimizations for Machine Learning)
Show Figures

Figure 1

12 pages, 371 KiB  
Article
NICE: Noise Injection and Clamping Estimation for Neural Network Quantization
by Chaim Baskin, Evgenii Zheltonozhkii, Tal Rozen, Natan Liss, Yoav Chai, Eli Schwartz, Raja Giryes, Alexander M. Bronstein and Avi Mendelson
Mathematics 2021, 9(17), 2144; https://doi.org/10.3390/math9172144 - 02 Sep 2021
Cited by 8 | Viewed by 2644
Abstract
Convolutional Neural Networks (CNNs) are very popular in many fields including computer vision, speech recognition, natural language processing, etc. Though deep learning leads to groundbreaking performance in those domains, the networks used are very computationally demanding and are far from being able to [...] Read more.
Convolutional Neural Networks (CNNs) are very popular in many fields including computer vision, speech recognition, natural language processing, etc. Though deep learning leads to groundbreaking performance in those domains, the networks used are very computationally demanding and are far from being able to perform in real-time applications even on a GPU, which is not power efficient and therefore does not suit low power systems such as mobile devices. To overcome this challenge, some solutions have been proposed for quantizing the weights and activations of these networks, which accelerate the runtime significantly. Yet, this acceleration comes at the cost of a larger error unless spatial adjustments are carried out. The method proposed in this work trains quantized neural networks by noise injection and a learned clamping, which improve accuracy. This leads to state-of-the-art results on various regression and classification tasks, e.g., ImageNet classification with architectures such as ResNet-18/34/50 with as low as 3 bit weights and activations. We implement the proposed solution on an FPGA to demonstrate its applicability for low-power real-time applications. The quantization code will become publicly available upon acceptance. Full article
(This article belongs to the Special Issue Computational Optimizations for Machine Learning)
Show Figures

Figure 1

37 pages, 1011 KiB  
Article
Statistical Machine Learning in Model Predictive Control of Nonlinear Processes
by Zhe Wu, David Rincon, Quanquan Gu and Panagiotis D. Christofides
Mathematics 2021, 9(16), 1912; https://doi.org/10.3390/math9161912 - 11 Aug 2021
Cited by 35 | Viewed by 4801
Abstract
Recurrent neural networks (RNNs) have been widely used to model nonlinear dynamic systems using time-series data. While the training error of neural networks can be rendered sufficiently small in many cases, there is a lack of a general framework to guide construction and [...] Read more.
Recurrent neural networks (RNNs) have been widely used to model nonlinear dynamic systems using time-series data. While the training error of neural networks can be rendered sufficiently small in many cases, there is a lack of a general framework to guide construction and determine the generalization accuracy of RNN models to be used in model predictive control systems. In this work, we employ statistical machine learning theory to develop a methodological framework of generalization error bounds for RNNs. The RNN models are then utilized to predict state evolution in model predictive controllers (MPC), under which closed-loop stability is established in a probabilistic manner. A nonlinear chemical process example is used to investigate the impact of training sample size, RNN depth, width, and input time length on the generalization error, along with the analyses of probabilistic closed-loop stability through the closed-loop simulations under Lyapunov-based MPC. Full article
(This article belongs to the Special Issue Computational Optimizations for Machine Learning)
Show Figures

Figure 1

21 pages, 1356 KiB  
Article
AutoNowP: An Approach Using Deep Autoencoders for Precipitation Nowcasting Based on Weather Radar Reflectivity Prediction
by Gabriela Czibula, Andrei Mihai, Alexandra-Ioana Albu, Istvan-Gergely Czibula, Sorin Burcea and Abdelkader Mezghani
Mathematics 2021, 9(14), 1653; https://doi.org/10.3390/math9141653 - 14 Jul 2021
Cited by 4 | Viewed by 2197
Abstract
Short-term quantitative precipitation forecast is a challenging topic in meteorology, as the number of severe meteorological phenomena is increasing in most regions of the world. Weather radar data is of utmost importance to meteorologists for issuing short-term weather forecast and warnings of severe [...] Read more.
Short-term quantitative precipitation forecast is a challenging topic in meteorology, as the number of severe meteorological phenomena is increasing in most regions of the world. Weather radar data is of utmost importance to meteorologists for issuing short-term weather forecast and warnings of severe weather phenomena. We are proposing AutoNowP, a binary classification model intended for precipitation nowcasting based on weather radar reflectivity prediction. Specifically, AutoNowP uses two convolutional autoencoders, being trained on radar data collected on both stratiform and convective weather conditions for learning to predict whether the radar reflectivity values will be above or below a certain threshold. AutoNowP is intended to be a proof of concept that autoencoders are useful in distinguishing between convective and stratiform precipitation. Real radar data provided by the Romanian National Meteorological Administration and the Norwegian Meteorological Institute is used for evaluating the effectiveness of AutoNowP. Results showed that AutoNowP surpassed other binary classifiers used in the supervised learning literature in terms of probability of detection and negative predictive value, highlighting its predictive performance. Full article
(This article belongs to the Special Issue Computational Optimizations for Machine Learning)
Show Figures

Figure 1

30 pages, 1780 KiB  
Article
Adaptive Online Learning for the Autoregressive Integrated Moving Average Models
by Weijia Shao, Lukas Friedemann Radke, Fikret Sivrikaya and Sahin Albayrak
Mathematics 2021, 9(13), 1523; https://doi.org/10.3390/math9131523 - 29 Jun 2021
Cited by 1 | Viewed by 1942
Abstract
This paper addresses the problem of predicting time series data using the autoregressive integrated moving average (ARIMA) model in an online manner. Existing algorithms require model selection, which is time consuming and unsuitable for the setting of online learning. Using adaptive online learning [...] Read more.
This paper addresses the problem of predicting time series data using the autoregressive integrated moving average (ARIMA) model in an online manner. Existing algorithms require model selection, which is time consuming and unsuitable for the setting of online learning. Using adaptive online learning techniques, we develop algorithms for fitting ARIMA models without hyperparameters. The regret analysis and experiments on both synthetic and real-world datasets show that the performance of the proposed algorithms can be guaranteed in both theory and practice. Full article
(This article belongs to the Special Issue Computational Optimizations for Machine Learning)
Show Figures

Figure 1

Back to TopTop