Deep and Classic Machine Learning in Signal, Image, and Video Analysis

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 30 September 2024 | Viewed by 7730

Special Issue Editor


E-Mail Website
Guest Editor
Faculty of Electronics and Information Technology, Institute of Control and Computation Engineering, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland
Interests: classification; signal, image and video processing; pattern recognition; computer vision; image processing; feature extraction; machine learning; digital signal processing; signal processing; robotics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The recent domination of machine learning, especially deep learning techniques, in sensor analysis research has, among others, raised questions about the new positioning of classic, analytic solutions in signal, image and video processing. Aside from the unprecedented success of machine learning, there are some weak points of its use, especially for low- and mid-level sensor analysis. This is due to its strong dependence on large data sets, typically requiring annotation by human experts, and heavy solutions with a huge number of trainable parameters, typically leading to extensive training and processing resources.

In this Special Issue, we are particularly interested in how a knowledge-aware preprocessing of training data (i.e., data refinement, ordering, normalization, feature extraction, etc.) can drive lightweight deep neural networks to achieve similar performance as existing complex and powerful “heavy” networks.

The application tasks solved in this may include various signal analysis and classification, speech and speaker recognition, semantic image segmentation and visual object recognition, object tracking, human action and interaction classification in video and RGB-D sequences, and other problems in sensor analysis.

Through the proposed topic, our Special Issue is expected to contribute to a successful synergy between long-time-existing highly specialized, vast amounts of computational methods in signal and image analysis, and deep learning techniques steadily being developed by the machine learning community.

Prof. Dr. Włodzimierz Kasprzak
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • adaptive data transform
  • audio analysis
  • convolution neural networks
  • data mining
  • deep neural network learning
  • deep auto-encoder
  • deep adversarial networks
  • federated and decentralized learning
  • graph convolution neural networks
  • human action and -interaction in video
  • human-centered signal processing
  • LSTM
  • machine perception
  • object tracking
  • recurrent neural networks
  • semantic image segmentation
  • sensor signal analysis
  • signal classification
  • speech and speaker recognition
  • video analysis
  • weak supervised learning

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 5044 KiB  
Article
Improving the Performance of Automatic Lip-Reading Using Image Conversion Techniques
by Ki-Seung Lee
Electronics 2024, 13(6), 1032; https://doi.org/10.3390/electronics13061032 - 09 Mar 2024
Viewed by 628
Abstract
Variation in lighting conditions is a major cause of performance degradation in pattern recognition when using optical imaging. In this study, infrared (IR) and depth images were considered as possible robust alternatives against variations in illumination, particularly for improving the performance of automatic [...] Read more.
Variation in lighting conditions is a major cause of performance degradation in pattern recognition when using optical imaging. In this study, infrared (IR) and depth images were considered as possible robust alternatives against variations in illumination, particularly for improving the performance of automatic lip-reading. The variations due to lighting conditions were quantitatively analyzed for optical, IR, and depth images. Then, deep neural network (DNN)-based lip-reading rules were built for each image modality. Speech recognition techniques based on IR or depth imaging required an additional light source that emitted light in the IR range, along with a special camera. To mitigate this problem, we propose a method that does not use an IR/depth image directly, but instead estimates images based on the optical RGB image. To this end, a modified U-net was adopted to estimate the IR/depth image from an optical RGB image. The results show that the IR and depth images were rarely affected by the lighting conditions. The recognition rates for the optical, IR, and depth images were 48.29%, 95.76%, and 92.34%, respectively, under various lighting conditions. Using the estimated IR and depth images, the recognition rates were 89.35% and 80.42%, respectively. This was significantly higher than for the optical RGB images. Full article
Show Figures

Figure 1

20 pages, 5451 KiB  
Article
An Aero-Engine Classification Method Based on Fourier Transform Infrared Spectrometer Spectral Feature Vectors
by Shuhan Du, Wei Han, Zhengyang Shi, Yurong Liao and Zhaoming Li
Electronics 2024, 13(5), 915; https://doi.org/10.3390/electronics13050915 - 28 Feb 2024
Viewed by 500
Abstract
Aiming at the classification identification problem of aero-engines, this paper adopts a telemetry Fourier transform infrared spectrometer to collect aero-engine hot jet infrared spectrum data and proposes an aero-engine classification identification method based on spectral feature vectors. First, aero-engine hot jet infrared spectrum [...] Read more.
Aiming at the classification identification problem of aero-engines, this paper adopts a telemetry Fourier transform infrared spectrometer to collect aero-engine hot jet infrared spectrum data and proposes an aero-engine classification identification method based on spectral feature vectors. First, aero-engine hot jet infrared spectrum data are acquired and measured; meanwhile, the spectral feature vectors based on CO2 are constructed. Subsequently, the feature vectors are combined with the seven mainstream classification algorithms to complete the training and prediction of the classification model. In the experiment, two Fourier transform infrared spectrometers, EM27 developed by Bruker and a self-developed telemetry FT-IR spectrometer, were used to telemeter the hot jet of three aero-engines to obtain infrared spectral data. The training data set and test data set were randomly divided in a ratio of 3:1. The model training of the training data set and the label prediction of the test data set were carried out by combining spectral feature vectors and classification algorithms. The classification evaluation indicators were accuracy, precision, recall, confusion matrix, and F1-score. The classification recognition accuracy of the algorithm was 98%. This paper has considerable significance for the fault diagnosis of aero-engines and classification recognition of aircrafts. Full article
Show Figures

Figure 1

23 pages, 2626 KiB  
Article
Deep Learning of Sensor Data in Cybersecurity of Robotic Systems: Overview and Case Study Results
by Wojciech Szynkiewicz, Ewa Niewiadomska-Szynkiewicz and Kamila Lis
Electronics 2023, 12(19), 4146; https://doi.org/10.3390/electronics12194146 - 05 Oct 2023
Cited by 1 | Viewed by 1059
Abstract
Recent technological advances have enabled the development of sophisticated robotic and sensor systems monitored and controlled by algorithms based on computational intelligence. The deeply intertwined and cooperating devices connected to the Internet and local networks, usually through wireless communication, are increasingly used in [...] Read more.
Recent technological advances have enabled the development of sophisticated robotic and sensor systems monitored and controlled by algorithms based on computational intelligence. The deeply intertwined and cooperating devices connected to the Internet and local networks, usually through wireless communication, are increasingly used in systems deployed among people in public spaces. The challenge is to ensure that physical and digital components work together securely, especially as the impact of cyberattacks is significantly increasing. The paper addresses cybersecurity issues of mobile service robots with distributed control architectures. The focus is on automatically detecting anomalous behaviors possibly caused by cyberattacks on onboard and external sensors measuring the robot and environmental parameters. We provide an overview of the methods and techniques for protecting robotic systems. Particular attention is paid to our technique for anomaly detection in a service robot’s operation based on sensor readings and deep recurrent neural networks, assuming that attacks result in the robot behaving inconsistently. The paper presents the architecture of two artificial neural networks, their parameters, and attributes based on which the potential attacks are identified. The solution was validated on the PAL Robotics TIAGo robot operating in the laboratory and replicating a home environment. The results confirm that the proposed system can effectively support the detection of computer threats affecting the sensors’ measurements and, consequently, the functioning of a service robotic system. Full article
Show Figures

Figure 1

18 pages, 32352 KiB  
Article
Lightweight Video Super-Resolution for Compressed Video
by Ilhwan Kwon, Jun Li and Mukesh Prasad
Electronics 2023, 12(3), 660; https://doi.org/10.3390/electronics12030660 - 28 Jan 2023
Cited by 1 | Viewed by 2412
Abstract
Video compression technology for Ultra-High Definition (UHD) and 8K UHD video has been established and is being widely adopted by major broadcasting companies and video content providers, allowing them to produce high-quality videos that meet the demands of today’s consumers. However, high-resolution video [...] Read more.
Video compression technology for Ultra-High Definition (UHD) and 8K UHD video has been established and is being widely adopted by major broadcasting companies and video content providers, allowing them to produce high-quality videos that meet the demands of today’s consumers. However, high-resolution video content broadcasting is not an easy problem to be resolved in the near future due to limited resources in network bandwidth and data storage. An alternative solution to overcome the challenges of broadcasting high-resolution video content is to downsample UHD or 8K video at the transmission side using existing infrastructure, and then utilizing Video Super-Resolution (VSR) technology at the receiving end to recover the original quality of the video content. Current deep learning-based methods for Video Super-Resolution (VSR) fail to consider the fact that the delivered video to viewers goes through a compression and decompression process, which can introduce additional distortion and loss of information. Therefore, it is crucial to develop VSR methods that are specifically designed to work with the compression–decompression pipeline. In general, various information in the compressed video is not utilized enough to realize the VSR model. This research proposes a highly efficient VSR network making use of data from decompressed video such as frame type, Group of Pictures (GOP), macroblock type and motion vector. The proposed Convolutional Neural Network (CNN)-based lightweight VSR model is suitable for real-time video services. The performance of the model is extensively evaluated through a series of experiments, demonstrating its effectiveness and applicability in practical scenarios. Full article
Show Figures

Figure 1

14 pages, 2829 KiB  
Article
A Federated Learning Framework for Breast Cancer Histopathological Image Classification
by Lingxiao Li, Niantao Xie and Sha Yuan
Electronics 2022, 11(22), 3767; https://doi.org/10.3390/electronics11223767 - 16 Nov 2022
Cited by 14 | Viewed by 2446
Abstract
Quantities and diversities of datasets are vital to model training in a variety of medical image diagnosis applications. However, there are the following problems in real scenes: the required data may not be available in a single institution due to the number of [...] Read more.
Quantities and diversities of datasets are vital to model training in a variety of medical image diagnosis applications. However, there are the following problems in real scenes: the required data may not be available in a single institution due to the number of patients or the type of pathology, and it is often not feasible to share patient data due to medical data privacy regulations. This means keeping private data safe is required and has become an obstacle in fusing data from multi-party to train a medical model. To solve the problems, we propose a federated learning framework, which allows knowledge fusion achieved by sharing the model parameters of each client through federated training rather than sharing data. Based on breast cancer histopathological dataset (BreakHis), our federated learning experiments achieve the expected results which are similar to the performances of the centralized learning and verify the feasibility and efficiency of the proposed framework. Full article
Show Figures

Figure 1

Back to TopTop