Algorithms for Image Processing and Machine Vision

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Algorithms for Multidisciplinary Applications".

Deadline for manuscript submissions: 30 June 2024 | Viewed by 7876

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Science, Kansas State University, Manhattan, KS 66506, USA
Interests: artificial intelligence; computer vision; parallel computing; embedded systems; secure and trustworthy systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Modern image processing is a process of transforming an image into a digital form and using computing systems to process, manipulate, and/or enhance digital images through various algorithms. Image processing is also a requisite for many machine vision tasks as it helps to preprocess images and prepare data in a form suitable for various machine vision models. Machine vision generally refers to techniques and algorithms that enable computers/machines to understand and make sense of images. Machine vision enables machines to extract latent information from visual data and to mimic the human perception of sight with computational algorithms. Active research is ongoing on developing novel image processing and machine vision algorithms including deep learning-based algorithms for enabling new and fascinating applications.

This Special Issue targets algorithms for image processing and machine vision. This Special Issue invites original research articles and reviews that relate to computing, architecture, algorithms, security, and applications of image processing and machine vision. Topics of interest include but are not limited to the following:

  • Image interpretation;
  • Object detection and recognition;
  • Spatial artificial intelligence;
  • Event detection and activity recognition;
  • Image segmentation;
  • Video classification and analysis;
  • Face and gesture recognition;
  • Pose estimation;
  • Computational photography;
  • Image security;
  • Vision hardware and/or software architectures;
  • Image/vision acceleration techniques;
  • Monitoring and surveillance;
  • Situational awareness.

Dr. Arslan Munir
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image processing
  • machine vision
  • image fusion
  • vision algorithms
  • deep learning
  • stereo vision
  • activity recognition
  • image/video analysis
  • image encryption algorithms
  • computational photography
  • vision hardware/software
  • monitoring and surveillance
  • biometrics
  • robotics
  • augmented reality

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 808 KiB  
Article
GDUI: Guided Diffusion Model for Unlabeled Images
by Xuanyuan Xie and Jieyu Zhao
Algorithms 2024, 17(3), 125; https://doi.org/10.3390/a17030125 - 18 Mar 2024
Viewed by 826
Abstract
The diffusion model has made progress in the field of image synthesis, especially in the area of conditional image synthesis. However, this improvement is highly dependent on large annotated datasets. To tackle this challenge, we present the Guided Diffusion model for Unlabeled Images [...] Read more.
The diffusion model has made progress in the field of image synthesis, especially in the area of conditional image synthesis. However, this improvement is highly dependent on large annotated datasets. To tackle this challenge, we present the Guided Diffusion model for Unlabeled Images (GDUI) framework in this article. It utilizes the inherent feature similarity and semantic differences in the data, as well as the downstream transferability of Contrastive Language-Image Pretraining (CLIP), to guide the diffusion model in generating high-quality images. We design two semantic-aware algorithms, namely, the pseudo-label-matching algorithm and label-matching refinement algorithm, to match the clustering results with the true semantic information and provide more accurate guidance for the diffusion model. First, GDUI encodes the image into a semantically meaningful latent vector through clustering. Then, pseudo-label matching is used to complete the matching of the true semantic information of the image. Finally, the label-matching refinement algorithm is used to adjust the irrelevant semantic information in the data, thereby improving the quality of the guided diffusion model image generation. Our experiments on labeled datasets show that GDUI outperforms diffusion models without any guidance and significantly reduces the gap between it and models guided by ground-truth labels. Full article
(This article belongs to the Special Issue Algorithms for Image Processing and Machine Vision)
Show Figures

Figure 1

17 pages, 25202 KiB  
Article
Denoising Diffusion Models on Model-Based Latent Space
by Carmelo Scribano, Danilo Pezzi, Giorgia Franchini and Marco Prato
Algorithms 2023, 16(11), 501; https://doi.org/10.3390/a16110501 - 28 Oct 2023
Viewed by 1540
Abstract
With the recent advancements in the field of diffusion generative models, it has been shown that defining the generative process in the latent space of a powerful pretrained autoencoder can offer substantial advantages. This approach, by abstracting away imperceptible image details and introducing [...] Read more.
With the recent advancements in the field of diffusion generative models, it has been shown that defining the generative process in the latent space of a powerful pretrained autoencoder can offer substantial advantages. This approach, by abstracting away imperceptible image details and introducing substantial spatial compression, renders the learning of the generative process more manageable while significantly reducing computational and memory demands. In this work, we propose to replace autoencoder coding with a model-based coding scheme based on traditional lossy image compression techniques; this choice not only further diminishes computational expenses but also allows us to probe the boundaries of latent-space image generation. Our objectives culminate in the proposal of a valuable approximation for training continuous diffusion models within a discrete space, accompanied by enhancements to the generative model for categorical values. Beyond the good results obtained for the problem at hand, we believe that the proposed work holds promise for enhancing the adaptability of generative diffusion models across diverse data types beyond the realm of imagery. Full article
(This article belongs to the Special Issue Algorithms for Image Processing and Machine Vision)
Show Figures

Figure 1

21 pages, 4300 KiB  
Article
Indoor Scene Recognition: An Attention-Based Approach Using Feature Selection-Based Transfer Learning and Deep Liquid State Machine
by Ranjini Surendran, Ines Chihi, J. Anitha and D. Jude Hemanth
Algorithms 2023, 16(9), 430; https://doi.org/10.3390/a16090430 - 08 Sep 2023
Cited by 1 | Viewed by 1444
Abstract
Scene understanding is one of the most challenging areas of research in the fields of robotics and computer vision. Recognising indoor scenes is one of the research applications in the category of scene understanding that has gained attention in recent years. Recent developments [...] Read more.
Scene understanding is one of the most challenging areas of research in the fields of robotics and computer vision. Recognising indoor scenes is one of the research applications in the category of scene understanding that has gained attention in recent years. Recent developments in deep learning and transfer learning approaches have attracted huge attention in addressing this challenging area. In our work, we have proposed a fine-tuned deep transfer learning approach using DenseNet201 for feature extraction and a deep Liquid State Machine model as the classifier in order to develop a model for recognising and understanding indoor scenes. We have included fuzzy colour stacking techniques, colour-based segmentation, and an adaptive World Cup optimisation algorithm to improve the performance of our deep model. Our proposed model would dedicatedly assist the visually impaired and blind to navigate in the indoor environment and completely integrate into their day-to-day activities. Our proposed work was implemented on the NYU depth dataset and attained an accuracy of 96% for classifying the indoor scenes. Full article
(This article belongs to the Special Issue Algorithms for Image Processing and Machine Vision)
Show Figures

Figure 1

22 pages, 7154 KiB  
Article
A Comprehensive Analysis of Real-Time Car Safety Belt Detection Using the YOLOv7 Algorithm
by Lwando Nkuzo, Malusi Sibiya and Elisha Didam Markus
Algorithms 2023, 16(9), 400; https://doi.org/10.3390/a16090400 - 23 Aug 2023
Viewed by 2185
Abstract
Using a safety belt is crucial for preventing severe injuries and fatalities during vehicle accidents. In this paper, we propose a real-time vehicle occupant safety belt detection system based on the YOLOv7 (You Only Look Once version seven) object detection algorithm. The proposed [...] Read more.
Using a safety belt is crucial for preventing severe injuries and fatalities during vehicle accidents. In this paper, we propose a real-time vehicle occupant safety belt detection system based on the YOLOv7 (You Only Look Once version seven) object detection algorithm. The proposed approach aims to automatically detect whether the occupants of a vehicle have buckled their safety belts or not as soon as they are detected within the vehicle. A dataset for this purpose was collected and annotated for validation and testing. By leveraging the efficiency and accuracy of YOLOv7, we achieve near-instantaneous analysis of video streams, making our system suitable for deployment in various surveillance and automotive safety applications. This paper outlines a comprehensive methodology for training the YOLOv7 model using the labelImg tool to annotate the dataset with images showing vehicle occupants. It also discusses the challenges of detecting seat belts and evaluates the system’s performance on a real-world dataset. The evaluation focuses on distinguishing the status of a safety belt between two classes: “buckled” and “unbuckled”. The results demonstrate a high level of accuracy, with a mean average precision (mAP) of 99.6% and an F1 score of 98%, indicating the system’s effectiveness in identifying the safety belt status. Full article
(This article belongs to the Special Issue Algorithms for Image Processing and Machine Vision)
Show Figures

Figure 1

22 pages, 2661 KiB  
Article
Human Action Representation Learning Using an Attention-Driven Residual 3DCNN Network
by Hayat Ullah and Arslan Munir
Algorithms 2023, 16(8), 369; https://doi.org/10.3390/a16080369 - 31 Jul 2023
Viewed by 1010
Abstract
The recognition of human activities using vision-based techniques has become a crucial research field in video analytics. Over the last decade, there have been numerous advancements in deep learning algorithms aimed at accurately detecting complex human actions in video streams. While these algorithms [...] Read more.
The recognition of human activities using vision-based techniques has become a crucial research field in video analytics. Over the last decade, there have been numerous advancements in deep learning algorithms aimed at accurately detecting complex human actions in video streams. While these algorithms have demonstrated impressive performance in activity recognition, they often exhibit a bias towards either model performance or computational efficiency. This biased trade-off between robustness and efficiency poses challenges when addressing complex human activity recognition problems. To address this issue, this paper presents a computationally efficient yet robust approach, exploiting saliency-aware spatial and temporal features for human action recognition in videos. To achieve effective representation of human actions, we propose an efficient approach called the dual-attentional Residual 3D Convolutional Neural Network (DA-R3DCNN). Our proposed method utilizes a unified channel-spatial attention mechanism, allowing it to efficiently extract significant human-centric features from video frames. By combining dual channel-spatial attention layers with residual 3D convolution layers, the network becomes more discerning in capturing spatial receptive fields containing objects within the feature maps. To assess the effectiveness and robustness of our proposed method, we have conducted extensive experiments on four well-established benchmark datasets for human action recognition. The quantitative results obtained validate the efficiency of our method, showcasing significant improvements in accuracy of up to 11% as compared to state-of-the-art human action recognition methods. Additionally, our evaluation of inference time reveals that the proposed method achieves up to a 74× improvement in frames per second (FPS) compared to existing approaches, thus showing the suitability and effectiveness of the proposed DA-R3DCNN for real-time human activity recognition. Full article
(This article belongs to the Special Issue Algorithms for Image Processing and Machine Vision)
Show Figures

Figure 1

Back to TopTop