Artificial Intelligence-Based Image Processing and Computer Vision

A special issue of AI (ISSN 2673-2688). This special issue belongs to the section "AI Systems: Theory and Applications".

Deadline for manuscript submissions: 31 July 2024 | Viewed by 6929

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Science, Kansas State University, Manhattan, KS 66506, USA
Interests: artificial intelligence; computer vision; parallel computing; embedded systems; secure and trustworthy systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Modern image processing is a process of transforming an image into a digital form and using computing systems to process, manipulate, and/or enhance digital images through various algorithms. Image processing is also a requisite for many computer vision tasks as it helps to preprocess images and prepares data in a form suitable for various computer vision models. Computer vision generally refers to techniques that enable computers to understand and make sense of images. Computer vision enables machines to extract latent information from visual data and to mimic the human perception of sight with computational algorithms. Active research is ongoing on developing novel image processing and computer vision algorithms including artificial intelligence (AI), in particular, deep-learning-based algorithms to enable new and fascinating applications. Advances in AI-based image processing and computer vision have enabled many exciting new applications, such as autonomous vehicles, unmanned aerial vehicles, computational photography, augmented reality, surveillance, optical character recognition, machine inspection, autonomous package delivery, photogrammetry, biometrics, computer-aided inspection of medical images, and remote patient monitoring. Image processing and computer vision have applications in various domains including healthcare, transportation, retail, agriculture, business, manufacturing, construction, space, and military.

This Special Issue will explore algorithms and applications of image processing and computer vision inspired by AI. For this Special Issue, we welcome the submission of original research articles and reviews that relate to computing, architecture, algorithms, security, and applications of image processing and computer vision. Topics of interest include but are not limited to the following:

  • Image interpretation
  • Object detection and recognition
  • Spatial artificial intelligence
  • Event detection and activity recognition
  • Image segmentation
  • Video classification and analysis
  • Face and gesture recognition
  • Pose estimation
  • Computational photography
  • Image security
  • Vision hardware and/or software architectures
  • Image/vision acceleration techniques
  • Monitoring and surveillance
  • Situational awareness.

Dr. Arslan Munir
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. AI is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image processing
  • computer vision
  • image fusion
  • vision algorithms
  • deep learning
  • stereo vision
  • activity recognition
  • image/video analysis
  • image encryption
  • computational photography
  • vision hardware/software
  • monitoring and surveillance
  • biometrics
  • robotics
  • augmented reality

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

20 pages, 6807 KiB  
Article
Single Image Super Resolution Using Deep Residual Learning
by Moiz Hassan, Kandasamy Illanko and Xavier N. Fernando
AI 2024, 5(1), 426-445; https://doi.org/10.3390/ai5010021 - 21 Mar 2024
Viewed by 1246
Abstract
Single Image Super Resolution (SSIR) is an intriguing research topic in computer vision where the goal is to create high-resolution images from low-resolution ones using innovative techniques. SSIR has numerous applications in fields such as medical/satellite imaging, remote target identification and autonomous vehicles. [...] Read more.
Single Image Super Resolution (SSIR) is an intriguing research topic in computer vision where the goal is to create high-resolution images from low-resolution ones using innovative techniques. SSIR has numerous applications in fields such as medical/satellite imaging, remote target identification and autonomous vehicles. Compared to interpolation based traditional approaches, deep learning techniques have recently gained attention in SISR due to their superior performance and computational efficiency. This article proposes an Autoencoder based Deep Learning Model for SSIR. The down-sampling part of the Autoencoder mainly uses 3 by 3 convolution and has no subsampling layers. The up-sampling part uses transpose convolution and residual connections from the down sampling part. The model is trained using a subset of the VILRC ImageNet database as well as the RealSR database. Quantitative metrics such as PSNR and SSIM are found to be as high as 76.06 and 0.93 in our testing. We also used qualitative measures such as perceptual quality. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)
Show Figures

Figure 1

31 pages, 4849 KiB  
Article
MultiWave-Net: An Optimized Spatiotemporal Network for Abnormal Action Recognition Using Wavelet-Based Channel Augmentation
by Ramez M. Elmasry, Mohamed A. Abd El Ghany, Mohammed A.-M. Salem and Omar M. Fahmy
AI 2024, 5(1), 259-289; https://doi.org/10.3390/ai5010014 - 24 Jan 2024
Viewed by 1069
Abstract
Human behavior is regarded as one of the most complex notions present nowadays, due to the large magnitude of possibilities. These behaviors and actions can be distinguished as normal and abnormal. However, abnormal behavior is a vast spectrum, so in this work, abnormal [...] Read more.
Human behavior is regarded as one of the most complex notions present nowadays, due to the large magnitude of possibilities. These behaviors and actions can be distinguished as normal and abnormal. However, abnormal behavior is a vast spectrum, so in this work, abnormal behavior is regarded as human aggression or in another context when car accidents occur on the road. As this behavior can negatively affect the surrounding traffic participants, such as vehicles and other pedestrians, it is crucial to monitor such behavior. Given the current prevalent spread of cameras everywhere with different types, they can be used to classify and monitor such behavior. Accordingly, this work proposes a new optimized model based on a novel integrated wavelet-based channel augmentation unit for classifying human behavior in various scenes, having a total number of trainable parameters of 5.3 m with an average inference time of 0.09 s. The model has been trained and evaluated on four public datasets: Real Live Violence Situations (RLVS), Highway Incident Detection (HWID), Movie Fights, and Hockey Fights. The proposed technique achieved accuracies in the range of 92% to 99.5% across the used benchmark datasets. Comprehensive analysis and comparisons between different versions of the model and the state-of-the-art have been performed to confirm the model’s performance in terms of accuracy and efficiency. The proposed model has higher accuracy with an average of 4.97%, and higher efficiency by reducing the number of parameters by around 139.1 m compared to other models trained and tested on the same benchmark datasets. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)
Show Figures

Figure 1

23 pages, 9079 KiB  
Article
Deep Learning Performance Characterization on GPUs for Various Quantization Frameworks
by Muhammad Ali Shafique, Arslan Munir and Joonho Kong
AI 2023, 4(4), 926-948; https://doi.org/10.3390/ai4040047 - 18 Oct 2023
Viewed by 2013
Abstract
Deep learning is employed in many applications, such as computer vision, natural language processing, robotics, and recommender systems. Large and complex neural networks lead to high accuracy; however, they adversely affect many aspects of deep learning performance, such as training time, latency, throughput, [...] Read more.
Deep learning is employed in many applications, such as computer vision, natural language processing, robotics, and recommender systems. Large and complex neural networks lead to high accuracy; however, they adversely affect many aspects of deep learning performance, such as training time, latency, throughput, energy consumption, and memory usage in the training and inference stages. To solve these challenges, various optimization techniques and frameworks have been developed for the efficient performance of deep learning models in the training and inference stages. Although optimization techniques such as quantization have been studied thoroughly in the past, less work has been done to study the performance of frameworks that provide quantization techniques. In this paper, we have used different performance metrics to study the performance of various quantization frameworks, including TensorFlow automatic mixed precision and TensorRT. These performance metrics include training time and memory utilization in the training stage along with latency and throughput for graphics processing units (GPUs) in the inference stage. We have applied the automatic mixed precision (AMP) technique during the training stage using the TensorFlow framework, while for inference we have utilized the TensorRT framework for the post-training quantization technique using the TensorFlow TensorRT (TF-TRT) application programming interface (API).We performed model profiling for different deep learning models, datasets, image sizes, and batch sizes for both the training and inference stages, the results of which can help developers and researchers to devise and deploy efficient deep learning models for GPUs. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)
Show Figures

Figure 1

Review

Jump to: Research

21 pages, 683 KiB  
Review
Few-Shot Fine-Grained Image Classification: A Comprehensive Review
by Jie Ren, Changmiao Li, Yaohui An, Weichuan Zhang and Changming Sun
AI 2024, 5(1), 405-425; https://doi.org/10.3390/ai5010020 - 06 Mar 2024
Viewed by 1324
Abstract
Few-shot fine-grained image classification (FSFGIC) methods refer to the classification of images (e.g., birds, flowers, and airplanes) belonging to different subclasses of the same species by a small number of labeled samples. Through feature representation learning, FSFGIC methods can make better use of [...] Read more.
Few-shot fine-grained image classification (FSFGIC) methods refer to the classification of images (e.g., birds, flowers, and airplanes) belonging to different subclasses of the same species by a small number of labeled samples. Through feature representation learning, FSFGIC methods can make better use of limited sample information, learn more discriminative feature representations, greatly improve the classification accuracy and generalization ability, and thus achieve better results in FSFGIC tasks. In this paper, starting from the definition of FSFGIC, a taxonomy of feature representation learning for FSFGIC is proposed. According to this taxonomy, we discuss key issues on FSFGIC (including data augmentation, local and/or global deep feature representation learning, class representation learning, and task-specific feature representation learning). In addition, the existing popular datasets, current challenges and future development trends of feature representation learning on FSFGIC are also described. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)
Show Figures

Figure 1

Back to TopTop