Deep Learning in Computer Vision: Theory and Applications

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: 30 April 2024 | Viewed by 10292

Special Issue Editors


E-Mail Website
Guest Editor
Engineering, Computing and Mathematical Sciences Department, Lewis University, Romeoville, IL, USA
Interests: machine learning; community structure; graph theory vision; signal processing; data analysis

E-Mail Website
Guest Editor
Department of Electrical and Computer Engineering, Norwich University, Northfield, VT, USA
Interests: machine learning; deep learning; computer vision; natural language processing; optimization

E-Mail Website
Guest Editor
Engineering, Computing and Mathematical Sciences Department, Lewis University, Romeoville, IL, USA
Interests: electric power systems; stability of nonlinear networks; numerical methods; programming languages; scientific visualization; object-oriented theory and design; security of information systems

E-Mail Website
Guest Editor
1. Electrical Engineering Department, Jordan University of Science and Technology, Ar-Ramtha, Jordan
2. Electrical and Computer Engineering Department, Michigan State University, East Lansing, MI, USA
Interests: community structure; signal processing; community detection; graph theory; principal component analysis

Special Issue Information

Dear Colleagues,

Computer vision applications are now present in almost every aspect of our lives, including medicine, drones, object detection and classification, robotics, self-driving cars, search engines, and many others. In the field of computer vision, several algorithms and methods have been proposed. However, the majority of these algorithms do not perform well in complicated tasks. As a result, in recent years, researchers have started to use deep learning to develop methods and models for use in a variety of fields, including computer vision. Researchers selected deep learning over other machine learning algorithms in computer vision-related applications for several reasons, including: (1) Deep learning has the ability to extract features automatically. (2) Deep learning models are reusable and generalizable to other applications. (3) Deep learning models outperform other models, particularly in more challenging applications.

The aim of this Special Issue is to present recent advances in the use of deep learning in the field of computer vision. Scholars from various disciplines are encouraged to submit manuscripts to this Special Issue to share their findings.

Prof. Dr. Mahmood Al-khassaweneh
Dr. Ali Al Bataineh
Prof. Dr. Raymond P. Klump
Dr. Esraa Al-sharoa
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • artificial intelligence
  • machine learning
  • computer vision
  • object detection
  • face recognition
  • neural networks
  • convolutional and recurrent neural networks
  • image classification

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

27 pages, 3333 KiB  
Article
Artistic Style Recognition: Combining Deep and Shallow Neural Networks for Painting Classification
by Saqib Imran, Rizwan Ali Naqvi, Muhammad Sajid, Tauqeer Safdar Malik, Saif Ullah, Syed Atif Moqurrab and Dong Keon Yon
Mathematics 2023, 11(22), 4564; https://doi.org/10.3390/math11224564 - 07 Nov 2023
Viewed by 1540
Abstract
This study’s main goal is to create a useful software application for finding and classifying fine art photos in museums and art galleries. There is an increasing need for tools to swiftly analyze and arrange art collections based on their artistic styles as [...] Read more.
This study’s main goal is to create a useful software application for finding and classifying fine art photos in museums and art galleries. There is an increasing need for tools to swiftly analyze and arrange art collections based on their artistic styles as a result of the digitization of art collections. To increase the accuracy of the style categorization, the suggested technique involves two parts. The input image is split into five sub-patches in the first stage. A DCNN that has been particularly trained for this task is then used to classify each patch individually. A decision-making module using a shallow neural network is part of the second phase. Probability vectors acquired from the first-phase classifier are used to train this network. The results from each of the five patches are combined in this phase to deduce the final style classification for the input image. One key advantage of this approach is employing probability vectors rather than images, and the second phase is trained separately from the first. This helps compensate for any potential errors made during the first phase, improving accuracy in the final classification. To evaluate the proposed method, six various already-trained CNN models, namely AlexNet, VGG-16, VGG-19, GoogLeNet, ResNet-50, and InceptionV3, were employed as the first-phase classifiers. The second-phase classifier was implemented as a shallow neural network. By using four representative art datasets, experimental trials were conducted using the Australian Native Art dataset, the WikiArt dataset, ILSVRC, and Pandora 18k. The findings showed that the recommended strategy greatly surpassed existing methods in terms of style categorization accuracy and precision. Overall, the study assists in creating efficient software systems for analyzing and categorizing fine art images, making them more accessible to the general public through digital platforms. Using pre-trained models, we were able to attain an accuracy of 90.7. Our model performed better with a higher accuracy of 96.5 as a result of fine-tuning and transfer learning. Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision: Theory and Applications)
Show Figures

Figure 1

13 pages, 2138 KiB  
Article
A Comparative Evaluation of Self-Attention Mechanism with ConvLSTM Model for Global Aerosol Time Series Forecasting
by Dušan S. Radivojević, Ivan M. Lazović, Nikola S. Mirkov, Uzahir R. Ramadani and Dušan P. Nikezić
Mathematics 2023, 11(7), 1744; https://doi.org/10.3390/math11071744 - 05 Apr 2023
Cited by 2 | Viewed by 1290
Abstract
The attention mechanism in natural language processing and self-attention mechanism in vision transformers improved many deep learning models. An implementation of the self-attention mechanism with the previously developed ConvLSTM sequence-to-one model was done in order to make a comparative evaluation with statistical testing. [...] Read more.
The attention mechanism in natural language processing and self-attention mechanism in vision transformers improved many deep learning models. An implementation of the self-attention mechanism with the previously developed ConvLSTM sequence-to-one model was done in order to make a comparative evaluation with statistical testing. First, the new ConvLSTM sequence-to-one model with a self-attention mechanism was developed and then the self-attention layer was removed in order to make comparison. The hyperparameters optimization process was conducted by grid search for integer and string type parameters, and with particle swarm optimization for float type parameters. A cross validation technique was used for better evaluating models with a predefined ratio of train-validation-test subsets. Both models with and without a self-attention layer passed defined evaluation criteria that means that models are able to generate the image of the global aerosol thickness and able to find patterns for changes in the time domain. The model obtained by an ablation study on the self-attention layer achieved better outcomes for Root Mean Square Error and Euclidean Distance in regards to developed ConvLSTM-SA model. As part of the statistical test, a Kruskal–Wallis H Test was done since it was determined that the data did not belong to the normal distribution and the obtained results showed that both models, with and without the SA layer, predict similar images with patterns at the pixel level to the original dataset. However, the model without the SA layer was more similar to the original dataset especially in the time domain at the pixel level. Based on the comparative evaluation with statistical testing, it was concluded that the developed ConvLSTM-SA model better predicts without an SA layer. Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision: Theory and Applications)
Show Figures

Figure 1

13 pages, 18485 KiB  
Article
Hybrid Attention Asynchronous Cascade Network for Salient Object Detection
by Haiyan Yang, Yongxin Chen, Rui Chen and Shuning Liu
Mathematics 2023, 11(6), 1389; https://doi.org/10.3390/math11061389 - 13 Mar 2023
Viewed by 935
Abstract
The highlighted area or object is defined as the salient region or salient object. For salient object detection, the main challenges are still the clarity of the boundary information of the salient object and the positioning accuracy of the salient object in the [...] Read more.
The highlighted area or object is defined as the salient region or salient object. For salient object detection, the main challenges are still the clarity of the boundary information of the salient object and the positioning accuracy of the salient object in the complex background, such as noise and occlusion. To remedy these issues, it is proposed that the asynchronous cascade saliency detection algorithm based on a deep network, which is embedded in an encoder–decoder architecture. Moreover, the lightweight hybrid attention module is designed to obtain the explicit boundaries of salient regions. In order to effectively improve location information of salient objects, this paper adopts a bi-directional asynchronous cascade fusion strategy, which generates prediction maps with higher accuracy. The experimental results on five benchmark datasets show that the proposed network HACNet is on a par with the state of the art for image saliency datasets. Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision: Theory and Applications)
Show Figures

Figure 1

15 pages, 2186 KiB  
Article
Application of Optimized ORB Algorithm in Design AR Augmented Reality Technology Based on Visualization
by Hai’an Yan, Jian Wang and Peng Zhang
Mathematics 2023, 11(6), 1278; https://doi.org/10.3390/math11061278 - 07 Mar 2023
Cited by 2 | Viewed by 1176
Abstract
The current media digitization and artistic strength are more powerful than the previous application. Using its advanced information display methods and technologies, this paper proposed a digital museum built by integrating digital media art with AR technology, which was helpful to analyze and [...] Read more.
The current media digitization and artistic strength are more powerful than the previous application. Using its advanced information display methods and technologies, this paper proposed a digital museum built by integrating digital media art with AR technology, which was helpful to analyze and solve the objective problems of current museums’ ecological imbalance and single-system function. Based on the principles and laws of augmented reality technology, the museum guide system is optimized. In the system evaluation experiment, firstly, the cultural relics of six kinds of materials are used as the target image to extract and identify the features of the image. In experiments, the recognition performance of three feature algorithms, Binary Robust Invariant Scalable Keypoints (BRISK), organizational retaliatory behavior (ORB), and Accelerated-KAZE (AKAZE), is compared. Among them, the ORB algorithm is superior to other algorithms in feature richness and recognition speed but is inferior to the other two algorithms in recognition accuracy. Therefore, this paper optimized the ORB algorithm based on the characteristics of the ORB algorithm. The ORB algorithm must calculate the orientation of the feature points before constructing the feature descriptor. After optimizing the parameters, the improved ORB algorithm not only has advantages in feature richness and recognition time but also improves the recognition accuracy up to 98.3%, which is 16% higher than the traditional ORB algorithm. Therefore, the application prospects of AR technology in digital media design are very important. Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision: Theory and Applications)
Show Figures

Figure 1

17 pages, 774 KiB  
Article
Automated CNN Architectural Design: A Simple and Efficient Methodology for Computer Vision Tasks
by Ali Al Bataineh, Devinder Kaur, Mahmood Al-khassaweneh and Esraa Al-sharoa
Mathematics 2023, 11(5), 1141; https://doi.org/10.3390/math11051141 - 24 Feb 2023
Cited by 5 | Viewed by 2531
Abstract
Convolutional neural networks (CNN) have transformed the field of computer vision by enabling the automatic extraction of features, obviating the need for manual feature engineering. Despite their success, identifying an optimal architecture for a particular task can be a time-consuming and challenging process [...] Read more.
Convolutional neural networks (CNN) have transformed the field of computer vision by enabling the automatic extraction of features, obviating the need for manual feature engineering. Despite their success, identifying an optimal architecture for a particular task can be a time-consuming and challenging process due to the vast space of possible network designs. To address this, we propose a novel neural architecture search (NAS) framework that utilizes the clonal selection algorithm (CSA) to automatically design high-quality CNN architectures for image classification problems. Our approach uses an integer vector representation to encode CNN architectures and hyperparameters, combined with a truncated Gaussian mutation scheme that enables efficient exploration of the search space. We evaluated the proposed method on six challenging EMNIST benchmark datasets for handwritten digit recognition, and our results demonstrate that it outperforms nearly all existing approaches. In addition, our approach produces state-of-the-art performance while having fewer trainable parameters than other methods, making it low-cost, simple, and reusable for application to multiple datasets. Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision: Theory and Applications)
Show Figures

Figure 1

22 pages, 5291 KiB  
Article
Super-Resolution Reconstruction-Based Plant Image Classification Using Thermal and Visible-Light Images
by Ganbayar Batchuluun, Se Hyun Nam, Chanhum Park and Kang Ryoung Park
Mathematics 2023, 11(1), 76; https://doi.org/10.3390/math11010076 - 25 Dec 2022
Cited by 4 | Viewed by 1324
Abstract
Few studies have been conducted on thermal plant images. This is because of the difficulty in extracting and analyzing various color-related patterns and features from the plant image obtained using a thermal camera, which does not provide color information. In addition, the thermal [...] Read more.
Few studies have been conducted on thermal plant images. This is because of the difficulty in extracting and analyzing various color-related patterns and features from the plant image obtained using a thermal camera, which does not provide color information. In addition, the thermal camera is sensitive to the surrounding temperature and humidity. However, the thermal camera enables the extraction of invisible patterns in the plant by providing external and internal heat information. Therefore, this study proposed a novel plant classification method based on both the thermal and visible-light plant images to exploit the strengths of both types of cameras. To the best of our knowledge, this study is the first to perform super-resolution reconstruction using visible-light and thermal plant images. Furthermore, a method to improve the classification performance through generative adversarial network (GAN)-based super-resolution reconstruction was proposed. Through the experiments using a self-collected dataset of thermal and visible-light images, our method shows higher accuracies than the state-of-the-art methods. Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision: Theory and Applications)
Show Figures

Figure 1

Back to TopTop