Deep Learning and Machine Learning in Image Processing and Pattern Recognition

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 30 July 2024 | Viewed by 3673

Special Issue Editors


E-Mail Website
Guest Editor
Automation Department, School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
Interests: neural networks; machine learning; information fusion; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail
Guest Editor
Automation Department, School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
Interests: control theory; fuzzy systems; complex systems; robot control systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Pattern recognition and image processing have grown in importance within the field of artificial intelligence due to the swift advancement of science and technology. This field has developed rapidly in recent years due to the growing use of machine learning and deep learning in image processing and pattern recognition. The goal of this Special Issue is to examine the most recent developments as well as potential directions for machine learning and deep learning in pattern recognition and image processing.

Prof. Dr. Haitao Zhao
Dr. Meng Wang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image processing
  • machine learning
  • pattern recognition
  • machine learning
  • neural network
  • artificial intelligence

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 8343 KiB  
Article
Precision-Boosted Forest Fire Target Detection via Enhanced YOLOv8 Model
by Zhaoxu Yang, Yifan Shao, Ye Wei and Jun Li
Appl. Sci. 2024, 14(6), 2413; https://doi.org/10.3390/app14062413 - 13 Mar 2024
Viewed by 789
Abstract
Forest fires present a significant challenge to ecosystems, particularly due to factors like tree cover that complicate fire detection tasks. While fire detection technologies, like YOLO, are widely used in forest protection, capturing diverse and complex flame features remains challenging. Therefore, we propose [...] Read more.
Forest fires present a significant challenge to ecosystems, particularly due to factors like tree cover that complicate fire detection tasks. While fire detection technologies, like YOLO, are widely used in forest protection, capturing diverse and complex flame features remains challenging. Therefore, we propose an enhanced YOLOv8 multiscale forest fire detection method. This involves adjusting the network structure and integrating Deformable Convolution and SCConv modules to better adapt to forest fire complexities. Additionally, we introduce the Coordinate Attention mechanism in the Detection module to more effectively capture feature information and enhance model accuracy. We adopt the WIoU v3 loss function and implement a dynamically non-monotonic mechanism to optimize gradient allocation strategies. Our experimental results demonstrate that our model achieves a mAP of 90.02%, approximately 5.9% higher than the baseline YOLOv8 network. This method significantly improves forest fire detection accuracy, reduces False Positive rates, and demonstrates excellent applicability in real forest fire scenarios. Full article
Show Figures

Figure 1

23 pages, 4888 KiB  
Article
SFFNet: Staged Feature Fusion Network of Connecting Convolutional Neural Networks and Graph Convolutional Neural Networks for Hyperspectral Image Classification
by Hao Li, Xiaorui Xiong, Chaoxian Liu, Yong Ma, Shan Zeng and Yaqin Li
Appl. Sci. 2024, 14(6), 2327; https://doi.org/10.3390/app14062327 - 10 Mar 2024
Viewed by 657
Abstract
The immense representation power of deep learning frameworks has kept them in the spotlight in hyperspectral image (HSI) classification. Graph Convolutional Neural Networks (GCNs) can be used to compensate for the lack of spatial information in Convolutional Neural Networks (CNNs). However, most GCNs [...] Read more.
The immense representation power of deep learning frameworks has kept them in the spotlight in hyperspectral image (HSI) classification. Graph Convolutional Neural Networks (GCNs) can be used to compensate for the lack of spatial information in Convolutional Neural Networks (CNNs). However, most GCNs construct graph data structures based on pixel points, which requires the construction of neighborhood matrices on all data. Meanwhile, the setting of GCNs to construct similarity relations based on spatial structure is not fully applicable to HSIs. To make the network more compatible with HSIs, we propose a staged feature fusion model called SFFNet, a neural network framework connecting CNN and GCN models. The CNN performs the first stage of feature extraction, assisted by adding neighboring features and overcoming the defects of local convolution; then, the GCN performs the second stage for classification, and the graph data structure is constructed based on spectral similarity, optimizing the original connectivity relationships. In addition, the framework enables the batch training of the GCN by using the extracted spectral features as nodes, which greatly reduces the hardware requirements. The experimental results on three publicly available benchmark hyperspectral datasets show that our proposed framework outperforms other relevant deep learning models, with an overall classification accuracy of over 97%. Full article
Show Figures

Figure 1

18 pages, 7048 KiB  
Article
U-ETMVSNet: Uncertainty-Epipolar Transformer Multi-View Stereo Network for Object Stereo Reconstruction
by Ning Zhao, Heng Wang, Quanlong Cui and Lan Wu
Appl. Sci. 2024, 14(6), 2223; https://doi.org/10.3390/app14062223 - 07 Mar 2024
Viewed by 581
Abstract
The Multi-View Stereo model (MVS), which utilizes 2D images from multiple perspectives for 3D reconstruction, is a crucial technique in the field of 3D vision. To address the poor correlation between 2D features and 3D space in existing MVS models, as well as [...] Read more.
The Multi-View Stereo model (MVS), which utilizes 2D images from multiple perspectives for 3D reconstruction, is a crucial technique in the field of 3D vision. To address the poor correlation between 2D features and 3D space in existing MVS models, as well as the high sampling rate required for static sampling, we proposeU-ETMVSNet in this paper. Initially, we employ an integrated epipolar transformer module (ET) to establish 3D spatial correlations along epipolar lines, thereby enhancing the reliability of aggregated cost volumes. Subsequently, we devise a sampling module based on probability volume uncertainty to dynamically adjust the depth sampling range for the next stage. Finally, we utilize a multi-stage joint learning method based on multi-depth value classification to evaluate and optimize the model. Experimental results demonstrate that on the DTU dataset, our method achieves a relative performance improvement of 27.01% and 11.27% in terms of completeness error and overall error, respectively, compared to CasMVSNet, even at lower depth sampling rates. Moreover, our method exhibits excellent performance with a score of 58.60 on the Tanks &Temples dataset, highlighting its robustness and generalization capability. Full article
Show Figures

Figure 1

20 pages, 7918 KiB  
Article
Lightweight Non-Destructive Detection of Diseased Apples Based on Structural Re-Parameterization Technique
by Bo Han, Ziao Lu, Luan Dong and Jingjing Zhang
Appl. Sci. 2024, 14(5), 1907; https://doi.org/10.3390/app14051907 - 26 Feb 2024
Viewed by 559
Abstract
This study addresses the challenges in the non-destructive detection of diseased apples, specifically the high complexity and poor real-time performance of the classification model for detecting diseased fruits in apple grading. Research is conducted on a lightweight model for apple defect recognition, and [...] Read more.
This study addresses the challenges in the non-destructive detection of diseased apples, specifically the high complexity and poor real-time performance of the classification model for detecting diseased fruits in apple grading. Research is conducted on a lightweight model for apple defect recognition, and an improved VEW-YOLOv8n method is proposed. The backbone network incorporates a lightweight, re-parameterization VanillaC2f module, reducing both complexity and the number of parameters, and it employs an extended activation function to enhance the model’s nonlinear expression capability. In the neck network, an Efficient-Neck lightweight structure, developed using the lightweight modules and augmented with a channel shuffling strategy, decreases the computational load while ensuring comprehensive feature information fusion. The model’s robustness and generalization ability are further enhanced by employing the WIoU bounding box loss function, evaluating the quality of anchor frames using outlier metrics, and incorporating a dynamically updated gradient gain assignment strategy. Experimental results indicate that the improved model surpasses the YOLOv8n model, achieving a 2.7% increase in average accuracy, a 24.3% reduction in parameters, a 28.0% decrease in computational volume, and an 8.5% improvement in inference speed. This technology offers a novel, effective method for the non-destructive detection of diseased fruits in apple grading working procedures. Full article
Show Figures

Figure 1

20 pages, 4385 KiB  
Article
A Multi-Task Learning and Knowledge Selection Strategy for Environment-Induced Color-Distorted Image Restoration
by Yuan Ding and Kaijun Wu
Appl. Sci. 2024, 14(5), 1836; https://doi.org/10.3390/app14051836 - 23 Feb 2024
Viewed by 546
Abstract
Existing methods for restoring color-distorted images in specific environments typically focus on a singular type of distortion, making it challenging to generalize their application across various types of color-distorted images. If it were possible to leverage the intrinsic connections between different types of [...] Read more.
Existing methods for restoring color-distorted images in specific environments typically focus on a singular type of distortion, making it challenging to generalize their application across various types of color-distorted images. If it were possible to leverage the intrinsic connections between different types of color-distorted images and coordinate their interactions during model training, it would simultaneously enhance generalization, address potential overfitting and underfitting issues during data fitting, and consequently lead to a positive performance boost. In this paper, our approach primarily addresses three distinct types of color-distorted images, namely dust-laden images, hazy images, and underwater images. By thoroughly exploiting the unique characteristics and interrelationships of these types, we achieve the objective of multitask processing. Within this endeavor, identifying appropriate correlations is pivotal. To this end, we propose a knowledge selection and allocation strategy that optimally distributes the features and correlations acquired by the network from the images to different tasks, enabling a more refined task differentiation. Moreover, given the challenge of difficult dataset pairing, we employ unsupervised learning techniques and introduce novel Transformer blocks, feedforward networks, and hybrid modules to enhance context relevance. Through extensive experimentation, we demonstrate that our proposed method significantly enhances the performance of color-distorted image restoration. Full article
Show Figures

Figure 1

Back to TopTop