Research

Jump to: Review

20 pages, 16585 KiB

Open AccessArticle

An Improved YOLO Model for UAV Fuzzy Small Target Image Detection

by Yanlong Chang, Dong Li, Yunlong Gao, Yun Su and Xiaoqiang Jia

Appl. Sci. 2023, 13(9), 5409; https://doi.org/10.3390/app13095409 - 26 Apr 2023

Cited by 7 | Viewed by 1637

High-altitude UAV photography presents several challenges, including blurry images, low image resolution, and small targets, which can cause low detection performance of existing object detection algorithms. Therefore, this study proposes an improved small-object detection algorithm based on the YOLOv5s computer vision model. First, [...] Read more.

High-altitude UAV photography presents several challenges, including blurry images, low image resolution, and small targets, which can cause low detection performance of existing object detection algorithms. Therefore, this study proposes an improved small-object detection algorithm based on the YOLOv5s computer vision model. First, the original convolution in the network framework was replaced with the SPD-Convolution module to eliminate the impact of pooling operations on feature information and to enhance the model’s capability to extract features from low-resolution and small targets. Second, a coordinate attention mechanism was added after the convolution operation to improve model detection accuracy with small targets under image blurring. Third, the nearest-neighbor interpolation in the original network upsampling was replaced with transposed convolution to increase the receptive field range of the neck and reduce detail loss. Finally, the CIoU loss function was replaced with the Alpha-IoU loss function to solve the problem of the slow convergence of gradients during training on small target images. Using the images of Artemisia salina, taken in Hunshandake sandy land in China, as a dataset, the experimental results demonstrated that the proposed algorithm provides significantly improved results (average precision = 80.17%, accuracy = 73.45% and recall rate = 76.97%, i.e., improvements by 14.96%, 6.24%, and 7.21%, respectively, compared with the original model) and also outperforms other detection algorithms. The detection of small objects and blurry images has been significantly improved. Full article

(This article belongs to the Special Issue Deep Learning Architectures for Computer Vision)

► Show Figures

Figure 1

22 pages, 4130 KiB

Open AccessArticle

Multiplicative Vector Fusion Model for Detecting Deepfake News in Social Media

by Yalamanchili Salini and Jonnadula Harikiran

Appl. Sci. 2023, 13(7), 4207; https://doi.org/10.3390/app13074207 - 26 Mar 2023

Cited by 3 | Viewed by 1225

Abstract

In the digital age, social media platforms are becoming vital tools for generating and detecting deepfake news due to the rapid dissemination of information. Unfortunately, today, fake news is being developed at an accelerating rate that can cause substantial problems, such as early [...] Read more.

In the digital age, social media platforms are becoming vital tools for generating and detecting deepfake news due to the rapid dissemination of information. Unfortunately, today, fake news is being developed at an accelerating rate that can cause substantial problems, such as early detection of fake news, a lack of labelled data available for training, and identifying fake news instances that still need to be discovered. Identifying false news requires an in-depth understanding of authors, entities, and the connections between words in a long text. Unfortunately, many deep learning (DL) techniques have proven ineffective with lengthy texts to address these issues. This paper proposes a TL-MVF model based on transfer learning for detecting and generating deepfake news in social media. To generate the sentences, the T5, or Text-to-Text Transfer Transformer model, was employed for data cleaning and feature extraction. In the next step, we designed an optimal hyperparameter RoBERTa model for effectively detecting fake and real news. Finally, we propose a multiplicative vector fusion model for classifying fake news from real news efficiently. A real-time and benchmarked dataset was used to test and validate the proposed TL-MVF model. For the TL-MVF model, F-score, accuracy, precision, recall, and AUC were performance evaluation measures. As a result, the proposed TL-MVF performed better than existing benchmarks. Full article

(This article belongs to the Special Issue Deep Learning Architectures for Computer Vision)

► Show Figures

Figure 1

14 pages, 2533 KiB

Open AccessArticle

Lightweight Micro-Expression Recognition on Composite Database

by Nur Aishah Ab Razak and Shahnorbanun Sahran

Appl. Sci. 2023, 13(3), 1846; https://doi.org/10.3390/app13031846 - 31 Jan 2023

Viewed by 1297

Abstract

The potential of leveraging micro-expression in various areas such as security, health care and education has intensified interests in this area. Unlike facial expression, micro-expression is subtle and occurs rapidly, making it imperceptible. Micro-expression recognition (MER) on composite dataset following Micro-Expression Grand Challenge [...] Read more.

The potential of leveraging micro-expression in various areas such as security, health care and education has intensified interests in this area. Unlike facial expression, micro-expression is subtle and occurs rapidly, making it imperceptible. Micro-expression recognition (MER) on composite dataset following Micro-Expression Grand Challenge 2019 protocol is an ongoing research area with challenges stemming from demographic variety of the samples as well as small and imbalanced dataset. However, most micro-expression recognition (MER) approaches today are complex and require computationally expensive pre-processing but result in average performance. This work will demonstrate how transfer learning from a larger and varied macro-expression database (FER 2013) in a lightweight deep learning network before fine-tuning on the composite dataset can achieve high MER performance using only static images as input. The imbalanced dataset problem is redefined as an algorithm tuning problem instead of data engineering and generation problem to lighten the pre-processing steps. The proposed MER model is developed from truncated EfficientNet-B0 model consisting of 15 layers with only 867k parameters. A simple algorithm tuning that manipulates the loss function to place more importance on minority classes is suggested to deal with the imbalanced dataset. Experimental results using Leave-One-Subject-Out cross-validation on the composite dataset show substantial performance increase compared to the state-of-the-art models. Full article

(This article belongs to the Special Issue Deep Learning Architectures for Computer Vision)

► Show Figures

Figure 1

18 pages, 4569 KiB

Open AccessArticle

Unsupervised Semantic Segmentation Inpainting Network Using a Generative Adversarial Network with Preprocessing

by Woo-Jin Ahn, Dong-Won Kim, Tae-Koo Kang, Dong-Sung Pae and Myo-Taeg Lim

Appl. Sci. 2023, 13(2), 781; https://doi.org/10.3390/app13020781 - 05 Jan 2023

Viewed by 1766

Abstract

The generative adversarial neural network has shown a novel result in the image generation area. However, applying it to a semantic segmentation inpainting task exhibits instability due to the different data distribution. To solve this problem, we propose an unsupervised semantic segmentation inpainting [...] Read more.

The generative adversarial neural network has shown a novel result in the image generation area. However, applying it to a semantic segmentation inpainting task exhibits instability due to the different data distribution. To solve this problem, we propose an unsupervised semantic segmentation inpainting method using an adversarial deep neural network with a newly introduced preprocessing method and loss function. For stabilizing the adversarial training for semantic segmentation inpainting, we match the probability distribution of the segmentation maps with the developed preprocessing method. In addition, a new cross-entropy total variation loss for the probability map is introduced to improve the segmentation inpainting work by smoothing the segmentation map. The experimental results demonstrate the proposed algorithm’s effectiveness on both synthetic and real datasets. Full article

(This article belongs to the Special Issue Deep Learning Architectures for Computer Vision)

► Show Figures

Figure 1

26 pages, 3258 KiB

Open AccessArticle

Depth-Adaptive Deep Neural Network Based on Learning Layer Relevance Weights

by Arwa Alturki, Ouiem Bchir and Mohamed Maher Ben Ismail

Appl. Sci. 2023, 13(1), 398; https://doi.org/10.3390/app13010398 - 28 Dec 2022

Viewed by 1935

Abstract

In this paper, we propose two novel Adaptive Neural Network Approaches (ANNAs), which are intended to automatically learn the optimal network depth. In particular, the proposed class-independent and class-dependent ANNAs address two main challenges faced by typical deep learning paradigms. Namely, they overcome [...] Read more.

In this paper, we propose two novel Adaptive Neural Network Approaches (ANNAs), which are intended to automatically learn the optimal network depth. In particular, the proposed class-independent and class-dependent ANNAs address two main challenges faced by typical deep learning paradigms. Namely, they overcome the problems of setting the optimal network depth and improving the model interpretability. Specifically, ANNA approaches simultaneously train the network model, learn the network depth in an unsupervised manner, and assign fuzzy relevance weights to each network layer to better decipher the model behavior. In addition, two novel cost functions were designed in order to optimize the layer fuzzy relevance weights along with the model hyper-parameters. The proposed ANNA approaches were assessed using standard benchmarking datasets and performance measures. The experiments proved their effectiveness compared to typical deep learning approaches, which rely on empirical tuning and scaling of the network depth. Moreover, the experimental findings demonstrated the ability of the proposed class-independent and class-dependent ANNAs to decrease the network complexity and build lightweight models for less overfitting risk and better generalization. Full article

(This article belongs to the Special Issue Deep Learning Architectures for Computer Vision)

► Show Figures

Figure 1

36 pages, 11108 KiB

Open AccessArticle

On the Relative Impact of Optimizers on Convolutional Neural Networks with Varying Depth and Width for Image Classification

by Eustace M. Dogo, Oluwatobi J. Afolabi and Bhekisipho Twala

Appl. Sci. 2022, 12(23), 11976; https://doi.org/10.3390/app122311976 - 23 Nov 2022

Cited by 6 | Viewed by 2834

Abstract

The continued increase in computing resources is one key factor that is allowing deep learning researchers to scale, design and train new and complex convolutional neural network (CNN) architectures in terms of varying width, depth, or both width and depth to improve performance [...] Read more.

The continued increase in computing resources is one key factor that is allowing deep learning researchers to scale, design and train new and complex convolutional neural network (CNN) architectures in terms of varying width, depth, or both width and depth to improve performance for a variety of problems. The contributions of this study include an uncovering of how different optimization algorithms impact CNN architectural setups with variations in width, depth, and both width/depth. Specifically in this study, three different CNN architectural setups in combination with nine different optimization algorithms—namely SGD vanilla, with momentum, and with Nesterov momentum, RMSProp, ADAM, ADAGrad, ADADelta, ADAMax, and NADAM—are trained and evaluated using three publicly available benchmark image classification datasets. Through extensive experimentation, we analyze the output predictions of the different optimizers with the CNN architectures using accuracy, convergence speed, and loss function as performance metrics. Findings based on the overall results obtained across the three image classification datasets show that ADAM and NADAM achieved superior performances with wider and deeper/wider setups, respectively, while ADADelta was the worst performer, especially with the deeper CNN architectural setup. Full article

(This article belongs to the Special Issue Deep Learning Architectures for Computer Vision)

► Show Figures

Figure 1

14 pages, 721 KiB

Open AccessArticle

Extending Partial Domain Adaptation Algorithms to the Open-Set Setting

by George Pikramenos, Evaggelos Spyrou and Stavros J. Perantonis

Appl. Sci. 2022, 12(19), 10052; https://doi.org/10.3390/app121910052 - 06 Oct 2022

Viewed by 920

Abstract

Partial domain adaptation (PDA) is a framework for mitigating the covariate shift problem when target labels are contained in source labels. For this task, adversarial neural network (ANN) methods proposed in the literature have been proven to be flexible and effective. In this [...] Read more.

Partial domain adaptation (PDA) is a framework for mitigating the covariate shift problem when target labels are contained in source labels. For this task, adversarial neural network (ANN) methods proposed in the literature have been proven to be flexible and effective. In this work, we adapt such methods to tackle the more general problem of open-set domain adaptation (OSDA), which further allows the existence of target instances with labels outside the source labels. The aim in OSDA is to mitigate the covariate shift problem and to identify target instances with labels outside the source label space. We show that the effectiveness of ANN methods utilized in the PDA setting is hindered by outlier target instances, and we propose an adaptation for effective OSDA. Full article

(This article belongs to the Special Issue Deep Learning Architectures for Computer Vision)

► Show Figures

Figure 1

Review

Jump to: Research

18 pages, 7191 KiB

Open AccessReview

A Deep Learning Review of ResNet Architecture for Lung Disease Identification in CXR Image

by Syifa Auliyah Hasanah, Anindya Apriliyanti Pravitasari, Atje Setiawan Abdullah, Intan Nurma Yulita and Mohammad Hamid Asnawi

Appl. Sci. 2023, 13(24), 13111; https://doi.org/10.3390/app132413111 - 08 Dec 2023

Cited by 3 | Viewed by 1521

Abstract

The lungs are two of the most crucial organs in the human body because they are connected to the respiratory and circulatory systems. Lung cancer, COVID-19, pneumonia, and other severe diseases are just a few of the many threats. The patient is subjected [...] Read more.

The lungs are two of the most crucial organs in the human body because they are connected to the respiratory and circulatory systems. Lung cancer, COVID-19, pneumonia, and other severe diseases are just a few of the many threats. The patient is subjected to an X-ray examination to evaluate the health of their lungs. A radiologist must interpret the X-ray results. The rapid advancement of technology today can help people in many different ways. One use of deep learning in the health industry is in the detection of diseases, which can decrease the amount of money, time, and energy needed while increasing effectiveness and efficiency. There are other methods that can be used, but in this research, the convolutional neural network (CNN) method is only used with three architectures, namely ResNet-50, ResNet-101, and ResNet-152, to aid radiologists in identifying lung diseases in patients. The 21,885 images that make up the dataset for this study are split into four groups: COVID-19, pneumonia, lung opacity, and normal. The three algorithms have fairly high evaluation scores per the experiment results. F1 scores of 91%, 93%, and 94% are assigned to the ResNet-50, ResNet-101, and ResNet-152 architectures, respectively. Therefore, it is advised to use the ResNet-152 architecture, which has better performance values than the other two designs in this study, to categorize lung diseases experienced by patients. Full article

(This article belongs to the Special Issue Deep Learning Architectures for Computer Vision)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Deep Learning Architectures for Computer Vision

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Published Papers (8 papers)

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI