Submit to Special Issue Submit Abstract to Special Issue Review for Applied Sciences Propose a Special Issue

Journal Menu

Journal Browser

Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 June 2024 | Viewed by 3741

Share This Special Issue

Special Issue Editor

Dr. Shiyang Yan

E-Mail Website
Guest Editor

INRIA Institut National de Recherche en Informatique et en Automatique, Le Chesnay, France
Interests: attention mechanism; reinforcement learning; generative learning; causal inference

Special Issue Information

Dear Colleagues,

Convolutional neural networks (CNNs) and related deep neural networks have seen great success in machine learning and computer vision. Advanced CNNs, such as fast, faster R-CNN, have achieved breakthrough performance in object detection. Recently, transformer models have been widely applied in classification, object detection, and multimodal machine learning tasks.

To further boost the research and application of advanced deep neural networks in various computer vision applications, this Special Issue aims to gather and collect advanced deep neural networks and algorithms in the field of computer vision and related areas. We encouraged the submission of research papers on, but not restricted to, object detection, image segmentation, and classification.

Dr. Shiyang Yan
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

convolutional neural network
computer vision
object detection
image segmentation
image classification

Published Papers (4 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

18 pages, 3884 KiB

Open AccessArticle

A Method for Underwater Biological Detection Based on Improved YOLOXs

by Heng Wang, Pu Zhang, Mengnan You and Xinyuan You

Appl. Sci. 2024, 14(8), 3196; https://doi.org/10.3390/app14083196 - 10 Apr 2024

Viewed by 371

Abstract

This article proposes a lightweight underwater biological target detection network based on the improvement of YOLOXs, addressing the challenges of complex and dynamic underwater environments, limited memory in underwater devices, and constrained computational capabilities. Firstly, in the backbone network, GhostConv and GhostBottleneck are introduced to replace standard convolutions and the Bottleneck1 structure in CSPBottleneck_1, significantly reducing the model’s parameter count and computational load, facilitating the construction of a lightweight network. Next, in the feature fusion network, a Contextual Transformer block replaces the 3 × 3 convolution in CSPBottleneck_2. This enhances self-attention learning by leveraging the rich context between input keys, improving the model’s representational capacity. Finally, the positioning loss function Focal_EIoU Loss is employed to replace IoU Loss, enhancing the model’s robustness and generalization ability, leading to faster and more accurate convergence during training. Our experimental results demonstrate that compared to the YOLOXs model, the proposed YOLOXs-GCE achieves a 1.1% improvement in mAP value, while reducing parameters by 24.47%, the computational load by 26.39%, and the model size by 23.87%. This effectively enhances the detection performance of the model, making it suitable for complex and dynamic underwater environments, as well as underwater devices with limited memory. The model meets the requirements of underwater target detection tasks. Full article

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

► Show Figures

Figure 1

27 pages, 28358 KiB

Open AccessArticle

Fast Coherent Video Style Transfer via Flow Errors Reduction

by Li Wang, Xiaosong Yang and Jianjun Zhang

Appl. Sci. 2024, 14(6), 2630; https://doi.org/10.3390/app14062630 - 21 Mar 2024

Viewed by 425

Abstract

For video style transfer, naively applying still image techniques to process a video frame-by-frame independently often causes flickering artefacts. Some works adopt optical flow into the design of temporal constraint loss to secure temporal consistency. However, these works still suffer from incoherence (including ghosting artefacts) where large motions or occlusions occur, as optical flow fails to detect the boundaries of objects accurately. To address this problem, we propose a novel framework which consists of the following two stages: (1) creating new initialization images from proposed mask techniques, which are able to significantly reduce the flow errors; (2) process these initialized images iteratively with proposed losses to obtain stylized videos which are free of artefacts, which also increases the speed from over 3 min per frame to less than 2 s per frame for the gradient-based optimization methods. To be specific, we propose a multi-scale mask fusion scheme to reduce untraceable flow errors, and obtain an incremental mask to reduce ghosting artefacts. In addition, a multi-frame mask fusion scheme is designed to reduce traceable flow errors. In our proposed losses, the Sharpness Losses are used to deal with the potential image blurriness artefacts over long-range frames, and the Coherent Losses are performed to restrict the temporal consistency at both the multi-frame RGB level and Feature level. Overall, our approach produces stable video stylization outputs even in large motion or occlusion scenarios. The experiments demonstrate that the proposed method outperforms the state-of-the-art video style transfer methods qualitatively and quantitatively on the MPI Sintel dataset. Full article

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

► Show Figures

Figure 1

23 pages, 7009 KiB

Open AccessArticle

PGDS-YOLOv8s: An Improved YOLOv8s Model for Object Detection in Fisheye Images

by Degang Yang, Jie Zhou, Tingting Song, Xin Zhang and Yingze Song

Appl. Sci. 2024, 14(1), 44; https://doi.org/10.3390/app14010044 - 20 Dec 2023

Viewed by 1535

Abstract

Recently, object detection has become a research hotspot in computer vision, which often detects regular images with small viewing angles. In order to obtain a field of view without blind spots, fisheye cameras, which have distortions and discontinuities, have come into use. The fisheye camera, which has a wide viewing angle, and an unmanned aerial vehicle equipped with a fisheye camera are used to obtain a field of view without blind spots. However, distorted and discontinuous objects appear in the captured fisheye images due to the unique viewing angle of fisheye cameras. It poses a significant challenge to some existing object detectors. To solve this problem, this paper proposes a PGDS-YOLOv8s model to solve the issue of detecting distorted and discontinuous objects in fisheye images. First, two novel downsampling modules are proposed. Among them, the Max Pooling and Ghost’s Downsampling (MPGD) module effectively extracts the essential feature information of distorted and discontinuous objects. The Average Pooling and Ghost’s Downsampling (APGD) module acquires rich global features and reduces the feature loss of distorted and discontinuous objects. In addition, the proposed C2fs module uses Squeeze-and-Excitation (SE) blocks to model the interdependence of the channels to acquire richer gradient flow information about the features. The C2fs module provides a better understanding of the contextual information in fisheye images. Subsequently, an SE block is added after the Spatial Pyramid Pooling Fast (SPPF), thus improving the model’s ability to capture features of distorted, discontinuous objects. Moreover, the UAV-360 dataset is created for object detection in fisheye images. Finally, experiments show that the proposed PGDS-YOLOv8s model on the VOC-360 dataset improves mAP@0.5 by 19.8% and mAP@0.5:0.95 by 27.5% compared to the original YOLOv8s model. In addition, the improved model on the UAV-360 dataset achieves 89.0% for mAP@0.5 and 60.5% for mAP@0.5:0.95. Furthermore, on the MS-COCO 2017 dataset, the PGDS-YOLOv8s model improved AP by 1.4%, AP

_{50}

by 1.7%, and AP

_{75}

by 1.2% compared with the original YOLOv8s model. Full article

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

► Show Figures

Figure 1

13 pages, 2895 KiB

Open AccessArticle

Improved Lightweight Multi-Target Recognition Model for Live Streaming Scenes

by Zongwei Li, Kai Qiao, Jianing Chen, Zhenyu Li and Yanhui Zhang

Appl. Sci. 2023, 13(18), 10170; https://doi.org/10.3390/app131810170 - 10 Sep 2023

Viewed by 814

Abstract

Nowadays, the commercial potential of live e-commerce is being continuously explored, and machine vision algorithms are gradually attracting the attention of marketers and researchers. During live streaming, the visuals can be effectively captured by algorithms, thereby providing additional data support. This paper aims to consider the diversity of live streaming devices and proposes an extremely lightweight and high-precision model to meet different requirements in live streaming scenarios. Building upon yolov5s, we incorporate the MobileNetV3 module and the CA attention mechanism to optimize the model. Furthermore, we construct a multi-object dataset specific to live streaming scenarios, including anchor facial expressions and commodities. A series of experiments have demonstrated that our model realized a 0.4% improvement in accuracy compared to the original model, while reducing its weight to 10.52%. Full article

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

► Show Figures

Journal Menu

Journal Browser

Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Published Papers (4 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI