Big Data Analysis and Management Based on Deep Learning

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 June 2024 | Viewed by 19991

Special Issue Editors


E-Mail Website
Guest Editor
School of Automation, Nanjing University of Information Science and Technology, 219 Ningliu Rd., Nanjing 210044, China
Interests: deep learning; remote sensing image analysis; change detection; semantic analysis; image segmentation
Special Issues, Collections and Topics in MDPI journals
Jiangsu Key Laboratory of Big Data Analysis Technology, Nanjing University of Information Science and Technology, Nanjing 210044, China
Interests: machine learning; image analysis; intelligent robot

Special Issue Information

Dear Colleagues,

With the development of information society, the data scale is becoming larger and larger, and heterogeneous information is significantly expanded, including a series of cross media content, including video, image, remote sensing, audio, text, and other data. At present, the emergence of increasingly complex big data brings more challenges to the current big data analysis technology. Because of its multilayer nonlinear structure, the deep learning model has a strong feature learning ability, which provides an effective way to solve the above problems. For data-driven representation learning, such as speech recognition, target detection, image classification, and machine translation, deep learning shows unique advantages.

Therefore, this Special Issue aims to collate original research and review articles that emphasize the important role of deep learning for big data analysis. It aims to call for state-of-the-art research in the theory, algorithm, modeling, system, and application of deep learning-based big data analysis and to demonstrate the latest efforts of relevant researchers.

Prof. Dr. Min Xia
Dr. Kai Hu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning methods for analysis of big data
  • deep learning methods for the analysis of multisource time-series data
  • deep learning methods for semantic information extraction from complex image and video data
  • model acceleration for deep learning of big data
  • theory and novel application scenarios of cross media big data analysis
  • investigation of the latest progress in this field

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 2531 KiB  
Article
BEAC-Net: Boundary-Enhanced Adaptive Context Network for Optic Disk and Optic Cup Segmentation
by Lincen Jiang, Xiaoyu Tang, Shuai You, Shangdong Liu and Yimu Ji
Appl. Sci. 2023, 13(18), 10244; https://doi.org/10.3390/app131810244 - 12 Sep 2023
Viewed by 659
Abstract
Accurately segmenting the optic disk (OD) and optic cup (OC) on retinal fundus images is important for treating glaucoma. With the development of deep learning, some CNN-based methods have been implemented to segment OD and OC, but it is difficult to accurately segment [...] Read more.
Accurately segmenting the optic disk (OD) and optic cup (OC) on retinal fundus images is important for treating glaucoma. With the development of deep learning, some CNN-based methods have been implemented to segment OD and OC, but it is difficult to accurately segment OD and OC boundaries affected by blood vessels and the lesion area. To this end, we propose a novel boundary-enhanced adaptive context network (BEAC-Net) for OD and OC segmentation. Firstly, a newly designed efficient boundary pixel attention (EBPA) module enhances pixel-by-pixel feature capture to collect the boundary contextual information of OD and OC in the horizontal and vertical directions. In addition, background noise makes segmenting boundary pixels difficult. To this end, an adaptive context module (ACM) was designed, which simultaneously learns local-range and long-range information to capture richer context. Finally, BEAC-Net adaptively integrates the feature maps from different levels using the attentional feature fusion (AFF) module. In addition, we provide a high-quality retinal fundus image dataset named the 66 Vision-Tech dataset, which advances the field of diagnostic glaucoma. Our proposed BEAC-Net was used to perform extensive experiments on the RIM-ONE-v3, DRISHTI-GS, and 66 Vision-Tech datasets. In particular, BEAC-Net achieved a Dice coefficient of 0.8267 and an IoU of 0.8138 for OD segmentation and a Dice coefficient of 0.8057 and an IoU value of 0.7858 for OC segmentation on the 66 Vision-Tech dataset, achieving state-of-the-art segmentation results. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

17 pages, 7526 KiB  
Article
Semantic-Aligned Cross-Modal Visual Grounding Network with Transformers
by Qianjun Zhang and Jin Yuan
Appl. Sci. 2023, 13(9), 5649; https://doi.org/10.3390/app13095649 - 04 May 2023
Cited by 1 | Viewed by 1579
Abstract
Multi-modal deep learning methods have achieved great improvements in visual grounding; their objective is to localize text-specified objects in images. Most of the existing methods can localize and classify objects with significant appearance differences but suffer from the misclassification problem for extremely similar [...] Read more.
Multi-modal deep learning methods have achieved great improvements in visual grounding; their objective is to localize text-specified objects in images. Most of the existing methods can localize and classify objects with significant appearance differences but suffer from the misclassification problem for extremely similar objects, due to inadequate exploration of multi-modal features. To address this problem, we propose a novel semantic-aligned cross-modal visual grounding network with transformers (SAC-VGNet). SAC-VGNet integrates visual and textual features with semantic alignment to highlight important feature cues for capturing tiny differences between similar objects. Technically, SAC-VGNet incorporates a multi-modal fusion module to effectively fuse visual and textual descriptions. It also introduces contrastive learning to align linguistic and visual features on the text-to-pixel level, enabling the capture of subtle differences between objects. The overall architecture is end-to-end without the need for extra parameter settings. To evaluate our approach, we manually annotate text descriptions for images in two fine-grained visual grounding datasets. The experimental results demonstrate that SAC-VGNet significantly improves performance in fine-grained visual grounding. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

17 pages, 2297 KiB  
Article
End-to-End: A Simple Template for the Long-Tailed-Recognition of Transmission Line Clamps via a Vision-Language Model
by Fei Yan, Hui Zhang, Yaogen Li, Yongjia Yang and Yinping Liu
Appl. Sci. 2023, 13(5), 3287; https://doi.org/10.3390/app13053287 - 04 Mar 2023
Viewed by 1194
Abstract
Raw image classification datasets generally maintain a long-tailed distribution in the real world. Standard classification algorithms face a substantial issue because many labels only relate to a few categories. The model learning processes will tend toward the dominant labels under the influence of [...] Read more.
Raw image classification datasets generally maintain a long-tailed distribution in the real world. Standard classification algorithms face a substantial issue because many labels only relate to a few categories. The model learning processes will tend toward the dominant labels under the influence of their loss functions. Existing systems typically use two stages to improve performance: pretraining on initial imbalanced datasets and fine-tuning on balanced datasets via re-sampling or logit adjustment. These have achieved promising results. However, their limited self-supervised information makes it challenging to transfer such systems to other vision tasks, such as detection and segmentation. Using large-scale contrastive visual-language pretraining, the Open AI team discovered a novel visual recognition method. We provide a simple one-stage model called the text-to-image network (TIN) for long-tailed recognition (LTR) based on the similarities between textual and visual features. The TIN has the following advantages over existing techniques: (1) Our model incorporates textual and visual semantic information. (2) This end-to-end strategy achieves good results with fewer image samples and no secondary training. (3) By using seesaw loss, we further reduce the loss gap between the head category and the tail category. These adjustments encourage large relative magnitudes between the logarithms of rare and dominant labels. TIN conducted extensive comparative experiments with a large number of advanced models on ImageNet-LT, the largest long-tailed public dataset, and achieved the state-of-the-art for a single-stage model with 72.8% at Top-1 accuracy. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

13 pages, 544 KiB  
Article
MFAMNet: Multi-Scale Feature Attention Mixture Network for Short-Term Load Forecasting
by Shengchun Yang, Kedong Zhu, Feng Li, Liguo Weng and Liangcheng Cheng
Appl. Sci. 2023, 13(5), 2998; https://doi.org/10.3390/app13052998 - 26 Feb 2023
Viewed by 991
Abstract
Short-term load forecasting is an important prerequisite for smart grid controls. The current methods are mainly based on the convolution neural network (CNN) or long short-term memory (LSTM) model to realize load forecasting. For the multi-factor input sequence, the existing methods cannot obtain [...] Read more.
Short-term load forecasting is an important prerequisite for smart grid controls. The current methods are mainly based on the convolution neural network (CNN) or long short-term memory (LSTM) model to realize load forecasting. For the multi-factor input sequence, the existing methods cannot obtain multi-scale features of the time series and the important parameters of the multi-factor, resulting in low accuracy and robustness. To address these problems, a multi-scale feature attention hybrid network is proposed, which uses LSTM to extract the time correlation of the sequence and multi-scale CNN to automatically extract the multi-scale feature of the load. This work realizes the integration of features by constructing a circular network. In the proposed model, a two-branch attention mechanism is further constructed to capture the important parameters of different influencing factors to improve the model’s robustness, which can make the network to obtain effective features at the curve changes. Comparative experiments on two open test sets show that the proposed multi-scale feature attention mixture network can achieve accurate short-term load forecasting and is superior to the existing methods. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

21 pages, 5532 KiB  
Article
JAUNet: A U-Shape Network with Jump Attention for Semantic Segmentation of Road Scenes
by Zhiyong Fan, Kailai Liu, Jianmin Hou, Fei Yan and Qiang Zang
Appl. Sci. 2023, 13(3), 1493; https://doi.org/10.3390/app13031493 - 23 Jan 2023
Cited by 2 | Viewed by 1414
Abstract
The task of complex scene semantic segmentation is to classify and label the scene image pixel by pixel. For the complex image information in autonomous driving scenes, its characteristics such as many kinds of targets and various scene changes make the segmentation task [...] Read more.
The task of complex scene semantic segmentation is to classify and label the scene image pixel by pixel. For the complex image information in autonomous driving scenes, its characteristics such as many kinds of targets and various scene changes make the segmentation task more difficult, making various kinds of FCN-based networks unable to restore the image information well. In contrast, the encoder–decoder network structure represented by SegNet and UNet uses jump connections and other methods to restore image information. Still, its extraction of shallow details is simple and unfocused. In this paper, we propose a U-shaped convolutional neural network with a jump attention mechanism, which is an improved encoder plus decoder structure to achieve semantic segmentation by four times of convolutional downsampling and four transposed convolutional upsamplings while adding a jump attention module in the upsampling process to realize selective extraction of contextual information from high-dimensional features to guide low-dimensional features, improve the fusion of deep and shallow features, and ensure the consistency of the same type of pixel prediction. The CamVid and Cityscapes datasets are sampled for the experiments, and the model ground mIoU evaluation metrics can reach 66.3% and 69.1%. Compared with other mainstream semantic segmentation algorithms, this method is competitive in terms of segmentation performance and model size. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

14 pages, 4284 KiB  
Article
Improving Semi-Supervised Image Classification by Assigning Different Weights to Correctly and Incorrectly Classified Samples
by Xu Zhang, Huan Zhang, Xinyue Zhang, Xinyue Zhang, Cheng Zhen, Tianguo Yuan and Jiande Wu
Appl. Sci. 2022, 12(23), 11915; https://doi.org/10.3390/app122311915 - 22 Nov 2022
Viewed by 1095
Abstract
Semi-supervised deep learning, a model that aims to effectively use unlabeled data to help learn sample features from labeled data, is a recent hot topic. To effectively use unlabeled data, a new semi-supervised learning model based on a consistency strategy is proposed. In [...] Read more.
Semi-supervised deep learning, a model that aims to effectively use unlabeled data to help learn sample features from labeled data, is a recent hot topic. To effectively use unlabeled data, a new semi-supervised learning model based on a consistency strategy is proposed. In the supervised part with labeled samples, the image generation model first generates some artificial images to complement the limited number of labeled samples. Secondly, the sample label mapping, as the “benchmark”, is compared to the corresponding sample features in the network as an additional loss to complement the original supervisory loss, aiming to better correct the model parameters. Finally, the original supervised loss is changed so that the network parameters are determined by the characteristics of each correctly classified sample. In the unsupervised part, the actual unsupervised loss is altered so that the model does not “treat all samples equally” and can focus more on the characteristics of misclassified samples. A total of 40 labeled samples from the CIFAR-10 and SVHN datasets were used to train the semi-supervised model achieving accuracies of 93.25% and 96.83%, respectively, demonstrating the effectiveness of the proposed semi-supervised model. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

19 pages, 1257 KiB  
Article
A Federated Incremental Learning Algorithm Based on Dual Attention Mechanism
by Kai Hu, Meixia Lu, Yaogen Li, Sheng Gong, Jiasheng Wu, Fenghua Zhou, Shanshan Jiang and Yi Yang
Appl. Sci. 2022, 12(19), 10025; https://doi.org/10.3390/app121910025 - 06 Oct 2022
Cited by 1 | Viewed by 2079
Abstract
Federated incremental learning best suits the changing needs of common Federal Learning (FL) tasks. In this area, the large sample client dramatically influences the final model training results, and the unbalanced features of the client are challenging to capture. In this paper, a [...] Read more.
Federated incremental learning best suits the changing needs of common Federal Learning (FL) tasks. In this area, the large sample client dramatically influences the final model training results, and the unbalanced features of the client are challenging to capture. In this paper, a federated incremental learning framework is designed; firstly, part of the data is preprocessed to obtain the initial global model. Secondly, to help the global model to get the importance of the features of the whole sample of each client, and enhance the performance of the global model to capture the critical information of the feature, channel attention neural network model is designed on the client side, and a federated aggregation algorithm based on the feature attention mechanism is designed on the server side. Experiments on standard datasets CIFAR10 and CIFAR100 show that the proposed algorithm accuracy has good performance on the premise of realizing incremental learning. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

14 pages, 1419 KiB  
Article
Real-Time Motion Detection Network Based on Single Linear Bottleneck and Pooling Compensation
by Huayang Cheng, Yunchao Ding and Lu Yang
Appl. Sci. 2022, 12(17), 8645; https://doi.org/10.3390/app12178645 - 29 Aug 2022
Cited by 1 | Viewed by 1279
Abstract
Motion (change) detection is a basic preprocessing step in video processing, which has many application scenarios. One challenge is that deep learning-based methods require high computation power to improve their accuracy. In this paper, we introduce a novel semantic segmentation and lightweight-based network [...] Read more.
Motion (change) detection is a basic preprocessing step in video processing, which has many application scenarios. One challenge is that deep learning-based methods require high computation power to improve their accuracy. In this paper, we introduce a novel semantic segmentation and lightweight-based network for motion detection, called Real-time Motion Detection Network Based on Single Linear Bottleneck and Pooling Compensation (MDNet-LBPC). In the feature extraction stage, the most computationally expensive CNN block is replaced with our single linear bottleneck operator to reduce the computational cost. During the decoder stage, our pooling compensation mechanism can supplement the useful motion detection information. To our best knowledge, this is the first work to use the lightweight operator to solve the motion detection task. We show that the acceleration performance of the single linear bottleneck is  5% higher than that of the linear bottleneck, which is more suitable for improving the efficiency of model inference. On the dataset CDNet2014, MDNet-LBPC increases the frames per second (FPS) metric by 123 compared to the suboptimal method FgSegNet_v2, ranking first in inference speed. Meanwhile, our MDNet-LBPC achieves 95.74% on the accuracy metric, which is comparable to the state-of-the-art methods. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

20 pages, 6328 KiB  
Article
Application of Low-Altitude UAV Remote Sensing Image Object Detection Based on Improved YOLOv5
by Ziran Li, Akio Namiki, Satoshi Suzuki, Qi Wang, Tianyi Zhang and Wei Wang
Appl. Sci. 2022, 12(16), 8314; https://doi.org/10.3390/app12168314 - 19 Aug 2022
Cited by 15 | Viewed by 2538
Abstract
With the development of science and technology, the traditional industrial structures are constantly being upgraded. As far as drones are concerned, an increasing number of researchers are using reinforcement learning or deep learning to make drones more intelligent. At present, there are many [...] Read more.
With the development of science and technology, the traditional industrial structures are constantly being upgraded. As far as drones are concerned, an increasing number of researchers are using reinforcement learning or deep learning to make drones more intelligent. At present, there are many algorithms for object detection. Although many models have a high accuracy of detection, these models have many parameters and high complexity, making them unable to perform real-time detection. Therefore, it is particularly important to design a lightweight object detection algorithm that is able to meet the needs of real-time detection using UAVs. In response to the above problems, this paper establishes a dataset of six animals in grassland from different angles and during different time periods on the basis of the remote sensing images of drones. In addition, on the basis of the Yolov5s network model, a lightweight object detector is designed. First, Squeeze-and-Excitation Networks are introduced to improve the expressiveness of the network model. Secondly, the convolutional layer of branch 2 in the BottleNeckCSP structure is deleted, and 3/4 of its input channels are directly merged with the results of branch 1 processing, which reduces the number of model parameters. Next, in the SPP module of the network model, a 3 × 3 maximum pooling layer is added to improve the receptive field of the model. Finally, the trained model is applied to NVIDIA-TX2 processor for real-time object detection. After testing, the optimized YOLOv5 grassland animal detection model was able to effectively identify six different forms of grassland animal. Compared with the YOLOv3, EfficientDet-D0, YOLOv4 and YOLOv5s network models, the mAP_0.5 value was improved by 0.186, 0.03, 0.007 and 0.011, respectively, and the mAP_0.5:0.95 value was improved by 0.216, 0.066, 0.034 and 0.051, respectively, with an average detection speed of 26 fps. The experimental results show that the grassland animal detection model based on the YOLOv5 network has high detection accuracy, good robustness, and faster calculation speed in different time periods and at different viewing angles. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

16 pages, 1956 KiB  
Article
Combating Label Noise in Image Data Using MultiNET Flexible Confident Learning
by Adam Popowicz, Krystian Radlak, Slawomir Lasota, Karolina Szczepankiewicz and Michal Szczepankiewicz
Appl. Sci. 2022, 12(14), 6842; https://doi.org/10.3390/app12146842 - 06 Jul 2022
Cited by 2 | Viewed by 1836
Abstract
Deep neural networks (DNNs) have been used successfully for many image classification problems. One of the most important factors that determines the final efficiency of a DNN is the correct construction of the training set. Erroneously labeled training images can degrade the final [...] Read more.
Deep neural networks (DNNs) have been used successfully for many image classification problems. One of the most important factors that determines the final efficiency of a DNN is the correct construction of the training set. Erroneously labeled training images can degrade the final accuracy and additionally lead to unpredictable model behavior, reducing reliability. In this paper, we propose MultiNET, a novel method for the automatic detection of noisy labels within image datasets. MultiNET is an adaptation of the current state-of-the-art confident learning method. In contrast to the original, our method aggregates the outputs of multiple DNNs and allows for the adjustment of detection sensitivity. We conduct an exhaustive evaluation, incorporating four widely used datasets (CIFAR10, CIFAR100, MNIST, and GTSRB), eight state-of-the-art DNN architectures, and a variety of noise scenarios. Our results demonstrate that MultiNET significantly outperforms the confident learning method. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

21 pages, 1829 KiB  
Article
Human Action Recognition Based on Improved Two-Stream Convolution Network
by Zhongwen Wang, Haozhu Lu, Junlan Jin and Kai Hu
Appl. Sci. 2022, 12(12), 5784; https://doi.org/10.3390/app12125784 - 07 Jun 2022
Cited by 12 | Viewed by 2207
Abstract
Two-stream convolution network (2SCN) is a classical method of action recognition. It is capable of extracting action information from two dimensions: spatial and temporal streams. However, the method of extracting motion features from a spatial stream is single-frame recognition, and there is still [...] Read more.
Two-stream convolution network (2SCN) is a classical method of action recognition. It is capable of extracting action information from two dimensions: spatial and temporal streams. However, the method of extracting motion features from a spatial stream is single-frame recognition, and there is still room for improvement in the perception ability of appearance coherence features. The classical two-stream convolution network structure is modified in this paper by utilizing the strong mining capabilities of the bidirectional gated recurrent unit (BiGRU) to allow the neural network to extract the appearance coherence features of actions. In addition, this paper introduces an attention mechanism (SimAM) based on neuroscience theory, which improves the accuracy and stability of neural networks. Experiments show that the method proposed in this paper (BS-2SCN, BiGRU-SimAM Two-stream convolution network) has high accuracy. The accuracy is improved by 2.6% on the UCF101 data set and 11.7% on the HMDB51 data set. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

20 pages, 11090 KiB  
Article
Feature Residual Analysis Network for Building Extraction from Remote Sensing Images
by Yuqi Miao, Shanshan Jiang, Yiming Xu and Dongjie Wang
Appl. Sci. 2022, 12(10), 5095; https://doi.org/10.3390/app12105095 - 18 May 2022
Cited by 3 | Viewed by 1401
Abstract
Building extraction of remote sensing images is very important for urban planning. In the field of deep learning, in order to extract more detailed building features, more complex convolution operations and larger network models are usually used to segment buildings, resulting in low [...] Read more.
Building extraction of remote sensing images is very important for urban planning. In the field of deep learning, in order to extract more detailed building features, more complex convolution operations and larger network models are usually used to segment buildings, resulting in low efficiency of automatic extraction. The existing network is difficult to balance the extraction accuracy and extraction speed. Considering the segmentation accuracy and speed, a Feature Residual Analysis Network (FRA-Net) is proposed to realize fast and accurate building extraction. The whole network includes two stages: encoding and decoding. In the encoding stage, a Separable Residual Module (SRM) is designed to extract feature information and extract building features from remote sensing images, avoiding the use of large convolution kernels to reduce the complexity of the model. In the decoding stage, the SRM is used for information decoding, and a multi-feature attention module is constructed to enhance the effective information. The experimental results on the LandCover dataset and Massachusetts Buildings dataset show that the reasoning speed has been greatly improved without reducing the segmentation accuracy. Full article
(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)
Show Figures

Figure 1

Back to TopTop