Special Issue "Advances in Artificial Intelligence: Data, Methods and Interdisciplinary Applications"

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: 31 August 2023 | Viewed by 18292

Special Issue Editors

1. College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
2. Interdisciplinary Research Center for Artificial Intelligence, Beijing University of Chemical Technology, Beijing 100029, China
Interests: image processing; artificial intelligence; remote sensing; high performance computing
Special Issues, Collections and Topics in MDPI journals
School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
Interests: multi-label learning; partial-label learning; multi-view clustering; multi-label image classification
College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
Interests: image processing; artificial intelligence; remote sensing; calibration; signal processing
School of Software, Beihang University, Beijing 100191, China
Interests: computer vision; pattern recognition; biometrics; metric learning; face recognition

Special Issue Information

Dear Colleagues,

In recent years, artificial intelligence (AI) and big data methods are becoming widely used toolkits in different scientific fields, such as mathematics, physics, chemistry, biology, security, etc. Apparently, AI technology can integrate and innovate with any disciplinary knowledge, opening a new era of interdisciplinary research. As a kind of data-driven methodology, data, methods, and applications constitute the most essential elements of AI technology research and are worthy of in-depth study and discussion.

The aim of this Special Issue is to publish original research articles covering advances in the data, model, theory, and application of artificial intelligence, especially its interdisciplinary applications, e.g., AI plus mathematics, AI plus physics, or AI plus chemistry. The topics of interest include but are not limited to the following:

  • Analysis of data distribution problem;
  • Analysis of data label problem;
  • Analysis of data quality problems;
  • Deep learning and machine learning;
  • Explainable artificial intelligence;
  • Applications of image analysis;
  • Applications of AI and mathematics;
  • Applications of AI and chemistry;
  • Applications of AI and biology;
  • Other AI interdisciplinary research.

Prof. Dr. Fan Zhang
Prof. Dr. Songhe Feng
Prof. Dr. Yongsheng Zhou
Dr. Junlin Hu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2100 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • long-tailed data
  • noisy labeled data
  • deep learning
  • machine learning
  • casual learning
  • image interpretation
  • data inversion
  • natural language processing
  • pattern recognition
  • computer vision

Published Papers (18 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
Dynamic Learning Rate of Template Update for Visual Target Tracking
Mathematics 2023, 11(9), 1988; https://doi.org/10.3390/math11091988 - 23 Apr 2023
Viewed by 379
Abstract
The trackers based on discriminative correlation filter (DCF) have achieved remarkable performance in visual target tracking in recent years. Since the targets are usually affected by various factors such as deformation, rotation, motion blur and so on, the trackers have to update the [...] Read more.
The trackers based on discriminative correlation filter (DCF) have achieved remarkable performance in visual target tracking in recent years. Since the targets are usually affected by various factors such as deformation, rotation, motion blur and so on, the trackers have to update the templates for tracking online. The purpose of template update is to adapt to the target changes, the magnitude of which is closely related to the motion state of the target. Actually, the learning rate of template update indicates the weight of the historical sample, and its value is fixed in most existing trackers, which will decrease the precision of the tracker or make the tracker unstable. In this study, a new dynamic learning rate method for template update is proposed for visual target tracking. The motion state of the target is defined by the difference in target center position between the frames. Then, the learning rate is adjusted dynamically according to the motion state of the target instead of the fixed value, which could achieve better performance. Experiments on the popular datasets OTB100 and UAV123 show that with the proposed dynamic learning rate for template update, the DCF-based trackers can improve tracking accuracy and obtain better tracking stability in scenarios such as fast movement and motion blur. Full article
Show Figures

Figure 1

Article
Attention and Pixel Matching in RGB-T Object Tracking
Mathematics 2023, 11(7), 1646; https://doi.org/10.3390/math11071646 - 29 Mar 2023
Viewed by 472
Abstract
Visual object tracking using visible light images and thermal infrared images, named RGB-T tracking, has recently attracted increasing attention in the tracking community. Deep neural network-based methods becoming the most popular RGB-T trackers, still have to balance the robustness and the speed of [...] Read more.
Visual object tracking using visible light images and thermal infrared images, named RGB-T tracking, has recently attracted increasing attention in the tracking community. Deep neural network-based methods becoming the most popular RGB-T trackers, still have to balance the robustness and the speed of calculation. A novel tracker with Siamese architecture is proposed to obtain the accurate object location and meet the real-time requirements. Firstly, a multi-modal weight penalty module is designed to assign different weights to the RGB and thermal infrared features. Secondly, a new pixel matching module is proposed to calculate the similarity between each pixel on the search and the template features, which can avoid bringing excessive background information versus the regular cross-correlation operation. Finally, an improved anchor-free bounding box prediction network is put forward to further reduce the interference of the background information. The experimental results on the standard RGB-T tracking benchmark datasets show that the proposed method achieves better precision and success rate with a speed of over 34 frames per second which satisfies the real-time tracking. Full article
Show Figures

Figure 1

Article
PACR: Pixel Attention in Classification and Regression for Visual Object Tracking
Mathematics 2023, 11(6), 1406; https://doi.org/10.3390/math11061406 - 14 Mar 2023
Viewed by 459
Abstract
Anchor-free-based trackers have achieved remarkable performance in single visual object tracking in recent years. Most anchor-free trackers consider the rectangular fields close to the target center as the positive sample used in the training phase, while they always use the maximum of the [...] Read more.
Anchor-free-based trackers have achieved remarkable performance in single visual object tracking in recent years. Most anchor-free trackers consider the rectangular fields close to the target center as the positive sample used in the training phase, while they always use the maximum of the corresponding map to determine the location of the target in the tracking phase. Thus, this will make the tracker inconsistent between the training and tracking phase. To solve this problem, we propose a pixel-attention module (PAM), which ensures the consistency of the training and tracking phase through a self-attention module. Moreover, we put forward a new refined branch named Acc branch to inherit the benefit of the PAM. The score of Acc branch can tune the classification and the regression of the tracking target more precisely. We conduct extensive experiments on challenging benchmarks such as VOT2020, UAV123, DTB70, OTB100, and a large-scale benchmark LaSOT. Compared with other anchor-free trackers, our tracker gains excellent performance in small-scale datasets. In UAV benchmarks such as UAV123 and DTB70, the precision of our tracker increases 4.3% and 1.8%, respectively, compared with the SOTA in anchor-free trackers. Full article
Show Figures

Figure 1

Article
A High-Precision Two-Stage Legal Judgment Summarization
Mathematics 2023, 11(6), 1320; https://doi.org/10.3390/math11061320 - 09 Mar 2023
Viewed by 498
Abstract
Legal judgments are generally very long, and relevant information is often scattered throughout the text. To complete a legal judgment summarization, capturing important, relevant information comprehensively from a lengthy text is crucial. The existing abstractive-summarization models based on pre-trained language have restrictions on [...] Read more.
Legal judgments are generally very long, and relevant information is often scattered throughout the text. To complete a legal judgment summarization, capturing important, relevant information comprehensively from a lengthy text is crucial. The existing abstractive-summarization models based on pre-trained language have restrictions on the length of an input text. Another concern is that the generated summaries have not been well integrated with the legal judgment’s technical terms and specific topics. In this paper, we used raw legal judgments as information of different granularities and proposed a two-stage text-summarization model to handle different granularities of information. Specifically, we treated the legal judgments as a sequence of sentences and selected key sentence sets from the full texts as an input corpus for summary generation. In addition, we extracted keywords related to technical terms and specific topics in the legal texts and introduced them into the summary-generation model as an attention mechanism. The experimental results on the CAIL2020 and the LCRD datasets showed that our model achieved an overall 0.19–0.41 improvement in its ROUGE score, as compared to the baseline models. Further analysis also showed that our method could comprehensively capture essential and relevant information from lengthy legal texts and generate better legal judgment summaries. Full article
Show Figures

Figure 1

Article
A Mental Workload Classification Method Based on GCN Modified by Squeeze-and-Excitation Residual
Mathematics 2023, 11(5), 1189; https://doi.org/10.3390/math11051189 - 28 Feb 2023
Viewed by 487
Abstract
In some complex labor production and human–machine interactions, such as subway driving, to ensure both the efficient and rapid completion of work and the personal safety of staff and the integrity of operating equipment, the level of mental workload (MW) of operators is [...] Read more.
In some complex labor production and human–machine interactions, such as subway driving, to ensure both the efficient and rapid completion of work and the personal safety of staff and the integrity of operating equipment, the level of mental workload (MW) of operators is monitored at all times. In existing machine learning-based MW classification methods, the association information between neurons in different regions is almost not considered. To solve the above problem, a graph convolution network based on the squeeze-and-excitation (SE) block is proposed. For a raw electroencephalogram (EEG) signal, the principal component analysis (PCA) dimensionality reduction operation is carried out. After that, combined with the spatial distribution between brain electrodes, the dimensionality reduction data can be converted to graph structure data, carrying association information between neurons in different regions. In addition, we use graph convolution neural network (GCN) modified by SE residual to obtain final classification results. Here, to adaptively recalibrate channel-wise feature responses by explicitly modelling interdependencies between channels, the SE block is introduced. The residual connection can ease the training of networks. To discuss the performance of the proposed method, we carry out some experiments using the raw EEG signals of 10 healthy subjects, which are collected using the MATB-II platform based on multi-task aerial context manipulation. From the experiment results, the structural reasonableness and the performance superiority of the proposed method are verified. In short, the proposed GCN modified by the SE residual method is a workable plan of mental workload classification. Full article
Show Figures

Figure 1

Article
Improving the Performance of RODNet for MMW Radar Target Detection in Dense Pedestrian Scene
Mathematics 2023, 11(2), 361; https://doi.org/10.3390/math11020361 - 10 Jan 2023
Cited by 1 | Viewed by 732
Abstract
In the field of autonomous driving, millimeter-wave (MMW) radar is often used as a supplement sensor of other types of sensors, such as optics, in severe weather conditions to provide target-detection services for autonomous driving. RODNet (A Real-Time Radar Object-Detection Network) is one [...] Read more.
In the field of autonomous driving, millimeter-wave (MMW) radar is often used as a supplement sensor of other types of sensors, such as optics, in severe weather conditions to provide target-detection services for autonomous driving. RODNet (A Real-Time Radar Object-Detection Network) is one of the most widely used MMW radar range–azimuth (RA) image sequence target-detection algorithms based on Convolutional Neural Networks (CNNs). However, RODNet adopts an object-location similarity (OLS) detection method that is independent of the number of targets to obtain the final target detections from the predicted confidence map. Therefore, it gives a poor performance on missed detection ratio in dense pedestrian scenes. Based on the analysis of the predicted confidence map distribution characteristics, we propose a new generative model-based target-location detection algorithm to improve the performance of RODNet in dense pedestrian scenes. The confidence value and space distribution predicted by RODNet are analyzed in this paper. It shows that the space distribution is more robust than the value distribution for clustering. This is useful in selecting a clustering method to estimate the clustering centers of multiple targets in close range under the effects of distributed target and radar measurement variance and multipath scattering. Another key idea of this algorithm is the derivation of a Gaussian Mixture Model with target number (GMM-TN) for generating the likelihood probability distributions of different target number assumptions. Furthermore, a minimum Kullback–Leibler (KL) divergence target number estimation scheme is proposed combined with K-means clustering and a GMM-TN model. Through the CRUW dataset, the target-detection experiment on a dense pedestrian scene is carried out, and the confidence distribution under typical hidden variable conditions is analyzed. The effectiveness of the improved algorithm is verified: the Average Precision (AP) is improved by 29% and the Average Recall (AR) is improved by 36%. Full article
Show Figures

Figure 1

Article
High-Cardinality Categorical Attributes and Credit Card Fraud Detection
Mathematics 2022, 10(20), 3808; https://doi.org/10.3390/math10203808 - 15 Oct 2022
Cited by 2 | Viewed by 843
Abstract
Credit card transactions may contain some categorical attributes with large domains, involving up to hundreds of possible values, also known as high-cardinality attributes. The inclusion of such attributes makes analysis harder, due to results with poorer generalization and higher resource usage. A common [...] Read more.
Credit card transactions may contain some categorical attributes with large domains, involving up to hundreds of possible values, also known as high-cardinality attributes. The inclusion of such attributes makes analysis harder, due to results with poorer generalization and higher resource usage. A common practice is, therefore, to ignore such attributes, removing them, albeit wasting the information they provided. Contrariwise, this paper reports our findings on the positive impacts of using high-cardinality attributes on credit card fraud detection. Thus, we present a new algorithm for domain reduction that preserves the fraud-detection capabilities. Experiments applying a deep feedforward neural network on real datasets from a major Brazilian financial institution have shown that, when measured by the F-1 metric, the inclusion of such attributes does improve fraud-detection quality. As a main contribution, this proposed algorithm was able to reduce attribute cardinality, improving the training times of a model while preserving its predictive capabilities. Full article
Show Figures

Figure 1

Article
Masked Autoencoder for Pre-Training on 3D Point Cloud Object Detection
Mathematics 2022, 10(19), 3549; https://doi.org/10.3390/math10193549 - 28 Sep 2022
Viewed by 1517
Abstract
In autonomous driving, the 3D LiDAR (Light Detection and Ranging) point cloud data of the target are missing due to long distance and occlusion. It makes object detection more difficult. This paper proposes Point Cloud Masked Autoencoder (PCMAE), which can provide pre-training for [...] Read more.
In autonomous driving, the 3D LiDAR (Light Detection and Ranging) point cloud data of the target are missing due to long distance and occlusion. It makes object detection more difficult. This paper proposes Point Cloud Masked Autoencoder (PCMAE), which can provide pre-training for most voxel-based point cloud object detection algorithms. PCMAE improves the feature representation ability of the 3D backbone for long-distance and occluded objects through self-supervised learning. First, a point cloud masking strategy for autonomous driving scenes named PC-Mask is proposed. It is used to simulate the problem of missing point cloud data information due to occlusion and distance in autonomous driving scenarios. Then, a symmetrical encoder–decoder architecture is designed for pre-training. The encoder is used to extract the high-level features of the point cloud after PC-Mask, and the decoder is used to reconstruct the complete point cloud. Finally, the pre-training method proposed in this paper is applied to SECOND (Sparsely Embedded Convolutional Detection) and Part-A2-Net (Part-aware and Aggregate Neural Network) object detection algorithms. The experimental results show that our method can speed up the model convergence speed and improve the detection accuracy, especially the detection effect of long-distance and occluded objects. Full article
Show Figures

Figure 1

Article
Smoothed Quantile Regression with Factor-Augmented Regularized Variable Selection for High Correlated Data
Mathematics 2022, 10(16), 2935; https://doi.org/10.3390/math10162935 - 15 Aug 2022
Viewed by 695
Abstract
This paper studies variable selection for the data set, which has heavy-tailed distribution and high correlations within blocks of covariates. Motivated by econometric and financial studies, we consider using quantile regression to model the heavy-tailed distribution data. Considering the case where the covariates [...] Read more.
This paper studies variable selection for the data set, which has heavy-tailed distribution and high correlations within blocks of covariates. Motivated by econometric and financial studies, we consider using quantile regression to model the heavy-tailed distribution data. Considering the case where the covariates are high dimensional and there are high correlations within blocks, we use the latent factor model to reduce the correlations between the covariates and use the conquer to obtain the estimators of quantile regression coefficients, and we propose a consistency strategy named factor-augmented regularized variable selection for quantile regression (Farvsqr). By principal component analysis, we can obtain the latent factors and idiosyncratic components; then, we use both as predictors instead of the covariates with high correlations. Farvsqr transforms the problem from variable selection with highly correlated covariates to that with weakly correlated ones for quantile regression. Variable selection consistency is obtained under mild conditions. Simulation study and real data application demonstrate that our method is better than the common regularized M-estimation LASSO. Full article
Show Figures

Figure 1

Article
Prediction Method of Human Fatigue in an Artificial Atmospheric Environment Based on Dynamic Bayesian Network
Mathematics 2022, 10(15), 2778; https://doi.org/10.3390/math10152778 - 05 Aug 2022
Viewed by 713
Abstract
Fatigue state usually leads to slow reaction of the human body and its thoughts. It is an important factor causing significant decline in the working ability of workers, an increase in error rate and even major accidents. It would have a more negative [...] Read more.
Fatigue state usually leads to slow reaction of the human body and its thoughts. It is an important factor causing significant decline in the working ability of workers, an increase in error rate and even major accidents. It would have a more negative impact in an artificial atmospheric environment. The effective prediction of fatigue can contribute to improved working efficiency and reduce the occurrence of accidents. In this paper, a prediction method of human fatigue in an artificial atmospheric environment was established, combining as many as eight input parameters about the cause and effect of human fatigue based on a dynamic Bayesian network in order to achieve a relatively comprehensive and accurate prediction of human fatigue. This fatigue prediction method was checked by experimental results. The results indicate that the established prediction method could provide a relatively reliable way to predict a worker fatigue state in an artificial atmospheric working environment. Full article
Show Figures

Figure 1

Article
Cloudformer V2: Set Prior Prediction and Binary Mask Weighted Network for Cloud Detection
Mathematics 2022, 10(15), 2710; https://doi.org/10.3390/math10152710 - 31 Jul 2022
Cited by 2 | Viewed by 784
Abstract
Cloud detection is an essential step in optical remote sensing data processing. With the development of deep learning technology, cloud detection methods have made remarkable progress. Among them, researchers have started to try to introduce Transformer into cloud detection tasks due to its [...] Read more.
Cloud detection is an essential step in optical remote sensing data processing. With the development of deep learning technology, cloud detection methods have made remarkable progress. Among them, researchers have started to try to introduce Transformer into cloud detection tasks due to its excellent performance in image semantic segmentation tasks. However, the current Transformer-based methods suffer from training difficulty and low detection accuracy of small clouds. To solve these problems, this paper proposes Cloudformer V2 based on the previously proposed Cloudformer. For the training difficulty, Cloudformer V2 uses Set Attention Block to extract intermediate features as Set Prior Prediction to participate in supervision, which enables the model to converge faster. For the detection of small clouds, Cloudformer V2 decodes the features by a multi-scale Transformer decoder, which uses multi-resolution features to improve the modeling accuracy. In addition, a binary mask weighted loss function (BW Loss) is designed to construct weights by counting pixels classified as clouds; thus, guiding the network to focus on features of small clouds and improving the overall detection accuracy. Cloudformer V2 is experimented on the dataset from GF-1 satellite and has excellent performance. Full article
Show Figures

Figure 1

Article
Sensitive Channel Selection for Mental Workload Classification
Mathematics 2022, 10(13), 2266; https://doi.org/10.3390/math10132266 - 29 Jun 2022
Viewed by 962
Abstract
Mental workload (MW) assessment has been widely studied in various human–machine interaction tasks. The existing researches on MW classification mostly use non-invasive electroencephalography (EEG) caps to collect EEG signals and identify MW levels. However, the activation region of the brain stimulated by MW [...] Read more.
Mental workload (MW) assessment has been widely studied in various human–machine interaction tasks. The existing researches on MW classification mostly use non-invasive electroencephalography (EEG) caps to collect EEG signals and identify MW levels. However, the activation region of the brain stimulated by MW tasks is not the same for every subject. It may be inappropriate to use EEG signals from all electrode channels to identify MW. In this paper, an EEG rhythm energy heatmap is first established to visually show the change trends in the energy of four EEG rhythms with time, EEG channels and MW levels. It can be concluded from the presented heatmaps that this change trend varies with subjects, rhythms and channels. Based on the analysis, a double threshold method is proposed to select sensitive channels for MW assessment. The EEG signals of personalized selected channels, named positive sensitive channels (PSCs) and negative sensitive channels (NSCs), are used for MW classification using the Support Vector Machine (SVM) algorithm. The results show that the selection of personalized sensitive channels generally contributes to improving the performance of MW classification. Full article
Show Figures

Figure 1

Article
A Novel Effective Vehicle Detection Method Based on Swin Transformer in Hazy Scenes
Mathematics 2022, 10(13), 2199; https://doi.org/10.3390/math10132199 - 23 Jun 2022
Cited by 5 | Viewed by 1304
Abstract
Under bad weather, the ability of intelligent vehicles to perceive the environment accurately is an important research content in many practical applications such as smart cities and unmanned driving. In order to improve vehicle environment perception technology in real hazy scenes, we propose [...] Read more.
Under bad weather, the ability of intelligent vehicles to perceive the environment accurately is an important research content in many practical applications such as smart cities and unmanned driving. In order to improve vehicle environment perception technology in real hazy scenes, we propose an effective detection algorithm based on Swin Transformer for hazy vehicle detection. This algorithm includes two aspects. First of all, for the aspect of the difficulty in extracting haze features with poor visibility, a dehazing network is designed to obtain high-quality haze-free output through encoding and decoding methods using Swin Transformer blocks. In addition, for the aspect of the difficulty of vehicle detection in hazy images, a new end-to-end vehicle detection model in hazy days is constructed by fusing the dehazing module and the Swin Transformer detection module. In the training stage, the self-made dataset Haze-Car is used, and the haze detection model parameters are initialized by using the dehazing model and Swin-T through transfer learning. Finally, the final haze detection model is obtained by fine tuning. Through the joint learning of dehazing and object detection and comparative experiments on the self-made real hazy image dataset, it can be seen that the detection performance of the model in real-world scenes is improved by 12.5%. Full article
Show Figures

Figure 1

Article
Transmission Line Object Detection Method Based on Label Adaptive Allocation
Mathematics 2022, 10(12), 2150; https://doi.org/10.3390/math10122150 - 20 Jun 2022
Cited by 1 | Viewed by 1120
Abstract
Inspection of the integrality of components and connecting parts is an important task to maintain safe and stable operation of transmission lines. In view of the fact that the scale difference of the auxiliary component in a connecting part is large and the [...] Read more.
Inspection of the integrality of components and connecting parts is an important task to maintain safe and stable operation of transmission lines. In view of the fact that the scale difference of the auxiliary component in a connecting part is large and the background environment of the object is complex, a one-stage object detection method based on the enhanced real feature information and the label adaptive allocation is proposed in this study. Based on the anchor-free detection algorithm FCOS, this method is optimized by expanding the real feature information of the adjacent feature layer fusion and the semantic information of the deep feature layer, as well as adaptively assigning the label through the idea of pixel-by-pixel detection. In addition, the grading ring image is sliced in original data to improve the proportion of bolts in the dataset, which can clear the appearance features of small objects and reduce the difficulty of detection. Experimental results show that this method can eliminate the background interference in the GT (ground truth) as much as possible in object detection process, and improve the detection accuracy for objects with a narrow shape and small size. The evaluation index AP (average precision) increased by 4.1%. Further improvement of detection accuracy lays a foundation for the realization of efficient real-time patrol inspection. Full article
Show Figures

Figure 1

Article
HA-RoadFormer: Hybrid Attention Transformer with Multi-Branch for Large-Scale High-Resolution Dense Road Segmentation
Mathematics 2022, 10(11), 1915; https://doi.org/10.3390/math10111915 - 02 Jun 2022
Cited by 3 | Viewed by 1034
Abstract
Road segmentation is one of the essential tasks in remote sensing. Large-scale high-resolution remote sensing images originally have larger pixel sizes than natural images, while the existing models based on Transformer have the high computational cost of square complexity, leading to more extended [...] Read more.
Road segmentation is one of the essential tasks in remote sensing. Large-scale high-resolution remote sensing images originally have larger pixel sizes than natural images, while the existing models based on Transformer have the high computational cost of square complexity, leading to more extended model training and inference time. Inspired by the long text Transformer model, this paper proposes a novel hybrid attention mechanism to improve the inference speed of the model. By calculating several diagonals and random blocks of the attention matrix, hybrid attention achieves linear time complexity in the token sequence. Using the superposition of adjacent and random attention, hybrid attention introduces the inductive bias similar to convolutional neural networks (CNNs) and retains the ability to acquire long-distance dependence. In addition, the dense road segmentation result of remote sensing image still has the problem of insufficient continuity. However, multiscale feature representation is an effective means in the network based on CNNs. Inspired by this, we propose a multi-scale patch embedding module, which divides images by patches with different scales to obtain coarse-to-fine feature representations. Experiments on the Massachusetts dataset show that the proposed HA-RoadFormer could effectively preserve the integrity of the road segmentation results, achieving a higher Intersection over Union (IoU) 67.36% of road segmentation compared to other state-of-the-art (SOTA) methods. At the same time, the inference speed has also been greatly improved compared with other Transformer based models. Full article
Show Figures

Figure 1

Article
Building Damage Assessment Based on Siamese Hierarchical Transformer Framework
Mathematics 2022, 10(11), 1898; https://doi.org/10.3390/math10111898 - 01 Jun 2022
Cited by 3 | Viewed by 1394
Abstract
The rapid and accurate damage assessment of buildings plays a critical role in disaster response. Based on pairs of pre- and post-disaster remote sensing images, effective building damage level assessment can be conducted. However, most existing methods are based on Convolutional Neural Network, [...] Read more.
The rapid and accurate damage assessment of buildings plays a critical role in disaster response. Based on pairs of pre- and post-disaster remote sensing images, effective building damage level assessment can be conducted. However, most existing methods are based on Convolutional Neural Network, which has limited ability to learn the global context. An attention mechanism helps ameliorate this problem. Hierarchical Transformer has powerful potential in the remote sensing field with strong global modeling capability. In this paper, we propose a novel two-stage damage assessment framework called SDAFormer, which embeds a symmetric hierarchical Transformer into a siamese U-Net-like network. In the first stage, the pre-disaster image is fed into a segmentation network for building localization. In the second stage, a two-branch damage classification network is established based on weights shared from the first stage. Then, pre- and post-disaster images are delivered to the network separately for damage assessment. Moreover, a spatial fusion module is designed to improve feature representation capability by building pixel-level correlation, which establishes spatial information in Swin Transformer blocks. The proposed framework achieves significant improvement on the large-scale building damage assessment dataset—xBD. Full article
Show Figures

Figure 1

Article
Mental Workload Classification Method Based on EEG Cross-Session Subspace Alignment
Mathematics 2022, 10(11), 1875; https://doi.org/10.3390/math10111875 - 30 May 2022
Cited by 4 | Viewed by 1132
Abstract
Electroencephalogram (EEG) signals are sensitive to the level of Mental Workload (MW). However, the random non-stationarity of EEG signals will lead to low accuracy and a poor generalization ability for cross-session MW classification. To solve this problem of the different marginal distribution of [...] Read more.
Electroencephalogram (EEG) signals are sensitive to the level of Mental Workload (MW). However, the random non-stationarity of EEG signals will lead to low accuracy and a poor generalization ability for cross-session MW classification. To solve this problem of the different marginal distribution of EEG signals in different time periods, an MW classification method based on EEG Cross-Session Subspace Alignment (CSSA) is presented to identify the level of MW induced in visual manipulation tasks. The Independent Component Analysis (ICA) method is used to obtain the Independent Components (ICs) of labeled and unlabeled EEG signals. The energy features of ICs are extracted as source domains and target domains, respectively. The marginal distributions of source subspace base vectors are aligned with the target subspace base vectors based on the linear mapping. The Kullback–Leibler (KL) divergences between the two domains are calculated to select approximately similar transformed base vectors of source subspace. The energy features in all selected vectors are trained to build a new classifier using the Support Vector Machine (SVM). Then it can realize MW classification using the cross-session EEG signals, and has good classification accuracy. Full article
Show Figures

Figure 1

Article
Multi-View Cosine Similarity Learning with Application to Face Verification
Mathematics 2022, 10(11), 1800; https://doi.org/10.3390/math10111800 - 25 May 2022
Cited by 3 | Viewed by 2034
Abstract
An instance can be easily depicted from different views in pattern recognition, and it is desirable to exploit the information of these views to complement each other. However, most of the metric learning or similarity learning methods are developed for single-view feature representation [...] Read more.
An instance can be easily depicted from different views in pattern recognition, and it is desirable to exploit the information of these views to complement each other. However, most of the metric learning or similarity learning methods are developed for single-view feature representation over the past two decades, which is not suitable for dealing with multi-view data directly. In this paper, we propose a multi-view cosine similarity learning (MVCSL) approach to efficiently utilize multi-view data and apply it for face verification. The proposed MVCSL method is able to leverage both the common information of multi-view data and the private information of each view, which jointly learns a cosine similarity for each view in the transformed subspace and integrates the cosine similarities of all the views in a unified framework. Specifically, MVCSL employs the constraints that the joint cosine similarity of positive pairs is greater than that of negative pairs. Experiments on fine-grained face verification and kinship verification tasks demonstrate the superiority of our MVCSL approach. Full article
Show Figures

Figure 1

Planned Papers

The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.

Title: A Zoom-net-Based Deep Learning Detector for Ship Detection
Authors: Yongsheng Zhou1, Hanchao Liu1 and Fei Ma1,*, Fan Zhang 1
Affiliation: 1 College of Information Science and Technology, Beijing University of Chemical Technology;

Title: An edge computing based digital twin on spaceborne SAR imaging and image interpretation
Authors: Fei Ma 1, Jie Bao 1, Xiaokun Sun 1, *, Yating Zhou 1 and Fan Zhang 1
Affiliation: 1 College of Information Science and Technology, Beijing University of Chemical Technology;

Back to TopTop