New Trends in Computer Vision, Deep Learning and Artificial Intelligence

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: 30 April 2024 | Viewed by 5309

Special Issue Editors


E-Mail Website
Guest Editor
College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China
Interests: computer vision; deep learning

E-Mail Website
Guest Editor
School of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Interests: medical image analysis; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Computing, National University of Singapore, Singapore 119077, Singapore
Interests: machine learning; high performance computing; parallel and distributed systems; AI applications

Special Issue Information

Dear Colleagues,

In the past decade, deep learning algorithms have dominated in speech, computer vision, and natural language processing, and AI applications have been everywhere in our daily lives. With the amount of labeled data available, a well-trained AI system can perform much better than humans when it comes to easy, repetitive, or determinate tasks such as image recognition, face recognition, translation, etc. Identifying ways to extend AI capabilities to tasks with limited data available and other complex tasks will be particularly important for the next decade.

The purpose of this Special Issue is to gather a collection of articles reflecting new trends in computer vision, deep learning, and artificial intelligence. Topics include but are not limited to the following:

  1. Deep learning with limited data;
  2. Unsupervised deep learning technology;
  3. Deep learning for 3D vision;
  4. Neural rendering and its applications;
  5. Deep learning for efficient detection and segmentation;
  6. Deep learning for video understanding;
  7. Deep learning for language-vision tasks;
  8. Deep learning for visual affective computing;
  9. Deep learning for medical image analysis;
  10. Deep learning training acceleration;
  11. Big AI models;
  12. Industrial AI applications.

Dr. Xiaojiang Peng
Prof. Dr. Linlin Shen
Prof. Dr. Yang You
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • computer vision
  • video understanding
  • 3D vision
  • neural rendering
  • visual affective computing

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 1556 KiB  
Article
Boundary-Match U-Shaped Temporal Convolutional Network for Vulgar Action Segmentation
by Zhengwei Shen, Ran Xu, Yongquan Zhang, Feiwei Qin, Ruiquan Ge, Changmiao Wang and Masahiro Toyoura
Mathematics 2024, 12(6), 899; https://doi.org/10.3390/math12060899 - 18 Mar 2024
Viewed by 446
Abstract
The advent of deep learning has provided solutions to many challenges posed by the Internet. However, efficient localization and recognition of vulgar segments within videos remain formidable tasks. This difficulty arises from the blurring of spatial features in vulgar actions, which can render [...] Read more.
The advent of deep learning has provided solutions to many challenges posed by the Internet. However, efficient localization and recognition of vulgar segments within videos remain formidable tasks. This difficulty arises from the blurring of spatial features in vulgar actions, which can render them indistinguishable from general actions. Furthermore, issues of boundary ambiguity and over-segmentation complicate the segmentation of vulgar actions. To address these issues, we present the Boundary-Match U-shaped Temporal Convolutional Network (BMUTCN), a novel approach for the segmentation of vulgar actions. The BMUTCN employs a U-shaped architecture within an encoder–decoder temporal convolutional network to bolster feature recognition by leveraging the context of the video. Additionally, we introduce a boundary-match map that fuses action boundary inform ation with greater precision for frames that exhibit ambiguous boundaries. Moreover, we propose an adaptive internal block suppression technique, which substantially mitigates over-segmentation errors while preserving accuracy. Our methodology, tested across several public datasets as well as a bespoke vulgar dataset, has demonstrated state-of-the-art performance on the latter. Full article
Show Figures

Figure 1

16 pages, 708 KiB  
Article
Leveraging Chain-of-Thought to Enhance Stance Detection with Prompt-Tuning
by Daijun Ding, Xianghua Fu, Xiaojiang Peng, Xiaomao Fan, Hu Huang and Bowen Zhang
Mathematics 2024, 12(4), 568; https://doi.org/10.3390/math12040568 - 13 Feb 2024
Viewed by 719
Abstract
Investigating public attitudes towards social media is crucial for opinion mining systems to gain valuable insights. Stance detection, which aims to discern the attitude expressed in an opinionated text towards a specific target, is a fundamental task in opinion mining. Conventional approaches mainly [...] Read more.
Investigating public attitudes towards social media is crucial for opinion mining systems to gain valuable insights. Stance detection, which aims to discern the attitude expressed in an opinionated text towards a specific target, is a fundamental task in opinion mining. Conventional approaches mainly focus on sentence-level classification techniques. Recent research has shown that the integration of background knowledge can significantly improve stance detection performance. Despite the significant improvement achieved by knowledge-enhanced methods, applying these techniques in real-world scenarios remains challenging for several reasons. Firstly, existing methods often require the use of complex attention mechanisms to filter out noise and extract relevant background knowledge, which involves significant annotation efforts. Secondly, knowledge fusion mechanisms typically rely on fine-tuning, which can introduce a gap between the pre-training phase of pre-trained language models (PLMs) and the downstream stance detection tasks, leading to the poor prediction accuracy of the PLMs. To address these limitations, we propose a novel prompt-based stance detection method that leverages the knowledge acquired using the chain-of-thought method, which we refer to as PSDCOT. The proposed approach consists of two stages. The first stage is knowledge extraction, where instruction questions are constructed to elicit background knowledge from a VLPLM. The second stage is the multi-prompt learning network (M-PLN) for knowledge fusion, which learns model performance based on the background knowledge and the prompt learning framework. We evaluated the performance of PSDCOT on publicly available benchmark datasets to assess its effectiveness in improving stance detection performance. The results demonstrate that the proposed method achieves state-of-the-art results in in-domain, cross-target, and zero-shot learning settings. Full article
Show Figures

Figure 1

17 pages, 2693 KiB  
Article
SVSeq2Seq: An Efficient Computational Method for State Vectors in Sequence-to-Sequence Architecture Forecasting
by Guoqiang Sun, Xiaoyan Qi, Qiang Zhao, Wei Wang and Yujun Li
Mathematics 2024, 12(2), 265; https://doi.org/10.3390/math12020265 - 13 Jan 2024
Viewed by 554
Abstract
This study proposes an efficient method for computing State Vectors in Sequence-to-Sequence (SVSeq2Seq) architecture to improve the performance of sequence data forecasting, which associates each element with other elements instead of relying only on nearby elements. First, the dependency between two elements is [...] Read more.
This study proposes an efficient method for computing State Vectors in Sequence-to-Sequence (SVSeq2Seq) architecture to improve the performance of sequence data forecasting, which associates each element with other elements instead of relying only on nearby elements. First, the dependency between two elements is adaptively captured by calculating the relative importance between hidden layers. Second, tensor train decomposition is used to address the issue of dimensionality catastrophe. Third, we further select seven instantiated baseline models for data prediction and compare them with our proposed model on six real-world datasets. The results show that the Mean Square Error (MSE) and Mean Absolute Error (MAE) of our SVSeq2Seq model exhibit significant advantages over the other seven baseline models in predicting the three datasets, i.e., weather, electricity, and PEMS, with MSE/MAE values as low as 0.259/0.260, 0.186/0.285 and 0.113/0.222, respectively. Furthermore, the ablation study demonstrates that the SVSeq2Seq model possesses distinct advantages in sequential forecasting tasks. It is observed that replacing SVSeq2Seq with LPRcode and NMTcode resulted in an increase under an MSE of 18.05 and 10.11 times, and an increase under an MAE of 16.54 and 9.8 times, respectively. In comparative experiments with support vector machines (SVM) and random forest (RF), the performance of the SVSeq2Seq model is improved by 56.88 times in the weather dataset and 73.78 times in the electricity dataset under the MSE metric, respectively. The above experimental results demonstrate both the exceptional rationality and versatility of the SVSeq2Seq model for data forecasting. Full article
Show Figures

Figure 1

14 pages, 960 KiB  
Article
OL-JCMSR: A Joint Coding Monitoring Strategy Recommendation Model Based on Operation Log
by Guoqiang Sun, Peng Xu, Man Guo, Hao Sun, Zhaochen Du, Yujun Li and Bin Zhou
Mathematics 2022, 10(13), 2292; https://doi.org/10.3390/math10132292 - 30 Jun 2022
Cited by 1 | Viewed by 980
Abstract
A surveillance system with more than hundreds of cameras and much fewer monitors strongly relies on manual scheduling and inspections from monitoring personnel. A monitoring method which improves the surveillance performance by analyzing and learning from a large amount of manual operation logs [...] Read more.
A surveillance system with more than hundreds of cameras and much fewer monitors strongly relies on manual scheduling and inspections from monitoring personnel. A monitoring method which improves the surveillance performance by analyzing and learning from a large amount of manual operation logs is proposed in this paper. Compared to fixed rules or existing computer-vision methods, the proposed method can more effectively learn from the operators’ behaviors and incorporate their intentions into the monitoring strategy. To the best of our knowledge, this method is the first to apply a monitoring-strategy recommendation model containing a global encoder and a local encoder in monitoring systems. The local encoder can adaptively select important items in the operating sequence to capture the main purpose of the operator, while the global encoder is used to summarize the behavior of the entire sequence. Two experiments are conducted on two data sets. Compared with att-RNN and att-GRU, the joint coding model in experiment 1 improves the Recall@20 by 9.4% and 4.6%, respectively, and improves the MRR@20 by 5.49% and 3.86%, respectively. In experiment 2, compared with att-RNN and att-GRU, the joint coding model improves by 11.8% and 6.2% on Recall@20, and improves by 7.02% and 5.16% on MRR@20, respectively. The results illustrate the effectiveness of the our model in monitoring systems. Full article
Show Figures

Figure 1

20 pages, 3586 KiB  
Article
A Joint Learning Model to Extract Entities and Relations for Chinese Literature Based on Self-Attention
by Li-Xin Liang, Lin Lin, E Lin, Wu-Shao Wen and Guo-Yan Huang
Mathematics 2022, 10(13), 2216; https://doi.org/10.3390/math10132216 - 24 Jun 2022
Cited by 3 | Viewed by 1318
Abstract
Extracting structured information from massive and heterogeneous text is a hot research topic in the field of natural language processing. It includes two key technologies: named entity recognition (NER) and relation extraction (RE). However, previous NER models consider less about the influence of [...] Read more.
Extracting structured information from massive and heterogeneous text is a hot research topic in the field of natural language processing. It includes two key technologies: named entity recognition (NER) and relation extraction (RE). However, previous NER models consider less about the influence of mutual attention between words in the text on the prediction of entity labels, and there is less research on how to more fully extract sentence information for relational classification. In addition, previous research treats NER and RE as a pipeline of two separated tasks, which neglects the connection between them, and is mainly focused on the English corpus. In this paper, based on the self-attention mechanism, bidirectional long short-term memory (BiLSTM) neural network and conditional random field (CRF) model, we put forth a Chinese NER method based on BiLSTM-Self-Attention-CRF and a RE method based on BiLSTM-Multilevel-Attention in the field of Chinese literature. In particular, considering the relationship between these two tasks in terms of word vector and context feature representation in the neural network model, we put forth a joint learning method for NER and RE tasks based on the same underlying module, which jointly updates the parameters of the shared module during the training of these two tasks. For performance evaluation, we make use of the largest Chinese data set containing these two tasks. Experimental results show that the proposed independently trained NER and RE models achieve better performance than all previous methods, and our joint NER-RE training model outperforms the independently-trained NER and RE model. Full article
Show Figures

Figure 1

Back to TopTop