Deep Learning for Human-Centric Computer Vision

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: closed (25 January 2023) | Viewed by 11431

Special Issue Editors

1. Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China
2. Center of Materials Science and Optoelectronics Engineering School of Integrated Circuits, University of Chinese Academy of Sciences, Beijing 100049, China
Interests: pattern recognition; image classification; neural network; convolutional network; computer vision; object detection
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Artificial Intelligence and Computer Sciences, Jiangnan University, Wuxi 214122, China
Interests: pattern recognition, computational intelligence and its applications in intelligent healthcare

E-Mail
Guest Editor
1. Graduate School, Northern Arizona University, Flagstaff, AZ 86011, USA
2. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
Interests: neural networks; forecast modeling; deep learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Deep learning has been widely used in massive applications, such as natural language processing, computer vision, decision making, and data security. Among these, the field of computer vision has recently seen rapid development. However, it has been accompanied by certain challenges, such as person re-identification, lip language recognition, makeup transfer, and face image editing. A majority of these challenges belong to the category of human-centric computer vision, which has become an important research topic in academia. Several studies proposing deep learning-based approaches have achieved promising results in solving human-centric computer vision problems.

This Special Issue aims to curate original research and review articles from academia and industry-relevant researchers in the fields of deep learning, image processing, and computer vision. Researchers and industry practitioners from academia are invited to submit their innovative research on technical challenges and state-of-the-art findings related to human-centric computer vision. This Special Issue provides an opportunity to discuss and express views on current trends, challenges, and state-of-the-art solutions to the various problems in human-centric computer vision.

Topic:

  1. Face recognition;
  2. Expression recognition and affective computing;
  3. Face image editing and makeup transfer;
  4. Finger vein recognition;
  5. Gait recognition;
  6. Iris recognition;
  7. Human pose estimation;
  8. Pedestrian detection and tracking;
  9. Person re-identification;
  10. Gesture recognition;
  11. Lip language recognition;
  12. 3D vision face or human body application.

Dr. Xin Ning
Prof. Dr. Yizhang Jiang
Dr. Weiwei Cai
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • computer Vision

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

11 pages, 2307 KiB  
Article
Liver CT Image Recognition Method Based on Capsule Network
by Qifan Wang, Aibin Chen and Yongfei Xue
Information 2023, 14(3), 183; https://doi.org/10.3390/info14030183 - 15 Mar 2023
Cited by 2 | Viewed by 1273
Abstract
The automatic recognition of CT (Computed Tomography) images of liver cancer is important for the diagnosis and treatment of early liver cancer. However, there are problems such as single model structure and loss of pooling layer information when using a traditional convolutional neural [...] Read more.
The automatic recognition of CT (Computed Tomography) images of liver cancer is important for the diagnosis and treatment of early liver cancer. However, there are problems such as single model structure and loss of pooling layer information when using a traditional convolutional neural network to recognize CT images of liver cancer. Therefore, this paper proposes an efficient method for liver CT image recognition based on the capsule network (CapsNet). Firstly, the liver CT images are preprocessed, and in the process of image denoising, the traditional non-local mean (NLM) denoising algorithm is optimized with a superpixel segmentation algorithm to better protect the information of image edges. After that, CapsNet was used for image recognition for liver CT images. The experimental results show that the average recognition rate of liver CT images reaches 92.9% when CapsNet is used, which is 5.3% higher than the traditional CNN model, indicating that CapsNet has better recognition accuracy for liver CT images. Full article
(This article belongs to the Special Issue Deep Learning for Human-Centric Computer Vision)
Show Figures

Figure 1

14 pages, 12390 KiB  
Article
Research on Pedestrian Detection Based on the Multi-Scale and Feature-Enhancement Model
by Rui Li and Yaxin Zu
Information 2023, 14(2), 123; https://doi.org/10.3390/info14020123 - 14 Feb 2023
Cited by 2 | Viewed by 1905
Abstract
Pedestrian detection represents one of the critical tasks of computer vision; however, detecting pedestrians can be compromised by problems such as the various scale of pedestrian features and cluttered background, which can easily cause a loss of accuracy. Therefore, we propose a pedestrian [...] Read more.
Pedestrian detection represents one of the critical tasks of computer vision; however, detecting pedestrians can be compromised by problems such as the various scale of pedestrian features and cluttered background, which can easily cause a loss of accuracy. Therefore, we propose a pedestrian detection method based on the FCOS network. Firstly, we designed a feature enhancement module to ensure that effective high-level semantics are obtained while preserving the detailed features of pedestrians. Secondly, we defined a key-center region judgment to reduce the interference of background information on pedestrian feature extraction. By testing on the Caltech pedestrian dataset, the AP value is improved from 87.36% to 94.16%. The results of the comparison experiment illustrate that the model proposed in this paper can significantly increase the accuracy. Full article
(This article belongs to the Special Issue Deep Learning for Human-Centric Computer Vision)
Show Figures

Figure 1

13 pages, 1154 KiB  
Article
Masked Face Recognition System Based on Attention Mechanism
by Yuming Wang, Yu Li and Hua Zou
Information 2023, 14(2), 87; https://doi.org/10.3390/info14020087 - 02 Feb 2023
Cited by 7 | Viewed by 2338
Abstract
With the continuous development of deep learning, the face recognition field has also developed rapidly. However, with the massive popularity of COVID-19, face recognition with masks is a problem that is now about to be tackled in practice. In recognizing a face wearing [...] Read more.
With the continuous development of deep learning, the face recognition field has also developed rapidly. However, with the massive popularity of COVID-19, face recognition with masks is a problem that is now about to be tackled in practice. In recognizing a face wearing a mask, the mask obscures most of the facial features of the face, resulting in the general face recognition model only capturing part of the facial information. Therefore, existing face recognition models are usually ineffective in recognizing faces wearing masks. This article addresses this problem in the existing face recognition model and proposes an improvement of Facenet. We use ConvNeXt-T as the backbone of the network model and add the ECA (Efficient Channel Attention) mechanism. This enhances the feature extraction of the unobscured part of the face to obtain more useful information, while avoiding dimensionality reduction and not increasing the model complexity. We design new face recognition models by investigating the effects of different attention mechanisms on face mask recognition models and the effects of different data set ratios on experimental results. In addition, we construct a large set of faces wearing masks so that we can efficiently and quickly train the model. Through experiments, our model proved to be 99.76% accurate for real faces wearing masks. A combined accuracy of 99.48% for extreme environments such as too high or lousy contrast and brightness. Full article
(This article belongs to the Special Issue Deep Learning for Human-Centric Computer Vision)
Show Figures

Figure 1

15 pages, 3500 KiB  
Article
Basketball Action Recognition Method of Deep Neural Network Based on Dynamic Residual Attention Mechanism
by Jiongen Xiao, Wenchun Tian and Liping Ding
Information 2023, 14(1), 13; https://doi.org/10.3390/info14010013 - 27 Dec 2022
Cited by 4 | Viewed by 2264
Abstract
Aiming at the problem that the features extracted from the original C3D (Convolutional 3D) convolutional neural network(C3D) were insufficient, and it was difficult to focus on keyframes, which led to the low accuracy of basketball players’ action recognition; hence, a basketball action recognition [...] Read more.
Aiming at the problem that the features extracted from the original C3D (Convolutional 3D) convolutional neural network(C3D) were insufficient, and it was difficult to focus on keyframes, which led to the low accuracy of basketball players’ action recognition; hence, a basketball action recognition method of deep neural network based on dynamic residual attention mechanism was proposed. Firstly, the traditional C3D is improved to a dynamic residual convolution network to extract sufficient feature information. Secondly, the extracted feature information is selected by the improved attention mechanism to obtain the key video frames. Finally, the algorithm is compared with the traditional C3D in order to demonstrate the advance and applicability of the algorithm. Experimental results show that this method can effectively recognize basketball posture, and the average accuracy of posture recognition is more than 97%. Full article
(This article belongs to the Special Issue Deep Learning for Human-Centric Computer Vision)
Show Figures

Figure 1

16 pages, 4333 KiB  
Article
Hybrid No-Reference Quality Assessment for Surveillance Images
by Zhongchang Ye, Xin Ye and Zhonghua Zhao
Information 2022, 13(12), 588; https://doi.org/10.3390/info13120588 - 16 Dec 2022
Cited by 2 | Viewed by 1376
Abstract
Intelligent video surveillance (IVS) technology is widely used in various security systems. However, quality degradation in surveillance images (SIs) may affect its performance on vision-based tasks, leading to the difficulties in the IVS system extracting valid information from SIs. In this paper, we [...] Read more.
Intelligent video surveillance (IVS) technology is widely used in various security systems. However, quality degradation in surveillance images (SIs) may affect its performance on vision-based tasks, leading to the difficulties in the IVS system extracting valid information from SIs. In this paper, we propose a hybrid no-reference image quality assessment (NR IQA) model for SIs that can help to identify undesired distortions and provide useful guidelines for IVS technology. Specifically, we first extract two main types of quality-aware features: the low-level visual features related to various distortions, and the high-level semantic information, which is extracted by a state-of-the-art (SOTA) vision transformer backbone. Then, we fuse these two kinds of features into the final quality-aware feature vector, which is mapped into the quality index through the feature regression module. Our experimental results on two surveillance content quality databases demonstrate that the proposed model achieves the best performance compared to the SOTA on NR IQA metrics. Full article
(This article belongs to the Special Issue Deep Learning for Human-Centric Computer Vision)
Show Figures

Figure 1

11 pages, 1074 KiB  
Article
Research on High-Frequency Information-Transmission Method of Smart Grid Based on CNN-LSTM Model
by Xin Chen
Information 2022, 13(8), 375; https://doi.org/10.3390/info13080375 - 05 Aug 2022
Cited by 2 | Viewed by 1283
Abstract
In order to solve the problem of the slow transmission rate of high-frequency information in smart grid and improve the efficiency of information transmission, a research method of high-frequency information transmission in smart grids based on the CNN-LSTM model is proposed. It effectively [...] Read more.
In order to solve the problem of the slow transmission rate of high-frequency information in smart grid and improve the efficiency of information transmission, a research method of high-frequency information transmission in smart grids based on the CNN-LSTM model is proposed. It effectively combines the superiority of the CNN algorithm for high-frequency information feature extraction and the learning ability of the LSTM algorithm for global features of high-frequency information. Meanwhile, the client buffer is divided by the VLAN area division method, which avoids the buffer being too large due to line congestion. The intelligent control module is adopted to change the traditional control concept. In addition, the neural network optimization control module is used for intelligent control, which ensures the feedback speed of the control terminal and avoids the problem of increasing the buffer area caused by the feedback time difference. The experimental results show that via the method in this paper, the total efficiency of single-channel transmission reaches 96% and the transmission rate reaches 46 bit/s; the total efficiency of multiplex transmission is 89% and the transmission rate reaches 75 bit/s. It is verified that the method proposed in this paper has a fast transmission rate and high efficiency. Full article
(This article belongs to the Special Issue Deep Learning for Human-Centric Computer Vision)
Show Figures

Figure 1

Back to TopTop