Deep Vision Algorithms and Applications

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (31 December 2023) | Viewed by 10106

Special Issue Editors

Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, Uttarakhand 247667, India
Interests: computer vision; deep learning; pattern recognition

Special Issue Information

Dear Colleagues,

We are pleased to invite you to contribute on the topic of intelligent vision systems and their various applications. To achieve an intelligent vision system, dealing with sophisticated artificial intelligence (AI) technology is an inevitability. In AI technology, the core technology of deep learning has been based on deep neural networks and their training mechanisms. Through deep leaning-based approaches, many problems have been solved regarding various applications, especially in the field of classification and recognition. Despite the rapid development of deep learning approaches, further progress in visual feature extraction and pattern mining remain important for accelerating this kind of development.

This Special Issue aims to publish scientific papers on various big vision data analysis and vision data-based artificial intelligence technologies. The scope includes various topics from visual feature extraction, visual pattern recognition, deep learning structure, and learning methods to the results of their application in various fields.

For this Special Issue, both original research articles and reviews are welcome. Specific areas of research interest include (but are not limited to) the following:

  • Big vision data analysis
  • Vision data mining and pattern recognition
  • Visual feature extraction
  • Intelligent vision systems
  • Deep learning structures and optimization mechanisms
  • Lightweight deep vision structures
  • Various applications of deep vision schemes
  • Novel data fusion schemes with vision data
  • Vision sensor network and distributed processing
  • Deep neural networks for multimedia data processing
  • Human-centric vision systems and technologies

We look forward to receiving your valuable contributions.

Prof. Dr. Byung-Gyu Kim
Prof. Dr. Partha Pratim Roy
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • big vision data analysis
  • deep vision algorithm
  • visual data mining
  • pattern recognition
  • intelligent media
  • visual feature extraction

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 1938 KiB  
Article
Semi-Supervised Drivable Road Segmentation with Expanded Feature Cross-Consistency
by Shangchen Ma and Chunlin Song
Appl. Sci. 2023, 13(21), 12036; https://doi.org/10.3390/app132112036 - 04 Nov 2023
Cited by 1 | Viewed by 678
Abstract
Drivable road segmentation aims to sense the surrounding environment to keep vehicles within safe road boundaries, which is fundamental in Advance Driver-Assistance Systems (ADASs). Existing deep learning-based supervised methods are able to achieve good performance in this field with large amounts of human-labeled [...] Read more.
Drivable road segmentation aims to sense the surrounding environment to keep vehicles within safe road boundaries, which is fundamental in Advance Driver-Assistance Systems (ADASs). Existing deep learning-based supervised methods are able to achieve good performance in this field with large amounts of human-labeled training data. However, the process of collecting sufficient fine human-labeled data is extremely time-consuming and expensive. To fill this gap, in this paper, we innovatively propose a general yet effective semi-supervised method for drivable road segmentation with lower labeled-data dependency, high accuracy, and high real-time performance. Specifically, a main encoder and a main decoder are trained in the supervised mode with labeled data generating pseudo labels for the unsupervised training. Then, we innovatively set up both auxiliary encoders and auxiliary decoders in our model that yield feature representations and predictions based on the unlabeled data subjected to different elaborated perturbations. Both auxiliary encoders and decoders can leverage information in unlabeled data by enforcing consistency between predictions of the main modules and those perturbed versions from auxiliary modules. Experimental results on two public datasets (Cityspace and CamVid) verify that our proposed algorithm can almost reach the same performance with high FPS as a fully supervised method with 100% labeled data with only utilizing 40% labeled data in the field of drivable road segmentation. In addition, our semi-supervised algorithm has a good potential to be generalized to all models with an encoder–decoder structure. Full article
(This article belongs to the Special Issue Deep Vision Algorithms and Applications)
Show Figures

Figure 1

13 pages, 1642 KiB  
Article
Gesture Recognition and Hand Tracking for Anti-Counterfeit Palmvein Recognition
by Jiawei Xu, Lu Leng and Byung-Gyu Kim
Appl. Sci. 2023, 13(21), 11795; https://doi.org/10.3390/app132111795 - 28 Oct 2023
Viewed by 806
Abstract
At present, COVID-19 is posing a serious threat to global human health. The features of hand veins in infrared environments have many advantages, including non-contact acquisition, security, privacy, etc., which can remarkably reduce the risks of COVID-19. Therefore, this paper builds an interactive [...] Read more.
At present, COVID-19 is posing a serious threat to global human health. The features of hand veins in infrared environments have many advantages, including non-contact acquisition, security, privacy, etc., which can remarkably reduce the risks of COVID-19. Therefore, this paper builds an interactive system, which can recognize hand gestures and track hands for palmvein recognition in infrared environments. The gesture contours are extracted and input into an improved convolutional neural network for gesture recognition. The hand is tracked based on key point detection. Because the hand gesture commands are randomly generated and the hand vein features are extracted from the infrared environment, the anti-counterfeiting performance is obviously improved. In addition, hand tracking is conducted after gesture recognition, which prevents the escape of the hand from the camera view range, so it ensures that the hand used for palmvein recognition is identical to the hand used during gesture recognition. The experimental results show that the proposed gesture recognition method performs satisfactorily on our dataset, and the hand tracking method has good robustness. Full article
(This article belongs to the Special Issue Deep Vision Algorithms and Applications)
Show Figures

Figure 1

30 pages, 50057 KiB  
Article
Personalized Advertising Design Based on Automatic Analysis of an Individual’s Appearance
by Marco A. Moreno-Armendáriz, Hiram Calvo, José Faustinos and Carlos A. Duchanoy
Appl. Sci. 2023, 13(17), 9765; https://doi.org/10.3390/app13179765 - 29 Aug 2023
Cited by 1 | Viewed by 1212
Abstract
Market segmentation is a crucial marketing strategy that involves identifying and defining distinct groups of buyers to target a company’s marketing efforts effectively. To achieve this, the use of data to estimate consumer preferences and behavior is both appropriate and adequate. Visual elements, [...] Read more.
Market segmentation is a crucial marketing strategy that involves identifying and defining distinct groups of buyers to target a company’s marketing efforts effectively. To achieve this, the use of data to estimate consumer preferences and behavior is both appropriate and adequate. Visual elements, such as color and shape, in advertising can effectively communicate the product or service being promoted and influence consumer perceptions of its quality. Similarly, a person’s outward appearance plays a pivotal role in nonverbal communication, significantly impacting human social interactions and providing insights into individuals’ emotional states. In this study, we introduce an innovative deep learning model capable of predicting one of the styles in the seven universal styles model. By employing various advanced deep learning techniques, our models automatically extract features from full-body images, enabling the identification of style-defining traits in clothing subjects. Among the models proposed, the XCEPTION-based approach achieved an impressive top accuracy of 98.27%, highlighting its efficacy in accurately predicting styles. Furthermore, we developed a personalized ad generator that enjoyed a high acceptance rate of 80.56% among surveyed users, demonstrating the power of data-driven approaches in generating engaging and relevant content. Overall, the utilization of data to estimate consumer preferences and style traits is appropriate and effective in enhancing marketing strategies, as evidenced by the success of our deep learning models and personalized ad generator. By leveraging data-driven insights, businesses can create targeted and compelling marketing campaigns, thereby increasing their overall success in reaching and resonating with their desired audience. Full article
(This article belongs to the Special Issue Deep Vision Algorithms and Applications)
Show Figures

Figure 1

17 pages, 3269 KiB  
Article
Predictive Distillation Method of Anchor-Free Object Detection Model for Continual Learning
by Sumyung Gang, Daewon Chung and Joonjae Lee
Appl. Sci. 2022, 12(13), 6419; https://doi.org/10.3390/app12136419 - 24 Jun 2022
Viewed by 1205
Abstract
Continual learning (CL) is becoming increasingly important, not only for storage space because of the ever-increasing amount of data being generated, but also for associated copyright problems. In this study, we propose ground truth’ (GT’), which is a combination of ground truth (GT) [...] Read more.
Continual learning (CL) is becoming increasingly important, not only for storage space because of the ever-increasing amount of data being generated, but also for associated copyright problems. In this study, we propose ground truth’ (GT’), which is a combination of ground truth (GT) and a prediction of the teacher model that distills the prediction results of the previously trained model, called the teacher model, by applying the knowledge distillation (KD) technique to an anchor-free object detection model. Among all the objects predicted by the teacher model, an object for which the prediction score is higher than the threshold value is distilled into the current trained model, called the student model. To avoid interference with new class learning, the IoU is obtained between every object of the GT and the predicted objects. Through the continual learning scenario, even if the reuse of past data is limited, if new data are sufficient, the proposed model minimizes catastrophic forgetting problems and enables learning for newly added classes. The proposed model was learned in PascalVOC 2007 + 2012 and tested in PascalVOC2007, with better results of 9.6% p mAP and 13.7% p F1i shown in the scenario 19 + 1. The result in scenario 15 + 5 showed better results than the compared algorithm, with 1.6% p mAP and 0.9% p F1i. The scenario 10 + 10 also outperformed the other alternatives, with 0.9% p mAP and 0.6% p F1i. Full article
(This article belongs to the Special Issue Deep Vision Algorithms and Applications)
Show Figures

Figure 1

25 pages, 8623 KiB  
Article
A Framework for Pedestrian Attribute Recognition Using Deep Learning
by Saadman Sakib, Kaushik Deb, Pranab Kumar Dhar and Oh-Jin Kwon
Appl. Sci. 2022, 12(2), 622; https://doi.org/10.3390/app12020622 - 10 Jan 2022
Cited by 5 | Viewed by 3357
Abstract
The pedestrian attribute recognition task is becoming more popular daily because of its significant role in surveillance scenarios. As the technological advances are significantly more than before, deep learning came to the surface of computer vision. Previous works applied deep learning in different [...] Read more.
The pedestrian attribute recognition task is becoming more popular daily because of its significant role in surveillance scenarios. As the technological advances are significantly more than before, deep learning came to the surface of computer vision. Previous works applied deep learning in different ways to recognize pedestrian attributes. The results are satisfactory, but still, there is some scope for improvement. The transfer learning technique is becoming more popular for its extraordinary performance in reducing computation cost and scarcity of data in any task. This paper proposes a framework that can work in surveillance scenarios to recognize pedestrian attributes. The mask R-CNN object detector extracts the pedestrians. Additionally, we applied transfer learning techniques on different CNN architectures, i.e., Inception ResNet v2, Xception, ResNet 101 v2, ResNet 152 v2. The main contribution of this paper is fine-tuning the ResNet 152 v2 architecture, which is performed by freezing layers, last 4, 8, 12, 14, 20, none, and all. Moreover, data balancing techniques are applied, i.e., oversampling, to resolve the class imbalance problem of the dataset and analysis of the usefulness of this technique is discussed in this paper. Our proposed framework outperforms state-of-the-art methods, and it provides 93.41% mA and 89.24% mA on the RAP v2 and PARSE100K datasets, respectively. Full article
(This article belongs to the Special Issue Deep Vision Algorithms and Applications)
Show Figures

Figure 1

Back to TopTop