Recent Advances in Virtual Reality and Computer Vision Based on Deep Learning

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 August 2024 | Viewed by 8491

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Systems and Software Engineering, School of Computer Engineering, National Distance Education University (UNED), 28040 Madrid, Spain
Interests: computer vision; pattern recognition; artificial intelligence; image processing; robotics
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Systems Engineering and Automatics, University of Valladolid, 47002 Valladolid, Spain
Interests: indoor positioning; WPS; RGB cameras; WiFi; fingerprint map; trajectory; IPS; computer vision

Special Issue Information

Dear Colleagues,

Deep learning is a very powerful branch of machine learning, framed within artificial intelligence, and represents a very promising field of knowledge that is continually on the rise, in part thanks to technological advances which allow the processing of enormous amounts of data with complex structures. The fullest realization of these are neural networks, more specifically those classified as deep. It is evident that AI is a promising field with multiple and diverse practical applications, representing a very active area of research, in which abundant intelligent systems have flourished and facilitate and automate routine work.

Deep learning has proven its usefulness in many disciplines, including computer vision, virtual reality, voice and audio processing, natural language processing, robotics, bioinformatics, video games, search engines or finance, among others, all included within the general field of artificial intelligence. The emphasis of this Special Issue is placed on the field of computer vision, an area in which deep learning models play a very important role, as well as virtual reality. However, given the essentially transversal and multidisciplinary nature of computer vision, its interference with other areas and disciplines is very common, which broadens the range of possible researchers and scholars that may find interest in this Issue.

The following is a list of the main topics covered by this Special Issue concerned with computer vision based on deep learning models:

  • Voice and audio processing;
  • Natural language processing;
  • Robotics;
  • Bioinformatics;
  • Video games;
  • Search engines;
  • Economy and finance.

The Special Issue will not be limited to these topics. Papers must present innovative methods and approaches, or novel applications of existing tools.

Dr. Pedro Javier Herrera Caro
Dr. Jaime Duque-Domingo
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning 
  • artificial intelligence 
  • neural networks 
  • computer vision 
  • applications

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

22 pages, 3869 KiB  
Article
Enhancing Signer-Independent Recognition of Isolated Sign Language through Advanced Deep Learning Techniques and Feature Fusion
by Ali Akdag and Omer Kaan Baykan
Electronics 2024, 13(7), 1188; https://doi.org/10.3390/electronics13071188 - 24 Mar 2024
Viewed by 559
Abstract
Sign Language Recognition (SLR) systems are crucial bridges facilitating communication between deaf or hard-of-hearing individuals and the hearing world. Existing SLR technologies, while advancing, often grapple with challenges such as accurately capturing the dynamic and complex nature of sign language, which includes both [...] Read more.
Sign Language Recognition (SLR) systems are crucial bridges facilitating communication between deaf or hard-of-hearing individuals and the hearing world. Existing SLR technologies, while advancing, often grapple with challenges such as accurately capturing the dynamic and complex nature of sign language, which includes both manual and non-manual elements like facial expressions and body movements. These systems sometimes fall short in environments with different backgrounds or lighting conditions, hindering their practical applicability and robustness. This study introduces an innovative approach to isolated sign language word recognition using a novel deep learning model that combines the strengths of both residual three-dimensional (R3D) and temporally separated (R(2+1)D) convolutional blocks. The R3(2+1)D-SLR network model demonstrates a superior ability to capture the intricate spatial and temporal features crucial for accurate sign recognition. Our system combines data from the signer’s body, hands, and face, extracted using the R3(2+1)D-SLR model, and employs a Support Vector Machine (SVM) for classification. It demonstrates remarkable improvements in accuracy and robustness across various backgrounds by utilizing pose data over RGB data. With this pose-based approach, our proposed system achieved 94.52% and 98.53% test accuracy in signer-independent evaluations on the BosphorusSign22k-general and LSA64 datasets. Full article
Show Figures

Figure 1

24 pages, 4112 KiB  
Article
Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation
by Chu Xin, Seokhwan Kim, Yongjoo Cho and Kyoung Shin Park
Electronics 2024, 13(4), 747; https://doi.org/10.3390/electronics13040747 - 13 Feb 2024
Viewed by 829
Abstract
Human Action Recognition (HAR) is an important field that identifies human behavior through sensor data. Three-dimensional human skeleton data extracted from the Kinect depth sensor have emerged as a powerful alternative to mitigate the effects of lighting and occlusion of traditional 2D RGB [...] Read more.
Human Action Recognition (HAR) is an important field that identifies human behavior through sensor data. Three-dimensional human skeleton data extracted from the Kinect depth sensor have emerged as a powerful alternative to mitigate the effects of lighting and occlusion of traditional 2D RGB or grayscale image-based HAR. Data augmentation is a key technique to enhance model generalization and robustness in deep learning while suppressing overfitting to training data. In this paper, we conduct a comprehensive study of various data augmentation techniques specific to skeletal data, which aim to improve the accuracy of deep learning models. These augmentation methods include spatial augmentation, which generates augmented samples from the original 3D skeleton sequence, and temporal augmentation, which is designed to capture subtle temporal changes in motion. The evaluation covers two publicly available datasets and a proprietary dataset and employs three neural network models. The results highlight the impact of temporal augmentation on model performance on the skeleton datasets, while exhibiting the nuanced impact of spatial augmentation. The findings underscore the importance of tailoring augmentation strategies to specific dataset characteristics and actions, providing novel perspectives for model selection in skeleton-based human action recognition tasks. Full article
Show Figures

Figure 1

19 pages, 6188 KiB  
Article
Development of a Human–Robot Interface for Cobot Trajectory Planning Using Mixed Reality
by Raúl Calderón-Sesmero, Jaime Duque-Domingo, Jaime Gómez-García-Bermejo and Eduardo Zalama
Electronics 2024, 13(3), 571; https://doi.org/10.3390/electronics13030571 - 31 Jan 2024
Viewed by 742
Abstract
The growing demand for projects with collaborative robots, known as “cobots”, underlines the need to efficiently address the execution of tasks with speed and flexibility, without neglecting safety in human–robot interaction. In general terms, this practice requires knowledge of robotics programming and skill [...] Read more.
The growing demand for projects with collaborative robots, known as “cobots”, underlines the need to efficiently address the execution of tasks with speed and flexibility, without neglecting safety in human–robot interaction. In general terms, this practice requires knowledge of robotics programming and skill in the use of hardware. The proposed solution consists of a mixed reality (MR) application integrated into a mixed reality head-mounted device (HMD) that accelerates the process of programming the complex manoeuvres of a cobot. This advancement is achieved through voice and gesture recognition, in addition to the use of digital panels. This allows any user, regardless of his or her robotics experience, to work more efficiently. The Robot Operating System (ROS) platform monitors the cobot and manages the transfer of data between the two. The system uses QR (Quick Response) codes to establish a precise frame of reference. This solution has proven its applicability in industrial processes, by automating manoeuvres and receiving positive feedback from users who have evaluated its performance. This solution promises to revolutionize the programming and operation of cobots, and pave the way for efficient and accessible collaborative robotics. Full article
Show Figures

Figure 1

Review

Jump to: Research

18 pages, 595 KiB  
Review
Sign Language Translation: A Survey of Approaches and Techniques
by Zeyu Liang, Huailing Li and Jianping Chai
Electronics 2023, 12(12), 2678; https://doi.org/10.3390/electronics12122678 - 15 Jun 2023
Cited by 3 | Viewed by 5644
Abstract
Sign language is the main communication way for deaf and hard-of-hearing (i.e., DHH) people, which is unfamiliar to most non-deaf and hard-of-hearing (non-DHH) people. To break down the communication barriers between DHH and non-DHH people and to better promote communication among DHH individuals, [...] Read more.
Sign language is the main communication way for deaf and hard-of-hearing (i.e., DHH) people, which is unfamiliar to most non-deaf and hard-of-hearing (non-DHH) people. To break down the communication barriers between DHH and non-DHH people and to better promote communication among DHH individuals, we have summarized the research progress on sign language translation. We provide the necessary background on sign language translation and introduce its four subtasks (i.e., sign2gloss2text, sign2text, sign2(gloss+text), and gloss2text). We distill the basic mode of sign language translation (SLT) and introduce the transformer-based framework of SLT. We analyze the main challenges of SLT and propose possible directions for its development. Full article
Show Figures

Figure 1

Back to TopTop