sensors-logo

Journal Browser

Journal Browser

Robust Multimodal Sensing for Automated Driving Systems

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Vehicular Sensing".

Deadline for manuscript submissions: 30 June 2024 | Viewed by 17762

Special Issue Editors


E-Mail Website
Guest Editor
Laboratory of Information Technologies, Faculty of Electrical engineering, University of Ljubljana, Tržaška Cesta 25, 1000 Ljubljana, Slovenia
Interests: Human-machine interaction; driving behavior; automated driving; driver monitoring; driving style; driving comfort

E-Mail Website
Guest Editor
Department of Tourism and Transport, School of Hospitality and Tourism, Stag Hill, Guildford GU2 7XH, UK
Interests: automated driving; connected transport; leisure travel; passenger comfort; user perspective; wider impacts; WISE-ACT

E-Mail Website
Guest Editor
Autonomous Driving Research, Intel Labs, Intel Corporation, Beaverton, OR 97007, USA
Interests: automated driving; intelligent transportation; vehicle safety; connected vehicles; Artificial Intelligence; user experience
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Connected and automated driving technologies have the potential to revolutionize transportation by facilitating mobility services to a wider population, improving safety and traffic efficiency. Automated driving technology is expected to reduce the number of accidents caused by human error and avert deadly crashes, ensure mobility for all, including old and impaired individuals, allow the human driver to perform alternative (secondary) tasks, increase traffic flow efficiency, reduce fuel consumption, and lower emissions.

Driven by these goals, humankind is experiencing an exponential growth in vehicle automation taking over the monitoring of surroundings and vehicle control tasks from human drivers in a quest towards full autonomy. Connected and automated vehicles are equipped with multimodal sensors that allow continuous perception and monitoring of driving tasks to assist drivers in lower levels of SAE automation or to fully take control of driving tasks under full SAE automation. Numerous sensors, both inside and outside vehicles, allow the detection and identification of oncoming obstacles, the determination of their velocity, and the prediction of future behaviours to avoid potential collisions. Each sensor has its own strengths and weaknesses in terms of range, accuracy, energy consumption. and sensitivity towards external conditions such as weather and light. Automated vehicles usually rely on a mix of signals to improve operational reliability and robustness under the dynamic external conditions of real-world deployments. Generally, we can divide external AV sensors into two major groups: active and passive. Active sensors generate an active signal (electromagnetic or light) transmitted to the external environment to analyse its reflection (e.g., radar, lidar), whereas passive sensors just record the information from the environment (e.g., camera). Additionally, there have been advances in intelligent transportation infrastructure to monitor road users, perform predictive analytics, and facilitate collaborative perception services and remote vehicle control.

The increasing commercial availability of conditional automation (SAE level 3) and the incoming Robotaxi services (SAE Level 4) have also resulted in an increase in in-cabin monitoring sensors dedicated to monitoring driver and passenger behaviours.  Multimodal in-cabin monitoring systems are crucial enablers for successfully managing automated vehicle operations. These systems enable the detection of the driver/passenger’s physiological state and activity to assess their readiness to take over control of the vehicle if required as well as to monitor their safety. Driving monitoring solutions provide information on occupants’ fatigue, distraction, discomfort, and stress. Furthermore, they can help to verify that automation is used properly by evaluating engagement in the driving monitoring task or the inherent risk of the non-driving tasks.

This Special issue aims to collect original theoretical or empirical articles on different sensing technologies, solutions, and applications for automated vehicles. Potential topics include but are not limited to the following:

Topics of interest:

  • External sensing technologies:
    • Detection and ranging technologies: radar, lidar, sonar, cameras;
    • Localization and mapping: GPS and HD maps;
    • Object detection, classification, and scene segmentation algorithms;
    • Object tracking and prediction algorithms;
    • Data annotation;
    • External HMI;
    • ICT infrastructure;
  • Internal sensing technologies:
    • Driver monitoring systems, including related usability acceptance challenges, e.g., privacy;
    • Detection of driver’s physiological states: fatigue, discomfort, sickness, including ‘wearable’ technology;
    • Driver fitness/risk assessment for conditional automation;
    • User experience improvements through sensors;
  • Sensor fusion and dependability:
    • Dependable sensor systems;
    • Multimodal sensor fusion algorithms;
    • Improvements in training, evaluation, and validation of robust perception systems.

Prof. Dr. Jaka Sodnik
Dr. Nikolas Thomopoulos
Dr. Ignacio Alvarez
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Automated and autonomous driving
  • Vehicle sensing technologies
  • Sensor fusion
  • Object detection and identification
  • Driver monitoring

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

21 pages, 24373 KiB  
Article
PTA-Det: Point Transformer Associating Point Cloud and Image for 3D Object Detection
by Rui Wan, Tianyun Zhao and Wei Zhao
Sensors 2023, 23(6), 3229; https://doi.org/10.3390/s23063229 - 17 Mar 2023
Cited by 3 | Viewed by 1717
Abstract
In autonomous driving, 3D object detection based on multi-modal data has become an indispensable perceptual approach when facing complex environments around the vehicle. During multi-modal detection, LiDAR and a camera are simultaneously applied for capturing and modeling. However, due to the intrinsic discrepancies [...] Read more.
In autonomous driving, 3D object detection based on multi-modal data has become an indispensable perceptual approach when facing complex environments around the vehicle. During multi-modal detection, LiDAR and a camera are simultaneously applied for capturing and modeling. However, due to the intrinsic discrepancies between the LiDAR point and camera image, the fusion of the data for object detection encounters a series of problems, which results in most multi-modal detection methods performing worse than LiDAR-only methods. In this investigation, we propose a method named PTA-Det to improve the performance of multi-modal detection. Accompanied by PTA-Det, a Pseudo Point Cloud Generation Network is proposed, which can represent the textural and semantic features of keypoints in the image by pseudo points. Thereafter, through a transformer-based Point Fusion Transition (PFT) module, the features of LiDAR points and pseudo points from an image can be deeply fused under a unified point-based form. The combination of these modules can overcome the main obstacle of cross-modal feature fusion and achieves a complementary and discriminative representation for proposal generation. Extensive experiments on KITTI dataset support the effectiveness of PTA-Det, achieving a mAP (mean average precision) of 77.88% on the car category with relatively few LiDAR input points. Full article
(This article belongs to the Special Issue Robust Multimodal Sensing for Automated Driving Systems)
Show Figures

Figure 1

17 pages, 4969 KiB  
Article
Guided Depth Completion with Instance Segmentation Fusion in Autonomous Driving Applications
by Mohammad Z. El-Yabroudi, Ikhlas Abdel-Qader, Bradley J. Bazuin, Osama Abudayyeh and Rakan C. Chabaan
Sensors 2022, 22(24), 9578; https://doi.org/10.3390/s22249578 - 07 Dec 2022
Viewed by 1565
Abstract
Pixel-level depth information is crucial to many applications, such as autonomous driving, robotics navigation, 3D scene reconstruction, and augmented reality. However, depth information, which is usually acquired by sensors such as LiDAR, is sparse. Depth completion is a process that predicts missing pixels’ [...] Read more.
Pixel-level depth information is crucial to many applications, such as autonomous driving, robotics navigation, 3D scene reconstruction, and augmented reality. However, depth information, which is usually acquired by sensors such as LiDAR, is sparse. Depth completion is a process that predicts missing pixels’ depth information from a set of sparse depth measurements. Most of the ongoing research applies deep neural networks on the entire sparse depth map and camera scene without utilizing any information about the available objects, which results in more complex and resource-demanding networks. In this work, we propose to use image instance segmentation to detect objects of interest with pixel-level locations, along with sparse depth data, to support depth completion. The framework utilizes a two-branch encoder–decoder deep neural network. It fuses information about scene available objects, such as objects’ type and pixel-level location, LiDAR, and RGB camera, to predict dense accurate depth maps. Experimental results on the KITTI dataset showed faster training and improved prediction accuracy. The proposed method reaches a convergence state faster and surpasses the baseline model in all evaluation metrics. Full article
(This article belongs to the Special Issue Robust Multimodal Sensing for Automated Driving Systems)
Show Figures

Figure 1

13 pages, 3256 KiB  
Article
Temporal Dashboard Gaze Variance (TDGV) Changes for Measuring Cognitive Distraction While Driving
by Cyril Marx, Elem Güzel Kalayci and Peter Moertl
Sensors 2022, 22(23), 9556; https://doi.org/10.3390/s22239556 - 06 Dec 2022
Cited by 1 | Viewed by 1460
Abstract
A difficult challenge for today’s driver monitoring systems is the detection of cognitive distraction. The present research presents the development of a theory-driven approach for cognitive distraction detection during manual driving based on temporal control theories. It is based solely on changes in [...] Read more.
A difficult challenge for today’s driver monitoring systems is the detection of cognitive distraction. The present research presents the development of a theory-driven approach for cognitive distraction detection during manual driving based on temporal control theories. It is based solely on changes in the temporal variance of driving-relevant gaze behavior, such as gazes onto the dashboard (TDGV). Validation of the detection method happened in a field and in a simulator study by letting participants drive, alternating with and without a secondary task inducing external cognitive distraction (auditory continuous performance task). The general accuracy of the distraction detection method varies between 68% and 81% based on the quality of an individual prerecorded baseline measurement. As a theory-driven system, it represents not only a step towards a sophisticated cognitive distraction detection method, but also explains that changes in temporal dashboard gaze variance (TDGV) are a useful behavioral indicator for detecting cognitive distraction. Full article
(This article belongs to the Special Issue Robust Multimodal Sensing for Automated Driving Systems)
Show Figures

Figure 1

18 pages, 2595 KiB  
Article
Electrogastrogram-Derived Features for Automated Sickness Detection in Driving Simulator
by Grega Jakus, Jaka Sodnik and Nadica Miljković
Sensors 2022, 22(22), 8616; https://doi.org/10.3390/s22228616 - 08 Nov 2022
Cited by 2 | Viewed by 1548
Abstract
The rapid development of driving simulators for the evaluation of automated driving experience is constrained by the simulator sickness-related nausea. The electrogastrogram (EGG)-based approach may be promising for immediate, objective, and quantitative nausea assessment. Given the relatively high EGG sensitivity to noises associated [...] Read more.
The rapid development of driving simulators for the evaluation of automated driving experience is constrained by the simulator sickness-related nausea. The electrogastrogram (EGG)-based approach may be promising for immediate, objective, and quantitative nausea assessment. Given the relatively high EGG sensitivity to noises associated with the relatively low amplitude and frequency spans, we introduce an automated procedure comprising statistical analysis and machine learning techniques for EGG-based nausea detection in relation to the noise contamination during automated driving simulation. We calculate the root mean square of EGG amplitude, median and dominant frequencies, magnitude of Power Spectral Density (PSD) at dominant frequency, crest factor of PSD, and spectral variation distribution along with newly introduced parameters: sample and spectral entropy, autocorrelation zero-crossing, and parameters derived from the Poincaré diagram of consecutive EGG samples. Results showed outstanding robustness of sample entropy with moderate robustness of autocorrelation zero-crossing, dominant frequency, and its median. Machine learning reached an accuracy of 88.2% and revealed sample entropy as one of the most relevant and robust parameters, while linear analysis highlighted spectral entropy, spectral variation distribution, and crest factor of PSD. This study clearly indicates the need for customized feature selection in noisy environments, as well as a complementary approach comprising machine learning and statistical analysis for efficient nausea detection. Full article
(This article belongs to the Special Issue Robust Multimodal Sensing for Automated Driving Systems)
Show Figures

Figure 1

10 pages, 1361 KiB  
Article
Lightweight Depth Completion Network with Local Similarity-Preserving Knowledge Distillation
by Yongseop Jeong, Jinsun Park, Donghyeon Cho, Yoonjin Hwang, Seibum B. Choi and In So Kweon
Sensors 2022, 22(19), 7388; https://doi.org/10.3390/s22197388 - 28 Sep 2022
Cited by 4 | Viewed by 1644
Abstract
Depth perception capability is one of the essential requirements for various autonomous driving platforms. However, accurate depth estimation in a real-world setting is still a challenging problem due to high computational costs. In this paper, we propose a lightweight depth completion network for [...] Read more.
Depth perception capability is one of the essential requirements for various autonomous driving platforms. However, accurate depth estimation in a real-world setting is still a challenging problem due to high computational costs. In this paper, we propose a lightweight depth completion network for depth perception in real-world environments. To effectively transfer a teacher’s knowledge, useful for the depth completion, we introduce local similarity-preserving knowledge distillation (LSPKD), which allows similarities between local neighbors to be transferred during the distillation. With our LSPKD, a lightweight student network is precisely guided by a heavy teacher network, regardless of the density of the ground-truth data. Experimental results demonstrate that our method is effective to reduce computational costs during both training and inference stages while achieving superior performance over other lightweight networks. Full article
(This article belongs to the Special Issue Robust Multimodal Sensing for Automated Driving Systems)
Show Figures

Figure 1

22 pages, 49989 KiB  
Article
Enhanced Perception for Autonomous Driving Using Semantic and Geometric Data Fusion
by Horatiu Florea, Andra Petrovai, Ion Giosan, Florin Oniga, Robert Varga and Sergiu Nedevschi
Sensors 2022, 22(13), 5061; https://doi.org/10.3390/s22135061 - 05 Jul 2022
Cited by 11 | Viewed by 3463
Abstract
Environment perception remains one of the key tasks in autonomous driving for which solutions have yet to reach maturity. Multi-modal approaches benefit from the complementary physical properties specific to each sensor technology used, boosting overall performance. The added complexity brought on by data [...] Read more.
Environment perception remains one of the key tasks in autonomous driving for which solutions have yet to reach maturity. Multi-modal approaches benefit from the complementary physical properties specific to each sensor technology used, boosting overall performance. The added complexity brought on by data fusion processes is not trivial to solve, with design decisions heavily influencing the balance between quality and latency of the results. In this paper we present our novel real-time, 360 enhanced perception component based on low-level fusion between geometry provided by the LiDAR-based 3D point clouds and semantic scene information obtained from multiple RGB cameras, of multiple types. This multi-modal, multi-sensor scheme enables better range coverage, improved detection and classification quality with increased robustness. Semantic, instance and panoptic segmentations of 2D data are computed using efficient deep-learning-based algorithms, while 3D point clouds are segmented using a fast, traditional voxel-based solution. Finally, the fusion obtained through point-to-image projection yields a semantically enhanced 3D point cloud that allows enhanced perception through 3D detection refinement and 3D object classification. The planning and control systems of the vehicle receives the individual sensors’ perception together with the enhanced one, as well as the semantically enhanced 3D points. The developed perception solutions are successfully integrated onto an autonomous vehicle software stack, as part of the UP-Drive project. Full article
(This article belongs to the Special Issue Robust Multimodal Sensing for Automated Driving Systems)
Show Figures

Figure 1

24 pages, 1306 KiB  
Article
Probabilistic Traffic Motion Labeling for Multi-Modal Vehicle Route Prediction
by Alberto Flores Fernández, Jonas Wurst, Eduardo Sánchez Morales, Michael Botsch, Christian Facchi and Andrés García Higuera
Sensors 2022, 22(12), 4498; https://doi.org/10.3390/s22124498 - 14 Jun 2022
Cited by 3 | Viewed by 1644
Abstract
The prediction of the motion of traffic participants is a crucial aspect for the research and development of Automated Driving Systems (ADSs). Recent approaches are based on multi-modal motion prediction, which requires the assignment of a probability score to each of the multiple [...] Read more.
The prediction of the motion of traffic participants is a crucial aspect for the research and development of Automated Driving Systems (ADSs). Recent approaches are based on multi-modal motion prediction, which requires the assignment of a probability score to each of the multiple predicted motion hypotheses. However, there is a lack of ground truth for this probability score in the existing datasets. This implies that current Machine Learning (ML) models evaluate the multiple predictions by comparing them with the single real trajectory labeled in the dataset. In this work, a novel data-based method named Probabilistic Traffic Motion Labeling (PROMOTING) is introduced in order to (a) generate probable future routes and (b) estimate their probabilities. PROMOTING is presented with the focus on urban intersections. The generation of probable future routes is (a) based on a real traffic dataset and consists of two steps: first, a clustering of intersections with similar road topology, and second, a clustering of similar routes that are driven in each cluster from the first step. The estimation of the route probabilities is (b) based on a frequentist approach that considers how traffic participants will move in the future given their motion history. PROMOTING is evaluated with the publicly available Lyft database. The results show that PROMOTING is an appropriate approach to estimate the probabilities of the future motion of traffic participants in urban intersections. In this regard, PROMOTING can be used as a labeling approach for the generation of a labeled dataset that provides a probability score for probable future routes. Such a labeled dataset currently does not exist and would be highly valuable for ML approaches with the task of multi-modal motion prediction. The code is made open source. Full article
(This article belongs to the Special Issue Robust Multimodal Sensing for Automated Driving Systems)
Show Figures

Figure 1

26 pages, 2110 KiB  
Article
Challenges of Large-Scale Multi-Camera Datasets for Driver Monitoring Systems
by Juan Diego Ortega, Paola Natalia Cañas, Marcos Nieto, Oihana Otaegui and Luis Salgado
Sensors 2022, 22(7), 2554; https://doi.org/10.3390/s22072554 - 26 Mar 2022
Cited by 3 | Viewed by 3262
Abstract
Tremendous advances in advanced driver assistance systems (ADAS) have been possible thanks to the emergence of deep neural networks (DNN) and Big Data (BD) technologies. Huge volumes of data can be managed and consumed as training material to create DNN models which feed [...] Read more.
Tremendous advances in advanced driver assistance systems (ADAS) have been possible thanks to the emergence of deep neural networks (DNN) and Big Data (BD) technologies. Huge volumes of data can be managed and consumed as training material to create DNN models which feed functions such as lane keeping systems (LKS), automated emergency braking (AEB), lane change assistance (LCA), etc. In the ADAS/AD domain, these advances are only possible thanks to the creation and publication of large and complex datasets, which can be used by the scientific community to benchmark and leverage research and development activities. In particular, multi-modal datasets have the potential to feed DNN that fuse information from different sensors or input modalities, producing optimised models that exploit modality redundancy, correlation, complementariness and association. Creating such datasets pose a scientific and engineering challenge. The BD dimensions to cover are volume (large datasets), variety (wide range of scenarios and context), veracity (data labels are verified), visualization (data can be interpreted) and value (data is useful). In this paper, we explore the requirements and technical approach to build a multi-sensor, multi-modal dataset for video-based applications in the ADAS/AD domain. The Driver Monitoring Dataset (DMD) was created and partially released to foster research and development on driver monitoring systems (DMS), as it is a particular sub-case which receives less attention than exterior perception. Details on the preparation, construction, post-processing, labelling and publication of the dataset are presented in this paper, along with the announcement of a subsequent release of DMD material publicly available for the community. Full article
(This article belongs to the Special Issue Robust Multimodal Sensing for Automated Driving Systems)
Show Figures

Figure 1

Back to TopTop