sensors-logo

Journal Browser

Journal Browser

Multi-Modal Data Sensing and Processing

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: 5 June 2024 | Viewed by 10237

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science, China University of Geosciences, Wuhan 430074, China
Interests: information fusion; multi-view clustering; multiple kernel clustering; multi-modality fusion
School of Software, Jiangxi Normal University, Nanchang 330022, China
Interests: artificial intelligence; computer vision; machine learning; medical imaging processing
Special Issues, Collections and Topics in MDPI journals
School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China
Interests: RGB-D data acquisition and processing; 3D vision; structured light 3D reconstruction

Special Issue Information

Dear Colleagues,

With advances in sensing technology, increasing multi-modal data can be easily collected from different sources, which describes object and scene from different aspects. For example, in visual data processing, various sensors can capture data from a variety of domains or modalities, such as RGB and depth data, during video/image acquisition. In medial data analysis, Magnetic Resonance Imaging (MRI), positron emission tomography (PET) and Single-Nucleotide Polymorphisms (SNPs) are often used for reliable decisions. For the classification of the glioblastoma multiforme subtype, three different modalities, including miRNA expression (miRNA), mRNA expression (mRNA) and DNA methylation (DNA), are commonly used. As to multi-modal data, different modalities often contain both complementary and consensus information. Fully using multi-modality information is essential for enhancing specific tasks, which derives many multi-modality learning methods.

During the past few decades, although various models and networks have been put forward for multi-modality learning and gained great success, there are still many unsolved issues, which need to be further investigated. For example, missing or noisy values often exist among multi-modal data due to the failure of sensors, and the big data era also brings the scalability problem for handling large-scale multi-modal data. It is important to regularly bring together high-quality research and innovative works, covering multi-modal data sensing, processing, fusion and multi-modality learning models.

Thus, this Special Issue focuses on providing such a platform, which is fully within the scope of this journal.

Prof. Dr. Chang Tang
Dr. Yugen Yi
Dr. Sen Xiang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

14 pages, 4558 KiB  
Article
Dual-Signal Feature Spaces Map Protein Subcellular Locations Based on Immunohistochemistry Image and Protein Sequence
by Kai Zou, Simeng Wang, Ziqian Wang, Hongliang Zou and Fan Yang
Sensors 2023, 23(22), 9014; https://doi.org/10.3390/s23229014 - 07 Nov 2023
Cited by 1 | Viewed by 1004
Abstract
Protein is one of the primary biochemical macromolecular regulators in the compartmental cellular structure, and the subcellular locations of proteins can therefore provide information on the function of subcellular structures and physiological environments. Recently, data-driven systems have been developed to predict the subcellular [...] Read more.
Protein is one of the primary biochemical macromolecular regulators in the compartmental cellular structure, and the subcellular locations of proteins can therefore provide information on the function of subcellular structures and physiological environments. Recently, data-driven systems have been developed to predict the subcellular location of proteins based on protein sequence, immunohistochemistry (IHC) images, or immunofluorescence (IF) images. However, the research on the fusion of multiple protein signals has received little attention. In this study, we developed a dual-signal computational protocol by incorporating IHC images into protein sequences to learn protein subcellular localization. Three major steps can be summarized as follows in this protocol: first, a benchmark database that includes 281 proteins sorted out from 4722 proteins of the Human Protein Atlas (HPA) and Swiss-Prot database, which is involved in the endoplasmic reticulum (ER), Golgi apparatus, cytosol, and nucleoplasm; second, discriminative feature operators were first employed to quantitate protein image-sequence samples that include IHC images and protein sequence; finally, the feature subspace of different protein signals is absorbed to construct multiple sub-classifiers via dimensionality reduction and binary relevance (BR), and multiple confidence derived from multiple sub-classifiers is adopted to decide subcellular location by the centralized voting mechanism at the decision layer. The experimental results indicated that the dual-signal model embedded IHC images and protein sequences outperformed the single-signal models with accuracy, precision, and recall of 75.41%, 80.38%, and 74.38%, respectively. It is enlightening for further research on protein subcellular location prediction under multi-signal fusion of protein. Full article
(This article belongs to the Special Issue Multi-Modal Data Sensing and Processing)
Show Figures

Figure 1

15 pages, 4788 KiB  
Article
Deep-Learning-Aided Evaluation of Spondylolysis Imaged with Ultrashort Echo Time Magnetic Resonance Imaging
by Suraj Achar, Dosik Hwang, Tim Finkenstaedt, Vadim Malis and Won C. Bae
Sensors 2023, 23(18), 8001; https://doi.org/10.3390/s23188001 - 21 Sep 2023
Viewed by 1054
Abstract
Isthmic spondylolysis results in fracture of pars interarticularis of the lumbar spine, found in as many as half of adolescent athletes with persistent low back pain. While computed tomography (CT) is the gold standard for the diagnosis of spondylolysis, the use of ionizing [...] Read more.
Isthmic spondylolysis results in fracture of pars interarticularis of the lumbar spine, found in as many as half of adolescent athletes with persistent low back pain. While computed tomography (CT) is the gold standard for the diagnosis of spondylolysis, the use of ionizing radiation near reproductive organs in young subjects is undesirable. While magnetic resonance imaging (MRI) is preferable, it has lowered sensitivity for detecting the condition. Recently, it has been shown that ultrashort echo time (UTE) MRI can provide markedly improved bone contrast compared to conventional MRI. To take UTE MRI further, we developed supervised deep learning tools to generate (1) CT-like images and (2) saliency maps of fracture probability from UTE MRI, using ex vivo preparation of cadaveric spines. We further compared quantitative metrics of the contrast-to-noise ratio (CNR), mean squared error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity index (SSIM) between UTE MRI (inverted to make the appearance similar to CT) and CT and between CT-like images and CT. Qualitative results demonstrated the feasibility of successfully generating CT-like images from UTE MRI to provide easier interpretability for bone fractures thanks to improved image contrast and CNR. Quantitatively, the mean CNR of bone against defect-filled tissue was 35, 97, and 146 for UTE MRI, CT-like, and CT images, respectively, being significantly higher for CT-like than UTE MRI images. For the image similarity metrics using the CT image as the reference, CT-like images provided a significantly lower mean MSE (0.038 vs. 0.0528), higher mean PSNR (28.6 vs. 16.5), and higher SSIM (0.73 vs. 0.68) compared to UTE MRI images. Additionally, the saliency maps enabled quick detection of the location with probable pars fracture by providing visual cues to the reader. This proof-of-concept study is limited to the data from ex vivo samples, and additional work in human subjects with spondylolysis would be necessary to refine the models for clinical use. Nonetheless, this study shows that the utilization of UTE MRI and deep learning tools could be highly useful for the evaluation of isthmic spondylolysis. Full article
(This article belongs to the Special Issue Multi-Modal Data Sensing and Processing)
Show Figures

Figure 1

16 pages, 3488 KiB  
Article
Graph Sampling-Based Multi-Stream Enhancement Network for Visible-Infrared Person Re-Identification
by Jinhua Jiang, Junjie Xiao, Renlin Wang, Tiansong Li, Wenfeng Zhang, Ruisheng Ran and Sen Xiang
Sensors 2023, 23(18), 7948; https://doi.org/10.3390/s23187948 - 18 Sep 2023
Viewed by 840
Abstract
With the increasing demand for person re-identification (Re-ID) tasks, the need for all-day retrieval has become an inevitable trend. Nevertheless, single-modal Re-ID is no longer sufficient to meet this requirement, making Multi-Modal Data crucial in Re-ID. Consequently, a Visible-Infrared Person Re-Identification (VI Re-ID) [...] Read more.
With the increasing demand for person re-identification (Re-ID) tasks, the need for all-day retrieval has become an inevitable trend. Nevertheless, single-modal Re-ID is no longer sufficient to meet this requirement, making Multi-Modal Data crucial in Re-ID. Consequently, a Visible-Infrared Person Re-Identification (VI Re-ID) task is proposed, which aims to match pairs of person images from the visible and infrared modalities. The significant modality discrepancy between the modalities poses a major challenge. Existing VI Re-ID methods focus on cross-modal feature learning and modal transformation to alleviate the discrepancy but overlook the impact of person contour information. Contours exhibit modality invariance, which is vital for learning effective identity representations and cross-modal matching. In addition, due to the low intra-modal diversity in the visible modality, it is difficult to distinguish the boundaries between some hard samples. To address these issues, we propose the Graph Sampling-based Multi-stream Enhancement Network (GSMEN). Firstly, the Contour Expansion Module (CEM) incorporates the contour information of a person into the original samples, further reducing the modality discrepancy and leading to improved matching stability between image pairs of different modalities. Additionally, to better distinguish cross-modal hard sample pairs during the training process, an innovative Cross-modality Graph Sampler (CGS) is designed for sample selection before training. The CGS calculates the feature distance between samples from different modalities and groups similar samples into the same batch during the training process, effectively exploring the boundary relationships between hard classes in the cross-modal setting. Some experiments conducted on the SYSU-MM01 and RegDB datasets demonstrate the superiority of our proposed method. Specifically, in the VIS→IR task, the experimental results on the RegDB dataset achieve 93.69% for Rank-1 and 92.56% for mAP. Full article
(This article belongs to the Special Issue Multi-Modal Data Sensing and Processing)
Show Figures

Figure 1

19 pages, 9285 KiB  
Article
CLIP-Based Adaptive Graph Attention Network for Large-Scale Unsupervised Multi-Modal Hashing Retrieval
by Yewen Li, Mingyuan Ge, Mingyong Li, Tiansong Li and Sen Xiang
Sensors 2023, 23(7), 3439; https://doi.org/10.3390/s23073439 - 24 Mar 2023
Cited by 1 | Viewed by 2032
Abstract
With the proliferation of multi-modal data generated by various sensors, unsupervised multi-modal hashing retrieval has been extensively studied due to its advantages in storage, retrieval efficiency, and label independence. However, there are still two obstacles to existing unsupervised methods: (1) As existing methods [...] Read more.
With the proliferation of multi-modal data generated by various sensors, unsupervised multi-modal hashing retrieval has been extensively studied due to its advantages in storage, retrieval efficiency, and label independence. However, there are still two obstacles to existing unsupervised methods: (1) As existing methods cannot fully capture the complementary and co-occurrence information of multi-modal data, existing methods suffer from inaccurate similarity measures. (2) Existing methods suffer from unbalanced multi-modal learning and data semantic structure being corrupted in the process of hash codes binarization. To address these obstacles, we devise an effective CLIP-based Adaptive Graph Attention Network (CAGAN) for large-scale unsupervised multi-modal hashing retrieval. Firstly, we use the multi-modal model CLIP to extract fine-grained semantic features, mine similar information from different perspectives of multi-modal data and perform similarity fusion and enhancement. In addition, this paper proposes an adaptive graph attention network to assist the learning of hash codes, which uses an attention mechanism to learn adaptive graph similarity across modalities. It further aggregates the intrinsic neighborhood information of neighboring data nodes through a graph convolutional network to generate more discriminative hash codes. Finally, this paper employs an iterative approximate optimization strategy to mitigate the information loss in the binarization process. Extensive experiments on three benchmark datasets demonstrate that the proposed method significantly outperforms several representative hashing methods in unsupervised multi-modal retrieval tasks. Full article
(This article belongs to the Special Issue Multi-Modal Data Sensing and Processing)
Show Figures

Figure 1

Review

Jump to: Research

42 pages, 12665 KiB  
Review
Computer Vision Technology for Monitoring of Indoor and Outdoor Environments and HVAC Equipment: A Review
by Bin Yang, Shuang Yang, Xin Zhu, Min Qi, He Li, Zhihan Lv, Xiaogang Cheng and Faming Wang
Sensors 2023, 23(13), 6186; https://doi.org/10.3390/s23136186 - 06 Jul 2023
Cited by 4 | Viewed by 3599
Abstract
Artificial intelligence technologies such as computer vision (CV), machine learning, Internet of Things (IoT), and robotics have advanced rapidly in recent years. The new technologies provide non-contact measurements in three areas: indoor environmental monitoring, outdoor environ-mental monitoring, and equipment monitoring. This paper summarizes [...] Read more.
Artificial intelligence technologies such as computer vision (CV), machine learning, Internet of Things (IoT), and robotics have advanced rapidly in recent years. The new technologies provide non-contact measurements in three areas: indoor environmental monitoring, outdoor environ-mental monitoring, and equipment monitoring. This paper summarizes the specific applications of non-contact measurement based on infrared images and visible images in the areas of personnel skin temperature, position posture, the urban physical environment, building construction safety, and equipment operation status. At the same time, the challenges and opportunities associated with the application of CV technology are anticipated. Full article
(This article belongs to the Special Issue Multi-Modal Data Sensing and Processing)
Show Figures

Figure 1

Back to TopTop