sensors-logo

Journal Browser

Journal Browser

Advanced Computer Vision Systems 2023

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: closed (31 December 2023) | Viewed by 4652

Special Issue Editors


E-Mail Website
Guest Editor
Automation and Control Institute, Vienna University of Technology, Gusshausstrasse 27-29 / E376, 1040 Vienna, Austria
Interests: robot vision; service robots; object detection; scene understanding; robots at home
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Automation and Control Institute, Technische Universität Wien, Vienna, Austria
Interests: computer vision; machine learning; robotics

Special Issue Information

Dear Colleagues,

Computer vision has made remarkable progress in recent years, due to the development of advanced algorithms, deep learning models, and powerful computing resources. As a result, computer vision systems can now perform tasks that were once considered impossible or challenging, such as object recognition, tracking, segmentation, detection, and classification, with high accuracy and speed. Moreover, computer vision has found many practical applications in various domains, including healthcare, transportation, surveillance, robotics, entertainment, and education. This Special Issue primarily addresses issues that arise in the design and deployment of comprehensive computer vision systems.

Its scope includes, but is not limited to, the following topics:

  • Building vision systems: paradigms, architectures, integration and control;
  • Vision system applications: systems deployed in real/realistic scenarios;
  • Robot vision;
  • Real-time vision systems;
  • Mobile and wearable vision systems;
  • Hardware-implemented vision systems;
  • Vision for the real world: robustness, learning, adaptability, self-assessment and failure recovery;
  • Vision for autonomous vehicles;
  • Vision for healthcare and rehabilitation applications;
  • Vision for surveillance and security applications;
  • Vision for virtual and augmented reality (VR and AR) applications;
  • Vision for industrial automation in FoFs;
  • Cognitive vision systems;
  • Human–computer interaction: monitoring, supervised learning and scene interpretation;
  • Human–robot collaboration: gesture recognition and scene understanding;
  • Performance evaluation: benchmarks, methods and metrics.

Prof. Dr. Markus Vincze
Dr. Jean-Baptiste Weibel
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 4562 KiB  
Article
Enhancing Query Formulation for Universal Image Segmentation
by Yipeng Qu and Joohee Kim
Sensors 2024, 24(6), 1879; https://doi.org/10.3390/s24061879 - 14 Mar 2024
Viewed by 547
Abstract
Recent advancements in image segmentation have been notably driven by Vision Transformers. These transformer-based models offer one versatile network structure capable of handling a variety of segmentation tasks. Despite their effectiveness, the pursuit of enhanced capabilities often leads to more intricate architectures and [...] Read more.
Recent advancements in image segmentation have been notably driven by Vision Transformers. These transformer-based models offer one versatile network structure capable of handling a variety of segmentation tasks. Despite their effectiveness, the pursuit of enhanced capabilities often leads to more intricate architectures and greater computational demands. OneFormer has responded to these challenges by introducing a query-text contrastive learning strategy active during training only. However, this approach has not completely addressed the inefficiency issues in text generation and the contrastive loss computation. To solve these problems, we introduce Efficient Query Optimizer (EQO), an approach that efficiently utilizes multi-modal data to refine query optimization in image segmentation. Our strategy significantly reduces the complexity of parameters and computations by distilling inter-class and inter-task information from an image into a single template sentence. Furthermore, we propose a novel attention-based contrastive loss. It is designed to facilitate a one-to-many matching mechanism in the loss computation, which helps object queries learn more robust representations. Beyond merely reducing complexity, our model demonstrates superior performance compared to OneFormer across all three segmentation tasks using the Swin-T backbone. Our evaluations on the ADE20K dataset reveal that our model outperforms OneFormer in multiple metrics: by 0.2% in mean Intersection over Union (mIoU), 0.6% in Average Precision (AP), and 0.8% in Panoptic Quality (PQ). These results highlight the efficacy of our model in advancing the field of image segmentation. Full article
(This article belongs to the Special Issue Advanced Computer Vision Systems 2023)
Show Figures

Figure 1

17 pages, 2419 KiB  
Article
Text-Guided Image Editing Based on Post Score for Gaining Attention on Social Media
by Yuto Watanabe, Ren Togo, Keisuke Maeda, Takahiro Ogawa and Miki Haseyama
Sensors 2024, 24(3), 921; https://doi.org/10.3390/s24030921 - 31 Jan 2024
Viewed by 2083
Abstract
Text-guided image editing has been highlighted in the fields of computer vision and natural language processing in recent years. The approach takes an image and text prompt as input and aims to edit the image in accordance with the text prompt while preserving [...] Read more.
Text-guided image editing has been highlighted in the fields of computer vision and natural language processing in recent years. The approach takes an image and text prompt as input and aims to edit the image in accordance with the text prompt while preserving text-unrelated regions. The results of text-guided image editing differ depending on the way the text prompt is represented, even if it has the same meaning. It is up to the user to decide which result best matches the intended use of the edited image. This paper assumes a situation in which edited images are posted to social media and proposes a novel text-guided image editing method to help the edited images gain attention from a greater audience. In the proposed method, we apply the pre-trained text-guided image editing method and obtain multiple edited images from the multiple text prompts generated from a large language model. The proposed method leverages the novel model that predicts post scores representing engagement rates and selects one image that will gain the most attention from the audience on social media among these edited images. Subject experiments on a dataset of real Instagram posts demonstrate that the edited images of the proposed method accurately reflect the content of the text prompts and provide a positive impression to the audience on social media compared to those of previous text-guided image editing methods. Full article
(This article belongs to the Special Issue Advanced Computer Vision Systems 2023)
Show Figures

Figure 1

18 pages, 2218 KiB  
Article
Manipulation Direction: Evaluating Text-Guided Image Manipulation Based on Similarity between Changes in Image and Text Modalities
by Yuto Watanabe, Ren Togo, Keisuke Maeda, Takahiro Ogawa and Miki Haseyama
Sensors 2023, 23(22), 9287; https://doi.org/10.3390/s23229287 - 20 Nov 2023
Viewed by 758
Abstract
At present, text-guided image manipulation is a notable subject of study in the vision and language field. Given an image and text as inputs, these methods aim to manipulate the image according to the text, while preserving text-irrelevant regions. Although there has been [...] Read more.
At present, text-guided image manipulation is a notable subject of study in the vision and language field. Given an image and text as inputs, these methods aim to manipulate the image according to the text, while preserving text-irrelevant regions. Although there has been extensive research to improve the versatility and performance of text-guided image manipulation, research on its performance evaluation is inadequate. This study proposes Manipulation Direction (MD), a logical and robust metric, which evaluates the performance of text-guided image manipulation by focusing on changes between image and text modalities. Specifically, we define MD as the consistency of changes between images and texts occurring before and after manipulation. By using MD to evaluate the performance of text-guided image manipulation, we can comprehensively evaluate how an image has changed before and after the image manipulation and whether this change agrees with the text. Extensive experiments on Multi-Modal-CelebA-HQ and Caltech-UCSD Birds confirmed that there was an impressive correlation between our calculated MD scores and subjective scores for the manipulated images compared to the existing metrics. Full article
(This article belongs to the Special Issue Advanced Computer Vision Systems 2023)
Show Figures

Figure 1

17 pages, 5391 KiB  
Article
Point CNN:3D Face Recognition with Local Feature Descriptor and Feature Enhancement Mechanism
by Qi Wang, Hang Lei and Weizhong Qian
Sensors 2023, 23(18), 7715; https://doi.org/10.3390/s23187715 - 06 Sep 2023
Cited by 1 | Viewed by 895
Abstract
Three-dimensional face recognition is an important part of the field of computer vision. Point clouds are widely used in the field of 3D vision due to the simple mathematical expression. However, the disorder of the points makes it difficult for them to have [...] Read more.
Three-dimensional face recognition is an important part of the field of computer vision. Point clouds are widely used in the field of 3D vision due to the simple mathematical expression. However, the disorder of the points makes it difficult for them to have ordered indexes in convolutional neural networks. In addition, the point clouds lack detailed textures, which makes the facial features easily affected by expression or head pose changes. To solve the above problems, this paper constructs a new face recognition network, which mainly consists of two parts. The first part is a novel operator based on a local feature descriptor to realize the fine-grained features extraction and the permutation invariance of point clouds. The second part is a feature enhancement mechanism to enhance the discrimination of facial features. In order to verify the performance of our method, we conducted experiments on three public datasets: CASIA-3D, Bosphorus, and Lock3Dface. The results show that the accuracy of our method is improved by 0.7%, 0.4%, and 0.8% compared with the latest methods on these three datasets, respectively. Full article
(This article belongs to the Special Issue Advanced Computer Vision Systems 2023)
Show Figures

Figure 1

Back to TopTop