Applications of Computer Vision

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (20 March 2022) | Viewed by 38531

Special Issue Editor


E-Mail Website
Guest Editor
Centro Singular de Investigación en Tecnoloxías Intelixentes (CITIUS, Research Center of Intelligent Systems), University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
Interests: image segmentation; texture analysis; classification; regression; pattern recognition; applications of computer vision
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Computer vision (CV) techniques are widely used by the practicing engineer to solve a whole range of real vision problems. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible, covering some practical application of CV methods in all branches of science and engineering. Submitted papers should report some novel aspect of CV use for a real-world engineering application and also validated using data sets. There is no restriction on the length of the papers. Electronic files and software regarding the full details of the calculation or experimental procedure, if unable to be published in a normal way, can be deposited as supplementary electronic material.

Focal points of the Special Issue include but are not limited to innovative applications of:

  • Medical and biological imaging
  • Industrial inspection
  • Robotics
  • Photo and video interpretation
  • Image retrieval
  • Video analysis and annotation
  • Multimedia
  • Sensors and more.

Dr. Eva Cernadas
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image and video segmentation
  • image classification
  • video analysis
  • pattern recognition
  • image and video understanding

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 5886 KiB  
Article
A Saliency Prediction Model Based on Re-Parameterization and Channel Attention Mechanism
by Fei Yan, Zhiliang Wang, Siyu Qi and Ruoxiu Xiao
Electronics 2022, 11(8), 1180; https://doi.org/10.3390/electronics11081180 - 08 Apr 2022
Cited by 2 | Viewed by 1537
Abstract
Deep saliency models can effectively imitate the attention mechanism of human vision, and they perform considerably better than classical models that rely on handcrafted features. However, deep models also require higher-level information, such as context or emotional content, to further approach human performance. [...] Read more.
Deep saliency models can effectively imitate the attention mechanism of human vision, and they perform considerably better than classical models that rely on handcrafted features. However, deep models also require higher-level information, such as context or emotional content, to further approach human performance. Therefore, this study proposes a multilevel saliency prediction network that aims to use a combination of spatial and channel information to find possible high-level features, further improving the performance of a saliency model. Firstly, we use a VGG style network with an identity block as the primary network architecture. With the help of re-parameterization, we can obtain rich features similar to multiscale networks and effectively reduce computational cost. Secondly, a subnetwork with a channel attention mechanism is designed to find potential saliency regions and possible high-level semantic information in an image. Finally, image spatial features and a channel enhancement vector are combined after quantization to improve the overall performance of the model. Compared with classical models and other deep models, our model exhibits superior overall performance. Full article
(This article belongs to the Special Issue Applications of Computer Vision)
Show Figures

Figure 1

15 pages, 2150 KiB  
Article
3DPlanNet: Generating 3D Models from 2D Floor Plan Images Using Ensemble Methods
by Sungsoo Park and Hyeoncheol Kim
Electronics 2021, 10(22), 2729; https://doi.org/10.3390/electronics10222729 - 09 Nov 2021
Cited by 10 | Viewed by 11649
Abstract
Research on converting 2D raster drawings into 3D vector data has a long history in the field of pattern recognition. Prior to the achievement of machine learning, existing studies were based on heuristics and rules. In recent years, there have been several studies [...] Read more.
Research on converting 2D raster drawings into 3D vector data has a long history in the field of pattern recognition. Prior to the achievement of machine learning, existing studies were based on heuristics and rules. In recent years, there have been several studies employing deep learning, but a great effort was required to secure a large amount of data for learning. In this study, to overcome these limitations, we used 3DPlanNet Ensemble methods incorporating rule-based heuristic methods to learn with only a small amount of data (30 floor plan images). Experimentally, this method produced a wall accuracy of more than 95% and an object accuracy similar to that of a previous study using a large amount of learning data. In addition, 2D drawings without dimension information were converted into ground truth sizes with an accuracy of 97% or more, and structural data in the form of 3D models in which layers were divided for each object, such as walls, doors, windows, and rooms, were created. Using the 3DPlanNet Ensemble proposed in this study, we generated 110,000 3D vector data with a wall accuracy of 95% or more from 2D raster drawings end to end. Full article
(This article belongs to the Special Issue Applications of Computer Vision)
Show Figures

Figure 1

14 pages, 64236 KiB  
Article
Automatic Unsupervised Fabric Defect Detection Based on Self-Feature Comparison
by Zhengrui Peng, Xinyi Gong, Bengang Wei, Xiangyi Xu and Shixiong Meng
Electronics 2021, 10(21), 2652; https://doi.org/10.3390/electronics10212652 - 29 Oct 2021
Cited by 7 | Viewed by 2235
Abstract
Due to the huge demand for textile production in China, fabric defect detection is particularly attractive. At present, an increasing number of supervised deep-learning methods are being applied in surface defect detection. However, the annotation of datasets in industrial settings often depends on [...] Read more.
Due to the huge demand for textile production in China, fabric defect detection is particularly attractive. At present, an increasing number of supervised deep-learning methods are being applied in surface defect detection. However, the annotation of datasets in industrial settings often depends on professional inspectors. Moreover, the methods based on supervised learning require a lot of annotation, which consumes a great deal of time and costs. In this paper, an approach based on self-feature comparison (SFC) was employed that accurately located and segmented fabric texture images to find anomalies with unsupervised learning. The SFC architecture contained the self-feature reconstruction module and the self-feature distillation. Accurate fiber anomaly location and segmentation were generated based on these two modules. Compared with the traditional methods that operate in image space, the comparison of feature space can better locate the anomalies of fiber texture surfaces. Evaluations were performed on the three publicly available databases. The results indicated that our method performed well compared with other methods, and had excellent defect detection ability in the collected textile images. In addition, the visual results showed that our results can be used as a pixel-level candidate label. Full article
(This article belongs to the Special Issue Applications of Computer Vision)
Show Figures

Figure 1

15 pages, 9030 KiB  
Article
STERapp: Semiautomatic Software for Stereological Analysis. Application in the Estimation of Fish Fecundity
by Almoutaz Mbaidin, Sonia Rábade-Uberos, Rosario Dominguez-Petit, Andrés Villaverde, María Encarnación Gónzalez-Rufino, Arno Formella, Manuel Fernández-Delgado and Eva Cernadas
Electronics 2021, 10(12), 1432; https://doi.org/10.3390/electronics10121432 - 15 Jun 2021
Cited by 5 | Viewed by 2947
Abstract
Stereology is the tridimensional interpretation of bidimensional sections of a structure, widely used in fields such as mineralogy, medicine, and biology. This paper proposes a general software to do stereological analysis, called STERapp, with a friendly graphical interface to enable expert supervision. It [...] Read more.
Stereology is the tridimensional interpretation of bidimensional sections of a structure, widely used in fields such as mineralogy, medicine, and biology. This paper proposes a general software to do stereological analysis, called STERapp, with a friendly graphical interface to enable expert supervision. It includes a module to estimate fish fecundity (number of mature oocytes in the ovary), which has been used by experts in fish biology in two Spanish marine research centers since 2020 to estimate the fecundity of five fish species with different reproductive strategies and oocytes characteristics. This module encloses advanced computer vision and machine learning techniques to automatically recognize and classify the cells in histological images of fish gonads. The automatic recognition algorithm achieved a sensitivity of 55.6%, a specificity of 64.8%, and an average precision of 43.1%. The accuracies achieved for oocyte classification were 84.5% for the maturity stages and 78.5% for the classification regarding presence/absence of the nucleus. This facilitates the analysis and saves experts’ time. Hence, the SUS questionnaire reported a mean score of 81.9, which means that the system was perceived from good to excellent to develop stereological analysis for the estimation of fish fecundity. Full article
(This article belongs to the Special Issue Applications of Computer Vision)
Show Figures

Figure 1

22 pages, 5989 KiB  
Article
Effects of Enhancement on Deep Learning Based Hepatic Vessel Segmentation
by Shanmugapriya Survarachakan, Egidijius Pelanis, Zohaib Amjad Khan, Rahul Prasanna Kumar, Bjørn Edwin and Frank Lindseth
Electronics 2021, 10(10), 1165; https://doi.org/10.3390/electronics10101165 - 13 May 2021
Cited by 18 | Viewed by 3765
Abstract
Colorectal cancer (CRC) is the third most common type of cancer with the liver being the most common site for cancer spread. A precise understanding of patient liver anatomy and pathology, as well as surgical planning based on that, plays a critical role [...] Read more.
Colorectal cancer (CRC) is the third most common type of cancer with the liver being the most common site for cancer spread. A precise understanding of patient liver anatomy and pathology, as well as surgical planning based on that, plays a critical role in the treatment process. In some cases, surgeons request a 3D reconstruction, which requires a thorough analysis of the available images to be converted into 3D models of relevant objects through a segmentation process. Liver vessel segmentation is challenging due to the large variations in size and directions of the vessel structures as well as difficult contrasting conditions. In recent years, deep learning-based methods had been outperforming the conventional image analysis methods in the field of medical imaging. Though Convolutional Neural Networks (CNN) have been proved to be efficient for the task of medical image segmentation, the way of handling the image data and the preprocessing techniques play an important role in segmentation. Our work focuses on the combination of different vesselness enhancement filters and preprocessing methods to enhance the hepatic vessels prior to segmentation. In the first experiment, the effect of enhancement using individual vesselness filters was studied. In the second experiment, the effect of gamma correction on vesselness filters was studied. Lastly, the effect of fused vesselness filters over individual filters was studied. The methods were evaluated on clinical CT data. The quantitative analysis of the results in terms of different evaluation metrics from experiments can be summed up as (i) each of the filtered methods shows an improvement as compared to unenhanced with the best mean DICE score of 0.800 in comparison to 0.740 for unenhanced; (ii) applied gamma correction provides a statistically significant improvement in the performance of each filter with improvement in mean DICE of around 2%; (iii) both the fused filtered images and fused segmentation give the best results (mean DICE score of 0.818 and 0.830, respectively) with the statistically significant improvement compared to the individual filters with and without Gamma correction. The results have further been verified by qualitative analysis and hence show the importance of our proposed fused filter and segmentation approaches. Full article
(This article belongs to the Special Issue Applications of Computer Vision)
Show Figures

Figure 1

14 pages, 12920 KiB  
Article
Real-Time On-Board Deep Learning Fault Detection for Autonomous UAV Inspections
by Naeem Ayoub and Peter Schneider-Kamp
Electronics 2021, 10(9), 1091; https://doi.org/10.3390/electronics10091091 - 05 May 2021
Cited by 26 | Viewed by 3664
Abstract
Inspection of high-voltage power lines using unmanned aerial vehicles is an emerging technological alternative to traditional methods. In the Drones4Energy project, we work toward building an autonomous vision-based beyond-visual-line-of-sight (BVLOS) power line inspection system. In this paper, we present a deep learning-based autonomous [...] Read more.
Inspection of high-voltage power lines using unmanned aerial vehicles is an emerging technological alternative to traditional methods. In the Drones4Energy project, we work toward building an autonomous vision-based beyond-visual-line-of-sight (BVLOS) power line inspection system. In this paper, we present a deep learning-based autonomous vision system to detect faults in power line components. We trained a YOLOv4-tiny architecture-based deep neural network, as it showed prominent results for detecting components with high accuracy. For running such deep learning models in a real-time environment, different single-board devices such as the Raspberry Pi 4, Nvidia Jetson Nano, Nvidia Jetson TX2, and Nvidia Jetson AGX Xavier were used for the experimental evaluation. Our experimental results demonstrated that the proposed approach can be effective and efficient for fully automatic real-time on-board visual power line inspection. Full article
(This article belongs to the Special Issue Applications of Computer Vision)
Show Figures

Figure 1

19 pages, 8020 KiB  
Article
Research on the Cascade Vehicle Detection Method Based on CNN
by Jianjun Hu, Yuqi Sun and Songsong Xiong
Electronics 2021, 10(4), 481; https://doi.org/10.3390/electronics10040481 - 18 Feb 2021
Cited by 10 | Viewed by 2713
Abstract
This paper introduces an adaptive method for detecting front vehicles under complex weather conditions. In the field of vehicle detection from images extracted by cameras installed in vehicles, backgrounds with complicated weather, such as rainy and snowy days, increase the difficulty of target [...] Read more.
This paper introduces an adaptive method for detecting front vehicles under complex weather conditions. In the field of vehicle detection from images extracted by cameras installed in vehicles, backgrounds with complicated weather, such as rainy and snowy days, increase the difficulty of target detection. In order to improve the accuracy and robustness of vehicle detection in front of driverless cars, a cascade vehicle detection method combining multifeature fusion and convolutional neural network (CNN) is proposed in this paper. Firstly, local binary patterns, Haar-like and orientation gradient histogram features from the front vehicle are extracted, then principal-component-analysis dimension reduction and serial-fusion processing are performed on the input image. Furthermore, a preliminary screening is conducted as the input of a support vector machine classifier based on the acquired fusion features, and the CNN model is employed to validate cascade detection of the filtered results. Finally, an integrated data set extracted from BDD, Udacity, and other data sets is utilized to test the method proposed. The recall rate is 98.69%, which is better than the traditional feature algorithm, and the recall rate of 97.32% in a complex driving environment indicates that the algorithm possesses good robustness. Full article
(This article belongs to the Special Issue Applications of Computer Vision)
Show Figures

Figure 1

20 pages, 10202 KiB  
Article
Image Distortion and Rectification Calibration Algorithms and Validation Technique for a Stereo Camera
by Jonguk Kim, Hyansu Bae and Suk Gyu Lee
Electronics 2021, 10(3), 339; https://doi.org/10.3390/electronics10030339 - 01 Feb 2021
Cited by 3 | Viewed by 3577
Abstract
This paper focuses on the calibration problem using stereo camera images. Currently, advanced vehicle systems such as smart cars and mobile robots require accurate and reliable vision in order to detect obstacles and special marks around. Such modern vehicles can be equipped with [...] Read more.
This paper focuses on the calibration problem using stereo camera images. Currently, advanced vehicle systems such as smart cars and mobile robots require accurate and reliable vision in order to detect obstacles and special marks around. Such modern vehicles can be equipped with sensors and cameras together or separately. In this study, we propose new methodologies of stereo camera calibration based on the correction of distortion and image rectification. Once the calibration is complete, the validation of the corrections is presented followed by an evaluation of the calibration process. Usually, the validation section is not jointly considered with the calibration in other studies. However, the mass production of cameras widely uses the validation techniques in calibrations owned by manufacturing businesses. Here, we aim to present a single process for the calibration and validation of stereo cameras. The experiment results showed the disparity maps in comparison with another study and proved that the proposed calibration methods can be efficient. Full article
(This article belongs to the Special Issue Applications of Computer Vision)
Show Figures

Figure 1

20 pages, 10652 KiB  
Article
An Efficient Point-Matching Method Based on Multiple Geometrical Hypotheses
by Miguel Carrasco, Domingo Mery, Andrés Concha, Ramiro Velázquez, Roberto De Fazio and Paolo Visconti
Electronics 2021, 10(3), 246; https://doi.org/10.3390/electronics10030246 - 22 Jan 2021
Cited by 1 | Viewed by 2451
Abstract
Point matching in multiple images is an open problem in computer vision because of the numerous geometric transformations and photometric conditions that a pixel or point might exhibit in the set of images. Over the last two decades, different techniques have been proposed [...] Read more.
Point matching in multiple images is an open problem in computer vision because of the numerous geometric transformations and photometric conditions that a pixel or point might exhibit in the set of images. Over the last two decades, different techniques have been proposed to address this problem. The most relevant are those that explore the analysis of invariant features. Nonetheless, their main limitation is that invariant analysis all alone cannot reduce false alarms. This paper introduces an efficient point-matching method for two and three views, based on the combined use of two techniques: (1) the correspondence analysis extracted from the similarity of invariant features and (2) the integration of multiple partial solutions obtained from 2D and 3D geometry. The main strength and novelty of this method is the determination of the point-to-point geometric correspondence through the intersection of multiple geometrical hypotheses weighted by the maximum likelihood estimation sample consensus (MLESAC) algorithm. The proposal not only extends the methods based on invariant descriptors but also generalizes the correspondence problem to a perspective projection model in multiple views. The developed method has been evaluated on three types of image sequences: outdoor, indoor, and industrial. Our developed strategy discards most of the wrong matches and achieves remarkable F-scores of 97%, 87%, and 97% for the outdoor, indoor, and industrial sequences, respectively. Full article
(This article belongs to the Special Issue Applications of Computer Vision)
Show Figures

Figure 1

22 pages, 17886 KiB  
Article
Adaptive View Sampling for Efficient Synthesis of 3D View Using Calibrated Array Cameras
by Geonwoo Kim and Deokwoo Lee
Electronics 2021, 10(1), 82; https://doi.org/10.3390/electronics10010082 - 04 Jan 2021
Cited by 2 | Viewed by 2077
Abstract
Recovery of three-dimensional (3D) coordinates using a set of images with texture mapping to generate a 3D mesh has been of great interest in computer graphics and 3D imaging applications. This work aims to propose an approach to adaptive view selection (AVS) that [...] Read more.
Recovery of three-dimensional (3D) coordinates using a set of images with texture mapping to generate a 3D mesh has been of great interest in computer graphics and 3D imaging applications. This work aims to propose an approach to adaptive view selection (AVS) that determines the optimal number of images to generate the synthesis result using the 3D mesh and textures in terms of computational complexity and image quality (peak signal-to-noise ratio (PSNR)). All 25 images were acquired by a set of cameras in a 5×5 array structure, and rectification had already been performed. To generate the mesh, depth map extraction was carried out by calculating the disparity between the matched feature points. Synthesis was performed by fully exploiting the content included in the images followed by texture mapping. Both the 2D colored images and grey-scale depth images were synthesized based on the geometric relationship between the images, and to this end, three-dimensional synthesis was performed with a smaller number of images, which was less than 25. This work determines the optimal number of images that sufficiently provides a reliable 3D extended view by generating a mesh and image textures. The optimal number of images contributes to an efficient system for 3D view generation that reduces the computational complexity while preserving the quality of the result in terms of the PSNR. To substantiate the proposed approach, experimental results are provided. Full article
(This article belongs to the Special Issue Applications of Computer Vision)
Show Figures

Figure 1

Back to TopTop