Application of Machine Vision and Deep Learning Technology

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 June 2024 | Viewed by 22401

Special Issue Editors


E-Mail Website
Guest Editor
School of Mechanical Engineering, Xi'an Jiaotong University, Xi'an 710049, China
Interests: machine vision; optical engineering; deep learning
Department of Electronic Information Engineering, Zhengzhou University, Zhengzhou 450000, China
Interests: image processing; visual inspection; photoelectric measurement

E-Mail Website
Guest Editor
School of Mechanical Engineering, Xi'an Jiaotong University, Xi'an 710049, China
Interests: photoelectric measurement; optical information processing; optical precision instrument; digital image processing; manufacturing systems; quality engineering

Special Issue Information

Dear Colleagues,

Machine vision is a branch of artificial intelligence that is developing rapidly. Machine vision technology aims to use machines to measure and judge things instead of human eyes. In recent years, with the continuous development of deep learning, its unique end-to-end learning concept and outstanding data analysis ability have helped machine vision technology to achieve higher accuracy in image classification, target recognition, and semantic segmentation, increasing its use in security, driverless cars, smart home, medical imaging, and other fields. In this Special Issue, the recent efforts and advances made in machine vision and deep learning will be discussed.

Dr. Junhui Huang
Dr. Qi Xue
Prof. Dr. Zhao Wang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine vision
  • deep learning
  • 3D sensing
  • optical imaging and measurement
  • super-resolution imaging
  • image processing
  • artificial intelligence and photonic neural network
  • target recognition
  • neural network and optimization
  • semantic segmentation and understanding
  • automatic optical inspection, industrial product testing, driverless car, character recognition, tracking and positioning, etc. hardware, algorithm, and techniques relating to machine vision

Published Papers (14 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

14 pages, 2302 KiB  
Article
A Multi-Scale Attention Fusion Network for Retinal Vessel Segmentation
by Shubin Wang, Yuanyuan Chen and Zhang Yi
Appl. Sci. 2024, 14(7), 2955; https://doi.org/10.3390/app14072955 - 31 Mar 2024
Viewed by 447
Abstract
The structure and function of retinal vessels play a crucial role in diagnosing and treating various ocular and systemic diseases. Therefore, the accurate segmentation of retinal vessels is of paramount importance to assist a clinical diagnosis. U-Net has been highly praised for its [...] Read more.
The structure and function of retinal vessels play a crucial role in diagnosing and treating various ocular and systemic diseases. Therefore, the accurate segmentation of retinal vessels is of paramount importance to assist a clinical diagnosis. U-Net has been highly praised for its outstanding performance in the field of medical image segmentation. However, with the increase in network depth, multiple pooling operations may lead to the problem of crucial information loss. Additionally, handling the insufficient processing of local context features caused by skip connections can affect the accurate segmentation of retinal vessels. To address these problems, we proposed a novel model for retinal vessel segmentation. The proposed model is implemented based on the U-Net architecture, with the addition of two blocks, namely, an MsFE block and MsAF block, between the encoder and decoder at each layer of the U-Net backbone. The MsFE block extracts low-level features from different scales, while the MsAF block performs feature fusion across various scales. Finally, the output of the MsAF block replaces the skip connection in the U-Net backbone. Experimental evaluations on the DRIVE dataset, CHASE_DB1 dataset, and STARE dataset demonstrated that MsAF-UNet exhibited excellent segmentation performance compared with the state-of-the-art methods. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

23 pages, 21756 KiB  
Article
Segmenting Urban Scene Imagery in Real Time Using an Efficient UNet-like Transformer
by Haiqing Xu, Mingyang Yu, Fangliang Zhou and Hongling Yin
Appl. Sci. 2024, 14(5), 1986; https://doi.org/10.3390/app14051986 - 28 Feb 2024
Viewed by 488
Abstract
Semantic segmentation of high-resolution remote sensing urban images is widely used in many fields, such as environmental protection, urban management, and sustainable development. For many years, convolutional neural networks (CNNs) have been a prevalent method in the field, but the convolution operations are [...] Read more.
Semantic segmentation of high-resolution remote sensing urban images is widely used in many fields, such as environmental protection, urban management, and sustainable development. For many years, convolutional neural networks (CNNs) have been a prevalent method in the field, but the convolution operations are deficient in modeling global information due to their local nature. In recent years, the Transformer-based methods have demonstrated their advantages in many domains due to the powerful ability to model global information, such as semantic segmentation, instance segmentation, and object detection. Despite the above advantages, Transformer-based architectures tend to incur significant computational costs, limiting the model’s real-time application potential. To address this problem, we propose a U-shaped network with Transformer as the decoder and CNN as the encoder to segment remote sensing urban scene images. For efficient segmentation, we design a window-based, multi-head, focused linear self-attention (WMFSA) mechanism and further propose the global–local information modeling module (GLIM), which can capture both global and local contexts through a dual-branch structure. Experimenting on four challenging datasets, we demonstrate that our model not only achieves a higher segmentation accuracy compared with other methods but also can obtain competitive speeds to enhance the model’s real-time application potential. Specifically, the mIoU of our method is 68.2% and 52.8% on the UAVid and LoveDA datasets, respectively, while the speed is 114 FPS, with a 1024 × 1024 input on a single 3090 GPU. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

12 pages, 22729 KiB  
Article
nmODE-Unet: A Novel Network for Semantic Segmentation of Medical Images
by Shubin Wang, Yuanyuan Chen and Zhang Yi
Appl. Sci. 2024, 14(1), 411; https://doi.org/10.3390/app14010411 - 02 Jan 2024
Viewed by 880
Abstract
Diabetic retinopathy is a prevalent eye disease that poses a potential risk of blindness. Nevertheless, due to the small size of diabetic retinopathy lesions and the high interclass similarity in terms of location, color, and shape among different lesions, the segmentation task is [...] Read more.
Diabetic retinopathy is a prevalent eye disease that poses a potential risk of blindness. Nevertheless, due to the small size of diabetic retinopathy lesions and the high interclass similarity in terms of location, color, and shape among different lesions, the segmentation task is highly challenging. To address these issues, we proposed a novel framework named nmODE-Unet, which is based on the nmODE (neural memory Ordinary Differential Equation) block and U-net backbone. In nmODE-Unet, the shallow features serve as input to the nmODE block, and the output of the nmODE block is fused with the corresponding deep features. Extensive experiments were conducted on the IDRiD dataset, e_ophtha dataset, and the LGG segmentation dataset, and the results demonstrate that, in comparison to other competing models, nmODE-Unet showcases a superior performance. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

12 pages, 2615 KiB  
Article
Automatic Recognition of Blood Cell Images with Dense Distributions Based on a Faster Region-Based Convolutional Neural Network
by Yun Liu, Yumeng Liu, Menglu Chen, Haoxing Xue, Xiaoqiang Wu, Linqi Shui, Junhong Xing, Xian Wang, Hequn Li and Mingxing Jiao
Appl. Sci. 2023, 13(22), 12412; https://doi.org/10.3390/app132212412 - 16 Nov 2023
Viewed by 629
Abstract
In modern clinical medicine, the important information of red blood cells, such as shape and number, is applied to detect blood diseases. However, the automatic recognition problem of single cells and adherent cells always exists in a densely distributed medical scene, which is [...] Read more.
In modern clinical medicine, the important information of red blood cells, such as shape and number, is applied to detect blood diseases. However, the automatic recognition problem of single cells and adherent cells always exists in a densely distributed medical scene, which is difficult to solve for both the traditional detection algorithms with lower recognition rates and the conventional networks with weaker feature extraction capabilities. In this paper, an automatic recognition method of adherent blood cells with dense distribution is proposed. Based on the Faster R-CNN, the balanced feature pyramid structure, deformable convolution network, and efficient pyramid split attention mechanism are adopted to automatically recognize the blood cells under the conditions of dense distribution, extrusion deformation, adhesion and overlap. In addition, the Align algorithm for region of interest also contributes to improving the accuracy of recognition results. The experimental results show that the mean average precision of cell detection is 0.895, which is 24.5% higher than that of the original network model. Compared with the one-stage mainstream networks, the presented network has a stronger feature extraction capability. The proposed method is suitable for identifying single cells and adherent cells with dense distribution in the actual medical scene. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

16 pages, 3003 KiB  
Article
A Visual-Based Approach for Driver’s Environment Perception and Quantification in Different Weather Conditions
by Longxi Luo, Minghao Liu, Jiahao Mei, Yu Chen and Luzheng Bi
Appl. Sci. 2023, 13(22), 12176; https://doi.org/10.3390/app132212176 - 09 Nov 2023
Cited by 1 | Viewed by 773
Abstract
The decision-making behavior of drivers during the driving process is influenced by various factors, including road conditions, traffic situations, weather conditions, and so on. However, our understanding and quantification of the driving environment are still very limited, which not only increases the risk [...] Read more.
The decision-making behavior of drivers during the driving process is influenced by various factors, including road conditions, traffic situations, weather conditions, and so on. However, our understanding and quantification of the driving environment are still very limited, which not only increases the risk of driving but also hinders the deployment of autonomous vehicles. To address this issue, this study attempts to transform drivers’ visual perception into machine vision perception. Specifically, the study provides a detailed decomposition of the elements constituting weather and proposes three environmental quantification indicators: visibility brightness, visibility clarity, and visibility obstruction rate. These indicators help us to describe and quantify the driving environment more accurately. Based on these indicators, a visual-based environmental quantification method is further proposed to better understand and interpret the driving environment. Additionally, based on drivers’ visual perception, this study extensively analyzes the impact of environmental factors on driver behavior. A cognitive assessment model is established to evaluate drivers’ cognitive abilities in different environments. The effectiveness and accuracy of the model are validated through driver simulation experiments, thereby establishing a communication bridge between the driving environment and driver behavior. This research achievement enables us to better understand the decision-making behavior of drivers in specific environments and provides some references for the development of intelligent driving technology. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

19 pages, 2158 KiB  
Article
U2-Net: A Very-Deep Convolutional Neural Network for Detecting Distracted Drivers
by Nawaf O. Alsrehin, Mohit Gupta, Izzat Alsmadi and Saif Addeen Alrababah
Appl. Sci. 2023, 13(21), 11898; https://doi.org/10.3390/app132111898 - 31 Oct 2023
Cited by 1 | Viewed by 1186
Abstract
In recent years, the number of deaths and injuries resulting from traffic accidents has been increasing dramatically all over the world due to distracted drivers. Thus, a key element in developing intelligent vehicles and safe roads is monitoring driver behaviors. In this paper, [...] Read more.
In recent years, the number of deaths and injuries resulting from traffic accidents has been increasing dramatically all over the world due to distracted drivers. Thus, a key element in developing intelligent vehicles and safe roads is monitoring driver behaviors. In this paper, we modify and extend the U-net convolutional neural network so that it provides deep layers to represent image features and yields more precise classification results. It is the basis of a very deep convolution neural network, called U2-net, to detect distracted drivers. The U2-net model has two paths (contracting and expanding) in addition to a fully-connected dense layer. The contracting path is used to extract the context around the objects to provide better object representation while the symmetric expanding path enables precise localization. The motivation behind this model is that it provides precise object features to provide a better object representation and classification. We used two public datasets: MI-AUC and State Farm, to evaluate the U2 model in detecting distracted driving. The accuracy of U2-net on MI-AUC and State Farm is 98.34 % and 99.64%, respectively. These evaluation results show higher accuracy than achieved by many other state-of-the-art methods. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

17 pages, 4439 KiB  
Article
Exploring the ViDiDetect Tool for Automated Defect Detection in Manufacturing with Machine Vision
by Mateusz Dziubek, Jacek Rysiński and Daniel Jancarczyk
Appl. Sci. 2023, 13(19), 11098; https://doi.org/10.3390/app131911098 - 09 Oct 2023
Viewed by 1119
Abstract
Automated monitoring of cutting tool wear is of paramount importance in the manufacturing industry, as it directly impacts production efficiency and product quality. Traditional manual inspection methods are time-consuming and prone to human error, necessitating the adoption of more advanced techniques. This study [...] Read more.
Automated monitoring of cutting tool wear is of paramount importance in the manufacturing industry, as it directly impacts production efficiency and product quality. Traditional manual inspection methods are time-consuming and prone to human error, necessitating the adoption of more advanced techniques. This study explores the application of ViDiDetect, a deep learning-based defect detection solution, in the context of machine vision for assessing cutting tool wear. By capturing high-resolution images of machining tools and analyzing wear patterns, machine vision systems offer a non-contact and non-destructive approach to tool wear assessment, enabling continuous monitoring without disrupting the machining process. In this research, a smart camera and an illuminator were utilized to capture images of a car suspension knuckle’s machined surface, with a focus on detecting burrs, chips, and tool wear. The study also employed a mask to narrow the region of interest and enhance classification accuracy. This investigation demonstrates the potential of machine vision and ViDiDetect in automating cutting tool wear assessment, ultimately enhancing manufacturing processes’ efficiency and product quality. The project is at the implementation stage in one of the automotive production plants located in southern Poland. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

15 pages, 1482 KiB  
Article
LightSeg: Local Spatial Perception Convolution for Real-Time Semantic Segmentation
by Xiaochun Lei, Jiaming Liang, Zhaoting Gong and Zetao Jiang
Appl. Sci. 2023, 13(14), 8130; https://doi.org/10.3390/app13148130 - 12 Jul 2023
Cited by 1 | Viewed by 1018
Abstract
Semantic segmentation is increasingly being applied on mobile devices due to advancements in mobile chipsets, particularly in low-power consumption scenarios. However, the lightweight design of mobile devices poses limitations on the receptive field, which is crucial for dense prediction problems. Existing approaches have [...] Read more.
Semantic segmentation is increasingly being applied on mobile devices due to advancements in mobile chipsets, particularly in low-power consumption scenarios. However, the lightweight design of mobile devices poses limitations on the receptive field, which is crucial for dense prediction problems. Existing approaches have attempted to balance lightweight designs and high accuracy by downsampling features in the backbone. However, this downsampling may result in the loss of local details at each network stage. To address this challenge, this paper presents a novel solution in the form of a compact and efficient convolutional neural network (CNN) for real-time applications: our proposed model, local spatial perception convolution (LSPConv). Furthermore, the effectiveness of our architecture is demonstrated on the Cityscapes dataset. The results show that our model achieves an impressive balance between accuracy and inference speed. Specifically, our LightSeg, which does not rely on ImageNet pretraining, achieves an mIoU of 76.1 at a speed of 61 FPS on the Cityscapes validation set, utilizing an RTX 2080Ti GPU with mixed precision. Additionally, it achieves a speed of 115.7 FPS on the Jetson NX with int8 precision. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

17 pages, 5162 KiB  
Article
Semantic Segmentation of Packaged and Unpackaged Fresh-Cut Apples Using Deep Learning
by Udith Krishnan Vadakkum Vadukkal, Michela Palumbo and Giovanni Attolico
Appl. Sci. 2023, 13(12), 6969; https://doi.org/10.3390/app13126969 - 09 Jun 2023
Cited by 2 | Viewed by 1550
Abstract
Computer vision systems are often used in industrial quality control to offer fast, objective, non-destructive, and contactless evaluation of fruit. The senescence of fresh-cut apples is strongly related to the browning of the pulp rather than to the properties of the peel. This [...] Read more.
Computer vision systems are often used in industrial quality control to offer fast, objective, non-destructive, and contactless evaluation of fruit. The senescence of fresh-cut apples is strongly related to the browning of the pulp rather than to the properties of the peel. This work addresses the identification and selection of pulp inside images of fresh-cut apples, both packaged and unpackaged; this is a critical step towards a computer vision system that is able to evaluate their quality and internal properties. A DeepLabV3+-based convolutional neural network model (CNN) has been developed for this semantic segmentation task. It has proved to be robust with respect to the similarity of colours between the peel and pulp. Its ability to separate the pulp from the peel and background has been verified on four varieties of apples: Granny Smith (greenish peel), Golden (yellowish peel), Fuji, and Pink Lady (reddish peel). The semantic segmentation achieved an accuracy greater than 99% on all these varieties. The developed approach was able to isolate regions significantly affected by the browning process on both packaged and unpackaged pieces: on these areas, the colour analysis will be studied to evaluate internal quality and senescence of packaged and unpackaged products. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

20 pages, 6660 KiB  
Article
Real-Time Defect Detection for Metal Components: A Fusion of Enhanced Canny–Devernay and YOLOv6 Algorithms
by Hongjun Wang, Xiujin Xu, Yuping Liu, Deda Lu, Bingqiang Liang and Yunchao Tang
Appl. Sci. 2023, 13(12), 6898; https://doi.org/10.3390/app13126898 - 07 Jun 2023
Cited by 2 | Viewed by 1941
Abstract
Due to the presence of numerous surface defects, the inadequate contrast between defective and non-defective regions, and the resemblance between noise and subtle defects, edge detection poses a significant challenge in dimensional error detection, leading to increased dimensional measurement inaccuracies. These issues serve [...] Read more.
Due to the presence of numerous surface defects, the inadequate contrast between defective and non-defective regions, and the resemblance between noise and subtle defects, edge detection poses a significant challenge in dimensional error detection, leading to increased dimensional measurement inaccuracies. These issues serve as major bottlenecks in the domain of automatic detection of high-precision metal parts. To address these challenges, this research proposes a combined approach involving the utilization of the YOLOv6 deep learning network in conjunction with metal lock body parts for the rapid and accurate detection of surface flaws in metal workpieces. Additionally, an enhanced Canny–Devernay sub-pixel edge detection algorithm is employed to determine the size of the lock core bead hole. The methodology is as follows: The data set for surface defect detection is acquired using the labeling software lableImg and subsequently utilized for training the YOLOv6 model to obtain the model weights. For size measurement, the region of interest (ROI) corresponding to the lock cylinder bead hole is first extracted. Subsequently, Gaussian filtering is applied to the ROI, followed by a sub-pixel edge detection using the improved Canny–Devernay algorithm. Finally, the edges are fitted using the least squares method to determine the radius of the fitted circle. The measured value is obtained through size conversion. Experimental detection involves employing the YOLOv6 method to identify surface defects in the lock body workpiece, resulting in an achieved mean Average Precision (mAP) value of 0.911. Furthermore, the size of the lock core bead hole is measured using an upgraded technique based on the Canny–Devernay sub-pixel edge detection, yielding an average inaccuracy of less than 0.03 mm. The findings of this research showcase the successful development of a practical method for applying machine vision in the realm of the automatic detection of metal parts. This achievement is accomplished through the exploration of identification methods and size-measuring techniques for common defects found in metal parts. Consequently, the study establishes a valuable framework for effectively utilizing machine vision in the field of metal parts inspection and defect detection. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

14 pages, 4371 KiB  
Article
Lossless Compression of Large Aperture Static Imaging Spectrometer Data
by Lu Yu, Hongbo Li, Jing Li and Wei Li
Appl. Sci. 2023, 13(9), 5632; https://doi.org/10.3390/app13095632 - 03 May 2023
Viewed by 941
Abstract
The large-aperture static imaging spectrometer (LASIS) is an interference spectrometer with high device stability, high throughput, a wide spectral range, and a high spectral resolution. One frame image of the original data cube acquired by the LASIS shows the image superimposed with interference [...] Read more.
The large-aperture static imaging spectrometer (LASIS) is an interference spectrometer with high device stability, high throughput, a wide spectral range, and a high spectral resolution. One frame image of the original data cube acquired by the LASIS shows the image superimposed with interference fringes, which is distinctly different from traditional hyperspectral images. For compression studies using this new type of data, a lossless compression scheme that combines a novel data rearrange method and the lossless multispectral and hyperspectral image compression standard CCSDS-123 is presented. In the rearrange approach, the LASIS data cube is rearranged such that the interference information overlapped on the image can be separated, and the results are then processed using the CCSDS-123 standard. Then, several experiments are conducted to investigate the performance of the rearrange method and examine the impact of different CCSDS-123 parameter settings for the LASIS. The experimental results indicate that the proposed scheme provides a 32.9% higher ratio than traditional rearrange methods. Moreover, an adequate parameter combination for this compression scheme for LASIS is presented, and it yields a 19.6% improvement over the default settings suggested by the standard. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

18 pages, 13737 KiB  
Article
TSDNet: A New Multiscale Texture Surface Defect Detection Model
by Min Dong, Dezhen Li, Kaixiang Li and Junpeng Xu
Appl. Sci. 2023, 13(5), 3289; https://doi.org/10.3390/app13053289 - 04 Mar 2023
Viewed by 1635
Abstract
Industrial defect detection methods based on deep learning can reduce the cost of traditional manual quality inspection, improve the accuracy and efficiency of detection, and are widely used in industrial fields. Traditional computer defect detection methods focus on manual features and require a [...] Read more.
Industrial defect detection methods based on deep learning can reduce the cost of traditional manual quality inspection, improve the accuracy and efficiency of detection, and are widely used in industrial fields. Traditional computer defect detection methods focus on manual features and require a large amount of defect data, which has some limitations. This paper proposes a texture surface defect detection method based on convolutional neural network and wavelet analysis: TSDNet. The approach combines wavelet analysis with patch extraction, which can detect and locate many defects in a complex texture background; a patch extraction method based on random windows is proposed, which can quickly and effectively extract defective patches; and a judgment strategy based on a sliding window is proposed to improve the robustness of CNN. Our method can achieve excellent detection accuracy on DAGM 2007, a micro-surface defect database and KolektorSDD dataset, and can find the defect location accurately. The results show that in the complex texture background, the method can obtain high defect detection accuracy with only a small amount of training data and can accurately locate the defect position. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

20 pages, 8014 KiB  
Article
Research on Automatic Error Data Recognition Method for Structured Light System Based on Residual Neural Network
by Aozhuo Ding, Qi Xue, Xulong Ding, Xiaohong Sun, Xiaonan Yang and Huiying Ye
Appl. Sci. 2023, 13(5), 2920; https://doi.org/10.3390/app13052920 - 24 Feb 2023
Viewed by 1122
Abstract
In a structured light system, the positioning accuracy of the stripe is one of the determinants of measurement accuracy. However, the quality of the structured light stripe is reduced by noise, object shape, color, etc. The positioning accuracy of the low-quality stripe center [...] Read more.
In a structured light system, the positioning accuracy of the stripe is one of the determinants of measurement accuracy. However, the quality of the structured light stripe is reduced by noise, object shape, color, etc. The positioning accuracy of the low-quality stripe center will be decreased, and the large error will be introduced into measurement results, which can only be recognized by a human. To address this problem, this paper proposes a method to identify data with relatively large errors in 3D measurement results by evaluating the quality of the grayscale distribution of stripes. In this method, the undegraded and degraded stripe images are captured. Then, the residual neural network is trained using the grayscale distribution of the two types of stripes. The captured stripes are classified by the trained model. Finally, the data corresponding to the degraded stripes, which correspond to large errors in the data, can be identified according to the classified results. The experiment shows that the algorithm proposed in this paper can effectively identify the data with large errors automatically. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

Review

Jump to: Research

17 pages, 1044 KiB  
Review
Quick Overview of Face Swap Deep Fakes
by Tomasz Walczyna and Zbigniew Piotrowski
Appl. Sci. 2023, 13(11), 6711; https://doi.org/10.3390/app13116711 - 31 May 2023
Cited by 2 | Viewed by 7128
Abstract
Deep Fake technology has developed rapidly in its generation and detection in recent years. Researchers in both fields are outpacing each other in their axes achievements. The works use, among other methods, autoencoders, generative adversarial networks, or other algorithms to create fake content [...] Read more.
Deep Fake technology has developed rapidly in its generation and detection in recent years. Researchers in both fields are outpacing each other in their axes achievements. The works use, among other methods, autoencoders, generative adversarial networks, or other algorithms to create fake content that is resistant to detection by algorithms or the human eye. Among the ever-increasing number of emerging works, a few can be singled out that, in their solutions and robustness of detection, contribute significantly to the field. Despite the advancement of emerging generative algorithms, the fields are still left for further research. This paper will briefly introduce the fundamentals of some the latest Face Swap Deep Fake algorithms. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

Back to TopTop