sensors-logo

Journal Browser

Journal Browser

Perceptual Deep Learning in Image Processing and Computer Vision

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Internet of Things".

Deadline for manuscript submissions: closed (28 February 2021) | Viewed by 43661

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science and Information Engineering, National Ilan University, Yilan City 260, Taiwan
Interests: DSP IC design; computer vision; image processing; cognitive learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Information Engineering, University of Padua, Via Gradenigo 6, 35131 Padova, Italy
Interests: deep learning (ensembles of deep learners, transfer learning); computer vision (general-purpose image classifiers, medical image classification, texture descriptors); biometrics systems (fingerprint classification and recognition, signature verification, face recognition)
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei 106335, Taiwan
Interests: image processing; computer vision; biometrics

Special Issue Information

Dear Colleagues,

AI has garnered significant interest among academic researchers and industrial practitioners for its recent technological breakthrough in the form of deep learning architectures. Consequently, research directions of image processing and computer vision are shifting from statistical methods to deep learning neural network algorithms. The goal of this Special Issue is to invite frontier research that tackles important and challenging issues in deep learning for image processing, computer vision and perceptual understanding. Both theoretical studies and practical applications are welcome for submission. Topics for this Special Issue include but are not restricted to the following fields:

  1. Deep learning algorithms:
    Image/video understanding and recognition;
    Biometrics and spoof detection;
    Object detection and real-time tracking;
    Human perceptual cognition and preference;
  1. Deep learning architectural implementations:
    AI computing and hardware design;
    Self-adapting software engineering;
  1. Deep learning applications:
    Medical imaging and healthcare;
    Environmental and earth science;
    Autonomous manufacturing and smart factory;
    Smart city;
    Audio classification;
    Computational ecology.

Prof. Dr. Chih-Hsien Hsia
Prof. Dr. Loris Nanni
Prof. Dr. Jing-Ming Guo
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning algorithms
  • deep learning architectural implementations
  • deep learning applications

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 3457 KiB  
Article
Design and Implementation of an Atmospheric Anion Monitoring System Based on Beidou Positioning
by Jinhu Wang, Binze Xie, Jiahan Cai, Yuhao Wang, Jiang Chen and Muhammad Ilyas Abro
Sensors 2021, 21(18), 6174; https://doi.org/10.3390/s21186174 - 15 Sep 2021
Cited by 2 | Viewed by 2074
Abstract
Atmospheric oxygen anions play an important role in medical health, clinical medicine, environmental health, and the ecological environment. Therefore, the concentration of atmospheric anions is an important index for measuring air quality. This paper proposes a monitoring system for atmospheric oxygen anions based [...] Read more.
Atmospheric oxygen anions play an important role in medical health, clinical medicine, environmental health, and the ecological environment. Therefore, the concentration of atmospheric anions is an important index for measuring air quality. This paper proposes a monitoring system for atmospheric oxygen anions based on Beidou positioning and unmanned vehicles. This approach combines Beidou positioning technology, 4G pass-through, the unmanned capacitance suction method, electromagnetic field theory, and atmospheric detection technology. The proposed instrument can monitor the overall negative oxygen ion concentration, temperature, and humidity in a certain region over time and provide data visualization for the concentration of negative oxygen ions. Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

20 pages, 3023 KiB  
Article
A Classification and Prediction Hybrid Model Construction with the IQPSO-SVM Algorithm for Atrial Fibrillation Arrhythmia
by Liang-Hung Wang, Ze-Hong Yan, Yi-Ting Yang, Jun-Ying Chen, Tao Yang, I-Chun Kuo, Patricia Angela R. Abu, Pao-Cheng Huang, Chiung-An Chen and Shih-Lun Chen
Sensors 2021, 21(15), 5222; https://doi.org/10.3390/s21155222 - 01 Aug 2021
Cited by 11 | Viewed by 2661
Abstract
Atrial fibrillation (AF) is the most common cardiovascular disease (CVD), and most existing algorithms are usually designed for the diagnosis (i.e., feature classification) or prediction of AF. Artificial intelligence (AI) algorithms integrate the diagnosis of AF electrocardiogram (ECG) and predict the possibility that [...] Read more.
Atrial fibrillation (AF) is the most common cardiovascular disease (CVD), and most existing algorithms are usually designed for the diagnosis (i.e., feature classification) or prediction of AF. Artificial intelligence (AI) algorithms integrate the diagnosis of AF electrocardiogram (ECG) and predict the possibility that AF will occur in the future. In this paper, we utilized the MIT-BIH AF Database (AFDB), which is composed of data from normal people and patients with AF and onset characteristics, and the AFPDB database (i.e., PAF Prediction Challenge Database), which consists of data from patients with Paroxysmal AF (PAF; the records contain the ECG preceding an episode of PAF), and subjects who do not have documented AF. We extracted the respective characteristics of the databases and used them in modeling diagnosis and prediction. In the aspect of model construction, we regarded diagnosis and prediction as two classification problems, adopted the traditional support vector machine (SVM) algorithm, and combined them. The improved quantum particle swarm optimization support vector machine (IQPSO-SVM) algorithm was used to speed the training time. During the verification process, the clinical FZU-FPH database created by Fuzhou University and Fujian Provincial Hospital was used for hybrid model testing. The data were obtained from the Holter monitor of the hospital and encrypted. We proposed an algorithm for transforming the PDF ECG waveform images of hospital examination reports into digital data. For the diagnosis model and prediction model trained using the training set of the AFDB and AFPDB databases, the sensitivity, specificity, and accuracy measures were 99.2% and 99.2%, 99.2% and 93.3%, and 91.7% and 92.5% for the test set of the AFDB and AFPDB databases, respectively. Moreover, the sensitivity, specificity, and accuracy were 94.2%, 79.7%, and 87.0%, respectively, when tested using the FZU-FPH database with 138 samples of the ECG composed of two labels. The composite classification and prediction model using a new water-fall ensemble method had a total accuracy of approximately 91% for the test set of the FZU-FPH database with 80 samples with 120 segments of ECG with three labels. Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

19 pages, 4042 KiB  
Article
Caries and Restoration Detection Using Bitewing Film Based on Transfer Learning with CNNs
by Yi-Cheng Mao, Tsung-Yi Chen, He-Sheng Chou, Szu-Yin Lin, Sheng-Yu Liu, Yu-An Chen, Yu-Lin Liu, Chiung-An Chen, Yen-Cheng Huang, Shih-Lun Chen, Chun-Wei Li, Patricia Angela R. Abu and Wei-Yuan Chiang
Sensors 2021, 21(13), 4613; https://doi.org/10.3390/s21134613 - 05 Jul 2021
Cited by 30 | Viewed by 4961
Abstract
Caries is a dental disease caused by bacterial infection. If the cause of the caries is detected early, the treatment will be relatively easy, which in turn prevents caries from spreading. The current common procedure of dentists is to first perform radiographic examination [...] Read more.
Caries is a dental disease caused by bacterial infection. If the cause of the caries is detected early, the treatment will be relatively easy, which in turn prevents caries from spreading. The current common procedure of dentists is to first perform radiographic examination on the patient and mark the lesions manually. However, the work of judging lesions and markings requires professional experience and is very time-consuming and repetitive. Taking advantage of the rapid development of artificial intelligence imaging research and technical methods will help dentists make accurate markings and improve medical treatments. It can also shorten the judgment time of professionals. In addition to the use of Gaussian high-pass filter and Otsu’s threshold image enhancement technology, this research solves the problem that the original cutting technology cannot extract certain single teeth, and it proposes a caries and lesions area analysis model based on convolutional neural networks (CNN), which can identify caries and restorations from the bitewing images. Moreover, it provides dentists with more accurate objective judgment data to achieve the purpose of automatic diagnosis and treatment planning as a technology for assisting precision medicine. A standardized database established following a defined set of steps is also proposed in this study. There are three main steps to generate the image of a single tooth from a bitewing image, which can increase the accuracy of the analysis model. The steps include (1) preprocessing of the dental image to obtain a high-quality binarization, (2) a dental image cropping procedure to obtain individually separated tooth samples, and (3) a dental image masking step which masks the fine broken teeth from the sample and enhances the quality of the training. Among the current four common neural networks, namely, AlexNet, GoogleNet, Vgg19, and ResNet50, experimental results show that the proposed AlexNet model in this study for restoration and caries judgments has an accuracy as high as 95.56% and 90.30%, respectively. These are promising results that lead to the possibility of developing an automatic judgment method of bitewing film. Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

25 pages, 2837 KiB  
Article
Ball-Catching System Using Image Processing and an Omni-Directional Wheeled Mobile Robot
by Sho-Tsung Kao and Ming-Tzu Ho
Sensors 2021, 21(9), 3208; https://doi.org/10.3390/s21093208 - 05 May 2021
Cited by 12 | Viewed by 3894
Abstract
The ball-catching system examined in this research, which was composed of an omni-directional wheeled mobile robot and an image processing system that included a dynamic stereo vision camera and a static camera, was used to capture a thrown ball. The thrown ball was [...] Read more.
The ball-catching system examined in this research, which was composed of an omni-directional wheeled mobile robot and an image processing system that included a dynamic stereo vision camera and a static camera, was used to capture a thrown ball. The thrown ball was tracked by the dynamic stereo vision camera, and the omni-directional wheeled mobile robot was navigated through the static camera. A Kalman filter with deep learning was used to decrease the visual measurement noises and to estimate the ball’s position and velocity. The ball’s future trajectory and landing point was predicted by estimating its position and velocity. Feedback linearization was used to linearize the omni-directional wheeled mobile robot model and was then combined with a proportional-integral-derivative (PID) controller. The visual tracking algorithm was initially simulated numerically, and then the performance of the designed system was verified experimentally. We verified that the designed system was able to precisely catch a thrown ball. Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

23 pages, 4520 KiB  
Article
A Novel Focal Phi Loss for Power Line Segmentation with Auxiliary Classifier U-Net
by Rabeea Jaffari, Manzoor Ahmed Hashmani and Constantino Carlos Reyes-Aldasoro
Sensors 2021, 21(8), 2803; https://doi.org/10.3390/s21082803 - 16 Apr 2021
Cited by 22 | Viewed by 3268
Abstract
The segmentation of power lines (PLs) from aerial images is a crucial task for the safe navigation of unmanned aerial vehicles (UAVs) operating at low altitudes. Despite the advances in deep learning-based approaches for PL segmentation, these models are still vulnerable to the [...] Read more.
The segmentation of power lines (PLs) from aerial images is a crucial task for the safe navigation of unmanned aerial vehicles (UAVs) operating at low altitudes. Despite the advances in deep learning-based approaches for PL segmentation, these models are still vulnerable to the class imbalance present in the data. The PLs occupy only a minimal portion (1–5%) of the aerial images as compared to the background region (95–99%). Generally, this class imbalance problem is addressed via the use of PL-specific detectors in conjunction with the popular class balanced cross entropy (BBCE) loss function. However, these PL-specific detectors do not work outside their application areas and a BBCE loss requires hyperparameter tuning for class-wise weights, which is not trivial. Moreover, the BBCE loss results in low dice scores and precision values and thus, fails to achieve an optimal trade-off between dice scores, model accuracy, and precision–recall values. In this work, we propose a generalized focal loss function based on the Matthews correlation coefficient (MCC) or the Phi coefficient to address the class imbalance problem in PL segmentation while utilizing a generic deep segmentation architecture. We evaluate our loss function by improving the vanilla U-Net model with an additional convolutional auxiliary classifier head (ACU-Net) for better learning and faster model convergence. The evaluation of two PL datasets, namely the Mendeley Power Line Dataset and the Power Line Dataset of Urban Scenes (PLDU), where PLs occupy around 1% and 2% of the aerial images area, respectively, reveal that our proposed loss function outperforms the popular BBCE loss by 16% in PL dice scores on both the datasets, 19% in precision and false detection rate (FDR) values for the Mendeley PL dataset and 15% in precision and FDR values for the PLDU with a minor degradation in the accuracy and recall values. Moreover, our proposed ACU-Net outperforms the baseline vanilla U-Net for the characteristic evaluation parameters in the range of 1–10% for both the PL datasets. Thus, our proposed loss function with ACU-Net achieves an optimal trade-off for the characteristic evaluation parameters without any bells and whistles. Our code is available at Github. Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

24 pages, 9709 KiB  
Article
Develop an Adaptive Real-Time Indoor Intrusion Detection System Based on Empirical Analysis of OFDM Subcarriers
by Wei Zhuang, Yixian Shen, Lu Li, Chunming Gao and Dong Dai
Sensors 2021, 21(7), 2287; https://doi.org/10.3390/s21072287 - 25 Mar 2021
Cited by 9 | Viewed by 2596
Abstract
Device-free passive intrusion detection is a promising technology to determine whether moving subjects are present without deploying any specific sensors or devices in the area of interest. With the rapid development of wireless technology, multi-input multi-output (MIMO) and orthogonal frequency-division multiplexing (OFDM) which [...] Read more.
Device-free passive intrusion detection is a promising technology to determine whether moving subjects are present without deploying any specific sensors or devices in the area of interest. With the rapid development of wireless technology, multi-input multi-output (MIMO) and orthogonal frequency-division multiplexing (OFDM) which were originally exploited to improve the stability and bandwidth of Wi-Fi communication, can now support extensive applications such as indoor intrusion detection, patient monitoring, and healthcare monitoring for the elderly. At present, most research works use channel state information (CSI) in the IEEE 802.11n standard to analyze signals and select features. However, there are very limited studies on intrusion detection in real home environments that consider scenarios that include different motion speeds, different numbers of intruders, varying locations of devices, and whether people are present sleeping at home. In this paper, we propose an adaptive real-time indoor intrusion detection system using subcarrier correlation-based features based on the characteristics of narrow frequency spacing of adjacent subcarriers. We propose a link-pair selection algorithm for choosing an optimal link pair as a baseline for subsequent CSI processing. We prototype our system on commercial Wi-Fi devices and compare the overall performance with those of state-of-the-art approaches. The experimental results demonstrate that our system achieves impressive performance regardless of intruder’s motion speeds, number of intruders, non-line-of-sight conditions, and sleeping occupant conditions. Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

18 pages, 8235 KiB  
Article
Three-Dimensional Reconstruction with a Laser Line Based on Image In-Painting and Multi-Spectral Photometric Stereo
by Liang Lu, Hongbao Zhu, Junyu Dong, Yakun Ju and Huiyu Zhou
Sensors 2021, 21(6), 2131; https://doi.org/10.3390/s21062131 - 18 Mar 2021
Cited by 2 | Viewed by 2722
Abstract
This paper presents a multi-spectral photometric stereo (MPS) method based on image in-painting, which can reconstruct the shape using a multi-spectral image with a laser line. One of the difficulties in multi-spectral photometric stereo is to extract the laser line because the required [...] Read more.
This paper presents a multi-spectral photometric stereo (MPS) method based on image in-painting, which can reconstruct the shape using a multi-spectral image with a laser line. One of the difficulties in multi-spectral photometric stereo is to extract the laser line because the required illumination for MPS, e.g., red, green, and blue light, may pollute the laser color. Unlike previous methods, through the improvement of the network proposed by Isola, a Generative Adversarial Network based on image in-painting was proposed, to separate a multi-spectral image with a laser line into a clean laser image and an uncorrupted multi-spectral image without the laser line. Then these results were substituted into the method proposed by Fan to obtain high-precision 3D reconstruction results. To make the proposed method applicable to real-world objects, a rendered image dataset obtained using the rendering models in ShapeNet has been used for training the network. Evaluation using the rendered images and real-world images shows the superiority of the proposed approach over several previous methods. Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

16 pages, 3204 KiB  
Article
A High-Accuracy and Power-Efficient Self-Optimizing Wireless Water Level Monitoring IoT Device for Smart City
by Tsun-Kuang Chi, Hsiao-Chi Chen, Shih-Lun Chen and Patricia Angela R. Abu
Sensors 2021, 21(6), 1936; https://doi.org/10.3390/s21061936 - 10 Mar 2021
Cited by 5 | Viewed by 2241
Abstract
In this paper, a novel self-optimizing water level monitoring methodology is proposed for smart city applications. Considering system maintenance, the efficiency of power consumption and accuracy will be important for Internet of Things (IoT) devices and systems. A multi-step measurement mechanism and power [...] Read more.
In this paper, a novel self-optimizing water level monitoring methodology is proposed for smart city applications. Considering system maintenance, the efficiency of power consumption and accuracy will be important for Internet of Things (IoT) devices and systems. A multi-step measurement mechanism and power self-charging process are proposed in this study for improving the efficiency of a device for water level monitoring applications. The proposed methodology improved accuracy by 0.16–0.39% by moving the sensor to estimate the distance relative to different locations. Additional power is generated by executing a multi-step measurement while the power self-optimizing process used dynamically adjusts the settings to balance the current of charging and discharging. The battery level can efficiently go over 50% in a stable charging simulation. These methodologies were successfully implemented using an embedded control device, an ultrasonic sensor module, a LORA transmission module, and a stepper motor. According to the experimental results, the proposed multi-step methodology has the benefits of high accuracy and efficient power consumption for water level monitoring applications. Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

18 pages, 4697 KiB  
Article
Intelligent Brushing Monitoring Using a Smart Toothbrush with Recurrent Probabilistic Neural Network
by Ching-Han Chen, Chien-Chun Wang and Yan-Zhen Chen
Sensors 2021, 21(4), 1238; https://doi.org/10.3390/s21041238 - 10 Feb 2021
Cited by 12 | Viewed by 7595
Abstract
Smart toothbrushes equipped with inertial sensors are emerging as high-tech oral health products in personalized health care. The real-time signal processing of nine-axis inertial sensing and toothbrush posture recognition requires high computational resources. This paper proposes a recurrent probabilistic neural network (RPNN) for [...] Read more.
Smart toothbrushes equipped with inertial sensors are emerging as high-tech oral health products in personalized health care. The real-time signal processing of nine-axis inertial sensing and toothbrush posture recognition requires high computational resources. This paper proposes a recurrent probabilistic neural network (RPNN) for toothbrush posture recognition that demonstrates the advantages of low computational resources as a requirement, along with high recognition accuracy and efficiency. The RPNN model is trained for toothbrush posture recognition and brushing position and then monitors the correctness and integrity of the Bass Brushing Technique. Compared to conventional deep learning models, the recognition accuracy of RPNN is 99.08% in our experiments, which is 16.2% higher than that of the Convolutional Neural Network (CNN) and 21.21% higher than the Long Short-Term Memory (LSTM) model. The model we used can greatly reduce the computing power of hardware devices, and thus, our system can be used directly on smartphones. Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

17 pages, 1729 KiB  
Article
SoC FPGA Accelerated Sub-Optimized Binary Fully Convolutional Neural Network for Robotic Floor Region Segmentation
by Chi-Chia Sun, Afaroj Ahamad and Pin-He Liu
Sensors 2020, 20(21), 6133; https://doi.org/10.3390/s20216133 - 28 Oct 2020
Cited by 2 | Viewed by 2337
Abstract
In this article, a new Binary Fully Convolutional Neural Network (B-FCN) based on Taguchi method sub-optimization for the segmentation of robotic floor regions, which can precisely distinguish floor regions in complex indoor environments is proposed. This methodology is quite suitable for robot vision [...] Read more.
In this article, a new Binary Fully Convolutional Neural Network (B-FCN) based on Taguchi method sub-optimization for the segmentation of robotic floor regions, which can precisely distinguish floor regions in complex indoor environments is proposed. This methodology is quite suitable for robot vision in an embedded platform and the segmentation accuracy is up to 84.80% on average. A total of 6000 training datasets were used to improve the accuracy and reach convergence. On the other hand, to reach real-time computation, a PYNQ FPGA platform with heterogeneous computing acceleration was used to accelerate the proposed B-FCN architecture. Overall, robots would benefit from better navigation and route planning in our approach. The FPGA synthesis of our binarization method indicates an efficient reduction in the BRAM size to 0.5–1% and also GOPS/W is sufficiently high. Notably, the proposed faster architecture is ideal for low power embedded devices that need to solve the shortest path problem, path searching, and motion planning. Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

25 pages, 7481 KiB  
Article
An Adaptive Deep Learning Framework for Dynamic Image Classification in the Internet of Things Environment
by Syed Muslim Jameel, Manzoor Ahmed Hashmani, Mobashar Rehman and Arif Budiman
Sensors 2020, 20(20), 5811; https://doi.org/10.3390/s20205811 - 14 Oct 2020
Cited by 15 | Viewed by 4196
Abstract
In the modern era of digitization, the analysis in the Internet of Things (IoT) environment demands a brisk amalgamation of domains such as high-dimension (images) data sensing technologies, robust internet connection (4 G or 5 G) and dynamic (adaptive) deep learning approaches. This [...] Read more.
In the modern era of digitization, the analysis in the Internet of Things (IoT) environment demands a brisk amalgamation of domains such as high-dimension (images) data sensing technologies, robust internet connection (4 G or 5 G) and dynamic (adaptive) deep learning approaches. This is required for a broad range of indispensable intelligent applications, like intelligent healthcare systems. Dynamic image classification is one of the major areas of concern for researchers, which may take place during analysis under the IoT environment. Dynamic image classification is associated with several temporal data perturbations (such as novel class arrival and class evolution issue) which cause a massive classification deterioration in the deployed classification models and make them in-effective. Therefore, this study addresses such temporal inconsistencies (novel class arrival and class evolution issue) and proposes an adapted deep learning framework (ameliorated adaptive convolutional neural network (CNN) ensemble framework), which handles novel class arrival and class evaluation issue during dynamic image classification. The proposed framework is an improved version of previous adaptive CNN ensemble with an additional online training (OT) and online classifier update (OCU) modules. An OT module is a clustering-based approach which uses the Euclidean distance and silhouette method to determine the potential new classes, whereas, the OCU updates the weights of the existing instances of the ensemble with newly arrived samples. The proposed framework showed the desirable classification improvement under non-stationary scenarios for the benchmark (CIFAR10) and real (ISIC 2019: Skin disease) data streams. Also, the proposed framework outperformed against state-of-art shallow learning and deep learning models. The results have shown the effectiveness and proven the diversity of the proposed framework to adapt the new concept changes during dynamic image classification. In future work, the authors of this study aim to develop an IoT-enabled adaptive intelligent dermoscopy device (for dermatologists). Therefore, further improvements in classification accuracy (for real dataset) is the future concern of this study. Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

17 pages, 5838 KiB  
Article
Deep Learning-Based Violin Bowing Action Recognition
by Shih-Wei Sun, Bao-Yun Liu and Pao-Chi Chang
Sensors 2020, 20(20), 5732; https://doi.org/10.3390/s20205732 - 09 Oct 2020
Cited by 7 | Viewed by 2918
Abstract
We propose a violin bowing action recognition system that can accurately recognize distinct bowing actions in classical violin performance. This system can recognize bowing actions by analyzing signals from a depth camera and from inertial sensors that are worn by a violinist. The [...] Read more.
We propose a violin bowing action recognition system that can accurately recognize distinct bowing actions in classical violin performance. This system can recognize bowing actions by analyzing signals from a depth camera and from inertial sensors that are worn by a violinist. The contribution of this study is threefold: (1) a dataset comprising violin bowing actions was constructed from data captured by a depth camera and multiple inertial sensors; (2) data augmentation was achieved for depth-frame data through rotation in three-dimensional world coordinates and for inertial sensing data through yaw, pitch, and roll angle transformations; and, (3) bowing action classifiers were trained using different modalities, to compensate for the strengths and weaknesses of each modality, based on deep learning methods with a decision-level fusion process. In experiments, large external motions and subtle local motions produced from violin bow manipulations were both accurately recognized by the proposed system (average accuracy > 80%). Full article
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)
Show Figures

Figure 1

Back to TopTop