Advanced Machine Learning and Scene Understanding in Images and Data

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 March 2023) | Viewed by 5318

Special Issue Editors


E-Mail Website
Guest Editor
Multimedia Technology and Telecommunications Lab, University of Padova, 35131 Padova PD, Italy
Interests: computer vision; semantic segmentation; transfer learning; 3D data acquisition and processing; time-of-flight sensors
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Information Engineering, IAS-Lab (Intelligent Autonomous Systems Lab), University of Padova, 35131 Padova PD, Italy
Interests: computer vision; deep learning for semantic segmentation and scene understanding; people detection and re-identification; industrial vision systems

Special Issue Information

Dear Colleagues,

Scene understanding from visual data is a key tool for many applications, including autonomous driving, robotic motion and path planning, industrial automation, and video surveillance. The recent introduction of deep learning techniques has fostered an impressive improvement in performance for approaches dealing with such very challenging tasks, even though the need for a large amount of training data remains a critical aspect. This Special Issue welcomes novel research works presenting effective strategies for scene understanding from both images and 3D data. Possible applications include segmentation, semantic analysis, detection or recognition of objects and people, and many others. Papers focusing on novel segmentation strategies together with machine learning techniques for semantic segmentation and, more generally, scene understanding from visual data are welcome. Covered topics also include techniques exploiting 3D information for the aforementioned applications, both in the form of depth data and of point clouds. Finally, possible submissions also include approaches for solving the critical issue of acquiring training data, including transfer learning, reinforcement learning, domain adaption, and incremental learning strategies for scene understanding.

Prof. Dr. Pietro Zanuttigh
Dr. Stefano Ghidoni
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Machine learning
  • Semantic segmentation
  • Software engineering
  • Image and 3D data segmentation
  • Deep learning for scene understanding
  • Transfer learning
  • Reinforcement learning
  • Domain adaptation
  • Point cloud segmentation
  • Depth data analysis
  • Incremental learning
  • 3D scene understanding
  • Robotic applications of scene understanding and human–robot cooperation
  • Scene understanding for autonomous driving
  • Scene understanding for drone applications

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 7557 KiB  
Article
A Soybean Classification Method Based on Data Balance and Deep Learning
by Ning Zhang, Enxu Zhang and Fei Li
Appl. Sci. 2023, 13(11), 6425; https://doi.org/10.3390/app13116425 - 24 May 2023
Viewed by 1040
Abstract
Soybean is a type of food crop with economic benefits. Whether they are damaged or not directly affects the survival and nutritional value of soybean plants. In machine learning, unbalanced data represent a major factor affecting machine learning efficiency, and unbalanced data refer [...] Read more.
Soybean is a type of food crop with economic benefits. Whether they are damaged or not directly affects the survival and nutritional value of soybean plants. In machine learning, unbalanced data represent a major factor affecting machine learning efficiency, and unbalanced data refer to a category in which the number of samples in one category is much larger than that in the other, which biases the classification results towards a category with a large number of samples and thus affects the classification accuracy. Therefore, the effectiveness of the data-balancing method based on a convolutional neural network is investigated in this paper, and two balancing methods are used to expand the data set using the over-sampling method and using the loss function with assignable class weights. At the same time, to verify the effectiveness of the data-balancing method, four networks are introduced for control experiments. The experimental results show that the new loss function can effectively improve the classification accuracy and learning ability, and the classification accuracy of the DenseNet network can reach 98.48%, but the classification accuracy will be greatly reduced by using the data-augmentation method. With the binary classification method and the use of data-augmentation data sets, the excessive number of convolution layers will lead to a reduction in the classification accuracy and a small number of convolution layers can be used for classification purposes. It is verified that a neural network using a small convolution layer can improve the classification accuracy by 1.52% using the data-augmentation data-balancing method. Full article
(This article belongs to the Special Issue Advanced Machine Learning and Scene Understanding in Images and Data)
Show Figures

Figure 1

17 pages, 2645 KiB  
Article
Comparing OBIA-Generated Labels and Manually Annotated Labels for Semantic Segmentation in Extracting Refugee-Dwelling Footprints
by Yunya Gao, Stefan Lang, Dirk Tiede, Getachew Workineh Gella and Lorenz Wendt
Appl. Sci. 2022, 12(21), 11226; https://doi.org/10.3390/app122111226 - 5 Nov 2022
Cited by 1 | Viewed by 1667
Abstract
Refugee-dwelling footprints derived from satellite imagery are beneficial for humanitarian operations. Recently, deep learning approaches have attracted much attention in this domain. However, most refugees are hosted by low- and middle-income countries where accurate label data are often unavailable. The Object-Based Image Analysis [...] Read more.
Refugee-dwelling footprints derived from satellite imagery are beneficial for humanitarian operations. Recently, deep learning approaches have attracted much attention in this domain. However, most refugees are hosted by low- and middle-income countries where accurate label data are often unavailable. The Object-Based Image Analysis (OBIA) approach has been widely applied to this task for humanitarian operations over the last decade. However, the footprints were usually produced urgently, and thus, include delineation errors. Thus far, no research discusses whether these footprints generated by the OBIA approach (OBIA labels) can replace manually annotated labels (Manual labels) for this task. This research compares the performance of OBIA labels and Manual labels under multiple strategies by semantic segmentation. The results reveal that the OBIA labels can produce IoU values greater than 0.5, which can produce applicable results for humanitarian operations. Most falsely predicted pixels source from the boundary of the built-up structures, the occlusion of trees, and the structures with complicated ontology. In addition, we found that using a small number of Manual labels to fine-tune models initially trained with OBIA labels can outperform models trained with purely Manual labels. These findings show high values of the OBIA labels for deep-learning-based refugee-dwelling extraction tasks for future humanitarian operations. Full article
(This article belongs to the Special Issue Advanced Machine Learning and Scene Understanding in Images and Data)
Show Figures

Figure 1

11 pages, 9315 KiB  
Article
Throwaway Shadows Using Parallel Encoders Generative Adversarial Network
by Kamran Javed, Nizam Ud Din, Ghulam Hussain and Tahir Farooq
Appl. Sci. 2022, 12(2), 824; https://doi.org/10.3390/app12020824 - 14 Jan 2022
Cited by 2 | Viewed by 1804
Abstract
Face photographs taken on a bright sunny day or in floodlight contain unnecessary shadows of objects on the face. Most previous works deal with removing shadow from scene images and struggle with doing so for facial images. Faces have a complex semantic structure, [...] Read more.
Face photographs taken on a bright sunny day or in floodlight contain unnecessary shadows of objects on the face. Most previous works deal with removing shadow from scene images and struggle with doing so for facial images. Faces have a complex semantic structure, due to which shadow removal is challenging. The aim of this research is to remove the shadow of an object in facial images. We propose a novel generative adversarial network (GAN) based image-to-image translation approach for shadow removal in face images. The first stage of our model automatically produces a binary segmentation mask for the shadow region. Then, the second stage, which is a GAN-based network, removes the object shadow and synthesizes the effected region. The generator network of our GAN has two parallel encoders—one is standard convolution path and the other is a partial convolution. We find that this combination in the generator results not only in learning an incorporated semantic structure but also in disentangling visual discrepancies problems under the shadow area. In addition to GAN loss, we exploit low level L1, structural level SSIM and perceptual loss from a pre-trained loss network for better texture and perceptual quality, respectively. Since there is no paired dataset for the shadow removal problem, we created a synthetic shadow dataset for training our network in a supervised manner. The proposed approach effectively removes shadows from real and synthetic test samples, while retaining complex facial semantics. Experimental evaluations consistently show the advantages of the proposed method over several representative state-of-the-art approaches. Full article
(This article belongs to the Special Issue Advanced Machine Learning and Scene Understanding in Images and Data)
Show Figures

Figure 1

Back to TopTop