Digital Image Processing: Advanced Technologies and Applications

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 March 2024) | Viewed by 38113

Special Issue Editors


E-Mail Website
Guest Editor
Department of Electrical and Computer Engineering, COMSATS University Islamabad, Abbottabad, Pakistan
Interests: artificial intelligence; machine learning; computer vision; digital image processing

E-Mail Website
Guest Editor
Department of Electrical and Computer Engineering, COMSATS University Islamabad, Abbottabad, Pakistan
Interests: machine learning applications in power systems
Department of Electrical and Computer Engineering, COMSATS University Islamabad, Abbottabad, Pakistan
Interests: target tracking and information fusion

Special Issue Information

Dear Colleagues,

Throughout the 21st century, the human demand for information has been increasing every day. The choice of an electronic imaging device is related to its application. With the rapid technological developments and use of mobile devices and social media, humans are consistently exposed to a significant amount of information, including digital images and videos. With every minute that passes, the internet is flooded with huge amounts of digital content. Hence, digital imaging has obtained a substantial role in various scientific expeditions, for instance, in image enhancement, restorations, and various object recognition tasks. Often, the colors and contrast of many real life images degrade abruptly due to various factors, such as insufficient lighting, excessive light absorption, scattering, and of course limitations of the imaging devices themselves. Similarly, the hardware restrictions of image or video capturing devices also affects the imaging quality. Moreover, the selective absorption and scattering of light tends to cause color deviations in many real life images, which results in a blurry image and poor contrast. Furthermore, in various situations, digital images are distorted, which sooner or later degrades the visual experience for human viewers. For instance, adverse weather conditions, such as rain, snow, fog, or cloudy environments result in blurry images along with color distortions. Although imaging equipment with better embedded hardware can improve the image quality to a certain extent, in many situations, its adaptability is poor. Hence, the quality of the acquired images is non-satisfactory.

This Special Issue will focus on digital imaging strategies, with the aim of processing, analyzing, and investigating imaging and all the latest methods involved in handling them. Manuscripts are invited in all these different multidisciplinary areas, but are not limited to them. Recently, interest in computer vision, machine learning, and deep learning has grown significantly. Therefore, manuscripts that explore the utility of such tools in exploring advanced technologies and applications are also encouraged.

In this Special Issue, we invite authors to submit original research papers, reviews, and viewpoint articles that are related to recent advances at all levels of the applications and technologies of imaging and signal analysis. We are open to papers that address a diverse range of topics, from foundational issues up to novel algorithms that aim to provide state-of-the-art solutions and technological systems for practical and feasible applications.

The Special Issue on “Digital Image Processing: Advanced Technologies and Applications” covers rising trends in image theory additions, original algorithms, and novel architectures to capture, form, and display digital images, subsequent processings, communications, analysis, videos, and multidimensional systems and signals in an extensive diversity of applications. Topics of interest include, but are not limited to:

Dr. Zahid Mehmood Jehangiri
Dr. Mohsin Shahzad
Dr. Uzair Khan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • classification
  • computer vision
  • convolutional neural networks
  • deep learning
  • generative adversarial networks
  • image analysis
  • image enhancement
  • hyperspectral and infrared imaging
  • object detection
  • object recognition
  • segmentation
  • texture analysis
  • tracking
  • seismic inversion
  • cloud segmentation
  • satellite imaging

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

33 pages, 9169 KiB  
Article
Dhad—A Children’s Handwritten Arabic Characters Dataset for Automated Recognition
by Sarab AlMuhaideb, Najwa Altwaijry, Ahad D. AlGhamdy, Daad AlKhulaiwi, Raghad AlHassan, Haya AlOmran and Aliyah M. AlSalem
Appl. Sci. 2024, 14(6), 2332; https://doi.org/10.3390/app14062332 - 10 Mar 2024
Viewed by 450
Abstract
This study delves into the intricate realm of recognizing handwritten Arabic characters, specifically targeting children’s script. Given the inherent complexities of the Arabic script, encompassing semi-cursive styles, distinct character forms based on position, and the inclusion of diacritical marks, the domain demands specialized [...] Read more.
This study delves into the intricate realm of recognizing handwritten Arabic characters, specifically targeting children’s script. Given the inherent complexities of the Arabic script, encompassing semi-cursive styles, distinct character forms based on position, and the inclusion of diacritical marks, the domain demands specialized attention. While prior research has largely concentrated on adult handwriting, the spotlight here is on children’s handwritten Arabic characters, an area marked by its distinct challenges, such as variations in writing quality and increased distortions. To this end, we introduce a novel dataset, “Dhad”, refined for enhanced quality and quantity. Our investigation employs a tri-fold experimental approach, encompassing the exploration of pre-trained deep learning models (i.e., MobileNet, ResNet50, and DenseNet121), custom-designed Convolutional Neural Network (CNN) architecture, and traditional classifiers (i.e., Support Vector Machine (SVM), Random Forest (RF), and Multilayer Perceptron (MLP)), leveraging deep visual features. The results illuminate the efficacy of fine-tuned pre-existing models, the potential of custom CNN designs, and the intricacies associated with disjointed classification paradigms. The pre-trained model MobileNet achieved the best test accuracy of 93.59% on the Dhad dataset. Additionally, as a conceptual proposal, we introduce the idea of a computer application designed specifically for children aged 7–12, aimed at improving Arabic handwriting skills. Our concluding reflections emphasize the need for nuanced dataset curation, advanced model architectures, and cohesive training strategies to navigate the multifaceted challenges of Arabic character recognition. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

21 pages, 5126 KiB  
Article
Comprehensive Analysis of Mammography Images Using Multi-Branch Attention Convolutional Neural Network
by Ebtihal Al-Mansour, Muhammad Hussain, Hatim A. Aboalsamh and Saad A. Al-Ahmadi
Appl. Sci. 2023, 13(24), 12995; https://doi.org/10.3390/app132412995 - 05 Dec 2023
Viewed by 768
Abstract
Breast cancer profoundly affects women’s lives; its early diagnosis and treatment increase patient survival chances. Mammography is a common screening method for breast cancer, and many methods have been proposed for automatic diagnosis. However, most of them focus on single-label classification and do [...] Read more.
Breast cancer profoundly affects women’s lives; its early diagnosis and treatment increase patient survival chances. Mammography is a common screening method for breast cancer, and many methods have been proposed for automatic diagnosis. However, most of them focus on single-label classification and do not provide a comprehensive analysis concerning density, abnormality, and severity levels. We propose a method based on the multi-label classification of two-view mammography images to comprehensively diagnose a patient’s condition. It leverages the correlation between density type, lesion type, and states of lesions, which radiologists usually perform. It simultaneously classifies mammograms into the corresponding density, abnormality type, and severity level. It takes two-view mammograms (with craniocaudal and mediolateral oblique views) as input, analyzes them using ConvNeXt and the channel attention mechanism, and integrates the information from the two views. Finally, the fused information is passed to task-specific multi-branches, which learn task-specific representations and predict the relevant state. The system was trained, validated, and tested using two public domain benchmark datasets, INBreast and the Curated Breast Imaging Subset of DDSM (CBIS-DDSM), and achieved state-of-the-art results. The proposed computer-aided diagnosis (CAD) system provides a holistic observation of a patient’s condition. It gives the radiologists a comprehensive analysis of the mammograms to prepare a full report of the patient’s condition, thereby increasing the diagnostic precision. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

18 pages, 3442 KiB  
Article
Multi-Pedestrian Tracking Based on KC-YOLO Detection and Identity Validity Discrimination Module
by Jingwen Li, Wei Wu, Dan Zhang, Dayong Fan, Jianwu Jiang, Yanling Lu, Ertao Gao and Tao Yue
Appl. Sci. 2023, 13(22), 12228; https://doi.org/10.3390/app132212228 - 10 Nov 2023
Cited by 2 | Viewed by 895
Abstract
Multiple-object tracking (MOT) is a fundamental task in computer vision and is widely applied across various domains. However, its algorithms remain somewhat immature in practical applications. To address the challenges presented by complex scenarios featuring instances of missed detections, false alarms, and frequent [...] Read more.
Multiple-object tracking (MOT) is a fundamental task in computer vision and is widely applied across various domains. However, its algorithms remain somewhat immature in practical applications. To address the challenges presented by complex scenarios featuring instances of missed detections, false alarms, and frequent target switching leading to tracking failures, we propose an approach to multi-object tracking utilizing KC-YOLO detection and an identity validity discrimination module. We have constructed the KC-YOLO detection model as the detector for the tracking task, optimized the selection of detection frames, and implemented adaptive feature refinement to effectively address issues such as incomplete pedestrian features caused by occlusion. Furthermore, we have introduced an identity validity discrimination module in the data association component of the tracker. This module leverages the occlusion ratio coefficient, denoted by “k”, to assess the validity of pedestrian identities in low-scoring detection frames following cascade matching. This approach not only enhances pedestrian tracking accuracy but also ensures the integrity of pedestrian identities. In experiments on the MOT16, MOT17, and MOT20 datasets, MOTA reached 75.9%, 78.5%, and 70.1%, and IDF1 reached 74.8%, 77.8%, and 72.4%. The experimental results demonstrate the superiority of the methodology. This research outcome has potential applications in security monitoring, including public safety and fire prevention, for tracking critical targets. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

19 pages, 1710 KiB  
Article
Bidirectional-Feature-Learning-Based Adversarial Domain Adaptation with Generative Network
by Chansu Han, Hyunseung Choo and Jongpil Jeong
Appl. Sci. 2023, 13(21), 11825; https://doi.org/10.3390/app132111825 - 29 Oct 2023
Viewed by 860
Abstract
Studying domain adaptation is a recent research trend. Generally, many generative models that researchers have studied perform well on training data from a specific domain. However, their ability to be generalized to other domains might be limited. Therefore, a growing body of research [...] Read more.
Studying domain adaptation is a recent research trend. Generally, many generative models that researchers have studied perform well on training data from a specific domain. However, their ability to be generalized to other domains might be limited. Therefore, a growing body of research has utilized domain adaptation techniques to address the problem of generative models being vulnerable to input from other domains. In this paper, we focused on generative models and representation learning. Generative models have received a lot of attention for their ability to generate various types of data such as images, music, and text. In particular, studies utilizing generative adversarial neural networks (GANs) and autoencoder structures have received a lot of attention. In this paper, we solved the domain adaptation problem by reconstructing real image data using an autoencoder structure. In particular, reconstructed image data, considered a type of noisy image data, are used as input data. How to reconstruct data by extracting features and selectively transforming them in order to reduce differences in characteristics between domains entails representative learning. Considering these research trends, this paper proposed a novel methodology combining bidirectional feature learning and generative networks to innovatively approach the domain adaptation problem. It could improve the adaptation ability by accurately simulating the real data distribution. The experimental results show that the proposed model outperforms the traditional DANN and ADDA. This demonstrates that combining bidirectional feature learning and generative networks is an effective solution in the field of domain adaptation. These results break new ground in the field of domain adaptation. They are expected to provide great inspiration for future research and applications. Finally, through various experiments and evaluations, we verify that the proposed approach outperforms the existing works. We conducted experiments for representative generative models and domain adaptation techniques and found that the proposed approach was effective in improving data and domain robustness. We hope to contribute to the development of domain-adaptive models that are robust to the domain. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

17 pages, 4743 KiB  
Article
Automatic Fruits Freshness Classification Using CNN and Transfer Learning
by Umer Amin, Muhammad Imran Shahzad, Aamir Shahzad, Mohsin Shahzad, Uzair Khan and Zahid Mahmood
Appl. Sci. 2023, 13(14), 8087; https://doi.org/10.3390/app13148087 - 11 Jul 2023
Cited by 3 | Viewed by 2729
Abstract
Fruit Freshness categorization is crucial in the agriculture industry. A system, which precisely assess the fruits’ freshness, is required to save labor costs related to tossing out rotten fruits during the manufacturing stage. A subset of modern machine learning techniques, which are known [...] Read more.
Fruit Freshness categorization is crucial in the agriculture industry. A system, which precisely assess the fruits’ freshness, is required to save labor costs related to tossing out rotten fruits during the manufacturing stage. A subset of modern machine learning techniques, which are known as Deep Convolution Neural Networks (DCNN), have been used to classify images with success. There have recently been many changed CNN designs that gradually added more layers to achieve better classification accuracy. This study proposes an efficient and accurate fruit freshness classification method. The proposed method has several interconnected steps. After the fruits data is gathered, data is preprocessed using color uniforming, image resizing, augmentation, and image labelling. Later, the AlexNet model is loaded in which we use eight layers, including five convolutional layers and three fully connected layers. Meanwhile, the transfer learning and fine tuning of the CNN is performed. In the final stage, the softmax classifier is used for final classification. Detailed simulations are performed on three publicly available datasets. Our proposed model achieved highly favorable results on all three datasets in which 98.2%, 99.8%, and 99.3%, accuracy is achieved on aforesaid datasets, respectively. In addition, our developed method is also computationally efficient and consumes 8 ms on average to yield the final classification result. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

18 pages, 11033 KiB  
Article
PhotoMatch: An Open-Source Tool for Multi-View and Multi-Modal Feature-Based Image Matching
by Esteban Ruiz de Oña, Inés Barbero-García, Diego González-Aguilera, Fabio Remondino, Pablo Rodríguez-Gonzálvez and David Hernández-López
Appl. Sci. 2023, 13(9), 5467; https://doi.org/10.3390/app13095467 - 27 Apr 2023
Cited by 1 | Viewed by 2575
Abstract
The accurate and reliable extraction and matching of distinctive features (keypoints) in multi-view and multi-modal datasets is still an open research topic in the photogrammetric and computer vision communities. However, one of the main milestones is selecting which method is a suitable choice [...] Read more.
The accurate and reliable extraction and matching of distinctive features (keypoints) in multi-view and multi-modal datasets is still an open research topic in the photogrammetric and computer vision communities. However, one of the main milestones is selecting which method is a suitable choice for specific applications. This encourages us to develop an educational tool that encloses different hand-crafted and learning-based feature-extraction methods. This article presents PhotoMatch, a didactical, open-source tool for multi-view and multi-modal feature-based image matching. The software includes a wide range of state-of-the-art methodologies for preprocessing, feature extraction and matching, including deep learning detectors and descriptors. It also provides tools for a detailed assessment and comparison of the different approaches, allowing the user to select the best combination of methods for each specific multi-view and multi-modal dataset. The first version of the tool was awarded by the ISPRS (ISPRS Scientific Initiatives, 2019). A set of thirteen case studies, including six multi-view and six multi-modal image datasets, is processed by following different methodologies, and the results provided by the software are analysed to show the capabilities of the tool. The PhotoMatch Installer and the source code are freely available. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

29 pages, 7559 KiB  
Article
Towards Automatic License Plate Recognition in Challenging Conditions
by Fahd Sultan, Khurram Khan, Yasir Ali Shah, Mohsin Shahzad, Uzair Khan and Zahid Mahmood
Appl. Sci. 2023, 13(6), 3956; https://doi.org/10.3390/app13063956 - 20 Mar 2023
Cited by 6 | Viewed by 4131
Abstract
License plate recognition (LPR) is an integral part of the current intelligent systems that are developed to locate and identify various objects. Unfortunately, the LPR is a challenging task due to various factors, such as the numerous shapes and designs of the LPs, [...] Read more.
License plate recognition (LPR) is an integral part of the current intelligent systems that are developed to locate and identify various objects. Unfortunately, the LPR is a challenging task due to various factors, such as the numerous shapes and designs of the LPs, the non-following of standard LP templates, irregular outlines, angle variations, and occlusion. These factors drastically influence the LP appearance and significantly challenge the detection and recognition abilities of state-of-the-art detection and recognition algorithms. However, recent rising trends in the development of machine learning algorithms have yielded encouraging solutions. This paper presents a novel LPR method to address the aforesaid issues. The proposed method is composed of three distinct but interconnected steps. First, a vehicle that appears in an input image is detected using the Faster RCNN. Next, the LP area is located within the detected vehicle by using morphological operations. Finally, license plate recognition is accomplished using the deep learning network. Detailed simulations performed on the PKU, AOLP, and CCPD databases indicate that our developed approach produces mean license plate recognition accuracy of 99%, 96.0231%, and 98.7000% on the aforesaid databases. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

29 pages, 4916 KiB  
Article
A Fast and Accurate Real-Time Vehicle Detection Method Using Deep Learning for Unconstrained Environments
by Annam Farid, Farhan Hussain, Khurram Khan, Mohsin Shahzad, Uzair Khan and Zahid Mahmood
Appl. Sci. 2023, 13(5), 3059; https://doi.org/10.3390/app13053059 - 27 Feb 2023
Cited by 22 | Viewed by 6016
Abstract
Deep learning-based classification and detection algorithms have emerged as a powerful tool for vehicle detection in intelligent transportation systems. The limitations of the number of high-quality labeled training samples makes the single vehicle detection methods incapable of accomplishing acceptable accuracy in road vehicle [...] Read more.
Deep learning-based classification and detection algorithms have emerged as a powerful tool for vehicle detection in intelligent transportation systems. The limitations of the number of high-quality labeled training samples makes the single vehicle detection methods incapable of accomplishing acceptable accuracy in road vehicle detection. This paper presents detection and classification of vehicles on publicly available datasets by utilizing the YOLO-v5 architecture. This paper’s findings utilize the concept of transfer learning through fine tuning the weights of the pre-trained YOLO-v5 architecture. To employ the concept of transfer learning, extensive data sets of images and videos of the congested traffic patterns were collected by the authors. These datasets were made more comprehensive by pointing various attributes, for instance high- and low-density traffic patterns, occlusions, and different weather circumstances. All of these gathered datasets were manually annotated. Ultimately, the improved YOLO-v5 structure becomes accustomed to any difficult traffic patterns. By fine-tuning the pre-trained network through our datasets, our proposed YOLO-v5 has exceeded several other traditional vehicle detection methods in terms of detection accuracy and execution time. Detailed simulations performed on the PKU, COCO, and DAWN datasets demonstrate the effectiveness of the proposed method in various challenging situations. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

15 pages, 6195 KiB  
Article
Recognition and Classification of Handwritten Urdu Numerals Using Deep Learning Techniques
by Aamna Bhatti, Ameera Arif, Waqar Khalid, Baber Khan, Ahmad Ali, Shehzad Khalid and Atiq ur Rehman
Appl. Sci. 2023, 13(3), 1624; https://doi.org/10.3390/app13031624 - 27 Jan 2023
Cited by 5 | Viewed by 2336
Abstract
Urdu is a complex language as it is an amalgam of many South Asian and East Asian languages; hence, its character recognition is a huge and difficult task. It is a bidirectional language with its numerals written from left to right while script [...] Read more.
Urdu is a complex language as it is an amalgam of many South Asian and East Asian languages; hence, its character recognition is a huge and difficult task. It is a bidirectional language with its numerals written from left to right while script is written in opposite direction which induces complexities in the recognition process. This paper presents the recognition and classification of a novel Urdu numeral dataset using convolutional neural network (CNN) and its variants. We propose custom CNN model to extract features which are used by Softmax activation function and support vector machine (SVM) classifier. We compare it with GoogLeNet and the residual network (ResNet) in terms of performance. Our proposed CNN gives an accuracy of 98.41% with the Softmax classifier and 99.0% with the SVM classifier. For GoogLeNet, we achieve an accuracy of 95.61% and 96.4% on ResNet. Moreover, we develop datasets for handwritten Urdu numbers and numbers of Pakistani currency to incorporate real-life problems. Our models achieve best accuracies as compared to previous models in the literature for optical character recognition (OCR). Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

23 pages, 37343 KiB  
Article
Coarse-to-Fine Structure-Aware Artistic Style Transfer
by Kunxiao Liu, Guowu Yuan, Hao Wu and Wenhua Qian
Appl. Sci. 2023, 13(2), 952; https://doi.org/10.3390/app13020952 - 10 Jan 2023
Viewed by 1665
Abstract
Artistic style transfer aims to use a style image and a content image to synthesize a target image that retains the same artistic expression as the style image while preserving the basic content of the content image. Many recently proposed style transfer methods [...] Read more.
Artistic style transfer aims to use a style image and a content image to synthesize a target image that retains the same artistic expression as the style image while preserving the basic content of the content image. Many recently proposed style transfer methods have a common problem; that is, they simply transfer the texture and color of the style image to the global structure of the content image. As a result, the content image has a local structure that is not similar to the local structure of the style image. In this paper, we present an effective method that can be used to transfer style patterns while fusing the local style structure to the local content structure. In our method, different levels of coarse stylized features are first reconstructed at low resolution using a coarse network, in which style color distribution is roughly transferred, and the content structure is combined with the style structure. Then, the reconstructed features and the content features are adopted to synthesize high-quality structure-aware stylized images with high resolution using a fine network with three structural selective fusion (SSF) modules. The effectiveness of our method is demonstrated through the generation of appealing high-quality stylization results and a comparison with some state-of-the-art style transfer methods. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

Review

Jump to: Research, Other

27 pages, 2712 KiB  
Review
A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges
by Safiullah Faizullah, Muhammad Sohaib Ayub, Sajid Hussain and Muhammad Asad Khan
Appl. Sci. 2023, 13(7), 4584; https://doi.org/10.3390/app13074584 - 04 Apr 2023
Cited by 13 | Viewed by 6848
Abstract
Optical character recognition (OCR) is the process of extracting handwritten or printed text from a scanned or printed image and converting it to a machine-readable form for further data processing, such as searching or editing. Automatic text extraction using OCR helps to digitize [...] Read more.
Optical character recognition (OCR) is the process of extracting handwritten or printed text from a scanned or printed image and converting it to a machine-readable form for further data processing, such as searching or editing. Automatic text extraction using OCR helps to digitize documents for improved productivity and accessibility and for preservation of historical documents. This paper provides a survey of the current state-of-the-art applications, techniques, and challenges in Arabic OCR. We present the existing methods for each step of the complete OCR process to identify the best-performing approach for improved results. This paper follows the keyword-search method for reviewing the articles related to Arabic OCR, including the backward and forward citations of the article. In addition to state-of-art techniques, this paper identifies research gaps and presents future directions for Arabic OCR. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Graphical abstract

24 pages, 3860 KiB  
Review
On the Use of Deep Learning for Video Classification
by Atiq ur Rehman, Samir Brahim Belhaouari, Md Alamgir Kabir and Adnan Khan
Appl. Sci. 2023, 13(3), 2007; https://doi.org/10.3390/app13032007 - 03 Feb 2023
Cited by 10 | Viewed by 4815
Abstract
The video classification task has gained significant success in the recent years. Specifically, the topic has gained more attention after the emergence of deep learning models as a successful tool for automatically classifying videos. In recognition of the importance of the video classification [...] Read more.
The video classification task has gained significant success in the recent years. Specifically, the topic has gained more attention after the emergence of deep learning models as a successful tool for automatically classifying videos. In recognition of the importance of the video classification task and to summarize the success of deep learning models for this task, this paper presents a very comprehensive and concise review on the topic. There are several existing reviews and survey papers related to video classification in the scientific literature. However, the existing review papers do not include the recent state-of-art works, and they also have some limitations. To provide an updated and concise review, this paper highlights the key findings based on the existing deep learning models. The key findings are also discussed in a way to provide future research directions. This review mainly focuses on the type of network architecture used, the evaluation criteria to measure the success, and the datasets used. To make the review self-contained, the emergence of deep learning methods towards automatic video classification and the state-of-art deep learning methods are well explained and summarized. Moreover, a clear insight of the newly developed deep learning architectures and the traditional approaches is provided. The critical challenges based on the benchmarks are highlighted for evaluating the technical progress of these methods. The paper also summarizes the benchmark datasets and the performance evaluation matrices for video classification. Based on the compact, complete, and concise review, the paper proposes new research directions to solve the challenging video classification problem. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

Other

Jump to: Research, Review

14 pages, 7075 KiB  
Technical Note
Quality Analysis of Unmanned Aerial Vehicle Images Using a Resolution Target
by Jin-Hyo Kim and Sang-Min Sung
Appl. Sci. 2024, 14(5), 2154; https://doi.org/10.3390/app14052154 - 04 Mar 2024
Viewed by 480
Abstract
Unmanned aerial vehicle (UAV) photogrammetry is an emerging means of acquiring high-precision rapid spatial information and data because it is cost-effective and highly efficient. However, securing uniform quality in the results of UAV photogrammetry is difficult due to the use of low-cost navigation [...] Read more.
Unmanned aerial vehicle (UAV) photogrammetry is an emerging means of acquiring high-precision rapid spatial information and data because it is cost-effective and highly efficient. However, securing uniform quality in the results of UAV photogrammetry is difficult due to the use of low-cost navigation devices, non-surveying cameras, and rapid changes in shooting locations depending on the aircraft’s behavior. In addition, no specific procedures or guidelines exist for performing quantitative quality tests or certification methods on UAV images. Additionally, test tools for UAV image quality assessment only use the ground sample distance (GSD), often resulting in a reduced image quality compared with that of manned aircraft images. In this study, we performed a modulation transfer function (MTF) analysis using a slanted edge target and a GSD analysis to confirm the necessity of MTF analysis in UAV image quality assessments. In this study, we aimed to address this issue by conducting a modulation transfer function (MTF) analysis using a slanted edge target and a ground sample distance (GSD) analysis. This was carried out to confirm the necessity of MTF analysis in evaluating UAV image quality. Furthermore, we analyzed the impact of flight height and mounted sensors on image quality at different study sites. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

Back to TopTop