Signal, Image and Video Processing: Development and Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Circuit and Signal Processing".

Deadline for manuscript submissions: 30 September 2024 | Viewed by 5582

Special Issue Editors


E-Mail Website
Guest Editor
Research Institute of Intelligent Control and Systems, Harbin Institute of Technology, Harbin 150001, China
Interests: machine vision; deep learning; artificial intelligence; data-based industrial modeling and measurement

E-Mail Website
Guest Editor
School of Electronic and Information Engineering, South China University of Technology, Guangzhou 510641, China
Interests: computer vision; pattern recognition; machine learning

E-Mail Website
Guest Editor
College of Electrical Engineering and Control Science, Nanjing Tech University, Nanjing 211800, China
Interests: image processing; machine vision; deep learning; artificial intelligence

Special Issue Information

Dear Colleagues,

Modern complex systems are required to have humanoid intelligence and ability, and vision is an indispensable technical means for complex systems to achieve automation and intelligence. Vision carries rich information regarding a system and its operating environment in the form of signals, images and videos, and through information mining, valuable information or knowledge can be obtained to support the implementation of the automation and intelligence of complex systems. Due to the complexity of the system, the variability and harshness of the operating environment, and the diversity of perception tasks, visual perception faces many challenges in data processing, model development, knowledge expression, etc. This prompts academic researchers and engineering practitioners to make unremitting efforts to promote the development of visual perception technology and its application in various fields.

This Special Issue aims to alleviate the contradiction between the increasing actual demand for visual perception and the backward visual perception technology and focuses on, but is not limited to, advanced machine-learning- and deep-learning-related signal, image, and video-processing technologies. The related topics are data acquisition, data quality enhancement, segmentation, representation and description, feature matching, motion tracking, etc., and technologies related to large-scale/lightweight deep neural networks, domain adaption, and transfer learning with industrial applications are preferred. This Special Issue provides a platform for researchers and practitioners to present original and innovative results regarding new models and methods and engineering solutions.

Prof. Dr. Xianqiang Yang
Prof. Dr. Changxing Ding
Dr. Zhihao Zhang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine-learning- and deep-learning-based signal, image, and video processing
  • signal, image, and video acquisition
  • data transform and filtering
  • data quality enhancement
  • image and video segmentation
  • image and video representation and description
  • feature extraction, compression, description, and matching based on image and video
  • detection, recognition, classification, and measurement based on image and video
  • motion analysis and object tracking based on image and video
  • large-scale deep neural network development and applications
  • lightweight deep neural network development and applications in embedded devices
  • domain adaptation and transfer learning
  • image- and video-processing-based industrial applications

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 7120 KiB  
Article
Enhancing Crowd-Sourced Video Sharing through P2P-Assisted HTTP Video Streaming
by Jieran Geng and Satoshi Fujita
Electronics 2024, 13(7), 1270; https://doi.org/10.3390/electronics13071270 - 29 Mar 2024
Viewed by 398
Abstract
This paper introduces a decentralized architecture designed for the sharing and distribution of user-generated video streams. The proposed system employs HTTP Live Streaming (HLS) as the delivery method for these video streams. In the architecture, a creator who captures a video stream using [...] Read more.
This paper introduces a decentralized architecture designed for the sharing and distribution of user-generated video streams. The proposed system employs HTTP Live Streaming (HLS) as the delivery method for these video streams. In the architecture, a creator who captures a video stream using a smartphone camera subsequently transcodes it into a sequence of video chunks called HLS segments. These chunks are then stored in a distributed manner across the worker network, forming the core of the proposed architecture. Despite the presence of a coordinator for bootstrapping within the worker network, the selection of worker nodes for storing generated video chunks and autonomous load balancing among worker nodes are conducted in a decentralized fashion, eliminating the need for central servers. The worker network is implemented using the Golang-based IPFS (InterPlanetary File System) client, called kubo, leveraging essential IPFS functionalities such as node identification through Kademlia-DHT and message exchange using Bitswap. Beyond merely delivering stored video streams, the worker network can also amalgamate multiple streams to create a new composite stream. This bundling of multiple video streams into a unified video stream is executed on the worker nodes, making effective use of the FFmpeg library. To enhance download efficiency, parallel downloading with multiple threads is employed for retrieving the video stream from the worker network to the requester, thereby reducing download time. The result of the experiments conducted on the prototype system indicates that those concerned with the transmission time of the requested video streams compared with a server-based system using AWS exhibit a significant advantage, particularly evident in the case of low-resolution video streams, and this advantage becomes more pronounced as the stream length increases. Furthermore, it demonstrates a clear advantage in scenarios characterized by a substantial volume of viewing requests. Full article
(This article belongs to the Special Issue Signal, Image and Video Processing: Development and Applications)
Show Figures

Figure 1

24 pages, 5497 KiB  
Article
Fast Decision-Tree-Based Series Partitioning and Mode Prediction Termination Algorithm for H.266/VVC
by Ye Li, Zhihao He and Qiuwen Zhang
Electronics 2024, 13(7), 1250; https://doi.org/10.3390/electronics13071250 - 27 Mar 2024
Viewed by 509
Abstract
With the advancement of network technology, multimedia videos have emerged as a crucial channel for individuals to access external information, owing to their realistic and intuitive effects. In the presence of high frame rate and high dynamic range videos, the coding efficiency of [...] Read more.
With the advancement of network technology, multimedia videos have emerged as a crucial channel for individuals to access external information, owing to their realistic and intuitive effects. In the presence of high frame rate and high dynamic range videos, the coding efficiency of high-efficiency video coding (HEVC) falls short of meeting the storage and transmission demands of the video content. Therefore, versatile video coding (VVC) introduces a nested quadtree plus multi-type tree (QTMT) segmentation structure based on the HEVC standard, while also expanding the intra-prediction modes from 35 to 67. While the new technology introduced by VVC has enhanced compression performance, it concurrently introduces a higher level of computational complexity. To enhance coding efficiency and diminish computational complexity, this paper explores two key aspects: coding unit (CU) partition decision-making and intra-frame mode selection. Firstly, to address the flexible partitioning structure of QTMT, we propose a decision-tree-based series partitioning decision algorithm for partitioning decisions. Through concatenating the quadtree (QT) partition division decision with the multi-type tree (MT) division decision, a strategy is implemented to determine whether to skip the MT division decision based on texture characteristics. If the MT partition decision is used, four decision tree classifiers are used to judge different partition types. Secondly, for intra-frame mode selection, this paper proposes an ensemble-learning-based algorithm for mode prediction termination. Through the reordering of complete candidate modes and the assessment of prediction accuracy, the termination of redundant candidate modes is accomplished. Experimental results show that compared with the VVC test model (VTM), the algorithm proposed in this paper achieves an average time saving of 54.74%, while the BDBR only increases by 1.61%. Full article
(This article belongs to the Special Issue Signal, Image and Video Processing: Development and Applications)
Show Figures

Figure 1

34 pages, 39340 KiB  
Article
An RGB Pseudo-Colorization Method for Filtering of Multi-Source Graphical Data
by Ireneusz Kubiak and Artur Przybysz
Electronics 2023, 12(22), 4583; https://doi.org/10.3390/electronics12224583 - 09 Nov 2023
Cited by 1 | Viewed by 838
Abstract
Artificial colorization (pseudo-colorization) is a commonly used method to improve the readability of images obtained from sources (sensors) that do not reflect the original color of the object of observation (e.g., X-ray). It is designed to draw the observer’s attention to the important [...] Read more.
Artificial colorization (pseudo-colorization) is a commonly used method to improve the readability of images obtained from sources (sensors) that do not reflect the original color of the object of observation (e.g., X-ray). It is designed to draw the observer’s attention to the important details of the analyzed image (e.g., disease changes in medical imaging). Analogous needs occur in the process of assessing the emission security (EMSEC) of imaging devices used to process classified information, which is made on the basis of the analysis of images reproduced from compromising emanations related to the operation of these devices. The presence of many graphic elements in an image may reduce the level of perception of the information contained in it. Such images may be very noisy or contain overlapping graphic symbols, the source of which is devices processing graphic information operating in close proximity to each other. The use of various types of measures enabling data filtration at various stages of their processing, e.g., the use of a directional antenna, frequency filtering, point filtering or contextual contrast modification, does not always prove effective. The solution to the filtration problem is the pseudo-colorization of the image. However, the image colorization used based on the typical “Hot”, “Radar” or “Cold” color palettes does not meet the requirements for filtering graphic data from many sources. It is necessary to use a filter that will allow the sharp cut-off of graphic data at the border between the background and the graphic symbol. For the pseudo-colorization process itself, the exponential function as a function of transforming the amplitudes of image pixels from the gray color space to the RGB color space is sufficient. However, the smooth transition of the function shape from zero values to values greater than zero results in a low efficiency of filtering graphic data from noise. In this article, a method of filtering an image based on the pseudo-colorization of its content, i.e., reproduction of a compromising emanation signal level in the RGB value of image pixel color components, was proposed. A quadratic function was proposed as the transformation function. The higher effectiveness of the method based on the use of a square function (compared to the exponential function) was shown by conducting tests on many images, some of which are presented in this article. The proposed solution is a universal approach and can be used in various fields related to image analysis and the need for their filtration. Its universality is related to the possibility of changing function parameters affecting its position on the value axis from 0 to 255, its width, its minimum and its maximum value for each RGB channel. Full article
(This article belongs to the Special Issue Signal, Image and Video Processing: Development and Applications)
Show Figures

Figure 1

13 pages, 2189 KiB  
Article
Resource-Efficient Optimization for FPGA-Based Convolution Accelerator
by Yanhua Ma, Qican Xu and Zerui Song
Electronics 2023, 12(20), 4333; https://doi.org/10.3390/electronics12204333 - 19 Oct 2023
Viewed by 870
Abstract
Convolution forms one of the most essential operations for the FPGA-based hardware accelerator. However, the existing designs often neglect the inherent architecture of FPGA, which puts forward an austere challenge on hardware resource. Even though some previous works have proposed approximate multipliers or [...] Read more.
Convolution forms one of the most essential operations for the FPGA-based hardware accelerator. However, the existing designs often neglect the inherent architecture of FPGA, which puts forward an austere challenge on hardware resource. Even though some previous works have proposed approximate multipliers or convolution acceleration algorithms to deal with this issue, the inevitable accuracy loss and resource occupation easily lead to performance degradation. Toward this, we first propose two kinds of resource-efficient optimized accurate multipliers based on LUTs or carry chains. Then, targeting FPGA-based platforms, a generic multiply–accumulate structure is constructed by directly accumulating the partial products produced by our proposed optimized radix-4 Booth multipliers without intermediate multiplication and addition results. Experimental results demonstrate that our proposed multiplier achieves a maximum 51% look-up-table (LUT) reduction compared to the Vivado area-optimized multiplier IP. Furthermore, the convolutional process unit using the proposed structure achieves a 36% LUT reduction compared to existing methods. As case studies, the proposed method is applied to DCT transform, LeNet, and MobileNet-V3 to achieve hardware resource saving without loss of accuracy. Full article
(This article belongs to the Special Issue Signal, Image and Video Processing: Development and Applications)
Show Figures

Figure 1

15 pages, 16797 KiB  
Article
An Unsupervised Fundus Image Enhancement Method with Multi-Scale Transformer and Unreferenced Loss
by Yanzhe Hu, Yu Li, Hua Zou and Xuedong Zhang
Electronics 2023, 12(13), 2941; https://doi.org/10.3390/electronics12132941 - 04 Jul 2023
Viewed by 885
Abstract
Color fundus images are now widely used in computer-aided analysis systems for ophthalmic diseases. However, fundus imaging can be affected by human, environmental, and equipment factors, which may result in low-quality images. Such quality fundus images will interfere with computer-aided diagnosis. Existing methods [...] Read more.
Color fundus images are now widely used in computer-aided analysis systems for ophthalmic diseases. However, fundus imaging can be affected by human, environmental, and equipment factors, which may result in low-quality images. Such quality fundus images will interfere with computer-aided diagnosis. Existing methods for enhancing low-quality fundus images focus more on the overall visualization of the image rather than capturing pathological and structural features at the finer scales of the fundus image sufficiently. In this paper, we design an unsupervised method that integrates a multi-scale feature fusion transformer and an unreferenced loss function. Due to the loss of microscale features caused by unpaired training, we construct the Global Feature Extraction Module (GFEM), a combination of convolution blocks and residual Swin Transformer modules, to achieve the extraction of feature information at different levels while reducing computational costs. To improve the blurring of image details caused by deep unsupervised networks, we define unreferenced loss functions that improve the model’s ability to suppress edge sharpness degradation. In addition, uneven light distribution can also affect image quality, so we use an a priori luminance-based attention mechanism to improve low-quality image illumination unevenness. On the public dataset, we achieve an improvement of 0.88 dB in PSNR and 0.024 in SSIM compared to the state-of-the-art methods. Experiment results show that our method outperforms other deep learning methods in terms of vascular continuity and preservation of fine pathological features. Such a framework may have potential medical applications. Full article
(This article belongs to the Special Issue Signal, Image and Video Processing: Development and Applications)
Show Figures

Figure 1

17 pages, 1080 KiB  
Article
Hardware Architecture for Realtime HEVC Intra Prediction
by Duc Khai Lam, Pham The Anh Nguyen and Tuan Anh Tran
Electronics 2023, 12(7), 1705; https://doi.org/10.3390/electronics12071705 - 04 Apr 2023
Viewed by 1492
Abstract
Researchers have, in recent times, achieved excellent compression efficiency by implementing a more complicated compression algorithm due to the rapid development of video compression. As a result, the next model of video compression, High-Efficiency Video Coding (HEVC), provides high-quality video output while requiring [...] Read more.
Researchers have, in recent times, achieved excellent compression efficiency by implementing a more complicated compression algorithm due to the rapid development of video compression. As a result, the next model of video compression, High-Efficiency Video Coding (HEVC), provides high-quality video output while requiring less bandwidth. However, implementing the intra-prediction technique in HEVC requires significant processing complexity. This research provides a completely pipelined hardware architecture solution capable of real-time compression to minimize computing complexity. All prediction unit sizes of 4×4, 8×8, 16×16, and 32×32, and all planar, angular, and DC modes are supported by the proposed solution. The synthesis results mapped to Xilinx Virtex 7 reveal that our solution can do real-time output with 210 frames per second (FPS) at 1920×1080 resolution, called Full High Definition (FHD), or 52 FPS at 3840×2160 resolution, called 4K, while operating at 232 Mhz maximum frequency. Full article
(This article belongs to the Special Issue Signal, Image and Video Processing: Development and Applications)
Show Figures

Figure 1

Back to TopTop