sensors-logo

Journal Browser

Journal Browser

Machine Learning for Multimedia Communications

A topical collection in Sensors (ISSN 1424-8220). This collection belongs to the section "Communications".

Viewed by 21711

Editors


E-Mail Website1 Website2
Collection Editor
University of Essex, Colchester, UK
Interests: machine learning for communications; multimedia communications; network coding; information-centric networking; joint source and channel coding; signal processing and sensor networks
Special Issues, Collections and Topics in MDPI journals

E-Mail Website1 Website2
Collection Editor
University of Essex, Colchester, UK
Interests: deep neural networks for joint source-channel coding; machine learning; network coding; wireless edge caching; multimedia communications

Topical Collection Information

Dear Colleagues,

Despite the recent advances of 5G and beyond systems and multimedia coding techniques, the increasing demand for ubiquitous delivery of high-quality multimedia data ranging from high resolution video to immersive applications including AR/VR/MR continues to pose significant challenges for existing multimedia coding techniques and communication platforms that struggle to deal with the stringent requirements for low latency, high bandwidth, and ultra reliability. Machine learning has recently attracted significant attention from the multimedia community as the key enabler towards designing and building more reliable, efficient, and scalable multimedia communication systems. This Special Issue will publish the latest research and findings in machine learning enabled multimedia coding and communication systems for improved resilience, efficient coding, and reduced latency.

Topics of interest include but are not limited to the following:

  • Machine learning for image/video communications
  • Machine learning for immersive communications
  • Machine learning for resource allocation in multimedia communications
  • Rate control for machine learning based video coding
  • Machine learning for image/video coding
  • Machine learning for network orchestration in multimedia applications
  • Machine learning-based multimedia quality assessment
  • Machine learning for multimedia enabled IoT
  • Machine learning assisted cloud/edge/fog management for multimedia applications

Dr. Nikolaos Thomos
Dr. Eirina Bourtsoulatze
Collection Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the collection website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (7 papers)

2022

Jump to: 2021

20 pages, 4131 KiB  
Article
Reinforcement Learning-Based Adaptive Streaming Scheme with Edge Computing Assistance
by Minsu Kim and Kwangsue Chung
Sensors 2022, 22(6), 2171; https://doi.org/10.3390/s22062171 - 10 Mar 2022
Cited by 4 | Viewed by 2503
Abstract
Dynamic Adaptive Streaming over HTTP (DASH) is a promising scheme for improving the Quality of Experience (QoE) of users in video streaming. However, the existing schemes do not perform coordination among clients and depend on fixed heuristics. In this paper, we propose an [...] Read more.
Dynamic Adaptive Streaming over HTTP (DASH) is a promising scheme for improving the Quality of Experience (QoE) of users in video streaming. However, the existing schemes do not perform coordination among clients and depend on fixed heuristics. In this paper, we propose an adaptive streaming scheme with reinforcement learning in edge computing environments. The proposed scheme improves the overall QoE of clients and QoE fairness among clients based on a state-of-the-art reinforcement learning algorithm. Edge computing assistance plays a role in providing client-side observations to the mobile edge, making agents utilize this information when generating a policy for multi-client adaptive streaming. We evaluated the proposed scheme through simulation-based experiments under various network conditions. The experimental results show that the proposed scheme achieves better performance than the existing schemes. Full article
Show Figures

Figure 1

31 pages, 2552 KiB  
Review
Machine Learning for Multimedia Communications
by Nikolaos Thomos, Thomas Maugey and Laura Toni
Sensors 2022, 22(3), 819; https://doi.org/10.3390/s22030819 - 21 Jan 2022
Cited by 4 | Viewed by 3448
Abstract
Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us [...] Read more.
Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learning-oriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise. Full article
Show Figures

Figure 1

2021

Jump to: 2022

24 pages, 16451 KiB  
Article
A Flexible Coding Scheme Based on Block Krylov Subspace Approximation for Light Field Displays with Stacked Multiplicative Layers
by Joshitha Ravishankar, Mansi Sharma and Pradeep Gopalakrishnan
Sensors 2021, 21(13), 4574; https://doi.org/10.3390/s21134574 - 04 Jul 2021
Cited by 11 | Viewed by 3283
Abstract
To create a realistic 3D perception on glasses-free displays, it is critical to support continuous motion parallax, greater depths of field, and wider fields of view. A new type of Layered or Tensor light field 3D display has attracted greater attention these days. [...] Read more.
To create a realistic 3D perception on glasses-free displays, it is critical to support continuous motion parallax, greater depths of field, and wider fields of view. A new type of Layered or Tensor light field 3D display has attracted greater attention these days. Using only a few light-attenuating pixelized layers (e.g., LCD panels), it supports many views from different viewing directions that can be displayed simultaneously with a high resolution. This paper presents a novel flexible scheme for efficient layer-based representation and lossy compression of light fields on layered displays. The proposed scheme learns stacked multiplicative layers optimized using a convolutional neural network (CNN). The intrinsic redundancy in light field data is efficiently removed by analyzing the hidden low-rank structure of multiplicative layers on a Krylov subspace. Factorization derived from Block Krylov singular value decomposition (BK-SVD) exploits the spatial correlation in layer patterns for multiplicative layers with varying low ranks. Further, encoding with HEVC eliminates inter-frame and intra-frame redundancies in the low-rank approximated representation of layers and improves the compression efficiency. The scheme is flexible to realize multiple bitrates at the decoder by adjusting the ranks of BK-SVD representation and HEVC quantization. Thus, it would complement the generality and flexibility of a data-driven CNN-based method for coding with multiple bitrates within a single training framework for practical display applications. Extensive experiments demonstrate that the proposed coding scheme achieves substantial bitrate savings compared with pseudo-sequence-based light field compression approaches and state-of-the-art JPEG and HEVC coders. Full article
Show Figures

Figure 1

42 pages, 1125 KiB  
Article
Uplink vs. Downlink: Machine Learning-Based Quality Prediction for HTTP Adaptive Video Streaming
by Frank Loh, Fabian Poignée, Florian Wamser, Ferdinand Leidinger and Tobias Hoßfeld
Sensors 2021, 21(12), 4172; https://doi.org/10.3390/s21124172 - 17 Jun 2021
Cited by 13 | Viewed by 3616
Abstract
Streaming video is responsible for the bulk of Internet traffic these days. For this reason, Internet providers and network operators try to make predictions and assessments about the streaming quality for an end user. Current monitoring solutions are based on a variety of [...] Read more.
Streaming video is responsible for the bulk of Internet traffic these days. For this reason, Internet providers and network operators try to make predictions and assessments about the streaming quality for an end user. Current monitoring solutions are based on a variety of different machine learning approaches. The challenge for providers and operators nowadays is that existing approaches require large amounts of data. In this work, the most relevant quality of experience metrics, i.e., the initial playback delay, the video streaming quality, video quality changes, and video rebuffering events, are examined using a voluminous data set of more than 13,000 YouTube video streaming runs that were collected with the native YouTube mobile app. Three Machine Learning models are developed and compared to estimate playback behavior based on uplink request information. The main focus has been on developing a lightweight approach using as few features and as little data as possible, while maintaining state-of-the-art performance. Full article
Show Figures

Figure 1

21 pages, 20292 KiB  
Article
Salient Region Guided Blind Image Sharpness Assessment
by Siqi Liu, Shaode Yu, Yanming Zhao, Zhulin Tao, Hang Yu and Libiao Jin
Sensors 2021, 21(12), 3963; https://doi.org/10.3390/s21123963 - 08 Jun 2021
Cited by 1 | Viewed by 1974
Abstract
Salient regions provide important cues for scene understanding to the human vision system. However, whether the detected salient regions are helpful in image blur estimation is unknown. In this study, a salient region guided blind image sharpness assessment (BISA) framework is proposed, and [...] Read more.
Salient regions provide important cues for scene understanding to the human vision system. However, whether the detected salient regions are helpful in image blur estimation is unknown. In this study, a salient region guided blind image sharpness assessment (BISA) framework is proposed, and the effect of the detected salient regions on the BISA performance is investigated. Specifically, three salient region detection (SRD) methods and ten BISA models are jointly explored, during which the output saliency maps from SRD methods are re-organized as the input of BISA models. Consequently, the change in BISA metric values can be quantified and then directly related to the difference in BISA model inputs. Finally, experiments are conducted on three Gaussian blurring image databases, and the BISA prediction performance is evaluated. The comparison results indicate that salient region input can help achieve a close and sometimes superior performance to a BISA model over the whole image input. When using the center region input as the baseline, the detected salient regions from the saliency optimization from robust background detection (SORBD) method lead to consistently better score prediction, regardless of the BISA model. Based on the proposed hybrid framework, this study reveals that saliency detection benefits image blur estimation, while how to properly incorporate SRD methods and BISA models to improve the score prediction will be explored in our future work. Full article
Show Figures

Figure 1

24 pages, 7929 KiB  
Article
Synchronization of Acoustic Signals for Steganographic Transmission
by Jarosław Wojtuń and Zbigniew Piotrowski
Sensors 2021, 21(10), 3379; https://doi.org/10.3390/s21103379 - 12 May 2021
Cited by 3 | Viewed by 2134
Abstract
Steganography is a technique that makes it possible to hide additional information (payload) in the original signal (cover work). This paper focuses on hiding information in a speech signal. One of the major problems with steganographic systems is ensuring synchronization. The paper presents [...] Read more.
Steganography is a technique that makes it possible to hide additional information (payload) in the original signal (cover work). This paper focuses on hiding information in a speech signal. One of the major problems with steganographic systems is ensuring synchronization. The paper presents four new and effective mechanisms that allow achievement of synchronization on the receiving side. Three of the developed methods of synchronization operate directly on the acoustic signal, while the fourth method works in the higher layer, analyzing the structure of the decoded steganographic data stream. The results of the research concerning both the evaluation of signal quality and the effectiveness of synchronization are presented. The signal quality was assessed based on both objective and subjective methods. The conducted research confirmed the effectiveness of the developed methods of synchronization during the transmission of steganographic data in the VHF radio link and in the VoIP channel. Full article
Show Figures

Figure 1

14 pages, 3468 KiB  
Article
A New Cache Update Scheme Using Reinforcement Learning for Coded Video Streaming Systems
by Yu-Sin Kim, Jeong-Min Lee, Jong-Yeol Ryu and Tae-Won Ban
Sensors 2021, 21(8), 2867; https://doi.org/10.3390/s21082867 - 19 Apr 2021
Cited by 3 | Viewed by 2062
Abstract
As the demand for video streaming has been rapidly increasing recently, new technologies for improving the efficiency of video streaming have attracted much attention. In this paper, we thus investigate how to improve the efficiency of video streaming by using clients’ cache storage [...] Read more.
As the demand for video streaming has been rapidly increasing recently, new technologies for improving the efficiency of video streaming have attracted much attention. In this paper, we thus investigate how to improve the efficiency of video streaming by using clients’ cache storage considering exclusive OR (XOR) coding-based video streaming where multiple different video contents can be simultaneously transmitted in one transmission as long as prerequisite conditions are satisfied, and the efficiency of video streaming can be thus significantly enhanced. We also propose a new cache update scheme using reinforcement learning. The proposed scheme uses a K-actor-critic (K-AC) network that can mitigate the disadvantage of actor-critic networks by yielding K candidate outputs and by selecting the final output with the highest value out of the K candidates. The K-AC exists in each client, and each client can train it by using only locally available information without any feedback or signaling so that the proposed cache update scheme is a completely decentralized scheme. The performance of the proposed cache update scheme was analyzed in terms of the average number of transmissions for XOR coding-based video streaming and was compared to that of conventional cache update schemes. Our numerical results show that the proposed cache update scheme can reduce the number of transmissions up to 24% when the number of videos is 100, the number of clients is 50, and the cache size is 5. Full article
Show Figures

Figure 1

Back to TopTop