Submit to Future Internet Review for Future Internet Propose a Special Issue

Journal Menu

Journal Browser

Advances Techniques in Computer Vision and Multimedia

Special Issue Editors
Special Issue Information
Keywords
Related Special Issue
Published Papers

A special issue of Future Internet (ISSN 1999-5903). This special issue belongs to the section "Big Data and Augmented Intelligence".

Deadline for manuscript submissions: closed (30 April 2023) | Viewed by 12538

Share This Special Issue

Special Issue Editor

Prof. Dr. Yang Wang

E-Mail Website
Guest Editor

School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
Interests: pattern recognition; machine learning; multimedia computing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

With the popularization of Artificial Intelligence (AI) technology, computer vision has experienced significant advancements and great success in areas closely concerned with human society, e.g., autonomous driving, virtual reality, mixed reality, and medical health. As a research topic, computer vision aims to enable computer systems to automatically see, recognize, and understand the visual world by simulating the mechanism of human vision.

Multimedia have also changed our lifestyles and are becoming an indispensable part of our daily life. This research field mainly discusses the emerging computing methods of dealing with various media (picture, text, audio, video, etc.) generated by the ubiquitous multimedia sensors and infrastructures, including retrieval of multimedia data, analysis of multimedia contents, methodology based on deep learning, and practical multimedia applications.

Large amounts of researchers have devoted themselves to exploring the emerging fields of computer vision and multimedia, e.g., adversarial learning for multimedia, multimodal sentiment analysis, and explainable AI. Meanwhile, numerous advanced technologies in these areas continue to emerge. This Special Issue will provide an excellent opportunity for sharing a timely collection of research updates and will benefit researchers and practitioners engaged in computer vision, media computing, machine learning, and other fields.

Prof. Dr. Yang Wang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Future Internet is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

deep learning
motion and tracking
image and video retrieval
detection and localization
scene analysis and understanding
multimedia systems
multimedia for society and health
multimedia application and services
multimedia security and content protection
multimedia communications, networking, and mobility

Related Special Issue

Advances Techniques in Computer Vision and Multimedia II in Future Internet

Published Papers (5 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Editorial

Jump to: Research

2 pages, 146 KiB

Open AccessEditorial

Advances Techniques in Computer Vision and Multimedia

by Yang Wang

Future Internet 2023, 15(9), 294; https://doi.org/10.3390/fi15090294 - 01 Sep 2023

Viewed by 813

Abstract

Computer vision has experienced significant advancements and great success in areas closely related to human society, which aims to enable computer systems to automatically see, recognize, and understand the visual world by simulating the mechanism of human vision [...] Full article

(This article belongs to the Special Issue Advances Techniques in Computer Vision and Multimedia)

Research

Jump to: Editorial

17 pages, 7576 KiB

Open AccessArticle

Banging Interaction: A Ubimus-Design Strategy for the Musical Internet

by Damián Keller, Azeema Yaseen, Joseph Timoney, Sutirtha Chakraborty and Victor Lazzarini

Future Internet 2023, 15(4), 125; https://doi.org/10.3390/fi15040125 - 27 Mar 2023

Cited by 2 | Viewed by 1247

Abstract

We introduce a new perspective for musical interaction tailored to a specific class of sonic resources: impact sounds. Our work is informed by the field of ubiquitous music (ubimus) and engages with the demands of artistic practices. Through a series of deployments of a low-cost and highly flexible network-based prototype, the Dynamic Drum Collective, we exemplify the limitations and specific contributions of banging interaction. Three components of this new design strategy—adaptive interaction, mid-air techniques and timbre-led design—target the development of creative-action metaphors that make use of resources available in everyday settings. The techniques involving the use of sonic gridworks yielded positive outcomes. The subjects tended to choose sonic materials that—when combined with their actions on the prototype—approached a full rendition of the proposed soundtrack. The results of the study highlighted the subjects’ reliance on visual feedback as a non-exclusive strategy to handle both temporal organization and collaboration. The results show a methodological shift from device-centric and instrumental-centric methods to designs that target the dynamic relational properties of ubimus ecosystems. Full article

(This article belongs to the Special Issue Advances Techniques in Computer Vision and Multimedia)

► Show Figures

Figure 1

14 pages, 7919 KiB

Open AccessArticle

Neural Network-Based Price Tag Data Analysis

by Pavel Laptev, Sergey Litovkin, Sergey Davydenko, Anton Konev, Evgeny Kostyuchenko and Alexander Shelupanov

Future Internet 2022, 14(3), 88; https://doi.org/10.3390/fi14030088 - 13 Mar 2022

Cited by 3 | Viewed by 3417

Abstract

This paper compares neural networks, specifically Unet, MobileNetV2, VGG16 and YOLOv4-tiny, for image segmentation as part of a study aimed at finding an optimal solution for price tag data analysis. The neural networks considered were trained on an individual dataset collected by the authors. Additionally, this paper covers the automatic image text recognition approach using EasyOCR API. Research revealed that the optimal network for segmentation is YOLOv4-tiny, featuring a cross validation accuracy of 96.92%. EasyOCR accuracy was also calculated and is 95.22%. Full article

(This article belongs to the Special Issue Advances Techniques in Computer Vision and Multimedia)

► Show Figures

Figure 1

23 pages, 3828 KiB

Open AccessArticle

DA-GAN: Dual Attention Generative Adversarial Network for Cross-Modal Retrieval

by Liewu Cai, Lei Zhu, Hongyan Zhang and Xinghui Zhu

Future Internet 2022, 14(2), 43; https://doi.org/10.3390/fi14020043 - 27 Jan 2022

Cited by 6 | Viewed by 3102

Abstract

Cross-modal retrieval aims to search samples of one modality via queries of other modalities, which is a hot issue in the community of multimedia. However, two main challenges, i.e., heterogeneity gap and semantic interaction across different modalities, have not been solved efficaciously. Reducing the heterogeneous gap can improve the cross-modal similarity measurement. Meanwhile, modeling cross-modal semantic interaction can capture the semantic correlations more accurately. To this end, this paper presents a novel end-to-end framework, called Dual Attention Generative Adversarial Network (DA-GAN). This technique is an adversarial semantic representation model with a dual attention mechanism, i.e., intra-modal attention and inter-modal attention. Intra-modal attention is used to focus on the important semantic feature within a modality, while inter-modal attention is to explore the semantic interaction between different modalities and then represent the high-level semantic correlation more precisely. A dual adversarial learning strategy is designed to generate modality-invariant representations, which can reduce the cross-modal heterogeneity efficiently. The experiments on three commonly used benchmarks show the better performance of DA-GAN than these competitors. Full article

(This article belongs to the Special Issue Advances Techniques in Computer Vision and Multimedia)

► Show Figures

Figure 1

17 pages, 391 KiB

Open AccessArticle

Graph Representation-Based Deep Multi-View Semantic Similarity Learning Model for Recommendation

by Jiagang Song, Jiayu Song, Xinpan Yuan, Xiao He and Xinghui Zhu

Future Internet 2022, 14(2), 32; https://doi.org/10.3390/fi14020032 - 19 Jan 2022

Cited by 6 | Viewed by 2653

Abstract

With the rapid development of Internet technology, how to mine and analyze massive amounts of network information to provide users with accurate and fast recommendation information has become a hot and difficult topic of joint research in industry and academia in recent years. One of the most widely used social network recommendation methods is collaborative filtering. However, traditional social network-based collaborative filtering algorithms will encounter problems such as low recommendation performance and cold start due to high data sparsity and uneven distribution. In addition, these collaborative filtering algorithms do not effectively consider the implicit trust relationship between users. To this end, this paper proposes a collaborative filtering recommendation algorithm based on graphsage (GraphSAGE-CF). The algorithm first uses graphsage to learn low-dimensional feature representations of the global and local structures of user nodes in social networks and then calculates the implicit trust relationship between users through the feature representations learned by graphsage. Finally, the comprehensive evaluation shows the scores of users and implicit users on related items and predicts the scores of users on target items. Experimental results on four open standard datasets show that our proposed graphsage-cf algorithm is superior to existing algorithms in RMSE and MAE. Full article

(This article belongs to the Special Issue Advances Techniques in Computer Vision and Multimedia)

► Show Figures

Journal Menu

Journal Browser

Advances Techniques in Computer Vision and Multimedia

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Related Special Issue

Published Papers (5 papers)

Editorial

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI