Topic Editors

Dr. Shunli Zhang
School of Software Engineering, Beijing Jiaotong University, Beijing, China
Dr. Xin Yu
School of Computer Science, University of Technology Sydney, Australia
School of Information and Control, Nanjing University of Information Science and Technology, Nanjing, China
Dr. Yang Yang
School of Information Science and Engineering, Shandong University, Qingdao, China

Visual Object Tracking: Challenges and Applications

Abstract submission deadline
closed (31 August 2023)
Manuscript submission deadline
31 October 2023
Viewed by
4022

Topic Information

Dear Colleagues,

Visual tracking aims to locate the target specified in the initial frame, which has many realistic applications such as video surveillance, augment reality, and behavior analysis. In spite of numerous efforts, this is still a challenging task due to factors such as deformation, illumination change, rotation, and occlusion, to name a few. This Topic promotes scientific dialogue for the added value of novel methodological approaches and research in the specified areas. Our interest is in the entire end-to-end spectrum of visual object tracking research, from motion estimation, appearance representation, strategic frameworks, models, and best practices to sophisticated research related to radical innovation. The topics of interest include but are not limited to the following indicative list:

  • Enabling Technologies for visual object tracking research:
    • Machine learning;
    • Neural networks;
    • Image processing;
    • Bot technology;
    • AI agents;
    • Reinforcement learning;
    • Edge computing;
  • Methodologies, frameworks, and models for artificial intelligence and visual object tracking research:
    • For innovations in business, research, academia industry, and technology;
    • For theoretical foundations and contributions to the body of knowledge of visual object tracking;
  • Best practices and use cases;
  • Outcomes of R&D projects;
  • Industry-government collaboration;
  • Security and privacy issues;
  • Ethics on visual object tracking and AI;
  • Social impact of AI.

Dr. Shunli Zhang
Dr. Xin Yu
Prof. Dr. Kaihua Zhang
Dr. Yang Yang
Topic Editors

Keywords

  • artificial intelligence
  • computer vision
  • visual object tracking
  • reinforcement learning
  • deep learning
  • feature extraction
  • trajectory prediction

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Applied Sciences
applsci
2.7 4.5 2011 15.8 Days CHF 2300 Submit
Electronics
electronics
2.9 4.7 2012 15.8 Days CHF 2200 Submit
Journal of Imaging
jimaging
3.2 4.4 2015 21.9 Days CHF 1600 Submit
Sensors
sensors
3.9 6.8 2001 16.4 Days CHF 2600 Submit
Signals
signals
- - 2020 43.6 Days CHF 1000 Submit

Preprints is a platform dedicated to making early versions of research outputs permanently available and citable. MDPI journals allow posting on preprint servers such as Preprints.org prior to publication. For more details about reprints, please visit https://www.preprints.org.

Published Papers (4 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
Article
SiamUT: Siamese Unsymmetrical Transformer-like Tracking
Electronics 2023, 12(14), 3133; https://doi.org/10.3390/electronics12143133 - 19 Jul 2023
Viewed by 412
Abstract
Siamese networks have proven to be suitable for many computer vision tasks, including single object tracking. These trackers leverage the siamese structure to benefit from feature cross-correlation, which measures the similarity between a target template and the corresponding search region. However, the linear [...] Read more.
Siamese networks have proven to be suitable for many computer vision tasks, including single object tracking. These trackers leverage the siamese structure to benefit from feature cross-correlation, which measures the similarity between a target template and the corresponding search region. However, the linear nature of the correlation operation leads to the loss of important semantic information and may result in suboptimal performance when faced with complex background interference or significant object deformations. In this paper, we introduce the Transformer structure, which has been successful in vision tasks, to enhance the siamese network’s performance in challenging conditions. By incorporating self-attention and cross-attention mechanisms, we modify the original Transformer into an asymmetrical version that can focus on different regions of the feature map. This transformer-like fusion network enables more efficient and effective fusion procedures. Additionally, we introduce a two-layer output structure with decoupling prediction heads, improved loss functions, and window penalty post-processing. This design enhances the performance of both the classification and the regression branches. Extensive experiments conducted on large public datasets such as LaSOT, GOT-10k, and TrackingNet demonstrate that our proposed SiamUT tracker achieves state-of-the-art precision performance on most benchmark datasets. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

Article
Motion Vector Extrapolation for Video Object Detection
J. Imaging 2023, 9(7), 132; https://doi.org/10.3390/jimaging9070132 - 29 Jun 2023
Viewed by 668
Abstract
Despite the continued successes of computationally efficient deep neural network architectures for video object detection, performance continually arrives at the great trilemma of speed versus accuracy versus computational resources (pick two). Current attempts to exploit temporal information in video data to overcome this [...] Read more.
Despite the continued successes of computationally efficient deep neural network architectures for video object detection, performance continually arrives at the great trilemma of speed versus accuracy versus computational resources (pick two). Current attempts to exploit temporal information in video data to overcome this trilemma are bottlenecked by the state of the art in object detection models. This work presents motion vector extrapolation (MOVEX), a technique which performs video object detection through the use of off-the-shelf object detectors alongside existing optical flow-based motion estimation techniques in parallel. This work demonstrates that this approach significantly reduces the baseline latency of any given object detector without sacrificing accuracy performance. Further latency reductions up to 24 times lower than the original latency can be achieved with minimal accuracy loss. MOVEX enables low-latency video object detection on common CPU-based systems, thus allowing for high-performance video object detection beyond the domain of GPU computing. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

Article
A Target Re-Identification Method Based on Shot Boundary Object Detection for Single Object Tracking
Appl. Sci. 2023, 13(11), 6422; https://doi.org/10.3390/app13116422 - 24 May 2023
Cited by 1 | Viewed by 765
Abstract
With the advantages of simple model structure and performance-speed balance, the single object tracking (SOT) model based on a Transformer has become a hot topic in the current object tracking field. However, the tracking errors caused by the target leaving the shot, namely [...] Read more.
With the advantages of simple model structure and performance-speed balance, the single object tracking (SOT) model based on a Transformer has become a hot topic in the current object tracking field. However, the tracking errors caused by the target leaving the shot, namely the target out-of-view, are more likely to occur in videos than we imagine. To address this issue, we proposed a target re-identification method for SOT called TRTrack. First, we built a bipartite matching model of candidate tracklets and neighbor tracklets optimized by the Hopcroft–Karp algorithm, which is used for preliminary tracking and judging the target leaves the shot. It achieves 76.3% mAO on the tracking benchmark Generic Object Tracking-10k (GOT-10k). Then, we introduced the alpha-IoU loss function in YOLOv5-DeepSORT to detect the shot boundary objects and attained 38.62% mAP75:95 on Microsoft Common Objects in Context 2017 (MS COCO 2017). Eventually, we designed a backtracking identification module in TRTrack to re-identify the target. Experimental results confirmed the effectiveness of our method, which is superior to most of the state-of-the-art models. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

Article
Global Context Attention for Robust Visual Tracking
Sensors 2023, 23(5), 2695; https://doi.org/10.3390/s23052695 - 01 Mar 2023
Viewed by 919
Abstract
Although there have been recent advances in Siamese-network-based visual tracking methods where they show high performance metrics on numerous large-scale visual tracking benchmarks, persistent challenges regarding the distractor objects with similar appearances to the target object still remain. To address these aforementioned issues, [...] Read more.
Although there have been recent advances in Siamese-network-based visual tracking methods where they show high performance metrics on numerous large-scale visual tracking benchmarks, persistent challenges regarding the distractor objects with similar appearances to the target object still remain. To address these aforementioned issues, we propose a novel global context attention module for visual tracking, where the proposed module can extract and summarize the holistic global scene information to modulate the target embedding for improved discriminability and robustness. Our global context attention module receives a global feature correlation map to elicit the contextual information from a given scene and generates the channel and spatial attention weights to modulate the target embedding to focus on the relevant feature channels and spatial parts of the target object. Our proposed tracking algorithm is tested on large-scale visual tracking datasets, where we show improved performance compared to the baseline tracking algorithm while achieving competitive performance with real-time speed. Additional ablation experiments also validate the effectiveness of the proposed module, where our tracking algorithm shows improvements in various challenging attributes of visual tracking. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

Back to TopTop