Topic Editors

Prof. Dr. Junxing Zheng
School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan, China
Dr. Peng Cao
Associate Professor, Faculty of Architecture, Civil and Transportation Engineering, Beijing University of Technology, Beijing 100084, China

3D Computer Vision and Smart Building and City, 2nd Volume

Abstract submission deadline
31 October 2024
Manuscript submission deadline
31 December 2024
Viewed by
9896

Topic Information

Dear Colleagues,

This Topic is a continuation of the previous successful Topic, "3D Computer Vision and Smart Building and City (https://www.mdpi.com/topics/3D_BIM)". Three-dimensional computer vision is an interdisciplinary subject involving computer vision, computer graphics, artificial intelligence and other fields. Its main contents include 3D perception, 3D understanding and 3D modeling. In recent years, 3D computer vision technology has developed rapidly and has been widely used in unmanned aerial vehicles, robots, autonomous driving, AR, VR and other fields. Smart buildings and cities use various information technologies or innovative concepts to connect as well as various systems and services so as to improve the efficiency of resource utilization, optimize management and services and improve quality of life. Smart buildings and cities can involve some frontier techniques, such as 3D CV for building information models, digital twins, city information models, simultaneous localization and mapping robots. The application of 3D computer vision in smart buildings and cities is a valuable research direction, but it still faces many major challenges. This topic focuses on the theory and technology of 3D computer vision in smart buildings and cities. We welcome papers that provide innovative technologies, theories or case studies in the relevant field.

Prof. Dr. Junxing Zheng
Dr. Peng Cao
Topic Editors

Keywords

  • smart buildings and cities
  • 3D computer vision
  • SLAM
  • building information model
  • city information model
  • robots

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Buildings
buildings
3.8 3.1 2011 14.6 Days CHF 2600 Submit
Drones
drones
4.8 6.1 2017 17.9 Days CHF 2600 Submit
Energies
energies
3.2 5.5 2008 16.1 Days CHF 2600 Submit
Sensors
sensors
3.9 6.8 2001 17 Days CHF 2600 Submit
Sustainability
sustainability
3.9 5.8 2009 18.8 Days CHF 2400 Submit
ISPRS International Journal of Geo-Information
ijgi
3.4 6.2 2012 35.5 Days CHF 1700 Submit

Preprints.org is a multidiscipline platform providing preprint service that is dedicated to sharing your research from the start and empowering your research journey.

MDPI Topics is cooperating with Preprints.org and has built a direct connection between MDPI journals and Preprints.org. Authors are encouraged to enjoy the benefits by posting a preprint at Preprints.org prior to publication:

  1. Immediately share your ideas ahead of publication and establish your research priority;
  2. Protect your idea from being stolen with this time-stamped preprint article;
  3. Enhance the exposure and impact of your research;
  4. Receive feedback from your peers in advance;
  5. Have it indexed in Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.

Published Papers (9 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
20 pages, 13136 KiB  
Article
DSOMF: A Dynamic Environment Simultaneous Localization and Mapping Technique Based on Machine Learning
by Shengzhe Yue, Zhengjie Wang and Xiaoning Zhang
Sensors 2024, 24(10), 3063; https://doi.org/10.3390/s24103063 - 11 May 2024
Viewed by 336
Abstract
To address the challenges of reduced localization accuracy and incomplete map construction demonstrated using classical semantic simultaneous localization and mapping (SLAM) algorithms in dynamic environments, this study introduces a dynamic scene SLAM technique that builds upon direct sparse odometry (DSO) and incorporates instance [...] Read more.
To address the challenges of reduced localization accuracy and incomplete map construction demonstrated using classical semantic simultaneous localization and mapping (SLAM) algorithms in dynamic environments, this study introduces a dynamic scene SLAM technique that builds upon direct sparse odometry (DSO) and incorporates instance segmentation and video completion algorithms. While prioritizing the algorithm’s real-time performance, we leverage the rapid matching capabilities of Direct Sparse Odometry (DSO) to link identical dynamic objects in consecutive frames. This association is achieved through merging semantic and geometric data, thereby enhancing the matching accuracy during image tracking through the inclusion of semantic probability. Furthermore, we incorporate a loop closure module based on video inpainting algorithms into our mapping thread. This allows our algorithm to rely on the completed static background for loop closure detection, further enhancing the localization accuracy of our algorithm. The efficacy of this approach is validated using the TUM and KITTI public datasets and the unmanned platform experiment. Experimental results show that, in various dynamic scenes, our method achieves an improvement exceeding 85% in terms of localization accuracy compared with the DSO system. Full article
Show Figures

Figure 1

15 pages, 2894 KiB  
Article
Phase Error Reduction for a Structured-Light 3D System Based on a Texture-Modulated Reprojection Method
by Chenbo Shi, Zheng Qin, Xiaowei Hu, Changsheng Zhu, Yuanzheng Mo, Zelong Li, Shaojia Yan, Yue Yu, Xiangteng Zang and Chun Zhang
Sensors 2024, 24(7), 2075; https://doi.org/10.3390/s24072075 - 24 Mar 2024
Viewed by 676
Abstract
Fringe projection profilometry (FPP), with benefits such as high precision and a large depth of field, is a popular 3D optical measurement method widely used in precision reconstruction scenarios. However, the pixel brightness at reflective edges does not satisfy the conditions of the [...] Read more.
Fringe projection profilometry (FPP), with benefits such as high precision and a large depth of field, is a popular 3D optical measurement method widely used in precision reconstruction scenarios. However, the pixel brightness at reflective edges does not satisfy the conditions of the ideal pixel-wise phase-shifting model due to the influence of scene texture and system defocus, resulting in severe phase errors. To address this problem, we theoretically analyze the non-pixel-wise phase propagation model for texture edges and propose a reprojection strategy based on scene texture modulation. The strategy first obtains the reprojection weight mask by projecting typical FPP patterns and calculating the scene texture reflection ratio, then reprojects stripe patterns modulated by the weight mask to eliminate texture edge effects, and finally fuses coarse and refined phase maps to generate an accurate phase map. We validated the proposed method on various texture scenes, including a smooth plane, depth surface, and curved surface. Experimental results show that the root mean square error (RMSE) of the phase at the texture edge decreased by 53.32%, proving the effectiveness of the reprojection strategy in eliminating depth errors at texture edges. Full article
Show Figures

Figure 1

21 pages, 25891 KiB  
Article
An Improved TransMVSNet Algorithm for Three-Dimensional Reconstruction in the Unmanned Aerial Vehicle Remote Sensing Domain
by Jiawei Teng, Haijiang Sun, Peixun Liu and Shan Jiang
Sensors 2024, 24(7), 2064; https://doi.org/10.3390/s24072064 - 23 Mar 2024
Viewed by 601
Abstract
It is important to achieve the 3D reconstruction of UAV remote sensing images in deep learning-based multi-view stereo (MVS) vision. The lack of obvious texture features and detailed edges in UAV remote sensing images leads to inaccurate feature point matching or depth estimation. [...] Read more.
It is important to achieve the 3D reconstruction of UAV remote sensing images in deep learning-based multi-view stereo (MVS) vision. The lack of obvious texture features and detailed edges in UAV remote sensing images leads to inaccurate feature point matching or depth estimation. To address this problem, this study improves the TransMVSNet algorithm in the field of 3D reconstruction by optimizing its feature extraction network and costumed body depth prediction network. The improvement is mainly achieved by extracting features with the Asymptotic Pyramidal Network (AFPN) and assigning weights to different levels of features through the ASFF module to increase the importance of key levels and also using the UNet structured network combined with an attention mechanism to predict the depth information, which also extracts the key area information. It aims to improve the performance and accuracy of the TransMVSNet algorithm’s 3D reconstruction of UAV remote sensing images. In this work, we have performed comparative experiments and quantitative evaluation with other algorithms on the DTU dataset as well as on a large UAV remote sensing image dataset. After a large number of experimental studies, it is shown that our improved TransMVSNet algorithm has better performance and robustness, providing a valuable reference for research and application in the field of 3D reconstruction of UAV remote sensing images. Full article
Show Figures

Figure 1

32 pages, 8391 KiB  
Article
Model-Based 3D Gaze Estimation Using a TOF Camera
by Kuanxin Shen, Yingshun Li, Zhannan Guo, Jintao Gao and Yingjian Wu
Sensors 2024, 24(4), 1070; https://doi.org/10.3390/s24041070 - 6 Feb 2024
Viewed by 934
Abstract
Among the numerous gaze-estimation methods currently available, appearance-based methods predominantly use RGB images as input and employ convolutional neural networks (CNNs) to detect facial images to regressively obtain gaze angles or gaze points. Model-based methods require high-resolution images to obtain a clear eyeball [...] Read more.
Among the numerous gaze-estimation methods currently available, appearance-based methods predominantly use RGB images as input and employ convolutional neural networks (CNNs) to detect facial images to regressively obtain gaze angles or gaze points. Model-based methods require high-resolution images to obtain a clear eyeball geometric model. These methods face significant challenges in outdoor environments and practical application scenarios. This paper proposes a model-based gaze-estimation algorithm using a low-resolution 3D TOF camera. This study uses infrared images instead of RGB images as input to overcome the impact of varying illumination intensity in the environment on gaze estimation. We utilized a trained YOLOv8 neural network model to detect eye landmarks in captured facial images. Combined with the depth map from a time-of-flight (TOF) camera, we calculated the 3D coordinates of the canthus points of a single eye of the subject. Based on this, we fitted a 3D geometric model of the eyeball to determine the subject’s gaze angle. Experimental validation showed that our method achieved a root mean square error of 6.03° and 4.83° in the horizontal and vertical directions, respectively, for the detection of the subject’s gaze angle. We also tested the proposed method in a real car driving environment, achieving stable driver gaze detection at various locations inside the car, such as the dashboard, driver mirror, and the in-vehicle screen. Full article
Show Figures

Figure 1

22 pages, 7968 KiB  
Article
Ship-Fire Net: An Improved YOLOv8 Algorithm for Ship Fire Detection
by Ziyang Zhang, Lingye Tan and Robert Lee Kong Tiong
Sensors 2024, 24(3), 727; https://doi.org/10.3390/s24030727 - 23 Jan 2024
Viewed by 1628
Abstract
Ship fire may result in significant damage to its structure and large economic loss. Hence, the prompt identification of fires is essential in order to provide prompt reactions and effective mitigation strategies. However, conventional detection systems exhibit limited efficacy and accuracy in detecting [...] Read more.
Ship fire may result in significant damage to its structure and large economic loss. Hence, the prompt identification of fires is essential in order to provide prompt reactions and effective mitigation strategies. However, conventional detection systems exhibit limited efficacy and accuracy in detecting targets, which has been mostly attributed to limitations imposed by distance constraints and the motion of ships. Although the development of deep learning algorithms provides a potential solution, the computational complexity of ship fire detection algorithm pose significant challenges. To solve this, this paper proposes a lightweight ship fire detection algorithm based on YOLOv8n. Initially, a dataset, including more than 4000 unduplicated images and their labels, is established before training. In order to ensure the performance of algorithms, both fire inside ship rooms and also fire on board are considered. Then after tests, YOLOv8n is selected as the model with the best performance and fastest speed from among several advanced object detection algorithms. GhostnetV2-C2F is then inserted in the backbone of the algorithm for long-range attention with inexpensive operation. In addition, spatial and channel reconstruction convolution (SCConv) is used to reduce redundant features with significantly lower complexity and computational costs for real-time ship fire detection. For the neck part, omni-dimensional dynamic convolution is used for the multi-dimensional attention mechanism, which also lowers the parameters. After these improvements, a lighter and more accurate YOLOv8n algorithm, called Ship-Fire Net, was proposed. The proposed method exceeds 0.93, both in precision and recall for fire and smoke detection in ships. In addition, the mAP@0.5 reaches about 0.9. Despite the improvement in accuracy, Ship-Fire Net also has fewer parameters and lower FLOPs compared to the original, which accelerates its detection speed. The FPS of Ship-Fire Net also reaches 286, which is helpful for real-time ship fire monitoring. Full article
Show Figures

Figure 1

18 pages, 4863 KiB  
Article
Research on Pedestrian Crossing Decision Models and Predictions Based on Machine Learning
by Jun Cai, Mengjia Wang and Yishuang Wu
Sensors 2024, 24(1), 258; https://doi.org/10.3390/s24010258 - 1 Jan 2024
Cited by 1 | Viewed by 1565
Abstract
Systematically and comprehensively enhancing road traffic safety using artificial intelligence (AI) is of paramount importance, and it is gradually becoming a crucial framework in smart cities. Within this context of heightened attention, we propose to utilize machine learning (ML) to optimize and ameliorate [...] Read more.
Systematically and comprehensively enhancing road traffic safety using artificial intelligence (AI) is of paramount importance, and it is gradually becoming a crucial framework in smart cities. Within this context of heightened attention, we propose to utilize machine learning (ML) to optimize and ameliorate pedestrian crossing predictions in intelligent transportation systems, where the crossing process is vital to pedestrian crossing behavior. Compared with traditional analytical models, the application of OpenCV image recognition and machine learning methods can analyze the mechanisms of pedestrian crossing behaviors with greater accuracy, thereby more precisely judging and simulating pedestrian violations in crossing. Authentic pedestrian crossing behavior data were extracted from signalized intersection scenarios in Chinese cities, and several machine learning models, including decision trees, multilayer perceptrons, Bayesian algorithms, and support vector machines, were trained and tested. In comparing the various models, the results indicate that the support vector machine (SVM) model exhibited optimal accuracy in predicting pedestrian crossing probabilities and speeds, and it can be applied in pedestrian crossing prediction and traffic simulation systems in intelligent transportation. Full article
Show Figures

Figure 1

19 pages, 5724 KiB  
Article
Image-Enhanced U-Net: Optimizing Defect Detection in Window Frames for Construction Quality Inspection
by Jorge Vasquez, Tomotake Furuhata and Kenji Shimada
Buildings 2024, 14(1), 3; https://doi.org/10.3390/buildings14010003 - 19 Dec 2023
Viewed by 1105
Abstract
Ensuring the structural integrity of window frames and detecting subtle defects, such as dents and scratches, is crucial for maintaining product quality. Traditional machine vision systems face challenges in defect identification, especially with reflective materials and varied environments. Modern machine and deep learning [...] Read more.
Ensuring the structural integrity of window frames and detecting subtle defects, such as dents and scratches, is crucial for maintaining product quality. Traditional machine vision systems face challenges in defect identification, especially with reflective materials and varied environments. Modern machine and deep learning (DL) systems hold promise for post-installation inspections but face limitations due to data scarcity and environmental variability. Our study introduces an innovative approach to enhance DL-based defect detection, even with limited data. We present a comprehensive window frame defect detection framework incorporating optimized image enhancement, data augmentation, and a core U-Net model. We constructed five datasets using cell phones and the Spot Robot for autonomous inspection, evaluating our approach across various scenarios and lighting conditions in real-world window frame inspections. Our results demonstrate significant performance improvements over the standard U-Net model, with a notable 7.43% increase in the F1 score and 15.1% in IoU. Our approach enhances defect detection capabilities, even in challenging real-world conditions. To enhance the generalizability of this study, it would be advantageous to apply its methodology across a broader range of diverse construction sites. Full article
Show Figures

Figure 1

16 pages, 5787 KiB  
Article
The Spatio-Temporal Patterns of Regional Development in Shandong Province of China from 2012 to 2021 Based on Nighttime Light Remote Sensing
by Hongli Zhang, Quanzhou Yu, Yujie Liu, Jie Jiang, Junjie Chen and Ruyun Liu
Sensors 2023, 23(21), 8728; https://doi.org/10.3390/s23218728 - 26 Oct 2023
Cited by 1 | Viewed by 1274
Abstract
As a major coastal economic province in the east of China, it is of great significance to clarify the temporal and spatial patterns of regional development in Shandong Province in recent years to support regional high-quality development. Nightlight remote sensing data can reveal [...] Read more.
As a major coastal economic province in the east of China, it is of great significance to clarify the temporal and spatial patterns of regional development in Shandong Province in recent years to support regional high-quality development. Nightlight remote sensing data can reveal the spatio-temporal patterns of social and economic activities on a fine pixel scale. We based the nighttime light patterns at three spatial scales in three geographical regions on monthly nighttime light remote sensing data and social statistics. Different cities and different counties in Shandong Province in the last 10 years were studied by using the methods of trend analysis, stability analysis and correlation analysis. The results show that: (1) The nighttime light pattern was generally consistent with the spatial pattern of construction land. The nighttime light intensity of most urban, built-up areas showed an increasing trend, while the old urban areas of Qingdao and Yantai showed a weakening trend. (2) At the geographical unit scale, the total nighttime light in south-central Shandong was significantly higher than that in eastern and northwest Shandong, while the nighttime light growth rate in northwest Shandong was significantly highest. At the urban scale, Liaocheng had the highest nighttime light growth rate. At the county scale, the nighttime light growth rate of counties with a better economy was lower, while that of counties with a backward economy was higher. (3) The nighttime light growth was significantly correlated with Gross Domestic Product (GDP) and population growth, indicating that regional economic development and population growth were the main causes of nighttime light change. Full article
Show Figures

Figure 1

17 pages, 45348 KiB  
Article
Enhanced 3D Pose Estimation in Multi-Person, Multi-View Scenarios through Unsupervised Domain Adaptation with Dropout Discriminator
by Junli Deng, Haoyuan Yao and Ping Shi
Sensors 2023, 23(20), 8406; https://doi.org/10.3390/s23208406 - 12 Oct 2023
Viewed by 970
Abstract
Data-driven pose estimation methods often assume equal distributions between training and test data. However, in reality, this assumption does not always hold true, leading to significant performance degradation due to distribution mismatches. In this study, our objective is to enhance the cross-domain robustness [...] Read more.
Data-driven pose estimation methods often assume equal distributions between training and test data. However, in reality, this assumption does not always hold true, leading to significant performance degradation due to distribution mismatches. In this study, our objective is to enhance the cross-domain robustness of multi-view, multi-person 3D pose estimation. We tackle the domain shift challenge through three key approaches: (1) A domain adaptation component is introduced to improve estimation accuracy for specific target domains. (2) By incorporating a dropout mechanism, we train a more reliable model tailored to the target domain. (3) Transferable Parameter Learning is employed to retain crucial parameters for learning domain-invariant data. The foundation for these approaches lies in the H-divergence theory and the lottery ticket hypothesis, which are realized through adversarial training by learning domain classifiers. Our proposed methodology is evaluated using three datasets: Panoptic, Shelf, and Campus, allowing us to assess its efficacy in addressing domain shifts in multi-view, multi-person pose estimation. Both qualitative and quantitative experiments demonstrate that our algorithm performs well in two different domain shift scenarios. Full article
Show Figures

Figure 1

Back to TopTop