Next Article in Journal
Finding Nori—Understanding Key Factors Driving US Consumers’ Commitment for Sea-Vegetable Products
Previous Article in Journal
Mechanical Properties of Polymers Recovered from Multilayer Food Packaging by Nitric Acid
Previous Article in Special Issue
Research on eVTOL Air Route Network Planning Based on Improved A* Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

A Review of Electric UAV Visual Detection and Navigation Technologies for Emergency Rescue Missions

by
Peng Tang
1,
Jiyun Li
1 and
Hongqiang Sun
2,*
1
School of Transportation Science and Engineering, Beihang University, Beijing 100083, China
2
School of Emergency Equipment, North China Institute of Science and Technology, Langfang 065201, China
*
Author to whom correspondence should be addressed.
Sustainability 2024, 16(5), 2105; https://doi.org/10.3390/su16052105
Submission received: 6 November 2023 / Revised: 29 February 2024 / Accepted: 1 March 2024 / Published: 3 March 2024

Abstract

:
Sudden disasters often result in significant losses of human lives and property, and emergency rescue is a necessary response to disasters. In recent years, with the development of electric unmanned aerial vehicles (UAVs) and artificial intelligence technology, the combination of these technologies has been gradually applied to emergency rescue missions. However, in the face of the complex working conditions of emergency rescue missions, the application of electric UAV visual detection still faces great challenges, particularly in relation to a lack of GPS positioning signal in closed emergency rescue environments, as well as unforeseen obstacle avoidance and autonomous planning and searching flights. Although the combination of visual detection and visual navigation technology shows great potential and added value for use in the context of emergency rescue, at present it remains in the research and experimental stages. Consequently, this paper summarizes and discusses the current status and development of visual detection and navigation technologies for electric UAVs, as well as issues related to emergency rescue applications, with a view to accelerating the research and application of visual detection and navigation technologies for electric UAVs in emergency rescue missions. In this study, we first summarize the classification of typical disasters, analyze the application of sample UAV and configurations in typical disasters with a high frequency of occurrence, refine key electric UAV technologies in emergency rescue missions, and propose the value of exploring electric UAV visual detection and navigation technologies. Subsequently, current research on electric UAV visual detection and navigation technology is analyzed and its application in emergency rescue missions is discussed. Finally, this paper presents the problems faced in the application of electric UAV visual detection and navigation technology in urban emergency rescue environments and offers insights into future research directions.

1. Introduction

With the acceleration of global economic development and urbanization, the number of large cities with populations of 10 million or more is gradually increasing. In large cities, the population and public service facilities are relatively dense, which, on one hand, can provide support for the economic development of the city; on the other hand, in the face of sudden disasters, urban transport, municipal, public safety, and other areas have a certain degree of vulnerability [1]. Sudden disasters often cause significant loss of human life and property. Therefore, rescue organizations and regulatory authorities at all levels urgently need to seek means and tools to identify emergency rescue problems in time and improve sustainable emergency rescue capabilities [2].
Electric unmanned aerial vehicles (UAVs) are a type of powered, controllable, and reusable vehicle that can carry important equipment to complete designated tasks. They have the advantages of strong real-time operation, minimal influence of the external environment, flexible deployment, and low cost of use and maintenance [3]. Consequently, electric UAVs are increasingly used in civil and military fields and widely used in emergency rescue missions. In 2008, a mega earthquake occurred in Wenchuan, China, and electric UAVs played an important role in the rescue of the disaster area. They can be accurately positioned in the air for aerial photography and only took about 20 min to obtain the effect map of the earthquake, making it easier for the person in charge of the earthquake relief command to make decisions [4]. In 2011, a mudslide occurred in Guangxi, China, and the CK-GY04 UAV low-altitude aerial survey system was used in the rescue process. The UAV carried out about 1.5 h of flight monitoring tasks, with an image aerial photography area of 9000 km2 and a total of 748 photographs [5]. In 2015, a hazardous-chemical explosion occurred in Tianjin Port, China. The central area where the accident occurred was highly dangerous, making decision-making and rescue operations challenging. However, the UAV could fly directly into the core explosion zone. It was used to obtain a 360-degree panoramic view, perform more than 80 flight reconnaissance missions, and transmit images of the core explosion area for more than 750 min. This provided a large amount of detailed dynamic image information on the situation for the rescue command, and played a crucial role in the rescue operation [6].
Electric UAVs are widely used in urban emergency rescue operations. In 2015, Malaysia suffered severe flooding and the country’s government used electric UAVs to accurately locate shelters and deliver relief supplies after the disaster [7]. In 2020, Norway experienced the largest landslide in the country’s history, with a collapsed area of approximately 2 km2 and simultaneous low temperatures and gusty winds. The rescue team deployed multiple DJI Matrix M300 RTK UAVs, combining air and ground resources, using infrared thermal imaging to locate and rescue survivors and providing more than 270 h of aerial coverage, making this the largest drone rescue operation ever conducted in Europe [8].
Currently, traditional search and rescue methods include search and rescue dogs, optical sensors, and radar life detectors [9]. Compared with traditional search and rescue methods, electric UAVs used in emergency rescue missions have the following advantages.
(1)
Electric UAVs can be integrated with a wide range of equipment
Electric UAVs can be fitted with various types of equipment. When fitted with cameras, the situation at the scene of an accident can be transmitted to the command department through remote image transmission technology. This can be used to analyze and make decisions regarding the emergency handling of disaster risks. If the camera is equipped with an infrared thermal imaging function, it can enhance the UAV’s vision, infrared, and other sensing capabilities. If the camera has an infrared thermal imaging function, it can enhance the perception of night vision and infrared radiation. The addition of lighting equipment can provide long-duration and wide-range illumination at night to improve night rescue efficiency.
(2)
Low manufacturing costs and low maintenance costs for electric UAVs
As electric UAVs do not require a pilot, the cockpit and associated environmental controls and life-saving equipment can be eliminated, thus reducing the weight and cost of the aircraft, in addition to significant savings in operator training.
(3)
Simple takeoff and landing and flexible operation of electric UAVs
Electric UAVs are small and lightweight, meaning they can take off and land on the roofs of general office buildings and flat land after the area has been leveled and the necessary recovery equipment has been installed. The maneuvering of an electric UAV is simple, with route setting and flight attitude control being computer-based.
The application of electric UAVs in emergency rescue missions still has certain limitations, especially when faced with more complex environmental conditions. Electric UAVs still need to carry out rapid, high-precision target detection and identification, and at this time, they are often far away from the visual range of ground manipulators, and their dependence on on-board detection equipment and navigation equipment is even higher. As a result, the integrated visual detection and navigation technologies applied to electric UAVs in emergency rescue missions are becoming a research hotspot.

2. Classification of Typical Disasters and Analysis of the Application of UAV Key Emergency Rescue Technologies

Disasters occur with a certain degree of chance, and the classification of typical disasters facilitates the study of the background environment for emergency rescue of typical disasters. On the other hand, statistical analyses of electric UAVs and key technologies applied in emergency rescue missions can provide a preliminary understanding of the current status of electric UAV rescue technology. This subsection focuses on the following aspects.

2.1. Analysis of Typical Disasters

Typical disasters can be classified into natural and manmade disasters based on their causes. Natural disasters include those in the aerosphere, hydrosphere, geosphere, biosphere, and cosmosphere. Major aerosphere disasters include drought, heavy rain, continuous rain, acid rain, hail, snow, frost, fog, gale, dry and hot winds, tornadoes, tropical cyclones, dust storms, thunderstorms, and fires. Major hydrosphere disasters include floods, storm surges, sea surges, sea ice, tsunamis, red tides, and saltwater intrusion. Major geosphere disasters include earthquakes, avalanches, landslides, mudslides, collapse, subsidence, wet subsidence, soil erosion, desertification, and salinization. Major biosphere disasters include pests, rodents, exotic organisms, and plagues. Major cosmosphere disasters include meteorite strikes, comet collisions, solar super-flares, supernova outbursts, and magnetic storms. Man-made disasters include incidents involving industrial mines, engineering, transportation, electromagnetic networks, and social disasters [10].
Statistics on major natural disasters (including floods, droughts, storms, wildfires, and other meteorological hazards) and their global impacts for the period 2013–2022 are shown in Figure 1, Figure 2 and Figure 3. They have been derived from the Global Disaster Data Platform, created by the China Ministry of Emergency Management, Ministry of Education, Institute of Disaster Reduction and Emergency Management, and the China Disaster Prevention and Control Association [11].
Statistics on the frequency of major man-made disasters and the number of people affected by them globally for the period 2013–2022, as shown in Figure 4 and Figure 5, were obtained from the International Emergency Disasters database. This database was created by the World Health Organization and the Centre for Research on Epidemiology in the Aftermath of Disasters (CRED) in 1988 and is maintained by CRED [12].
Based on the statistics, the frequency of natural and man-made disasters has remained relatively stable over the past decade, with the frequency of typical disasters in cities in the same range each year and of a comparable order of magnitude (on the order of 100). Typical disasters encountered by cities are mainly natural. The annual number of people affected by natural disasters is more than 10 million, the number of deaths is more than 10,000, and the annual economic losses caused by natural disasters are more than USD 100 million. As mentioned above, the occurrence of typical urban disasters can be considered a non-negligible problem in terms of annual statistics as well as in terms of a stable high frequency of occurrence and the huge amount of damage caused.

2.2. Analysis of Electric UAV Emergency Response and Key Technology Applications

The use of electric UAVs for emergency rescue has become increasingly widespread. Recently, representative Chinese UAV companies have compiled vehicle application data on global rescue missions, including the types of electric UAVs applied and the equipment on board, as shown in Table 1 and Table 2, Figure 6 [13].
The statistics reveal that the sample company’s electric UAVs have been used in rescue applications across all continents over the past decade. Moreover, the number of cases in which electric UAVs have been used in rescue operations in the past five years has increased significantly compared with previous years. According to the statistics from the sample company, the main application scenarios for electric UAVs in the context of emergency rescue include flood, fire, earthquakes, and geological disaster rescues.
According to the statistics of the sample companies listed in Table 2, visual detection equipment has become one of the most common types of onboard equipment used in emergency rescue electric UAVs. For different emergency rescue scenarios, in addition to the specific equipment carried for the specific needs of the scenario (e.g., communication relay equipment carried for communication relay rescue tasks in geological disaster rescue), visual sensing equipment can quickly perform disaster detection, detect trapped people, and synchronize the location information with ground rescue personnel, guiding them to carry out rescue tasks efficiently. In conclusion, the application of electric UAV visual detection and navigation technology in emergency rescue missions has research value.

3. Electric UAV Visual Detection Technology

Visual detection techniques involve the recognition of a target object in a current image by acquiring an image [14]. The core task of visual target detection is to locate target instances from numerous predefined categories in a natural image; that is, to determine the location, size, and category of the target. Visual target detection is often challenging because of the differences in the shape characteristics (e.g., attitude and shape of the target) of the recognized targets, environmental factors (e.g., lighting), and equipment performance (e.g., resolution), which are somewhat intrusive.
At present, visual detection technology mainly falls into two categories: traditional target detection and deep learning-based target detection [15].

3.1. Traditional Target Detection Algorithms

Traditional target detection algorithms were first proposed by Viola and Jones in 2001. They proposed a fast face detection method based on Haar features and the AdaBoost algorithm, which was very successful in the field of face detection, becoming a milestone in the field of target detection [16]. Traditional target detection methods usually involve the following steps: (1) preprocessing of the input image, with operations including image denoising, gray-scale normalization and image enhancement; (2) the construction of a candidate region, which is divided into multiple subregions by means of sliding windows, image segmentation, and selective search; (3) the extraction of features from subregions, which can be local features, global features, or combinations of them. A flowchart of the traditional target detection method is provided in Figure 7.
Common feature description operators include histogram of gradient (HOG) directions [17,18,19], Haar-like features [20,21], scale-invariant feature transformation (SIFT) [22], and speeded-up robust features (SURF) [23]. A comparison of their advantages and disadvantages is presented in Table 3. The extracted features are matched with the features of the known targets to obtain the candidate targets, which are then screened using non-great-value suppression and other methods to exclude targets that do not meet the necessary conditions. The remaining candidate targets are localized—information including their location, size, and attitude is determined—and then recognized—information including the categories and attributes of the targets is determined [24,25].
However, traditional target detection algorithms have certain limitations. Since the region selection of traditional target detection algorithms is not targeted, there are a large number of redundant windows, resulting in a longer algorithm time. In addition, the traditional target detection algorithm has a low generalization ability, and the detection accuracy is unable to meet the requirements due to its need for a manual design of the features. Moreover, traditional target detection algorithms are sensitive to several factors, such as light and occlusion, and the algorithms are less robust, making it difficult to address target detection tasks in complex contexts.

3.2. Deep Learning-Based Target Detection Algorithm

Deep learning-based target detection is based on convolutional neural networks (CNN) [26,27], and the representative algorithms include Region-CNN (R-CNN) [28,29], You Only Look Once (YOLO) [30,31], Single Shot Multibox Detector (SSD) [32], Faster R-CNN [33], and Mask R-CNN [34].
Compared to traditional target detection algorithms, deep learning-based target detection algorithms can autonomously learn the feature representations and detection models required for the target detection task from a large amount of raw data points without manually designing the features. This makes them robust to changes in scale, attitude, and illumination. Moreover, deep learning models can learn high-level abstract features from large amounts of data, capture complex semantic information about the target, and have a higher accuracy and better generalization capabilities.
Deep learning-based target detection is usually divided into two common frameworks, the one- and two-stage frameworks, which were first proposed in the R-CNN family of models. These divide the target detection task into two phases: the first phase involves generating candidate regions from the input image, while the second phase involves performing feature extraction and classification on the candidate regions to determine whether they contain the target object. The advantage of the two-stage framework is that it has high detection accuracy. However, it is not suitable for real-time detection because it needs to extract and classify the features of each candidate frame. As a result, it is computationally intensive and slow. In contrast, the one-stage framework is simpler and faster, directly predicting all possible target regions in the image, and requires only one forward computation to complete the detection task. Therefore, most of the mainstream target detection models currently adopt the one-stage framework, including YOLO and SSD. These models usually contain a deep convolutional neural network for feature extraction, an object classifier for determining the target class, and an object localizer for locating the target position, which allows for a faster detection speed and a better detection accuracy [35].
Depending on the nature and requirements of the detection task, deep learning applied to visual detection can be classified into two types: regression problems and classification problems [36]. In regression problems, the target detection model employed uses the one-stage framework, wherein the model is designed to directly regress the position or bounding box coordinates of the target. Some classical regression models include CNN-based models, such as YOLO and SSD. In these classification problems, both target detection models using the one- and two-stage frameworks are included. One-stage framework models are trained to recognize object categories in images. Classical classification models include CNN-based models, such as AlexNet [37], Visual Geometry Group (VGG) [38], and Residual Net (ResNet) [39]. Two-stage framework models generate candidate regions in the first stage, and classify and bounding box regression on these candidate regions in the second stage. Typical models include Faster R-CNN and Mask R-CNN. A comparison of the performance of common deep learning target detection models is provided in Table 4.

3.3. Visual Detection Technology on UAVs

Research combining visual detection with aerial photography from drones started early. In 1997, Olson et al. of Cornell University proposed automatic target recognition based on matching-oriented edge pixels, which perform quasi-real-time tracking (model-based tracking in quasi-real-time, MBTIQR), and can be used to detect and track multiple targets that move arbitrarily. The practical test results are shown in Figure 8 [40].
In 2006, the U.S. Department of Defense developed a system called COCOA, which is a UAV-based video surveillance system capable of automated vehicle-pedestrian detection tracking. It can be customized to different sensor resolutions and can track targets as small as 100 pixels. Figure 9 shows the effectiveness of this system for vehicle tracking [41].
In 2007, Dobrokhodov et al. of the U.S. Naval Postgraduate School developed a rapid flight test prototype system that included a computer vision-based dynamic tracking system for target electric UAVs, combining the target state estimation with an electric UAV control system that could estimate real-time information regarding the tracked target’s speed and direction during the tracking process [42].
In 2017, Dong and Zou of the Nanjing Institute of Electronic Engineering proposed a UAV visual detection method that combined foreground detection and online feature classification. This method performs feature classification on foreground detection results to improve its UAV detection. Feature classification based on the edge strength and localization of the neighboring regions of the foreground detection results helps distinguish potential UAV targets from a dynamic background [43].
In 2018, Panyu Shao of Zhejiang University combined the use of a visual background extraction algorithm and a three-frame differencing method, along with a multi-scale kernel correlation filter, to achieve high-precision detection and tracking of moving targets. The detection and tracking effect of small targets is more efficient with a long effective distance, and the system performs well in real time [44].
While focusing on the improvement of the algorithm itself, the configuration of the experimental environment for actual deployment and the computational speed of the application are also the focus of research. In 2021, Pengfei Zhang from XIDIAN University conducted experiments on target detection and the tracking of pedestrians and vehicles by UAVs using the traditional model-free-based multi-target tracking algorithm and the multi-target tracking algorithm based on the YOLOv4 algorithm, respectively. The experimental results showed that compared with the model-free multi-target tracking algorithm, the YOLOv4-based multi-target tracking algorithm was able to better handle issues related to the disappearance of the old targets and the emergence of the new targets [45]. In 2023, Sun Bei et al. used segmenting objects by locations (SOLO) instance segmentation as the base network and inter-cross-attention-weighted fusion of the original network’s feature pyramid network (FPN) multi-scale features in order to enhance the feature preservation and semantic characterization of weakly-textured targets for real-time detection of camouflaged targets [46]. Yang from Jiangxi University of Science and Technology proposed a target detection algorithm based on the attention mechanism using the YOLOX-Nano algorithm as the basic algorithm, and conducted UAV visual detection and tracking experiments. The experiments were conducted using TensorRT, a deep learning thruster from NVIDIA that uses neural networks to accelerate computation, with a graphics memory footprint half that of the commonly used float point (FP) 32-precision model [47]. The configuration of the experimental environments and the speeds at which the algorithms were run in these studies are summarized in Table 5.

4. Electric UAV Visual Navigation Technology

4.1. Electric UAV Navigation Technology

Electric UAV navigation technology involves the use of various types of sensors mounted on the UAV’s fuselage to obtain information about the surrounding environment. Additionally, the UAV perceives its own state information in real time to complete the specified commands by the onboard embedded computing equipment controlling the fuselage. The sensors carried by an aircraft can be classified into LiDAR, GPS, and inertial and visual navigation systems. The advantages of LiDAR include strong resistance to interference, high accuracy, and relatively simple calculations; however, its use is limited by the radar detection range. GPS navigation algorithms have the advantages of high real-time, wide coverage, and high accuracy, and are commonly used in outdoor navigation processes; however, they are limited by geographic region. In some mid-latitude areas and regions with weak signals, there is an increased risk of signal disappearance and potential electromagnetic interference. The inertial navigation system has the advantages of working independently, having a wide range of applications, and being highly robust; however, the positioning error can easily increase continuously with the accumulation of time, and it is impossible to estimate external information [48]. Visual navigation mainly refers to the measurement of various navigation parameters through images captured on the ground using visual imaging equipment (visible light, infrared, SAR, etc.) onboard a vehicle. It has moderate accuracy, low cost, and a certain degree of resistance to electromagnetic interference [49]. The use of visual sensors can not only achieve effective localization and map building, but also target detection and extraction of effective semantic information in space, which is suitable for electric UAV guidance applications in emergency search and rescue scenarios.

4.2. Electric UAV Visual Navigation Technology

A schematic representation of UAV visual navigation is provided in Figure 10. The UAV visual navigation module usually consists of three modules: (i) image acquisition module, (ii) visual detection module, and (iii) navigation control module. The image acquisition module acquires map-related information during the flight mission through the UAV onboard camera to provide data for subsequent visual navigation line detection. The visual detection module processes the images or videos acquired by the image acquisition module. The navigation control module analyzes the operating state of the UAV in real time, combines the pre-calculated navigation data to derive the desired trajectory command of the UAV, and continuously adjusts it according to the actual situation to generate the corresponding attitude control command [50].
Visual navigation and localization used on UAVs comprise both relative and absolute vision methods. Relative vision methods estimate the motion of the UAV relative to the starting position by analyzing the motion information between successive image frames. The main methods include visual odometry (VO) [51] and simultaneous localization and map building (SLAM) [52,53,54]. VO estimates the UAV attitude by analyzing successive images and estimating changes in attitude inferred from the differences between the current and previous frames. SLAM is built around the concept of position recognition and map building, which is defined as placing a vehicle into an unknown environment at an unknown location and determining its position in the map in real time using the platform’s own sensors and the initial position in which it is located. Compared to visual SLAM, visual odometry does not need to maintain a global map, reducing the computation time of the algorithm, and can achieve high real-time and position estimation accuracy. However, for visual odometry to obtain high-precision position information, the input image needs to have sufficient environmental light, a high percentage of static objects in the scene, obvious image texture features, and a sufficient number of repetitive features between consecutive image frames. As a result, VO is not the best choice for UAV visual navigation.
The classical SLAM framework consists of modules consisting of sensor information reading, obtaining local maps and motion estimation, back-end nonlinear optimization, closed-loop inspection, and building maps, as shown in Figure 11.
Sensor information reading in vision SLAM involves the reading and preprocessing of camera image information. According to the different vision sensors used, vision SLAM can be classified into three categories: (i) monocular vision SLAM using only one camera as the only external sensor; (ii) stereo vision SLAM using multiple cameras as sensors; (iii) depth-image (RGB-D) SLAM based on a monocular camera with an infrared sensor. Monocular vision SLAM is simple, inexpensive, small, and lightweight. However, due to a lack of depth of information, it has lower accuracy for the 3D modeling of the environment and is susceptible to changes in illumination and a lack of texture. Stereo vision SLAM has better accuracy and stability compared to monocular SLAM, and can provide depth of information, which contributes to a more accurate 3D reconstruction. However, its configuration and calibration are more complex, and its depth range is limited to the baseline of binoculars. Depth-image SLAM can obtain image and depth information at the same time, simplifying 3D reconstruction. However, it has a limited depth range and is affected by infrared rays, and its algorithm consumes a large amount of power.
Obtaining local maps and achieving motion estimation means using the image information obtained from the camera and the correlation between them to recover the 3D motion of the camera and the look of the local maps, as well as updating the maps to reflect the changes in the environment based on the new observations. This includes adding new feature points and adjusting feature positions on the map.
Back-end nonlinear optimization is a scaled, much larger optimization problem for considering optimal trajectories and maps over long periods of time.
Closed-loop inspection optimizes the consistency of the entire trajectory by detecting closed loops (i.e., drones passing through the same locations) on the camera trajectory. This helps to reduce cumulative errors and improves the consistency and accuracy of the map.
The maps created pre-define the spatial layout of the environment, allowing the UAV to navigate with meandering behavior and motion planning capabilities.
The problem of error accumulation is the main drawback of the relative vision approach since it relies on a continuous sequence of images. Absolute vision methods use pre-collected data, such as geo-tagged ortho-rectified satellite images, to localize non-geo-tagged UAV images. They have a high application value in long-duration flights due to the fact that they do not rely on information from historical frames, which fundamentally avoids cumulative errors. Scene matching is a common absolute vision method that uses pre-stored aerial photography information in the UAV’s onboard system and image information captured by the UAV in real-time flight for matching and localization. By matching the aerial image as a reference image with the real-time picture of the UAV for positioning, fluctuations in the navigation accuracy due to the uncertainty of inputs are reduced. Moreover, since the method locates the UAV by means of a reference image with geographic information, it only needs to ensure that the accuracy of the matching between the real-time image and the reference image is high enough to ensure a high navigation and positioning accuracy of the UAV [55].
A comparison of the advantages and disadvantages of relative and absolute visual localization is provided in Table 6.

4.3. Current Status of Research on Electric UAV Visual Navigation Technology

Research on UAV visual navigation has been a topic of interest in recent years. Oh et al. [56] and McGuire et al. [57] investigated the use of optical flow field methods for UAV visual navigation in 2004 and 2017, respectively. Navigational localization using sequential images is also an important direction that has been focused on by Metni [58], Guo [59], and Wang et al. [60]. Conte et al. [61] delved into navigation methods using visual odometry and picture matching. UAV visual navigation techniques incorporating inertial measurement unit (IMU) data are also one of the hotspots of research, which have been investigated in [62,63,64,65].
The current research direction in electric UAV visual navigation mainly includes the following three aspects.
(1)
Integrating other navigation technologies
In addition to visual sensors, electric UAVs are often loaded with inertial navigation systems to compensate for the defects in inertial navigation devices with cumulative errors. Real-time information provided by vision can be fused with visual and inertial navigation information to improve navigation accuracy. Gianpaolo and Patrick have combined inertial navigation devices, image sequence motion estimation, and view-matching methods to design an electric UAV visual navigation system and conducted experimental tests on an Rmax helicopter with real scenarios [66,67].
(2)
Integration with deep learning
As the technical development of deep learning in image processing has matured, it has been increasingly applied in the field of visual SLAM. At present, the application of deep learning in this field has two main directions: replacing a process module in the SLAM algorithm framework with a neural network algorithm and using deep learning to train the labels of semantic maps, which facilitates electric UAVs to better identify image information with semantics. In 2013, Sermanet et al. proposed the OverFeat target detection model, which uses a neural network in the last convolutional layer instead of sliding window-based target localization in traditional detection methods. This is an integrated method capable of detecting, recognizing, and localizing targets [68].
(3)
Mitigating feature dependence
Dependence on scene features is a major limitation of electric UAV visual navigation methods. When the environmental features are insufficient or the image is blurred because of UAV motion, the UAV obtains insufficient information about the environmental features, and visual navigation information plays the reverse role, consuming a large amount of electric UAV arithmetic power and endurance. Most direct methods, that is, direct operations on pixels, are used to reduce dependence on features. Triggs et al. have investigated the theory and method of the beam leveling method, which directly tracks only sparse feature points and achieves high efficiency. In addition, semi-dense or dense maps can be constructed directly as inputs for subsequent processing, rather than providing only sparse roadmap points, as in the feature-point method [69]. Robertson has utilized edge image information, with the majority of semi-dense regions located at the edges of the object, and avoided the use of point features, which have a low level of image information [70].

5. Application of Visual Detection and Navigation Technology in Electric UAV Search and Rescue

The combined application of visual detection and navigation technologies in electric UAV emergency rescue missions can effectively improve the efficiency of UAV search and rescue. Visual detection technology can help electric UAVs quickly distinguish between search and rescue targets, whereas visual navigation technology can use the collected image information to autonomously plan rescue routes. The combined application can realize efficient and autonomous emergency rescue. There have been research cases in which visual detection technology has been comprehensively applied to electric UAV search and rescue.
(1)
Wilderness search and rescue (SAR) scenarios
In wilderness SAR scenarios, several teams have investigated the application of visually detected navigation UAVs for wilderness search and rescue (WiSAR). Goodrich’s team introduced and demonstrated a camera-equipped micro-UAV (mini-UAV) in WiSAR. Researchers used the technology to successfully locate a simulated missing person in a wilderness area [71]. Thereafter, Pelosi et al. investigated the potential for improvement of their search paths [72]; McConkey et al. employed YOLOv3 and Open Computer Vision (OpenCV) for potential target detection in WiSAR [73]; Broyles et al. introduced the WiSAR dataset for supporting their vision-based algorithms [74].
For other wilderness SAR scenario studies, in 2017, engineers at the University of South Australia and the Baghdad University of Secondary Technology designed an electric UAV-mounted life detector system as shown in Figure 12. This system remotely monitors vital signs using a computer vision system that can distinguish between the remains of a deceased person four to eight meters away. As long as the upper half of the human body is visible, the camera can capture minute movements in the chest cavity that are indicative of heartbeat and respiratory rates to detect vital signs. This system applies to a wide range of catastrophic accident scenarios, including emergency search and rescue operations for earthquakes, floods, nuclear accidents, chemical explosions, biological attacks, mass shooting, combat searches, and aviation accidents [75].
Shao et al. have designed a 24 h infrared visual field search and rescue electric UAV to address the issue of poor environmental conditions in the field, which makes search and rescue difficult. The UAV first acquires samples of the search and rescue area through an onboard infrared thermal camera, uses a histogram of oriented gradient (HOG) for feature extraction, employs the Support Vector Mac (SVM) to classify and identify the HOG features of the image, and then uses complementary filtering to fuse the GPS, barometer, and inertial guidance data to determine the location information of the person in distress. In addition, practical simulation tests have been conducted; the results show that the designed SAR UAV has a high recognition rate and positioning accuracy for human target detection and positioning and could complete wilderness search and rescue tasks in a variety of real wilderness scenarios [76].
(2)
Maritime search and rescue scenarios
For maritime SAR scenarios, Dinnbier et al. proposed a method that combines color analysis and frequency pattern recognition using an inexpensive vision camera (EO camera), which helps to reduce the cost of maritime vision UAV search and rescue [77].
Liu et al. have invented an intelligently assisted search and rescue system for people in distress in water based on visual perception and computation. This system includes a multirotor electric UAV flight platform, a base-integrated processing platform, and an unmanned boat rescue platform. The UAV carries a video acquisition module for collecting video images of people in distress in the target area and transmits them to the base integrated processing platform, which processes the collected image information using a dark channel a priori method and deep learning for the autonomous identification and detection of people in water [78]. Li et al. focus on the topic of active vision for UAV search, analyzing how flight altitude affects the accuracy of object detection algorithms [79].
(3)
Mountain search and rescue scenarios
In mountain rescue scenarios, Marušić et al. focused on how to use aerial imagery for mountain rescue [80], and Russkin’s team built a multi-sensory system UAV mountain snow rescue system in which visual sensors are an important component [81].
(4)
Combining deep learning for visual detection
The application of visual detection combined with deep learning is one of the latest research directions for rescue UAVs. Castellano et al. presented a new dataset dedicated to UAV search and rescue operations using computer vision, in addition to providing baseline results obtained on this dataset using the state-of-the-art TinyYOLOv3 object detector [82]. Dong’s team similarly used YOLOV3 to train a survivor detection model for UAVs [83]. Trong et al. proposed an autonomous detection and control algorithm for vertical takeoff and landing UAVs combined with the YOLOV7 algorithm to detect victims at sea in a sea rescue scenario under low light intensity. The simulation results showed that the images captured by the UAV were enhanced when the effects of light intensity were mitigated through deep learning techniques, with the number of sea victims detected increasing by 62.6% compared to the original data [84].

6. New Challenges

Electric UAVs encounter certain challenges when applying target visual detection and localization techniques to complex emergency rescue environments, particularly in the context of urban emergency search and rescue missions. Owing to the complexity of urban environments, visual detection techniques often face interference from the surrounding environment, such as densely packed buildings in the city and rubble after a disaster, when the detected target may blend into the background or become obscured, thereby making it difficult to identify. In addition, lighting significantly impacts the detector’s ability to define a target. In an emergency search, it is often necessary to have a sufficiently long time span to ensure suitable lighting conditions throughout the operation. The diversity of targets also affects the accuracy of visual detection. For individuals in the same category, the computer often needs a sufficient sample size to understand the meaning of the category, and different individuals with varying appearance characteristics are categorized together. Whether it can be quickly analyzed and trained in different search and rescue environments and accurately analyzed target detection determines the efficiency and success of the urban emergency search and rescue of electric UAVs.
On the other hand, several urgent technical problems regarding visual navigation technology are known, mainly involving aspects of the hardware and software:
(1)
Hardware;
(a)
Limited carrying capacity of UAVs.
Due to the small size of a UAV and the limited lift it can obtain, the carrying capacity of UAVs is relatively limited, and excessive loads affect the stability of the UAV’s flight attitude, as well as the endurance performance of the flight. If the visual navigation function of the UAV is to be realized, the corresponding on-board processor will also need to be carried, which will inevitably aggravate the load pressure of the UAV.
(b)
Smaller memory of on-board processor of UAV
Vision-based UAV navigation requires the processing of images captured by the camera, which uses video streams or pictures obtained in real time as the key visual input signals. For vision-based navigation systems, view matching also requires pre-stored reference maps, which take up memory space in the onboard system. By contrast, too little on-board memory will result in fewer reference images to use, which in turn will affect the performance of the navigation.
(c)
Limited computational performance of UAV onboard systems
Currently, the performance of UAV onboard systems is generally lower than that of an ordinary desktop computer, which leads to an increase in the time needed by the navigation algorithms running on the onboard equipment. If the navigation algorithm is run on the ground, and the ground station and the UAV visual equipment are used to transmit information interactively, another potential problem is time-consuming and unstable information transmission.
(2)
Software
(a)
Visual navigation algorithms are greatly affected by environmental factors
UAV visual navigation involves collecting environmental images to extract features and identify detection targets. Different weather conditions, changes in light, and the obstruction of surrounding terrain objects and other environmental factors affect the quality of image acquisition and the accuracy of the visual detection and navigation algorithm.
(b)
Problems of positioning accuracy and real-time performance of UAV visual navigation algorithms
In the feature-based UAV navigation system, the quality of images acquired in real time varies. Since the algorithms for picture processing are based on the pixels of the pictures, the quality of the pictures directly affects the performance of the picture processing algorithms, which ultimately affects the accuracy of the navigation algorithms. Along these lines, the current maximum flight speed of UAVs has been greatly improved, and in a rapid flight, the real-time requirements for navigation are higher. As a result, the system is required to obtain accurate aircraft position information in a shorter period of time.

7. Conclusions

Electric UAVs have significant advantages, including their small size, light weight, low energy consumption, wide field of view, ease of use, and flexible deployment. They are capable of performing emergency rescue tasks under various scenarios and have been gradually adopted for disaster relief in various countries in recent years. Visual detection recognizes the target information autonomously online, and visual navigation technology can provide positioning information indoors or in areas with weak signals. Combining the two methods with electric UAVs can further enhance the efficiency of rescue missions. With the development of visual detection and navigation technologies, their combination with UAVs will surely become an important research and application direction for urban emergency rescue in the future.

Author Contributions

Conceptualization, P.T. and H.S.; formal analysis, P.T. and J.L.; investigation, J.L.; resources, H.S.; data curation, J.L.; writing—original draft preparation, J.L.; writing—review and editing, P.T. and H.S.; visualization, H.S.; supervision, P.T. and H.S.; project administration, P.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this article can be found at: https://www.gddat.cn/newGlobalWeb/#/DisasBrowse (accessed on 27 June 2023); https://www.emdat.be/ (accessed on 27 June 2023); https://enterprise.dji.com/drone-rescue-map/#map1694488139399 (accessed on 13 September 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Chu, Z. Application of UAV in Urban Emergency Command Information System. Geomat. Technol. Equip. 2012, 14, 61–63. [Google Scholar] [CrossRef]
  2. Qin, W.; Lin, X.; Zhao, W.; Liu, X. Design of AI-technology-based Decision Support System for UAN-based Urban Emergency Rescue. Chin. Med. Equip. J. 2019, 40, 38–43. [Google Scholar] [CrossRef]
  3. Ding, Z. Task Allocation Technology of Unmanned Aerial Vehicles for Emergency Relief in an Urban Terrain. Master’s Thesis, Nanjing University of Aeronautics and Astronautics, Nanjing, China, 2016. [Google Scholar]
  4. Zhou, J.; Gong, J.; Wang, T.; Wang, D.; Yang, L.; Zhao, X.; Hong, Y.; Zhao, Z. Study on UAV Remote Sensing Image Acquiring and Visualization Management System for the Area Affected by 5.12 Wenchuan Earthquake. J. Remote Sens. 2008, 12, 877–884. [Google Scholar]
  5. Cao, F. Application of Drones in Urban Emergency Command Information Systems. Sci. Technol. Innov. 2016, 3, 128. [Google Scholar] [CrossRef]
  6. Tencent: Another Drone Seen Taking off at Tianjin Explosion Site. Available online: https://finance.qq.com/original/zibenlun/drone.html (accessed on 9 October 2023).
  7. Hashim, A.S.; Mohamad Tamizi, M.S. Development of Drone for Search and Rescue Operation in Malaysia Flood Disaster. Int. J. Eng. Technol. 2018, 7, 9–12. [Google Scholar] [CrossRef]
  8. NetEase: DJI M300 RTK Leads Largest Drone Rescue Mission in European History. Available online: https://www.163.com/dy/article/GAK4CAK305149OCK.html (accessed on 9 October 2023).
  9. Li, Z.; Hong, Z.; Sun, C. Introduction to the Commonly Used Search and Rescue Means and Characteristics. In Proceedings of the 2012 China Fire Protection Association Science and Technology Annual Conference, Guangzhou, China, 18–19 September 2012. [Google Scholar]
  10. Sun, D.; Sun, J.; Cao, J.; Meng, Y.; Zhang, Y. Study on Expert System of Urban Typical Disaster Emergency Handling Based on Case Reasoning. J. Saf. Sci. Technol. 2012, 19, 55–60. [Google Scholar] [CrossRef]
  11. Global Disaster Data Platform. Available online: https://www.gddat.cn/ (accessed on 27 June 2023).
  12. EM-DAT The International Disaster Database. Available online: https://www.emdat.be/ (accessed on 27 June 2023).
  13. DJI Drone Rescue Map. Available online: https://enterprise.dji.com/drone-rescue-map/#map1694488139399 (accessed on 13 September 2023).
  14. Liu, W. Vision-Based UAV Detection and Tracking Technology Research. Master’s Thesis, University of Electronic Science and Technology of China, Chengdu, China, 2022. [Google Scholar]
  15. Xu, Y. Research on the Rapid Detection Technology of Vision-Oriented UAV Aerial Targets. Master’s Thesis, Dalian Maritime University, Dalian, China, 2018. [Google Scholar] [CrossRef]
  16. Freund, Y.; Schapire, R.E. A Desicion-Theoretic Generalization of On-Line Learning and An Application to Boosting. In Proceedings of the II European Conference on Computational Learning Theory, Barcelona, Spain, 13–15 March 1995. [Google Scholar] [CrossRef]
  17. Felzenszwalb, P.F.; Girshick, R.B.; McAllester, D.; Ramanan, D. Object Detection with Discriminatively Trained Part-Based Models. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 1627–1645. [Google Scholar] [CrossRef]
  18. Zhang, S.; Wang, X. Human Detection and Object Tracking Based on Histograms of Oriented Gradients. In Proceedings of the 2013 Ninth International Conference on Natural Computation (ICNC), Shenyang, China, 23–25 July 2013. [Google Scholar] [CrossRef]
  19. Surasak, T.; Takahiro, I.; Cheng, C.-H.; Wang, C.-E.; Sheng, P.-Y. Histogram of Oriented Gradients for Human Detection in Video. In Proceedings of the 2018 5th International Conference on Business and Industrial Research (ICBIR), Bangkok, Thailand, 17–18 May 2018. [Google Scholar] [CrossRef]
  20. Papageorgiou, C.P.; Oren, M.; Poggio, T. A General Framework for Object Detection. In Proceedings of the Sixth International Conference on Computer Vision, Bombay, India, 7 January 1998; Volume 1, pp. 555–562. [Google Scholar] [CrossRef]
  21. Lienhart, R.; Maydt, J. An Extended Set of Haar-like Features for Rapid Object Detection. In Proceedings of the International Conference on Image Processing, Rochester, NY, USA, 22–25 September 2002. [Google Scholar] [CrossRef]
  22. Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. 2004, 60, 91–110. [Google Scholar] [CrossRef]
  23. Herbert, B.; Andreas, E.; Tinne, T.; Luc, V.G. Speeded-Up Robust Features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
  24. Liu, Y. Research of Vision-Based UAV Target Detection and Tracking and Its Autoland Technique. Master’s Thesis, Huazhong University of Science and Technology, Wuhan, China, 2019. [Google Scholar]
  25. Yao, Q.; Hu, X.; Lei, H. Application of Deep Convolutional Neural Network in Object Detection. Comput. Eng. Appl. 2018, 54, 1–9. [Google Scholar] [CrossRef]
  26. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
  27. Gao, Z.; Wang, L.; Zhou, L.; Zhang, J. HEp-2 Cell Image Classification with Deep Convolutional Neural Networks. IEEE J. Biomed. Health Inform. 2016, 21, 416–428. [Google Scholar] [CrossRef]
  28. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014. [Google Scholar] [CrossRef]
  29. Liu, L.; Ouyang, W.; Wang, X.; Fieguth, P.; Chen, J.; Liu, X.; Pietikäinen, M. Deep Learning for Generic Object Detection: A Survey. Int. J. Comput. Vis. 2020, 128, 261–318. [Google Scholar] [CrossRef]
  30. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar] [CrossRef]
  31. Farhadi, A.; Redmon, J. YOLO9000: Better, Faster, Stronger. arXiv 2016. [Google Scholar] [CrossRef]
  32. Solunke, B.R.; Gengaje, S.R. A Review on Traditional and Deep Learning based Object Detection Methods. In Proceedings of the 2023 International Conference on Emerging Smart Computing and Informatics (ESCI), Pune, India, 1–3 March 2023. [Google Scholar] [CrossRef]
  33. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
  34. He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar] [CrossRef]
  35. Kumar, G.; Bhatia, P.K. A Detailed Review of Feature Extraction in Image Processing Systems. In Proceedings of the 2014 Fourth International Conference on Advanced Computing & Communication Technologies, Rohtak, India, 8–9 February 2014. [Google Scholar] [CrossRef]
  36. Zhao, Z.-Q.; Zheng, P.; Xu, S.-T.; Wu, X. Object Detection With Deep Learning: A Review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [PubMed]
  37. Jiao, J.; Zhang, F.; Zhang, L. Remote Sensing Estimation of Rape Planting Area Based on Improved AlexNet Model. Comput. Meas. Control. 2018, 26, 186–189. [Google Scholar] [CrossRef]
  38. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2015. [Google Scholar] [CrossRef]
  39. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar] [CrossRef]
  40. Olson, C.F.; Huttenlocher, D. Automatic Target Recognition by Matching Oriented Edge Pixels. IEEE Trans. Image Process. 1997, 6, 103–113. [Google Scholar] [CrossRef]
  41. Ali, S.; Shah, M. COCOA—Tracking in Aerial Imagery. In Proceedings of the Airborne Intelligence, Surveillance, Reconnaissance(ISR) Systems and Applications III, Orlando, FL, USA, 5 May 2006. [Google Scholar] [CrossRef]
  42. Dobrokhodov, V.; Yakimenko, O.; Jones, K.; Kaminer, I.; Bourakov, E.; Kitsios, I.; Lizarraga, M. New Generation of Rapid Flight Test Prototyping System for Small Unmanned Air Vehicles. In Proceedings of the AIAA Modeling and Simulation Technologies Conference and Exhibit, Hilton Head, SC, USA, 20–23 August 2007. [Google Scholar] [CrossRef]
  43. Dong, Q.; Zou, Q. Visual UAV Detection Method with Online Feature Classification. In Proceedings of the 2017 IEEE 2nd Information Technology, Networking, Electronic and Automation Control Conference, Chengdu, China, 15–17 December 2017. [Google Scholar]
  44. Shao, P. Design and Implementation of Vision Based Drone Intrusion Detection and Tracking System. Master’s Thesis, Zhejiang University, Hangzhou, China, 2018. [Google Scholar]
  45. Zhang, P. Visual Object Detection and Tracking Technology of UAV. Master’s Thesis, Xidian University, Xi’an, China, 2021. [Google Scholar]
  46. Sun, B.; Dang, Z.; Wu, P.; Yuan, S.; Guo, R. Multi Scale Cross Attention Improved Method of Single Unmanned Aerial Vehicle for Ground Camouflage Target Detection and Localization. Chin. J. Sci. Instrum. 2023, 44, 54–65. [Google Scholar] [CrossRef]
  47. Yang, H. Research on Target Detection and Tracking Technology Based on UAV Perspective. Master’s Thesis, Jiangxi University of Science and Technology, Ganzhou, China, 2023. [Google Scholar]
  48. Li, C. Small Fixed-Wing UAV Vision Navigation Algorithm Research. Master’s Thesis, Shanghai Jiaotong University, Shanghai, China, 2020. [Google Scholar]
  49. Chen, M. Research on Technology and System of Vision-based UAV Autonomous Landing Navigation. Master’s Thesis, Nanjing University of Aeronautics and Astronautics, Nanjing, China, 2017. [Google Scholar]
  50. Tang, F. Research on Visual Navigation Technology of Agricultural UAV Based on ROS. Master’s Thesis, Qilu University of Technology, Jinan, China, 2023. [Google Scholar]
  51. Meng, Z.; Kong, X.; Meng, L.; Tomiyama, H. Stereo Vision-Based Depth Estimation, 1st ed.; Springer: Singapore, 2020; pp. 1209–1216. [Google Scholar]
  52. Tateno, K.; Tombari, F.; Laina, I.; Navab, N. CNN-SLAM: Real-time Dense Monocular SLAM with Learned Depth Prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar] [CrossRef]
  53. Pumarola, A.; Vakhitov, A.; Agudo, A.; Sanfeliu, A. PL-SLAM: Real-time Monocular Visual SLAM with Points and Lines. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–1 June 2017. [Google Scholar] [CrossRef]
  54. Wei, W.; Lu, L.; Jin, G.; Tan, L.; Chen, D. A review of monocular SLAM-based UAV visual navigation research. Aerosp. Technol. 2020, 24, 17–22. [Google Scholar] [CrossRef]
  55. Zhang, H. Research on Feature Matching Algorithm in UAV Visual Navigation. Master’s Thesis, Huazhong University of Science and Technology, Wuhan, China, 2021. [Google Scholar]
  56. Oh, P.Y.; Green, W.E.; Barrows, G. Neural Nets and Optic Flow for Autonomous MAV Navigation. In Proceedings of the 2004 International Mechanical Engineering Congress and Exposition, Anaheim, CA, USA, 13–19 November 2004. [Google Scholar] [CrossRef]
  57. McGuire, K.; de Croon, G.; De Wagter, C.; Tuyls, K.; Kappen, H. Efficient Optical Flow and Stereo Vision for Velocity Estimation and Obstacle Avoidance on an Autonomous Pocket Drone. IEEE Robot. Autom. Lett. 2017, 2, 1070–1076. [Google Scholar] [CrossRef]
  58. Metni, N.; Hamel, T.; Derkx, F. Visual Tracking Control of Aerial Robotic Systems with Adaptive Depth Estimation. In Proceedings of the 44th IEEE Conference on Decision and Control, Seville, Spain, 15 December 2005. [Google Scholar] [CrossRef]
  59. Guo, D.; Zhong, M.; Ji, H.; Liu, Y.; Yang, R. A Hybrid Feature Model and Deep Learning Based Fault Diagnosis for Unmanned Aerial Vehicle Sensors. Neurocomputing 2018, 319, 155–163. [Google Scholar] [CrossRef]
  60. Wang, L.; Bi, S.; Lu, X.; Gu, Y.; Zhai, C. Deformation Measurement of High-speed Rotating Drone Blades Based on Digital Image Correlation Combined with Ring Projection Transform and Orientation Codes. Measurement 2019, 148, 106–121. [Google Scholar] [CrossRef]
  61. Conte, G.; Doherty, P. An Integrated UAV Navigation System Based on Aerial Image Matching. In Proceedings of the 2008 IEEE Aerospace Conference, Big Sky, MT, USA, 1–8 March 2008. [Google Scholar] [CrossRef]
  62. Stowers, J.; Bainbridge-Smith, A.; Hayes, M.; Mills, S. Optical Flow for Heading Estimation of a Quadrotor Helicopter. Int. J. Micro Air Veh. 2009, 1, 229–239. [Google Scholar] [CrossRef]
  63. Li, Y.; Wang, Y.; Luo, H.; Chen, Y.; Jiang, Y. Landmark Recognition for UAV Autonomous Landing Based on Vision. Appl. Res. Comput. 2012, 29, 2780–2783. [Google Scholar]
  64. Xu, Y.; Pan, L.; Du, C.; Li, J.; Jing, N.; Wu, J. Vision-based UAVs Aerial Image Localization: A Survey. In Proceedings of the 2nd ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, Seattle, WA, USA, 6 November 2018. [Google Scholar]
  65. Wang, R.; Wan, W.; Wang, Y.; Di, K. A New RGB-D SLAM Method with Moving Object Detection for Dynamic Indoor Scenes. Remote Sens. 2019, 11, 1143–1150. [Google Scholar] [CrossRef]
  66. Conte, G.; Doherty, P. Vision-Based Unmanned Aerial Vehicle Navigation Using Geo-Referenced Information. EURASIP J. Adv. Signal Process. 2009, 9, 387308. [Google Scholar] [CrossRef]
  67. Conte, G.; Doherty, P. A Visual Navigation System for UAS Based on Geo-Referenced Imagery. In Proceedings of the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Zurich, Switzerland, 14–16 September 2011. [Google Scholar] [CrossRef]
  68. Sermanet, P.; Eigen, D.; Zhang, X.; Mathieu, M.; Fergus, R.; LeCun, Y. Overfeat: Integrated Recognition, Localization and Detection Using Convolutional Networks. arXiv 2013. [Google Scholar] [CrossRef]
  69. Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition, San Diego, CA, USA, 20–25 July 2005. [Google Scholar] [CrossRef]
  70. Gonzalez, R.C.; Woods, R.E. Digital Image Processing, 2nd ed.; Publishing House of Electronics Industry: Beijing, China, 2006; Volume 1, pp. 567–634. [Google Scholar]
  71. Goodrich, M.A.; Lin, L.; Morse, B.S. Using Camera-equipped Mini-UAVS to Support Collaborative Wilderness Search and Rescue Teams. In Proceedings of the 2012 International Conference on Collaboration Technologies and Systems (CTS), Denver, CO, USA, 21–25 May 2012. [Google Scholar] [CrossRef]
  72. Pelosi, M.; Brown, M.S. Improved Search Paths for Camera-equipped UAVS in Wilderness Search and Rescue. In Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, HI, USA, 27 November–1 December 2017. [Google Scholar] [CrossRef]
  73. McConkey, J.; Liu, Y. Semi-Autonomous Control of Drones/UAVs for Wilderness Search and Rescue. In Proceedings of the 2023 8th International Conference on Automation, Control and Robotics Engineering (CACRE), Hong Kong, China, 13–15 July 2017. [Google Scholar] [CrossRef]
  74. Broyles, D.; Hayner, C.R.; Leung, K. WiSARD: A Labeled Visual and Thermal Image Dataset for Wilderness Search and Rescue. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022. [Google Scholar] [CrossRef]
  75. Al-Naji, A.; Perera, A.G.; Mohammed, S.L.; Chahl, J. Life Signs Detector Using a Drone in Disaster Zones. Remote Sens. 2019, 11, 2441. [Google Scholar] [CrossRef]
  76. Shao, Y.; Zhang, D.; Chu, H.; Chang, Z.; Zhan, H.; Rao, Y. Design and implementation of outdoor search and rescue UAV based on infrared vision. Transducer Microsyst. Technol. 2019, 38, 104–106, 109. [Google Scholar] [CrossRef]
  77. Dinnbier, N.M.; Thueux, Y.; Savvaris, A.; Tsourdos, A. Target Detection Using Gaussian Mixture Models and Fourier Transforms for UAV Maritime Search and Rescue. In Proceedings of the 2017 International Conference on Unmanned Aircraft Systems (ICUAS), Miami, FL, USA, 4–6 November 2017. [Google Scholar] [CrossRef]
  78. Liu, W.; Zhang, R.; Sun, R.; Wu, Z.; He, J.; Ma, Q. Intelligent Auxiliary Search and Rescue System for People in Distress on Water Based on Visual Perception and Calculation. Patent No. CN112016373A, 1 December 2020. [Google Scholar]
  79. Li, Q.; Taipalmaa, J.; Queralta, J.P. Towards Active Vision with UAVs in Marine Search and Rescue: Analyzing Human Detection at Variable Altitudes. In Proceedings of the 2020 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Abu Dhabi, United Arab Emirates, 4–6 November 2020. [Google Scholar] [CrossRef]
  80. Marušić, Ž.; Zelenika, D.; Marušić, T.; Gotovac, S. Visual Search on Aerial Imagery as Support for Finding Lost Persons. In Proceedings of the 2019 8th Mediterranean Conference on Embedded Computing (MECO), Budva, Montenegro, 14–10 June 2019. [Google Scholar] [CrossRef]
  81. Russkin, A.; Alekhin, M.; Iskhakova, A. Functional Requirements Synthesis in Creation of Modular UAV Multisensory System Payload for Mountain Snow Search and Rescue Missions. In Proceedings of the 2021 International Siberian Conference on Control and Communications (SIBCON), Kazan, Russia, 13–15 May 2021. [Google Scholar] [CrossRef]
  82. Castellano, G.; Castiello, C.; Mencar, C.; Vessio, G. Preliminary Evaluation of TinyYOLO on a New Dataset for Search-and-Rescue with Drones. In Proceedings of the 2020 7th International Conference on Soft Computing & Machine Intelligence (ISCMI), Stockholm, Sweden, 14–15 November 2020. [Google Scholar] [CrossRef]
  83. Dong, J.; Ota, K.; Dong, M. UAV-Based Real-Time Survivor Detection System in Post-Disaster Search and Rescue Operations. IEEE J. Miniaturization Air Space Syst. 2021, 2, 209–219. [Google Scholar] [CrossRef]
  84. Trong, T.D.; Khai, V.D.; Duy, T.B.; Van, M.V. A Scheme of Autonomous Victim Search at Sea Based on Deep Learning Technique Using Cooperative Networked UAVs. In Proceedings of the 2023 12th International Conference on Control, Automation and Information Sciences (ICCAIS), Hanoi, Vietnam, 27–29 November 2023. [Google Scholar] [CrossRef]
Figure 1. Global statistics on the frequency of major natural disasters, 2013–2022.
Figure 1. Global statistics on the frequency of major natural disasters, 2013–2022.
Sustainability 16 02105 g001
Figure 2. Global statistics on the number of people affected/dead and missing owing to major natural disasters, 2013–2022.
Figure 2. Global statistics on the number of people affected/dead and missing owing to major natural disasters, 2013–2022.
Sustainability 16 02105 g002
Figure 3. Global statistics on direct economic losses from the impact of major natural disasters, 2013–2022. Note: Data on direct economic losses in 2022 and direct economic losses as a percentage of GDP for 2019–2022 are not available.
Figure 3. Global statistics on direct economic losses from the impact of major natural disasters, 2013–2022. Note: Data on direct economic losses in 2022 and direct economic losses as a percentage of GDP for 2019–2022 are not available.
Sustainability 16 02105 g003
Figure 4. Global frequency statistics for major man-made disasters, 2013–2022.
Figure 4. Global frequency statistics for major man-made disasters, 2013–2022.
Sustainability 16 02105 g004
Figure 5. Global statistics on the number of people affected/fatalities by major man-made disasters, 2013–2022.
Figure 5. Global statistics on the number of people affected/fatalities by major man-made disasters, 2013–2022.
Sustainability 16 02105 g005
Figure 6. Sample company counts of the number of times electric UAVs have been applied in global rescues by year.
Figure 6. Sample company counts of the number of times electric UAVs have been applied in global rescues by year.
Sustainability 16 02105 g006
Figure 7. Flowchart of traditional target detection methods.
Figure 7. Flowchart of traditional target detection methods.
Sustainability 16 02105 g007
Figure 8. MBTIQR detection tracking effect [40].
Figure 8. MBTIQR detection tracking effect [40].
Sustainability 16 02105 g008
Figure 9. COCOA UAV tracking of ground vehicles [41]. (a) Frame 515, (b) Frame 570, (c) Frame 624, (d) Frame 665.
Figure 9. COCOA UAV tracking of ground vehicles [41]. (a) Frame 515, (b) Frame 570, (c) Frame 624, (d) Frame 665.
Sustainability 16 02105 g009
Figure 10. UAV visual navigation system.
Figure 10. UAV visual navigation system.
Sustainability 16 02105 g010
Figure 11. Classical SLAM framework.
Figure 11. Classical SLAM framework.
Sustainability 16 02105 g011
Figure 12. Electric UAV cameras detect signs of life [75].
Figure 12. Electric UAV cameras detect signs of life [75].
Sustainability 16 02105 g012
Table 1. Sample company electric UAV global rescue statistics.
Table 1. Sample company electric UAV global rescue statistics.
DistrictYears of StatisticsNumber of Rescues
Asia2015–202378
Oceania2018–202312
Africa2016–20237
Europe2017–2023179
North America2013–2023303
Latin America2018–202321
Table 2. Sample company statistics on emergency rescue electric UAV models and carrying equipment.
Table 2. Sample company statistics on emergency rescue electric UAV models and carrying equipment.
Application ScenarioCommon Application ModelsLoaded DeviceAdvantages of Electric UAVs
Flood rescueDJI Mavic 3T, Matrix M30 Series, Matrice 350 RTKVision + infrared sensors, visible cameras, infrared cameras, laser rangefinders, searchlights, shouters, etc.Able to assess the disaster situation intuitively and quickly, and carry out visualization and command
Fire rescueDJI Mavic 3T, Matrix M30T Series, Matrice 350 RTKVision + infrared sensors, visible cameras, infrared cameras, laser rangefinders, etc.Fire information can be quickly labelled
Geological disaster rescueDJI Mavic 3T, Matrice 350 RTKVision + infrared sensors, visible cameras, infrared cameras, throwers, searchlights, shouters, communication relay equipment, etc.Able to conduct reconnaissance and assess disaster situations from the air, collect disaster data, and drop supplies, provide lighting, and carry out temporary communications relay, etc.
Table 3. Comparison of the advantages and disadvantages of common feature description operators.
Table 3. Comparison of the advantages and disadvantages of common feature description operators.
Feature Description OperatorsAdvantagesDisadvantages
HOGAbility to capture local shape information
Robust to changes in illumination
Can transform high-dimensional gradient information into feature vectors of lower dimensions, reducing computational complexity
Unaffected by changes in target geometry
Ability to incorporate support vector machines (SVMs)
Sensitive to light and shadow
No positional invariance
Sensitive to target distortion
Need for accurate edge detection
Higher computational effort
Performance is sensitive to parameters
Haar-like featuresHigh calculation speed
Simple and intuitive based on black and white features in a simple rectangular area
Relatively robust to changes in illumination
Some degree of scale invariance
Difficult to represent complex texture and structural information
Does not have directionality
Not robust to rotation
Dependent on manual design
SIFTScale and rotational invariance
Relatively stable with respect to light changes
Richness of extracted features
Feature points are generally unique, which helps in matching and identification
High computational complexity
Large feature dimensions
Sensitive to image distortion
SURFProposed optimization algorithm based on SIFT
Computationally faster
Scale and rotation invariance
Relatively small feature dimensions
Relatively robust to noise and illumination variations
Sensitive to image distortion
Relatively low accuracy
Table 4. Performance comparison of common deep learning target detection models.
Table 4. Performance comparison of common deep learning target detection models.
Models PerformanceAccelerationAccuracyApplication Scenarios
YOLOOne-stage design
Usually faster in its reasoning
Good performance in real-time detection, but slightly less accurate in small target detection and some complex scenesIdeal for applications requiring real-time detection, such as video analytics and autonomous driving
SSDFaster than some two-stage frameworksGood balance between accuracy and speedIdeal for real-time detection scenarios that require a balance of speed and accuracy, such as video surveillance
Faster R-CNNRelatively slow compared to two-stage frameworksOften excels in target detection accuracy, especially in large target detection and complex scenesIdeal for tasks with high accuracy requirements and relatively low real-time requirements
Mask R-CNNMore complex and usually slower than R-CNNCapable of not only target detection but also instance segmentation. Typically achieves a higher accuracy on more complex tasksSuitable for complex tasks that require target detection and instance segmentation, such as image segmentation and medical image analysis
Table 5. Experimental environment configuration and computing speed of example UAV visual detection algorithms.
Table 5. Experimental environment configuration and computing speed of example UAV visual detection algorithms.
Experimental Environment ConfigurationAlgorithm Running Speed
Zhang [45]Processor: Intel(R) Core(TM) i7-8550U CPU @ 1.80 GHZ Memory 8.00 GB
GPU Configuration: GeForce MX150, Graphics Memory 2 GB
YOLOv4 algorithm: ~15 fps
YOLOv4-Tiny algorithm: 40 fps
Sun et al. [46]Intel Core i7-11800H CPU, TITAN RTX graphics cardAverage running frame rate: 29.4 fps
Yang [47]On-board embedded device: NVIDIA Jetson nano
Quad-core CPU
Running memory: 4G
Half the memory footprint of the commonly used FP32-precision models when using FP16-precision inference models. Detection speed up to 23 fps
Table 6. Comparison of the advantages and disadvantages of relative and absolute visual positioning.
Table 6. Comparison of the advantages and disadvantages of relative and absolute visual positioning.
Visual Navigation Positioning MethodsAdvantagesDisadvantages
Relative visual orientationHigher real-time performance
Does not rely on pre-established maps or landmarks and can adapt to unknown or dynamically changing environments
Suitable for missions requiring rapid deployment and mobility
Accumulation of errors leads to degradation of navigation accuracy
Does not provide the absolute position of the UAV
Sensitive to environmental constraints
Absolute visual orientationHigher positioning accuracy
Resistant to cumulative errors
Suitable for complex scenes
Relies on absolute location information such as pre-built maps or GPS
Relatively poor real-time performance
Highly affected by environmental changes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tang, P.; Li, J.; Sun, H. A Review of Electric UAV Visual Detection and Navigation Technologies for Emergency Rescue Missions. Sustainability 2024, 16, 2105. https://doi.org/10.3390/su16052105

AMA Style

Tang P, Li J, Sun H. A Review of Electric UAV Visual Detection and Navigation Technologies for Emergency Rescue Missions. Sustainability. 2024; 16(5):2105. https://doi.org/10.3390/su16052105

Chicago/Turabian Style

Tang, Peng, Jiyun Li, and Hongqiang Sun. 2024. "A Review of Electric UAV Visual Detection and Navigation Technologies for Emergency Rescue Missions" Sustainability 16, no. 5: 2105. https://doi.org/10.3390/su16052105

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop