Vision-Based In-Flight Collision Avoidance Control Based on Background Subtraction Using Embedded System

Park, Jeonghwan; Choi, Andrew Jaeyong

doi:10.3390/s23146297

Open AccessArticle

Vision-Based In-Flight Collision Avoidance Control Based on Background Subtraction Using Embedded System

by

Jeonghwan Park

¹ and

Andrew Jaeyong Choi

^2,*

¹

ThorDrive, 165, Seonyu-ro, Yeongdeungpo-gu, Seoul 07268, Republic of Korea

²

School of Computing, Dept. of AI-SW, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam 13306, Republic of Korea

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(14), 6297; https://doi.org/10.3390/s23146297

Submission received: 24 May 2023 / Revised: 20 June 2023 / Accepted: 7 July 2023 / Published: 11 July 2023

(This article belongs to the Section Navigation and Positioning)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The development of high-performance, low-cost unmanned aerial vehicles paired with rapid progress in vision-based perception systems herald a new era of autonomous flight systems with mission-ready capabilities. One of the key features of an autonomous UAV is a robust mid-air collision avoidance strategy. This paper proposes a vision-based in-flight collision avoidance system based on background subtraction using an embedded computing system for unmanned aerial vehicles (UAVs). The pipeline of proposed in-flight collision avoidance system is as follows: (i) subtract dynamic background subtraction to remove it and to detect moving objects, (ii) denoise using morphology and binarization methods, (iii) cluster the moving objects and remove noise blobs, using Euclidean clustering, (iv) distinguish independent objects and track the movement using the Kalman filter, and (v) avoid collision, using the proposed decision-making techniques. This work focuses on the design and the demonstration of a vision-based fast-moving object detection and tracking system with decision-making capabilities to perform evasive maneuvers to replace a high-vision system such as event camera. The novelty of our method lies in the motion-compensating moving object detection framework, which accomplishes the task with background subtraction via a two-dimensional transformation approximation. Clustering and tracking algorithms process detection data to track independent objects, and stereo-camera-based distance estimation is conducted to estimate the three-dimensional trajectory, which is then used during decision-making procedures. The examination of the system is conducted with a test quadrotor UAV, and appropriate algorithm parameters for various requirements are deduced.

Keywords:

unmanned aerial vehicle; collision avoidance; trajectory estimation; feature-point matching; optical flow; background subtraction

1. Introduction

Lately, advancements in machine learning methodologies have enabled the development of vision-based spatial and object recognition systems, and this led to active research in the field of autonomous flight control systems, especially for unmanned aerial vehicles (UAVs). However, currently available autonomous flight control systems for UAVs focus mostly on waypoint-based global positioning systems (GPS), although the ability to navigate complex and dynamic environments is under active development [1,2]. Several drone manufacturers have integrated into their quadrotors simple autonomous flight systems that can follow subjects while avoiding static obstacles at low flight speeds, but they lack the ability to handle sudden changes in environments, and the standard function of static obstacle avoidance is still prone to failure.

The recent advances in unmanned aerial vehicles (UAVs) and computing technologies (i.e., artificial intelligence, embedded systems, soft computing, cloud and edge computing, sensor fusion, etc.) have expanded the potential and extended the capabilities of UAVs [3,4]. One of the key abilities required for autonomous flight systems is the methodology of mid-air collision avoidance [5]. Small-scale UAVs are exposed to various threat sources such as birds and other small aircrafts. Mid-air collisions almost certainly lead to vehicle damage and payload losses. An autonomous collision avoidance system capable of detecting potential hostile objects and performing decision-making avoidance maneuvers is regarded as a crucial component of an autonomous flight system [6].

Collision avoidance systems require a means by which to perceive potential obstacles, and this is performed by one or more types of sensors. Typical sensors employed for obstacle recognition include passive sensors such as cameras and active sensors such as RADAR, LiDAR, and SONAR. Cameras, commonly utilized passive sensors, can further be classified into monocular, stereo, and event-based types [7,8,9,10,11]. Cameras typically benefit from their small size and ease of use, but they are sensitive to lighting conditions. Lee et al. [12] employed inverse perspective mapping for object recognition, with the downside of relatively slow operational flight speeds. Haque et al. [13] implemented a fuzzy controller with stereo cameras for obstacle avoidance. Falanga et al. [14] demonstrated low-latency collision avoidance systems with bio-inspired sensors called event cameras. These sensors require no additional image processing operations, making them ideal for resource-constrained environments such as UAVs. However, event cameras are still in their active development, and their prices are still too high for the mass deployment.

Active sensors emit energy waves that reflect off object surfaces and measure distances based on the round-trip times of these waves. They are robust to lighting conditions and have a relatively broad operational range but are usually heavier and less easy to mount than passive sensors. Kwag et al. [15] utilized RADAR sensors to obtain position and velocity information about nearby objects and execute appropriate maneuvers based on this information. Moses et al. [16] developed a prototype of an X-band RADAR sensor for obstacle recognition from UAVs.

Mid-air collision avoidance systems are constrained with several requirements. Given that they operate in very short time periods, rapid cognition and response are crucial. Therefore, the computational complexity of the algorithms they use must be considered. In addition, it is desirable to utilize existing sensors such as cameras rather than using additional sensors to minimize the impact on the payload capacity.

Camera sensors benefit from their small size and low power consumption and provide an abundance of real-world information. However, this information is often highly abstracted, and additional processing is required to obtain them. One of the key objectives of vision-based perception algorithms is to require the lowest possible computational power. Furthermore, modern vision-based perception systems do not yet have robust general-inference capabilities and require further development.

There are various methodologies for moving object detection from image and video data, including statistical, image geometry-based, and machine-learning-based methods. Lv et al. [17] solved the problem of low accuracy and slow speed of drone detection by combining background difference and lightweight network SAG-YOLOv5s. The detection performance of the proposed method is 24.3 percentage higher than that of YOLOv5s, and the detection speed in 4K video reached 13.2 FPS. The proposed method achieved not only a higher detection accuracy but also a higher detection speed. However, the proposed method is only able to detect a drone under a fixed camera. Chen et al. [18] compared the motion vectors of moving and stationary objects to detect moving objects. They presented an object-level motion detection from a freely moving camera using only two consecutive video frames. However, due to the complexity of the computation, the proposed method is limited to real-time application. Kim et al. [19] clustered optical flow vectors with K-nearest neighbors clustering and distinguished moving objects based on the cluster variance. Seidaliyeva et al. [20] addressed moving object detection based on background subtraction, while the classification is performed using CNN. However, the main limitation of the proposed detector is the dependence of its performance on the presence of a moving background.

This research aims to develop and demonstrate a moving object perception and decision-making system for reactive mid-air collision avoidance with low-cost UAV system based on background subtraction method. The concept of applying the proposed system is illustrated in Figure 1.

The system utilizes an RGB-D camera to perform image geometry-based moving object detection. To meet the low latency requirements of collision avoidance systems, a detection algorithm with low computational complexity is devised. The novelty of the proposed system lies in the 2-D perspective transformation approximation of background motion. The corresponding points between images that are needed for approximation calculation are collected using optical flow, and to ensure that the perspective transform model best approximates the background motion, measures such as limiting optical flow estimation regions are employed. These components enable vision-based moving object detection with low computational requirements. The approximated background motion model is utilized for background subtraction to extract regions in which a moving object is present. To cope with inevitable noise originating from the approximation process and the visual data per se, various image filters and a binarization strategy are utilized. Custom-made clustering and tracking modules perceive and track individual objects for threat assessment. The distance to an object is measured by a stereo camera and then used to estimate the object’s three-dimensional trajectory relative to the ego-vehicle. This trajectory information is utilized for decision-making, triggering various commands that allow the ego-vehicle either to avoid the object or perform an emergency stop.

To test the system, a quadrotor UAV equipped with a Raspberry Pi 4 low-power computer and a low-cost Intel RealSense D435i RGB-D camera with a visual odometry camera are mounted on the platform. The proposed hardware system is shown in Figure 2. Evasive maneuvers from various conditions are tested, and the results are used to optimize the operational parameters further to better suit each flight environment.

2. Methodology of Moving Object Detection from a Moving Camera

2.1. Projective Transformation

A transformation in image processing refers to a function that maps an arbitrary point to another point. A 2D transformation is a specific category of transformation, which performs 2D-to-2D mapping. The most general form of a 2D transformation is a projective transformation, which can be interpreted as a type of transformation that maps an arbitrary quadrangle onto another quadrangle. Thus, a projective transformation can describe the relationship between two images that show the same planar object. The matrix that performs a projective transformation is called homography matrix, and the transformation can be defined as follows:

[\begin{matrix} x_{t} \\ y_{t} \\ 1 \end{matrix}] ~ [\begin{matrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & h_{33} \end{matrix}] \cdot [\begin{matrix} x \\ y \\ 1 \end{matrix}]

(1)

w h e r e [\begin{matrix} x_{t} \\ y_{t} \\ 1 \end{matrix}] i s t h e p i x e l c o o r d i n a t e a f t e r t h e t r a n s f o r m .

A homography matrix has eight degrees of freedom, and at least four matches between images are required to compute this matrix. Generally, more than four matches are processed with outlier-rejection techniques such as RANSAC [21] or LMedS [22] to yield optimal results.

In principle, a projective transform can only describe planar, or 2D, objects. However, as shown in Figure 3, if 3D objects are relatively far away from the camera, and the posed differences between viewpoints are minor, the relationship between the images taken from these viewpoints can be approximated with a projective transformation. This is a core assumption of a moving object detection system. Details about how to utilize this assumption for background motion modeling is provided in the next section.

2.2. Feature-Point Matching between Images with an Optical Flow

The optical flow methodology estimates the motion vectors of pixels during two consecutive frames. It involves a calculation of a differential equation called the gradient constraint equation, and methodologies for solving this equation consist of various optical flow algorithms. Some of the popular methods are the Lucas–Kanade and Horn–Schunck algorithms. The Lucas–Kanade algorithm only tracks specified pixels but is computationally less complex. The Horn–Schunck algorithm tracks all pixels in a frame but with less accuracy and at lower speeds. For the proposed system, the Lucas–Kanade optical flow is employed to match pixels between consecutive image frames. A combination of epipolar geometry, projective transformation, and an optical flow enables the background motion modeling and moving object detection. The detailed procedures are presented in the following section.

2.3. Moving Object Detection from a Moving Camera

The mid-air collision avoidance system proposed here consists of four operational stages: detection, recognition, tracking, and decision-making. The detection stage detects regions of moving objects in the camera’s field of view. A pseudocode of this stage is given in Algorithm 1.

Algorithm 1 Independent moving object detection

Input:

I_{t - 1}, I_{t}, I_{t + 1}, H_{t - 1, t}

Output:

R_{t}

1:

F P \leftarrow [0 0 0 0 0 0]

2:

i = 0

3:

w h i l e i < 6 :

4:

F P [i] \leftarrow n u m b e r o f f e a t u r e p o i n t s i n I_{t} [i]

5:

i f F P [i] < n :

6:

F i n d a t l e a s t n - F P [i] f e a t u r e p o i n t s i n I_{t} [i]

7:

e n d i f

8:

e n d w h i l e

9:

{O F}_{t, t + 1} \leftarrow o p t i c a l f l o w b e t w e e n I_{t} [i] a n d I_{t + 1} [i]

► Track
background feature points
10:

H_{t, t + 1} \leftarrow e s t i m a t e h o m o g r a p h y b e t w e e n I_{t} [i] a n d I_{t + 1} [i] u s i n g O F

►Calculate homography matrix
11:

R_{t} \leftarrow a v e r a g e ([(I_{t} - H_{t - 1, t} \cdot I_{t - 1}) + (I_{t + 1} - H_{t, t + 1} \cdot I_{t})])

►Background subtraction
12:

R_{t} \leftarrow B i n a r i z e (D i l a t i o n (M e d i a n F i l t e r (R_{t})))

►Apply
denoising filters and binarization

The purpose of homography matrix calculation between consecutive images is to model background scene transitions caused by camera motion. If three-dimensional objects are relatively far away, and the pose difference between the viewpoints is small, the transition between two video frames can be approximated as a projective transformation. Figure 4 visualizes this approximation process.

The calculated homography matrix is then used for background subtraction. This matrix approximately relates pixels from the previous frame to pixels of the current frame. That is, pixels from the previous frame can be mapped to coincide with those of the current frame. However, the motions of moving objects are not described by the homography matrix and thus cannot be mapped to coincide between frames. Therefore, they can be detected by searching for regions where pixels are not correctly mapped between consecutive frames. To locate incorrectly mapped pixels and detect moving objects, a perspective transform-based background subtraction technique is employed. The homography matrix between two consecutive frames is used to map pixels of the previous frame to match those of the current frame. The transformed frame is then subtracted from the current frame. The pixels of moving objects are not correctly mapped to the current frame given that the motion of the current frame differs from that of the background; accordingly, the corresponding brightness has a nonzero value after subtraction, signifying that image regions of moving objects are detected. An example result is shown in Figure 5. The size of the quadcopter drone used in the experiment was 180 × 65 × 190 mm. Thus, the proposed algorithm can detect a small-size moving drone.

Also, when calculating the homography matrix, the flight conditions of UAVs were taken into consideration. From the viewpoint of a flying UAV, objects near the center of the field of view are typically further away than those at the edge areas. Further objects appear smaller and thus exhibit weaker magnitudes of the appearance transition. Thus, objects near the edge of the field of view have more influence on the perspective transformation model. Therefore, it is reasonable to calculate the homography matrix from the transition of pixels that are close to the edge of the field of view. In this system, six 70 × 70 windows were used. The system constantly monitors the number of tracked points in these windows and initiates new tracks if the number falls below a predetermined threshold. This evenly distributes the number of tracked points and guarantees the stability of the homography matrix calculation. To filter out false matches and outliers, RANSAC was used to calculate the homography matrix. Limiting the optical flow calculation regions reduced the computational requirements and improved the approximation accuracy simultaneously.

However, because the projective transformation approximation is not perfect, noise elements inevitably exist. That is, even after background subtraction, some areas of the background may not have zero brightness values. To compensate for this, the proposed system utilizes two methods. The first utilizes the next frame as well as the previous frame for background subtraction. Each transformed frame is subtracted from the current frame and then averaged. Because noise elements appear randomly and momentarily, this method improves the signal-to-noise ratio (SNR) of the resulting image. The second method applies image filters and binarization to the resulting image after the first method. A median filter eliminates small noise components, while a dilation filter expands elements, thus revealing moving object regions more clearly. The image is subsequently binarized to disregard pixels below a certain brightness threshold. This process is illustrated in Figure 6.

3. Methodology for Object Recognition, Tracking, and Decision-Making

3.1. Object Recognition

The recognition stage applies a clustering algorithm to the result of the tracking stage to determine the number and locations of all moving objects. A pseudocode of this stage is given in Algorithm 2.

Algorithm 2 Independent object recognition

Input:

R_{t}

Output:

C_{t}

1:

C_{t} \leftarrow \emptyset

2:

f o r u n a l l o c a t e d B l o b i n R_{t} :

► A region
in

R_{t}

3:

i f s i z e (u n a l l o c a t e d B l o b) < s i z e T h r e s h :

4:

c o n t i n u e

► Disregard small regions
5:

f o r c l u s t e r i n C l u s t e r s :

6:

f o r b l o b i n c l u s t e r :

7:

i f d i s t a n c e (b l o b, u n a l l o c a t e d B l o b) < d i s t T h r e s h :

8:

a l l o c a t e u n a l l o c a t e d B l o b t o c l u s t e r

9:

b r e a k

► Add region to cluster
if below distance threshold
10:

i f u n a l l o c a t e d B l o b i s u n a l l o c a t e d :

11:

a l l o c a t e u n a l l o c a t e d B l o b t o C l u s t e r s a n d s e t a s c l u s t e r

► Set as new cluster
12:

f o r c l u s t e r i n C l u s t e r s :

13:

a p p e n d w e i g h e d m e a n p o s i t i o n o f c l u s t e r t o C_{t}

► Append
representative position of blob to

C_{t}

The recognition phase outputs a binary image that displays the regions of moving objects as 1. However, as the presence of noise is inevitable, some background regions have a value of 1, albeit intermittently. Additionally, a single independent moving object can appear in several “patches”, as shown in Figure 7. Therefore, there is a need for a method capable of rejecting noise components while also recognizing multiple nearby patches as a single object.

For the proposed system, a modified DBSCAN (density-based spatial clustering of applications with noise [23]) algorithm is developed. The DBSCAN, unlike center-based algorithms such as K-means [24], determines if a data point belongs to a cluster based on its distances to other points. It regards data points with no neighbors as outliers and data points with more than some number of neighbors as inliers. However, a vanilla DBSCAN cannot be directly applied for this task, as there are instances in which an object shows up as a single patch as well as those where noise components appear as several patches. Thus, the vanilla DBSCAN is modified to utilize the area information of patches. Only patches within an area threshold are considered as a true object, and a single patch is considered as a cluster if it satisfies the area threshold. The centroid of a cluster is calculated using the weighed sum of all patches in that cluster. This procedure is summarized in Figure 7.

3.2. Object Tracking

The tracking stage receives the objects’ locations at each frame from the recognition stage and associates them with those of the previous frame. A unique ID is assigned to each independent object, and the corresponding trajectories are monitored for threat assessment. Additional noise compensation is also provided in this stage. A pseudocode of this stage is given in Algorithm 3.

Algorithm 3 Independent object tracking

Input:

O_{t - 1}, C_{t}

Output:

O_{t}

1:

O \leftarrow \emptyset

2:

M = s i z e (O_{t - 1})

► number
of objects at previous frame
3:

N = s i z e (C_{t})

►
number of objects at current frame
4:

A r r a y \leftarrow z e r o s (M, N)

► An array that stores
distances between all objects
5:

f o r i i n O_{t - 1} :

6:

f o r j i n C_{t} :

7:

A r r a y [i, j] = d i s t a n c e (O_{t - 1} [i], C_{t} [j])

8:

f o r k i n C_{t} :

9:

i f I D [k] i s n o t s e t :

10:

i f A r r a y [i, k] = = m i n (A r r a y [i, :]) :

►
object k: closest object to object i
11:

i f A r r a y [i, k] < d i s t T h r e s h :

►
if below distance threshold:
12:

I D [k] \leftarrow I D [i]

► assume k and i is the
same object (tracking success)
13:

T r a c k e d l e n g t h o f I D [k] + = 1

► Increase tracked length
14:

b r e a k

15:

e l s e :

16:

m a r k I D [i] a s l o s t

► Mark
as lost if no object is nearby
17:

f o r k i n C_{t} :

18:

i f I D o f k i s n o t s e t :

19:

a s s i g n n e w I D t o k

► assign
new ID if new object is detected
20:

t r a c k e d l e n g t h o f I D [k] \leftarrow 0

21:

i f t r a c k e d l e n g t h o f I D [k] > n o i s e T h r e s h :

► If
tracked longer than threshold
22:

I D [k] \leftarrow m a r k a s t r u e o b j e c t

► Approve as true object
23:

a d d p o s i t i o n a n d I D o f k t o O_{t}

Temporal information is used to distinguish true observations from noise and to recognize object identities across multiple frames. This step operates on the central assumption that the distance travelled by an object between frames is shorter than the distances between separate objects. Moreover, because noise components are spawned intermittently and momentarily, they do not appear consistently across multiple frames.

First, the distances between previously detected coordinates and new coordinates are compared, after which the closest new detection within a distance threshold is recognized as the new position of the object. If there are no new detections within a distance threshold, the object is marked as lost. If no new detections appear near its last known location for a time threshold, the object is deleted. New detections with no associations are identified as object candidates and are given a temporary ID. If an object candidate is successfully tracked for a time exceeding a certain time threshold, it is approved as a true object and given a new ID. This procedure is summarized in Figure 8.

3.3. Decision-Making for Avoidance Maneuvers

The decision-making stage receives the trajectory data from the tracking stage and determines whether avoidance maneuvering must be performed, and if so, in what direction it should occur. A pseudocode of this stage is given in Algorithm 4, and Figure 9 presents a diagram of this algorithm.

Algorithm 4 Decision-making for avoidance maneuvers

Input:

O_{t}

► O_t: object IDs
and locations at current frame
Output:

A_{t}

►

A_{t}

: object IDs
and locations at current frame
1:

f o r o b j e c t i n O_{t} :

2:

p o s i t i o n \leftarrow 3 D p o s i t i o n o f o b j e c t

►
acquire 3D position of object
3:

i f p o s i t i o n . z < d_{s t o p T h r e s h} :

4:

e m e r g e n c y s t o p b e f o r e p r o c e e d i n g

5:

b r e a k

6:

i f p o s i t i o n i s i n s i d e R_{S a f e W i n d o w} :

7:

m a r k o b j e c t a s h o s t i l e

► consider object as
a threat if inside SafeWindow
8:

i f p o s i t i o n . z < d_{a v o i d T h r e s h} :

► execute avoidance maneuver if
distance is below threshold
9:

i f p o s i t i o n . x y i n u p p e r R e g i o n :

10:

m o v e d o w n w a r d a t v m / s u n t i l o u t s i d e o f R_{S a f e W i n d o w}

11:

i f p o s i t i o n . x y i n l o w e r R e g i o n :

12:

m o v e u p w a r d a t v m / s u n t i l o u t s i d e o f R_{S a f e W i n d o w}

13:

i f p o s i t i o n . x y i n l e f t R e g i o n :

14:

m o v e r i g h t a t v m / s u n t i l o u t s i d e o f R_{S a f e W i n d o w}

15:

i f p o s i t i o n . x y i n r i g h t R e g i o n :

16:

m o v e l e f t a t v m / s u n t i l o u t s i d e o f R_{S a f e W i n d o w}

17: break

The proposed system uses the object trajectory information from the tracking system to perform avoidance maneuvers if the relative distance falls below a certain threshold. Additionally, if the initial measured distance to the object is too close (for example, if the moving object approached from the side of the vehicle outside the field of view), an emergency stop occurs to prevent a crash into the obstacle. The radius of the safe window and the avoidance threshold can be determined from prior information, such as the object’s expected maximum velocity and the maximum possible maneuver velocity.

4. Performance Validation of In-Flight Collision Avoidance

In this section, the performance validation of the proposed in-flight collision avoidance system will be demonstrated. Before the actual in-flight collision avoidance testing, in-flight dynamic object detection based on background subtraction was conducted. Then, a simulation test was conducted using ROS Melodic with Gazebo environment. After the simulation testing, in-flight dynamic object detection based on background subtraction was conducted. Lastly, the proposed system (in-flight collision avoidance) was tested with actual quadcopter drone.

4.1. System Setting for In-Flight Collision Avoidance

The UAV hardware setting for the proposed in-flight collision avoidance shown in Figure 2. It is equipped with a Raspberry Pi 4 companion computer, a RealSense D435i stereo depth camera, and a RealSense T265 visual odometry camera for an indoor autonomous flight. The system components are interconnected via ROS (Robot Operating System), a robotics development software package, as illustrated Figure 10.

4.2. Validation of Moving Object Detection and Tracking

Before the actual in-flight collision avoidance testing, the in-flight dynamic object detection based on background subtraction was conducted as demonstrated in Figure 11. As Figure 11 shows, the in-flight UAV was able to subtract the dynamic background and detect only the moving object.

Furthermore, the detection and tracking algorithm is based on the Kalman filter. The tracking results for video sequences are shown in Figure 12. The IDs of objects are assigned to independent objects, and the two-dimensional trajectory of the object is displayed as red points. The video result is available at https://youtu.be/AcWcUMl0WW8 (accessed on 20 June 2023). The tracking algorithm works in real time on a Raspberry Pi 4 computer. It compensates for noise components and keeps track of the objects even in the detection failure.

The performance validation for moving object detection and tracking algorithms was also conducted for real-time performance. The fps (frames-per-second) performance comparison results with FastMCD [15] and a neural-network-based system [25] are displayed in Figure 13. The blue line represents FastMCD, the orange line represents the neural-network-based system, and the gray line represents the proposed system. As shown in the figure, the proposed system is vastly and continuously superior in terms of real-time processing performance, processing up to 70 fps and 60 fps on average. Note that the stereo camera system utilized for the collision avoidance system streams video data at 30 frames per second, and the proposed algorithms always performed sufficiently above the real-time requirements for the task of in-flight moving object detection, tracking, and collision avoidance. The validation demonstrated the real-time suitability for in-flight collision avoidance tasks in real time.

4.3. Validation of In-Flight Collision Avoidance

Before the actual in-flight testing, a simulation test was conducted using ROS Melodic with Gazebo environment as shown in Figure 14.

After the simple simulation test, the operational variables were set for the actual in-flight collision avoidance. To validate the proposed system, the drone was set to fly forward until avoidance maneuvers were required. The operational variables were set as listed in Table 1.

For the first and second validation tests, the emergency stops were demonstrated. The drone detected an obstacle closer than 0.5 m away and performed the emergency stop before proceeding. These results are shown in Figure 15, Figure 16 and Figure 17 with plots of the trajectories of the drone and the object. Table 2 is the list of various detection results and the avoidance performance metrics during in-flight performance. For the performance validation of the proposed in-flight avoidance system, a ball with a size of 200 mm was thrown to the drone.

For the third and fourth validation tests, the avoidance maneuvers were demonstrated. The vehicle detected and tracked an obstacle from outside the safe window and performed avoidance maneuvers when the object breached the avoidance threshold. These results are shown in Figure 18, Figure 19, Figure 20 and Figure 21. Figure 19 presents a plot of the trajectories of the vehicle and the object and the nearest distance between them at the same point in time. Table 3 is the list of various detection results and avoidance performance metrics during the process.

Figure 22 presents the detection, recognition, and tracking results from the UAV’s point of view. The red points represent the tracked two-dimensional trajectory of the object.

To demonstrate the effectiveness of our approach, we compare some of our test results to those of [25,26], which describes an event-camera-based obstacle avoidance system. Although the types of sensors used for each systems differ, the components of the algorithms are very similar: ego-motion compensation, thresholding, morphological operations, and clustering. The algorithm in [25] runs much faster than ours—up to 280 fps—thanks to a simpler ego-motion compensation process that is made possible with the characteristics of the event camera, whereas our approach is bottlenecked by feature point extraction and homography computation processes. Nevertheless, the results for indoor obstacle avoidance for each method are comparable; the approach in [25] was analyzed to enable the avoidance of dynamic obstacles with relative speeds of up to 10 m/s, with reactions times of below 0.2 s. Our approach was tested and demonstrated with dynamic obstacles at relative speeds of up to 6 m/s with similar reaction times. We presume that with a higher-powered computer, such as the NVIDIA Jetson TX2 used in [25], our proposed system would be able to process higher-resolution images at a constant time rate, which would certainly help to detect the objects that are further away or move with higher relative speeds.

However, on several aspects of the current system can improve the low-cost in-flight collision avoidance. First, the proposed background subtraction algorithm must be improved. The proposed algorithm cannot deal with the drastic light changes. The object detection should be robust in various light-interference conditions (i.e., dawn, flying conditions with the sun high in the sky, etc.). To overcome those critical challenges and achieve robust real-time object detection, light-weight CNN-based algorithm should be implemented. Second, the in-flight collision avoidance algorithm must be improved. This research is focused on development of low-cost vision-based moving obstacle detection on the moving UAV. Thus, the probabilistic decision-making algorithm must be implemented. Tutsoy [27] proposed that parametric machine learning algorithms have been developed not only to predict the future responses of the systems but also to produce constrained policies (decisions) for optimal future behaviors under the unknown uncertainties.

Future work will attempt not only to extend the proposed in-flight collision avoidance system to outfield validation using multiple moving objects in various light conditions but also to extend the computing system for deep-learning-based or probabilistic real-time decision-making.

5. Conclusions

This paper proposed a high-performance and low-cost in-flight collision avoidance system based on background subtraction for unmanned aerial vehicles (UAVs). A novel vision-based mid-air collision avoidance system for UAVs was proposed and implemented. The pipeline of the proposed in-flight collision avoidance system is as follows: (i) subtract dynamic background to remove it and to detect moving objects, (ii) denoise using morphology and binarization methods, (iii) cluster the moving objects and remove noise blobs, using Euclidean clustering, (iv) distinguish independent objects and track the movement, using the Kalman filter, and (v) avoid collision, using the proposed decision-making techniques. Performance validation tests were conducted in a simulation environment and with an actual quadcopter drone with vision sensors.

The key contribution of the proposed system is a lightweight, error-compensating moving object detection and recognition algorithm. Specifically, background scene transitions due to ego-motion are approximated with a two-dimensional projective transformation, and a homography matrix is computed using the optical flow. To reduce the required computational load and improve the quality of approximation, the optical flow is calculated only at the edge regions of video frames. The previous and successive frames are transformed to match the current frame, and the background subtraction is performed to acquire a primitive estimate of the object location. Image filters and thresholding are utilized to improve the SNR of this result. A modified DBSCAN clustering algorithm is used to identify multiple detected image patches correctly as a single object. A distance-based tracking algorithm assigns object identities and tracks them across frames, incorporating an additional noise-filtering procedure based on the tracked period. A stereo camera system measures the distances to detected objects, and this information is used for determining whether avoidance maneuvers must be executed.

Further research on several aspects of the current system can improve low-cost in-flight collision avoidance. Future work will attempt to develop an object classification system for greater reliability and robustness of the system by developing a light CNN with a six-degree-of-freedom pose estimation algorithm to detect an orientation of moving objects.

The proposed system effectively detects, recognizes, and tracks moving objects in its field of view with low computational requirements and achieves sufficient processing performance on a low-power onboard computer. It is implemented onto a test vehicle and performs mid-air collision avoidance to demonstrate its effectiveness in real-world conditions.

Author Contributions

Conceptualization, J.P. and A.J.C.; methodology, J.P. and A.J.C.; software, J.P. and A.J.C.; validation, J.P. and A.J.C.; formal analysis, J.P. and A.J.C.; investigation, J.P. and A.J.C.; resources, A.J.C.; data curation, J.P. and A.J.C.; writing—original draft preparation, A.J.C.; writing—review and editing, A.J.C.; visualization, J.P. and A.J.C.; supervision, A.J.C.; project administration, A.J.C.; funding acquisition, A.J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Gachon University Research Fund of 2022 (GCU-202300680001).

Institutional Review Board Statement

Not Applicable.

Informed Consent Statement

Not Applicable.

Data Availability Statement

Not Applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

I_t	= t^th frame of a video sequence
I_t [i]	= i^th edge area of frame I_t
N	= threshold number for a new feature point search
OF_{t, t+1}	= optical flow between the edge areas of I_t and I_t+1
H_{t−1, t}	= homography matrix that maps I_t−1 to I_t
R_t	= binary image that displays regions where moving objects are present
C_t	= list of prior object coordinates
O_t	= object locations and IDs at the previous frame
d_avoidThresh	= avoidance decision threshold
d_stopThresh	= emergency stop decision threshold
R_safewindow	= safe window radius
v_obj	= maximum expected approach speed of object
v_max	= maximum possible maneuver speed of the ego-vehicle

References

Floreano, D.; Wood, R.J. Science, technology and the future of small autonomous drones. Nature 2015, 521, 460–466. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cai, Z.; Lou, J.; Zhao, J.; Wu, K.; Lui, N.; Wang, Y.X. Quadrotor trajectory tracking and obstacle avoidance by chaotic grey wolf optimization-based active disturbance rejection control. Mech. Syst. Signal Process. 2019, 128, 636–654. [Google Scholar] [CrossRef]
Choi, A.J.; Park, J.; Han, J.-H. Automated Aerial Docking System Using Onboard Vision-Based Deep Learning. J. Aerosp. Inf. Syst. 2022, 19, 421–436. [Google Scholar] [CrossRef]
Choi, A.J.; Yang, H.-H.; Han, J.-H. Study on robust aerial docking mechanism with deep learning based drogue detection and docking. Mech. Syst. Signal Process. 2021, 154, 107579. [Google Scholar] [CrossRef]
Shao, X.; Liu, N.; Wang, Z.; Zhang, W.; Yang, W. Neuroadaptive integral robust control of visual quadrotor for tracking a moving object. Mech. Syst. Signal Process. 2020, 136, 106513. [Google Scholar] [CrossRef]
Shim, D.; Chung, H.; Kim, H.J.; Sastry, S. Autonomous Exploration in Unknown Urban Environments for Unmanned Aerial Vehicles. In Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, American Institute of Aeronautics and Astronautics, San Francisco, CA, USA, 15–18 August 2005. [Google Scholar]
Qiu, Z.; Zhao, N.; Zhou, L.; Wang, M.; Yang, L.; Fang, H.; He, Y.; Liu, Y. Vision-Based Moving Obstacle Detection and Tracking in Paddy Field Using Improved Yolov3 and Deep SORT. Sensors 2020, 20, 4082. [Google Scholar] [CrossRef]
Mejias, L.; McNamara, S.; Lai, J.; Ford, J. Vision-based detection and tracking of aerial targets for UAV collision avoidance. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 18–22 October 2010; pp. 87–92. [Google Scholar]
Al-Kaff, A.; Garcia, F.; Martin, D.; Escalera AD, L.; Armingol, J.M. Obstacle Detection and Avoidance System Based on Monocular Camera and Size Expansion Algorithm for UAVs. Sensors 2017, 17, 1061. [Google Scholar] [CrossRef] [Green Version]
Alado, E.; Gonzalez-deSantos, L.M.; Michinel, H.; Gonzalez-Jorge, H. UAV Obstacle Avoidance Algorithm to Navigate in Dynamic Building Environment. Drones 2022, 6, 16. [Google Scholar] [CrossRef]
Ahmad, T.; Cavazza, M.; Matsuo, Y.; Prendinger, H. Detection Human Actions in Drone Images Using YoloV5 and Stochastic Gradient Boosting. Sensors 2022, 22, 7020. [Google Scholar] [CrossRef]
Lee, T.-J.; Yi, D.-H.; Cho, D.-I. A Monocular Vision Sensor-Based Obstacle Detection Algorithm for Autonomous Robots. Sensors 2016, 16, 311. [Google Scholar] [CrossRef] [Green Version]
Uddin Haque, A.; Nejadpak, A. Obstacle Avoidance Using Stereo Camera. arXiv 2017, arXiv:1705.04114. [Google Scholar]
Falanga, D.; Kim, S.; Scaramuzza, D. How Fast Is Too Fast? The Role of Perception Latency in High-Speed Sense and Avoid. IEEE Robot. Autom. Lett. 2019, 4, 1884–1891. [Google Scholar] [CrossRef]
Kwag, Y.K.; Chung, C.H. UAV based collision avoidance radar sensor. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Barcelona, Spain, 23–27 July 2007; pp. 639–642. [Google Scholar]
Moses, A.; Rutherford, M.J.; Kontitsis, M.; Valavanis, K.P. UAV-borne X-band radar for collision avoidance. Robotica 2014, 32, 97–114. [Google Scholar] [CrossRef]
Lv, Y.; Ai, Z.; Chen, M.; Gong, X.; Wang, Y.; Lu, Z. High-Resolution Drone Detection Based on Background Difference and SAG-YOLOv5s. Sensors 2022, 22, 5825. [Google Scholar] [CrossRef]
Chen, T.; Lu, S. Object-Level Motion Detection from Moving Cameras. IEEE Trans. Circuits Syst. Video Technol. 2017, 27, 2333–2343. [Google Scholar] [CrossRef]
Kim, J.; Wang, X.; Wang, H.; Zhu, C.; Kim, D. Fast moving object detection with non-stationary background. Multimed. Tools Appl. 2013, 67, 311–335. [Google Scholar] [CrossRef]
Seidaliyeva, U.; Akhmetov, D.; Ilipbayeva, L.; Matson, E. Real-Time and Accurate Drone Detection in a Video with a Static Background. Sensors 2020, 20, 3856. [Google Scholar] [CrossRef]
Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
Massart, D.L.; Kaufman, L.; Rousseeuw, P.J.; Leroy, A. Least median of squares: A robust method for outlier and model error detection in regression and calibration. Anal. Chim. Acta 1986, 187, 171–179. [Google Scholar] [CrossRef]
Ester, M.; Kriegel, H.-P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; AAAI Press: Portland, OR, USA, 1996; pp. 226–231. [Google Scholar]
Lloyd, S. Least squares quantization in PCM. IEEE Trans. Inf. Theory 1982, 28, 129–137. [Google Scholar] [CrossRef] [Green Version]
Yang, Y.; Loquercio, A.; Scaramuzza, D.; Soatto, S. Unsupervised Moving Object Detection via Contextual Information Separation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15 June 2019; pp. 879–888. [Google Scholar]
Falangs, D.; Kleber, K.; Scaramuzza, D. Dynamic obstacle avoidance for quadrotors with event cameras. Sci. Robot. 2020, 5, eaaz9712. [Google Scholar] [CrossRef] [PubMed]
Tutsoy, O. COVID-19 Epidemic and Opening of the Schools: Artificial Intelligence-Based Long-Term Adaptive Policy Making to Control the Pandemic Diseases. IEEE Access 2021, 9, 68461–68471. [Google Scholar] [CrossRef]

Figure 1. Concept of applying the proposed low-cost moving object avoidance system.

Figure 2. Proposed hardware setting for the low-cost moving object avoidance.

Figure 3. Approximation and transformation of background change between two viewpoints with projective transformation.

Figure 4. Projective transformation approximation of two consecutive video frames.

Figure 5. Original image and background-subtracted result.

Figure 6. Image processing for background subtraction: (a) the original frame, (b) background subtracted frame, (c) median-filtered frame, and (d) dilated and binarized frame.

Figure 7. Clustering procedure of the modified DBSCAN algorithm.

Figure 8. Tracking procedure of the distance-based tracking algorithm.

Figure 9. Illustration of avoidance maneuvers.

Figure 10. Software and hardware integration for the proposed system.

Figure 11. Moving object detection based on background subtraction.

Figure 12. The results of moving object detection and tracking.

Figure 13. The comparison of fps processing performance with other moving object detection systems.

Figure 14. Demonstration of the in-flight collision avoidance in ROS Gazebo simulation environment.

Figure 15. Demonstration of emergency stop 1.

Figure 16. Demonstration of emergency stop 2.

Figure 17. The trajectories of the drone and the object.

Figure 18. Demonstration of avoidance maneuver 1.

Figure 19. The trajectories of the drone and the object for avoidance maneuver 1.

Figure 20. Demonstration of avoidance maneuver 2.

Figure 21. The trajectories of the drone and the object for avoidance maneuver 2.

Figure 22. Detection, recognition, and tracking results from UAV′s point of view.

Table 1. Operational variables for the in-flight collision avoidance.

Algorithm 1 Independent moving object detection
Parameter	Value
$i m a g e r e s o l u t i o n$	424 × 240
$b o u n d a r y r e g i o n s i z e$	70 × 70
$N u m b e r$ of regions	10
$m e d i a n f i l t e r s i z e$	5 × 5
$d i l a t i o n f i l t e r s i z e$	5 × 5
$b i n a r i z a t i o n t h r e s h o l d$	40
Algorithm 2 Independent object recognition
Parameter	Pixel
$s i z e T h r e s h$	25 px
$d i s t T h r e s h$	100 px
Algorithm 3 Independent object tracking
Parameter	Value
$d i s t T h r e s h$	150 px
$n o i s e T h r e s h$
Algorithm 4 Decision-making for avoidance maneuvers
Parameter	Value
$R_{S a f e W i n d o w}$	1 m × 1 m × 1 m
$d_{a v o i d T h r e s h}$	1 m
$v_{m a x}$	3 m/s
$d_{e m e r g e n c y T h r e s h}$	1 m

Table 2. The detection results and avoidance performance metrics.

Emergency Stop
Category	Value
Vehicle speed	1.2 m/s
Obstacle speed	5.9 m/s
Relative speed	6.0 m/s
Time between detection and recognition	0.09 s
Minimum distance to obstacle	0.72 m

Table 3. The detection results and avoidance performance metrics.

Avoidance Maneuver 1
Category	Value
Vehicle speed	1.2 m/s
Obstacle speed	5.7 m/s
Relative speed	6.7 m/s
Time between detection and recognition	0.1 s
Minimum distance to obstacle	0.54 m
Avoidance Maneuver 2
Category	Value
Vehicle speed	1.2 m/s
Obstacle speed	5.6 m/s
Relative speed	5.9 m/s
Time between detection and recognition	0.11 s
Minimum distance to obstacle	0.51 m

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Park, J.; Choi, A.J. Vision-Based In-Flight Collision Avoidance Control Based on Background Subtraction Using Embedded System. Sensors 2023, 23, 6297. https://doi.org/10.3390/s23146297

AMA Style

Park J, Choi AJ. Vision-Based In-Flight Collision Avoidance Control Based on Background Subtraction Using Embedded System. Sensors. 2023; 23(14):6297. https://doi.org/10.3390/s23146297

Chicago/Turabian Style

Park, Jeonghwan, and Andrew Jaeyong Choi. 2023. "Vision-Based In-Flight Collision Avoidance Control Based on Background Subtraction Using Embedded System" Sensors 23, no. 14: 6297. https://doi.org/10.3390/s23146297

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Vision-Based In-Flight Collision Avoidance Control Based on Background Subtraction Using Embedded System

Abstract

1. Introduction

2. Methodology of Moving Object Detection from a Moving Camera

2.1. Projective Transformation

2.2. Feature-Point Matching between Images with an Optical Flow

2.3. Moving Object Detection from a Moving Camera

3. Methodology for Object Recognition, Tracking, and Decision-Making

3.1. Object Recognition

3.2. Object Tracking

3.3. Decision-Making for Avoidance Maneuvers

4. Performance Validation of In-Flight Collision Avoidance

4.1. System Setting for In-Flight Collision Avoidance

4.2. Validation of Moving Object Detection and Tracking

4.3. Validation of In-Flight Collision Avoidance

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI