1. Introduction
The development of dairy farming is closely related to animal husbandry and even agriculture. It not only has more connections with other industries, but also has the advantages of grain saving and high efficiency. Compared with other countries, the development of dairy farming in China is relatively late. However, after years of accumulation and rapid development, China’s dairy farming industry has become an important part of animal husbandry, ranking at the forefront of the world in terms of total milk production and farming scale [
1]. The development status of dairy-related industries can be used to measure a country’s level of agricultural modernization [
2]. Benefitting from the development of dairy farming, in 2020, China’s total milk production reached 35.3 million tons, an increase of 4.3 times compared with 2000. It is undeniable that China’s dairy farming industry plays a key role in increasing the supply of dairy products, meeting the consumption needs of residents’ dairy cows, optimizing the rural industrial structure and increasing the income of dairy farmers [
3].
The increase of dairy farming enterprises is the inevitable result of the development of dairy farming and related industries. In the traditional dairy farming process, forage is usually laid beside the dairy cow pen but, according to the feeding habits of dairy cows, only the closest forage is eaten. Not only that, in the process of eating, part of the forage will be gradually pushed to the outside by the dairy cows, which will lead to the spoilage of uneaten forage due to accumulation for an extended amount of time. This requires external intervention to push all-weather feed to the dairy cow pen in time to ensure the normal feeding of dairy cows. The current solution is to use manual or manually driven pushing equipment to push the outer forage into the feeding area of the dairy cow. As a result, breeding enterprises need to arrange more labor or equipment to push the forage [
4]. Relying on manual labor is not only labor-intensive, but also makes it difficult to complete the feeding work in a timely and stable manner. If the feeding is not pushed in time at night, the milk production of the dairy cows will be reduced. Therefore, many types of pusher robots have emerged. These pusher robots mainly use magnetic induction technology to realize the self-propelled operation of the robot. For example, Shen used the magnetic induction navigation method to determine the walking route of the robot, completed the motion control through PID control, and detected the pushing resistance and output thrust in real time [
5]. This not only solved the defects of the high work intensity and untimely feeding at night, but also realized the full automation and intelligence of the feeding process. The drive motor has a high transmission precision and fast response speed. However, motors are sensitive to external disturbances and load changes. The control accuracy of traditional control methods (such as PID control) is based on the accuracy of the mathematical model of the controlled object [
6,
7]. In practical industrial applications, the variation of disturbance and load is inevitable; it is almost impossible to establish an accurate mathematical model of the controlled object [
8]. When the drive motor works in an uncertain environment and needs to maintain a high precision, traditional control methods cannot satisfactorily meet the control requirements of linear motors. Therefore, a control method that can change the control parameters in real time is required.
On the other hand, under the mature application of image processing, embedded, electromechanical control and other technologies, visual signals stand out due to their advantages of a large amount of information and strong applicability [
9,
10]. Machine vision has become an important direction for the development of autonomous robot operation technology. Agricultural robots such as pusher robots have also begun to develop in the direction of intelligence, becoming a research hotspot in self-propelled robots and precision animal husbandry. In recent years, researchers at home and abroad have carried out related studies on the extraction of navigation paths for agricultural robots based on vision. Chen et al. proposed a new algorithm for the fitting of a navigation path for greenhouse cucumber-picking robots. The proposed prediction point, the Hough transform, fits the navigation paths with an average error less than 0.5°, which is 10.25° lower than the average error of the least-square method [
11]. Kim et al. proposed a novel path detection approach for orchards using CNN-based partial area classification. The approach involves generating a path score map by using a four-layer CNN for tree and path classification, path area segmentation and target path detection by using boundary line determination [
12]. Whether in a greenhouse sprayer [
13], rubber-tapping robot [
14] or tomato harvesting robot [
15], machine vision was used in navigation path extraction and tracking control research. The research and development of various agricultural robots with autonomous navigation and obstacle avoidance functions can complete various automated operations, improve the operational efficiency of agricultural labor and reduce overall production costs. Therefore, in this study, the visual navigation method will be applied to develop a pusher robot for dairy farms.
At present, the existing research at home and abroad has solved the problem of navigation path extraction in some agricultural scenarios, but the working environment of dairy farms is rarely mentioned. In this study, a new type of machine vision system was developed to fill this gap. The system will be used for the extraction and tracking control of the working path of the pusher robot. The binocular vision sensor was used as the image acquisition device, and only binary images were used in the algorithm, thus reducing the computational load of the system. A method for extracting the working path of a pusher robot based on binocular vision proposed in this study is summarized as follows: Firstly, the camera mounted on the robot acquired the image of the working area in front of the robot in real time and preprocessed the image by means of sky removal, noise removal and grayscale. Then, the preprocessed image was segmented, and, at the same time, the front fence, feed belt and ground data were collected. Finally, the navigation path was obtained by extracting the characteristics of the feed belt. The system designed in this study could autonomously generate accurate navigation paths for robots in a dynamic farm environment, which will provide a technical reference for the autonomous navigation of farming robots and the development of precision animal husbandry.
This paper is organized as follows:
Section 2 details the materials and methods employed to achieve the research objective. In
Section 3, the experimental results and discussion on the proposed technique are presented. Finally, in
Section 4, the conclusion and future work is provided.
2. Materials and Methods
2.1. The Composition of the Pusher Robot System in the Farm
The pusher robot was mainly composed of a vehicle navigation hardware system, pushing operation system and navigation and operation control system. Among them, the vehicle navigation hardware system and the pushing operation system were the specific execution systems of the instructions, which were responsible for receiving and executing the task instructions issued by the control system to complete the autonomous navigation and pushing operation. The navigation and operation control system was responsible for setting the working mode of the vehicle system, issuing target point instructions, displaying the robot position in real time and controlling the pushing operation system. Through the fusion and analysis of various sensor information, the pusher robot could realize autonomous positioning and navigation in the natural environment. With adaptive sensor fusion technology, it could independently complete the automatic feeding function of the robot in the dairy farm without any environmental assistance.
The hardware device and the control system communicated in real time through the wireless network to complete the autonomous navigation and operation tasks of the pusher robot in the farm together, as shown in
Figure 1. The vehicle navigation hardware system mainly included robot chassis, drive module, control module, environmental information perception module, communication module and power supply module.
According to the task of the farm operation and the needs of the environment, the driving module of the robot adopted a two-wheel differential drive structure, and the steering control of the robot was realized by setting different speeds for the two driving wheels. The front wheel adopted a two-wheel differential drive structure, and the rear wheel adopted a universal wheel. This drive system was not only simple in structure and small in turning radius, but also more flexible in movement, which greatly improved the control accuracy of the whole machine. The power system was provided by 60 V lithium battery modules. In order to ensure that the robot had powerful computing functions, the main control unit used an NVIDIA Jetson Nano development board equipped with Tegra X1 heterogeneous SoC; the size of this unit was 100 mm × 80 mm × 29 mm. The basic framework of the robot operating system navigation was built under the Ubuntu 18.04 system, and information was exchanged with the chassis by means of RS-485 communication. The ZED 2 binocular camera of Stereo Labs was used to build the environmental information perception module. The horizontal field of view was 90° and the vertical field of view was 60°. It supported a maximum frequency of 15 Hz to output a binocular image of 4416 × 1242 (pixels). The generated signal was transmitted to the main control unit via USB3.0. The car was equipped with an STM32F103 embedded motherboard as the underlying controller. According to the speed information provided by the encoder, the odometer data (moving speed, driving distance and turning angle) of the vehicle system were obtained through kinematics calculation. Finally, the control of the vehicle-mounted system and the pushing operation system was completed through the control algorithm.
2.2. Motion Modeling and Control
In this study, a two-wheel differential drive was used for the steering control, so as to realize the autonomous movement of the robot in the farm. During steering control, steering was achieved by changing the speed difference of the servo motors on both sides used to drive the wheels. The differences between the servo motor speeds caused the vehicle to deviate from the desired direction, and the yaw angle varied with the magnitude of the speed difference. Likewise, if the speed of the servo motors on both sides was equal, there would be no direction change. The robot could keep driving in a straight line without lateral deviation.
2.2.1. Motion Modeling
Without considering the slippage and sideslip of the crawler car [
16], the motion model of the robot is shown in
Figure 2. Firstly, we established the world coordinate system
XwOYw:
Xw represented the horizontal direction, O represented the origin of the world coordinate system, and
Yw represented the direction perpendicular to the
Xw axis. Then, the kinematics model of the pusher robot was established in the coordinate system. The rectangle represented the chassis of the robot, the two small black squares represented the two driving wheels of the robot, and the circle inside the rectangle represented a universal wheel. Point C was located in the center of the two driving wheels, which was also the center of mass of the entire robot chassis. Point C coordinates were set to C (
x,
y). The distance between the two wheels was represented by
L, and the radius of the wheel was represented by
r. The robot’s own local coordinate system (
XrCYr) was established with point
C as the reference point of the robot, where
Xr pointed to the full right side of the moving robot’s forward direction, and
Yr indicated the forward direction of the mobile robot.
θ was the angle between the forward direction of the robot [
17] and the world coordinate system
Xw, which represented the pose angle of the robot.
vl,
vr,
wl, and
wr represented the linear and angular velocities of the left and right wheels.
v,
w, and
OC represented the instantaneous velocity, angular velocity and instantaneous center of the robot motion, respectively.
The robot’s pose is expressed as:
The motion of the robot in the world coordinate system was mapped to the robot coordinate system by the orthogonal rotation matrix
R(θ) [
18]:
The motion equation of the robot’s center of mass in the world coordinate system was as follows:
From this, the kinematics model of the robot was as follows:
It can be seen from Formula (5) that the pose of the robot was determined by the linear velocity and the angular velocity, and, according to the kinematics principle, the velocity at point C was as follows:
The pose vector of the mobile robot:
It can be seen from Formula (7) that the pose state of the robot at each moment was obtained by using the angular velocity of the driving wheels on both sides of the robot at the corresponding moment.
2.2.2. Motion Control
The lateral deviation and angular deviation of the robot was obtained through the fitting of the cowshed fence, the extraction of the operation route and the determination of the robot’s pose state [
19]. The motion deviation of the pusher robot is shown in
Figure 3. During the working process of the pusher robot, the linear speed was related to the angle deviation. When there was no angular deviation, the actual line speed moved according to the set line speed; when there was an angular deviation, within a certain range, the larger the angle deviation, the smaller the actual line speed should be, so as to avoid the robot deviating further and further away. The actual angular velocity was related to lateral deviation and heading deviation [
20]. When there was lateral deviation or angular deviation, the mobile robot would generate a corresponding angular velocity to eliminate the deviation; when both the lateral deviation and the angular deviation were zero, the actual angular velocity of the mobile robot was zero.
As shown in
Figure 4, the PID control algorithm is usually composed of three units, namely a proportional control unit, integral control unit and differential control unit. The
e(
t) is the deviation between the actual value and the given value, and it is also the input value of the PID control algorithm. The expression of
e(
t) is shown in Equation (8).
where, r(
t) is the actual value transmitted to the system;
y(
t) is the actual control value of the system; and
e(
t) is the deviation between the actual value and the expected value of the system.
After the linear combination,
u(
t) is output as the output value, and it is used as the input value of the control object. Therefore, the expression of the PID control algorithm is shown in Equation (9).
where,
t is the sampling time;
kp is the proportional coefficient of the controller;
ki is the integral coefficient of the controller; and
kd is the differential coefficient of the controller.
The transfer function of Equation (9) is of the form:
The fuzzy control algorithm can adjust the PID parameters according to the difference of the system [
21,
22]. However, the initial PID parameters need to be adjusted. This makes it difficult to make the system perform optimally. Therefore, particle swarm optimization is used to optimize the parameters of the PID and find the optimal parameters. This makes the system balanced and more responsive.
The particle swarm optimization algorithm is a random search algorithm with group collaboration as the core [
23]. As shown in
Figure 4, in this algorithm, the particles in the search space represent the solution to the problem, and these particles form a population. The direction and distance a particle flies is determined by its velocity. The particle flies in the solution space along the optimal solution, and when a certain number of iterations is reached, the optimal position vector of the particle is the optimal approximate solution of the problem. The particle swarm optimization has the advantages of simple iteration and fast convergence, and is suitable for parameter optimization of control systems.
The process of optimizing the fuzzy PID controller with the particle swarm optimization algorithm is shown in
Figure 5. Assuming that K particles are initialized in the D-dimensional solution space, each particle has its own spatial position
Xm = (
Xm1,
Xm2, …,
XmD) and its own flying speed
Vm = (
Vm1,
Vm2, …,
VmD), where
m = 1, 2, 3, …,
K. Each time the particle iterates in the solution space, it optimizes its own solution through the optimal solution
Pbestm = (
Pm1,
Pm2, …,
PmD) of the individual particle and the optimal solution
Gbestm = (
G1,
G2, …,
GD) of the particle population’s speed and position. The update formulas for particle velocity and position at the
Nth iteration are shown in Equations (11) and (12).
where,
represents the velocity of the
mth particle,
XmD represents the position of the mth particle,
c1 and
c2 represent the learning factors,
r1 and
r2 represent the learning factor random number function whose value ranges from [0, 1], and
N represents the number of iterations.
2.3. Navigation Path Extraction Algorithm
As shown in
Figure 6, after a dairy cow had eaten for a period of time, a large portion of the forage would be pushed by the dairy cow to the area where it cannot be eaten, which made the stacking shape of the forage become irregular, and the stacking thickness become uneven. Therefore, the pusher robot was operated along the outer edge of the area where the dairy cows could not eat the forage, and the forage in this area was pushed to the area where the dairy cows could reach. The outer boundary of the no-feed zone was the tangent to the outermost forage and was parallel to the fence. As shown in
Figure 7, during the entire operation process, the centerline of the pusher robot was kept coincidental with the outer boundary of the no-feed zone.
By analyzing the above operation process, it was found that the navigation path of the pusher robot was the outer boundary of where the cows were unable to eat. The navigation path extraction process is shown in
Figure 8. The extraction of the navigation path in the RGB image collected by the ZED 2 camera was done through the Jetson Nano development board. Firstly, the camera mounted on the robot would acquire the RGB image of the working area in front of the robot in real time, and preprocess the image by means of sky removal, noise removal and grayscale. Then, the preprocessed image was segmented and, at the same time, the front fence, feed belt and ground data were collected; Finally, the navigation path was obtained by extracting the characteristics of the feed belt.
2.4. Image Acquisition
Image data were collected at the Jinlan Dairy Farming Company (Tai’an, Shandong, China). The image acquisition time was from 7 September to 10 September 2021. The company currently had 6 cowsheds (70 m and 30 m in length and width), with a total of 2000 high-quality Holstein dairy cows. A ZED 2 camera was installed at the top front end of the robot at a height of 1.2 m above the ground to collect RGB images of the cowshed environment on the robot’s driving path. In order to reduce the calculation amount of image processing and meet the requirements of reducing the computing power of the development board, the SDK provided by the manufacturer was deployed on the Jetson Nano development board, and image acquisition was performed at a frame rate of 2 fps and a resolution of 1280 × 720. A total of 400 cowshed images were acquired throughout the whole process, as shown in
Figure 9, and the acquisition period was 8:00–18:00. The image set covers different lighting situations (front, back, side, etc.) and road environments (inside the barn, side of the barn and between the barns).
2.5. Image Preprocessing
Raw images taken directly from a cowshed were always affected by a variety of factors, including changing backgrounds, lighting and occlusions. Therefore, segmentation and path extraction could not be performed directly on the original image. Generally, the original image was preprocessed through a series of algorithms to convert it into a form that was convenient and fast for the robot to process [
24,
25]. Image preprocessing may involve one or several steps, depending on the nature of the original image. In this work, we employed two techniques (image transformation [
26] and smoothing [
27]) to perform sky removal, noise removal and grayscale operations on the original image.
2.5.1. Sky Removal
During the working process of the robot, there were generally feed belts and dairy cow pens beside the road, and the background was generally the sky and the roof outside the dairy cow barn. After identifying the areas on both sides, the road and sky range in the middle could be obtained. Then, road feature information and navigation paths could be extracted by sky removal. Therefore, the image transformation should be utilized to remove the sky in the background element. OpenCV was used to read the RGB values of the sky area. The sum of squared deviations (SS) of the RGB values of the sky (252, 240, 200) and other elements were calculated through the threshold processing function, as a measure of the similarity between the current pixel and the background, as shown in Equation (13).
where,
p0,
p1 and
p2 were the G, B and R values of the other elements in the image, respectively;
SC0,
SC1 and
SC2 were the G, B and R values of a point in the sky, respectively
Combined with the actual environment of the cattle farm, the threshold was determined to be 9000 after many tests. The test results showed that more pixels were covered and a better processing result was obtained. Filling the sky color with black (0, 0, 0) eliminates unnecessary image factors, which was beneficial to subsequent image segmentation processing and reduced computational costs. For each GBR image pixel, the values of the
B,
G and
R channels were converted to the same value, and the grayscale effect was observed. As shown in Equation (14), in this study, the average value method was used to average the
G,
B, and
R values of each pixel on the original image, replace the original pixels and convert the original image to a grayscale image. The processing effect is shown in
Figure 10.
where,
B,
G and
R are the values of each component of the RGB image from which the sky was removed; Gray is the value to be replaced; and
B1,
G1 and
R1 are the values of the replaced components.
The K-means clustering is also often used for such problems. The K-means algorithm is also more widely used in practice due to its good scalability and low complexity. During the task of the sky removal of background elements, the number of clusters can be determined as 2. A total of 50 images were randomly selected from each of the RGB images acquired during the work outside and inside the barn and tested using the K-means algorithm. The average success rate of clustering by K-means was 79%, in which 47 of the 50 images outside the barn were successfully clustered, and 32 of the 50 images inside the barn were successfully clustered.
The similarity measure between classes is calculated by the discrete sum of squares method, which is slow due to the need to calculate the inter-class distance in each clustering step; the K-means clustering algorithm is sensitive to the initial clustering center and cannot specify a more accurate initial clustering center, which easily falls into the local optimal solution in the face of complex situations and can easily remove the road elements by mistake. Considering the above situation, we used the following algorithm for the sky removal operation.
The global threshold algorithm used the threshold segmentation method of threshold de-binarization on the entire image to segment the pixels from the image [
28]. The RGB images inside and outside the barn were randomly selected for a segmentation test, and the maximum value in the threshold segmentation was 255. The key to the global threshold segmentation was to choose the specified value. The test results showed that when the threshold was set to 210, the sky could be effectively segmented by the scene inside the cowshed, and the road surface could be preserved. The processing effect is shown in
Figure 11.
2.5.2. Image Noise Reduction
In order to make the image brightness smooth and gradual, the sudden gradient must be reduced, the wide area, low frequency and main part of the image highlighted, and the impulse noise of a given image suppressed for the image to be smoothed. The median filter method was used in this study, and its basic principle was to replace the value of a certain point in the digital image with the median value of each point value in the neighborhood of the point, which would eliminate isolated noise points by making nearby pixel values close to the true value. For noise points, an effective median filter was selected for processing, which could remove the noise points without destroying the information of other normal image points, and at the same time retain the boundary information [
29]. The choice of kernel size directly affected output quality and processing time. The experimental results showed that the output quality of the 3 × 3 core was also satisfactory without increasing the processing resources. Therefore, image regions were processed to remove noise by median filtering of a 3 × 3 kernel neighborhood. The transformation and smoothing effects of the image are shown in
Figure 12.
2.6. Image Segmentation
2.6.1. Gabor Filter
Depending on the problem to be solved and the content of the images, segmentation could be done in a variety of ways. However, considering the complex background of the cowshed, a Gabor filter was used to extract the texture features of the cowshed environment. The Gabor filter is robust enough to image brightness and contrast changes as well as pose changes, and it expresses the most useful local features for image recognition. Therefore, Gabor filters are widely used in texture segmentation, edge detection and content detection [
30]. From a biological point of view, the Gabor filter is very suitable for the expression and separation of texture, and its frequency and direction expression is similar to the human visual system. In previous studies, no one has attempted to use this technique in a cattle barn environment, especially in the study of pusher robot navigation.
In the spatial domain, a two-dimensional Gabor filter is the product of a sine plane wave and a Radial Basis Function (Gaussian kernel). The former is a harmonics function and the latter is online analytical processing. The following is the complex number expression of the two-dimensional Gabor function.
where,
x,
y represent the original pixel coordinate position, and
x’,
y’ are the pixel coordinate positions after Gabor transformation; λ is the wavelength—the wavelength directly affects the filtering scale of the filter, and its value is specified in pixels;
θ is the direction of the filter—this parameter specifies the direction of the parallel stripes of the Gabor function, and its value was 0–360°;
Ψ is the phase offset of the tuning function;
γ is the spatial aspect ratio, which determines the shape of the filter; and
σ is the bandwidth, which is the variance of the Radial Basis Function. The relationship between standard deviation and wavelength was
σ = 0.56
λ. The smaller the bandwidth, the larger the standard deviation, the larger the Gabor shape, and the greater the number of visible parallel stripes.
2.6.2. Principal Component Analysis
For each pixel, the Gabor filter extracted 40 amplitude features using 40 different filters at 5 scales (λ = 2, 3, 4, 5, 6) and 8 directions (
θ = 0, π/8, π/4, 3π/8, π/2, 5π/8, 3π/4, 7π/8). A barn image has 1280 × 720 × 40 = 36,864,000 dimensional features, and it is more difficult to directly characterize such high-dimensional features; therefore, in order to minimize the computational requirements during processing, the principal component analysis (PCA) technique is used. The PCA technique is a common method of data analysis, is often used for dimensionality reduction of high-dimensional data and can be used to extract the main features of the data [
31]. The principle of PCA is to reduce the dimensionality of a dataset with a large number of correlated variables while preserving as much of the variation present in the dataset as possible. PCA eliminates redundancy by transforming several interrelated variables to generate a new set of orthogonal features, called principal components.
In this work, we implement the PCA algorithm based on the eigenvalue decomposition covariance matrix. For an input of 36,864,000-dimensional data, the eigenvalues and eigenvectors of the covariance matrix are calculated by subtracting the respective mean values for each part of the features, selecting the largest 500 of them, and forming the eigenvector matrix P with its corresponding 500 eigenvectors as row vectors, respectively. PX is our mathematical expression after dimensionality reduction, and the expression is as follows.
where,
X represents the input n-dimensional vector,
n = 36,864,000, and
D(
X) represents the covariance matrix of the n-dimensional vector.
The Gabor filter data were run through a PCA algorithm that identifies the principal components used as texture descriptors representing the cow field paths, thus allowing a more natural segmentation and reducing the dimensionality of the data.
2.6.3. Images Segmented by Global Thresholding
The next step was to perform a threshold binarization process with a maximum value of 255 and a threshold value of 80, which could extract fences and road surfaces, as shown in
Figure 13.
The thresholding method occupies a very important position in image processing, and the selection of the correct threshold is the key to successful segmentation. Generally speaking, the thresholding methods we can choose from are basic global thresholding, which uses the idea of iterative thresholding, and optimal global thresholding using Otsu, which uses the idea of variance maximization; these are often chosen by histogram. When the separation between the wave peaks in the image histogram is very obvious, the basic thresholding method is often a good solution to the problem, and the farther the separation between the wave peaks, the greater the chance of separation. In the image pre-processing stage, we also perform image transformation and smoothing, and the image grayscale histogram before and after processing is shown in
Figure 13.
As shown in
Figure 14, after image pre-processing, the image grayscale distribution was smooth and uniform, and we quickly found the suitable threshold value after testing. When the threshold value was set to 80, the extraction of cattle fence and path could obtain a better segmentation effect for the acquired image. When the threshold value was less than 80, the road surface and the feed strip could not be distinguished, and when the threshold value was greater than 80, the road surface could not be segmented completely.
The feature extraction of feed belt was carried out in the cowshed, as shown in
Figure 15.
2.7. Image Processing
Extracting regions of interest (ROI) from images is a research hotspot in the field of digital image processing. Therefore, in the image processing stage, the ROI was used to extract the coordinates of the robot’s navigation path. When the pusher robot transferred from barn to barn, it needed to find the outline of the path. The specific processing was: it took the edge point set of the path contour, and then obtained the moment of the contour, so as to obtain the contour of the path, as shown in
Figure 16a. The center coordinates of the contour can be calculated from the known moments, as shown in
Figure 16b. The center coordinates are the navigation coordinates. When the pusher robot extracted front images at a frame rate of 2 fps, the midpoints of these images were extracted and connected to form a line, and the navigation path could be obtained.
When the pusher robot pushed the material in the cowshed, it needed to work along the outermost area of the area where the dairy cows couldn’t eat the forage (the shape was irregular), and pushed the forage into an area where the dairy cows could eat it. Therefore, it was necessary to keep the centerline of the robot coincidental with the outermost boundary of where the cows were unable to eat during the entire operation. Considering the computing power and working efficiency of the system, it was not realistic to build a 3D model for each RGB image. In this work, the information that can reflect the three-dimensional space was still extracted from the two-dimensional image. As shown in
Figure 17, in path extraction, the image was first processed for boundary extraction with the help of mathematical methods. The first step was the erosion operation, i.e., the internal points in the feed band were preserved by a 3 × 3 structural element, the obtained internal points were the erosion image, and the erosion image was operated with the original feed band to find the complementary set, i.e., the feed band boundary was obtained. After extracting the contour of the feed band, the contour was analyzed in the connected domain, the 8 pixel points around each pixel were analyzed, and the seed-filling connected domain labeling algorithm was used to separate the foreground target of interest from the image background in the image, so that the relevant image processing operations could subsequently be performed on the foreground target of interest. The main idea of the seed-filling-based tagging algorithm is to select a pixel point in the image as the seed for the start of the algorithm traversal, and then merge the eligible pixel points into the set according to the specified neighboring relationship, and after the traversal, all the pixel points in the set belong to the same connected region. In the two-dimensional image, the straight-line envelope of the irregular feeding belt contour boundary could achieve the processing effect. The minimum envelope triangle was found by the above connectivity domain analysis (three sides contained all the points in the set) and ensured that the feed was within the pushing range, which could better reflect the three-dimensional space. Compared with the traditional polygon and convex hull processing contours, the path obtained by the minimum circumscribed triangle was shorter and the work efficiency was higher.