Crop Identification and Growth Stage Determination for Autonomous Navigation of Agricultural Robots

Cortinas, Eloisa; Emmi, Luis; Gonzalez-de-Santos, Pablo

doi:10.3390/agronomy13122873

Open AccessArticle

Crop Identification and Growth Stage Determination for Autonomous Navigation of Agricultural Robots

by

Eloisa Cortinas

,

Luis Emmi

^*

and

Pablo Gonzalez-de-Santos

Centre for Automation and Robotics (UPM-CSIC), Arganda del Rey, 28500 Madrid, Spain

^*

Author to whom correspondence should be addressed.

Agronomy 2023, 13(12), 2873; https://doi.org/10.3390/agronomy13122873

Submission received: 16 October 2023 / Revised: 17 November 2023 / Accepted: 20 November 2023 / Published: 22 November 2023

Download

Browse Figures

Versions Notes

Abstract

:

This study introduces two methods for crop identification and growth stage determination, focused primarily on enabling mobile robot navigation. These methods include a two-phase approach involving separate models for crop and growth stage identification and a one-phase method employing a single model capable of handling all crops and growth stages. The methods were validated with maize and sugar beet field images, demonstrating the effectiveness of both approaches. The one-phase approach proved to be advantageous for scenarios with a limited variety of crops, allowing, with a single model, to recognize both the type and growth state of the crop and showed an overall Mean Average Precision (mAP) of about 67.50%. Moreover, the two-phase method recognized the crop type first, achieving an overall mAP of about 74.2%, with maize detection performing exceptionally well at 77.6%. However, when it came to identifying the specific maize growth state, the mAP was only able to reach 61.3% due to some difficulties arising when accurately categorizing maize growth stages with six and eight leaves. On the other hand, the two-phase approach has been proven to be more flexible and scalable, making it a better choice for systems accommodating a wide range of crops.

Keywords:

object detection; precision agriculture; agricultural robots; crop identification

1. Introduction

In the last hundred years, the world’s population has quadrupled. In 1915, there were 1.8 billion individuals inhabiting the planet. Based on the latest UN estimate, the global population stands at 8 billion, with projections suggesting a potential increase to 9.7 billion by 2050 and 10.4 billion by 2100 [1]. Population growth, compounded by growing challenges to global food security, including increasing dependence on animal-based foods, declining water and land resources, and the impacts of climate change, is amplifying the urgent global need for food [2].

The expected demand for food would maintain constant growth over the next 30 years, reaching a 46.8% increase in global request for food crop production in 2050 relative to 2020 [2,3]. Therefore, farmers worldwide need to boost crop production by expanding agricultural land for cultivation or improving productivity on existing farmlands by applying fertilizers and irrigation. However, one of the most promising strategies in recent years is the adoption of innovative approaches like precision agriculture (PA), characterized by utilizing modern information and communication technologies to enhance agricultural productivity and profitability, which has recently garnered significant interest [4]. PA involves technologies that integrate sensors, information systems, sophisticated machinery, and informed decision making and aims to enhance agricultural production by effectively managing variations and uncertainties inherent in agriculture systems.

The inclusion of PA strategies to manage each plant individually has required the use of methodologies based on computer vision and Artificial Intelligence (AI) that have helped (i) to identify the needs of each plant [5], (ii) to avoid damage when carrying out mechanical weed management [6], and, in some cases, (iii) to perform autonomous navigation in the field [7].

This research addresses the critical challenge of enabling mobile robots to discern and classify crops precisely, thus empowering them to make informed decisions regarding crop-specific tasks. To achieve this objective, the state of the art in the relevant techniques is analyzed in the following subsections.

1.1. Significance of Technology and Automation in Agriculture

One of the elements that has shown the most potential to apply advanced PA techniques has been the use of robotic systems, whose usage and incorporation into the crop field have increased in recent years [8]. These robots are produced to perform all types of tasks in the field, such as guidance and mapping, automated harvesting, site-specific fertilization, environmental conditions monitoring, livestock monitoring, pesticide spraying, and precision weed management, among others.

Localization and guidance techniques in agriculture encompass a variety of approaches, such as Extended Kalman Filter (EKF) [9], Particle Filter (PF) [10], and Visual Odometry (VO) [11], among others, which have enabled robots to determine their position within the agricultural environment. Mapping applications involve the creation of maps using techniques like metric-semantic mapping, Light Detection and Ranging (LiDAR) mapping, and fusion of point cloud maps, enabling robots to navigate and interact with the agricultural landscape while facilitating specific tasks like fruit monitoring or weed control [12].

Regarding agricultural harvesting robots, they typically consist of mobile platforms carrying robotic arms [13]. These robots require advanced vision systems, employing adaptive thresholding algorithms, as well as texture-based methods and color shape characteristic extraction, to identify target fruits. The integration of color and depth data, the utilization of reinforcement learning methods [14,15], and the application of deep Convolutional Neural Networks for image segmentation [16] are some of the strategies currently commonly used in the background and foreground. Additionally, these robots rely on human–robot interaction strategies with 3D visualization, systematic operational planning, efficient grasping methods, and well-designed grippers [17].

One of the applications that has shown the most interest in being automated is pesticide spraying [18] and weed management in general [19]. These robots are designed to minimize pesticide wastage by targeting pest-affected areas on plants. Strategies such as Convolutional Neural Network (CNN) have proven to be highly beneficial for accurately identifying pests on crops, ensuring precise pesticide application. Moreover, several autonomous commercial robots have emerged designed for weed removal and precision agriculture tasks in recent years. The Small Robot Company [20] has introduced three robots: Tom digitizes the field, Dick zaps weeds with electricity, and Harry sows and records seed location, reducing the need for chemicals and heavy machinery. EcoRobotix [21], a Swiss prototype, employs computer vision to identify weeds and selectively spray them with a small dose of herbicide, significantly reducing herbicide usage. AVO [22] uses machine learning for centimeter-precise weed detection and spraying, minimizing herbicide volume by over 95% while preserving crop yield. Tertill by Franklin Robotics [23] recognizes and cuts weeds using sensors and operates on solar power. TerraSentia [24] is an agricultural robot designed for autonomous weed detection.

One of the main ingredients that allows the inclusion of mobile robots in the field and the execution of precision agriculture tasks is Global Navigation Satellite Systems (GNSS) [25]. Currently, commercial mobile robots perform tasks where full coverage maps are generally used, such as land preparation and seeding [26]. Many others focus on perennial crops. This is because navigating in arable land with crops in an early stage of growth is challenging, especially if a precise map of the location of each plant is not available, which can create a risk of damaging the plants.

1.2. Machine Vision in Agriculture

In the past five decades, AI has demonstrated its resilience and ubiquity across various domains, and agriculture is no exception [27]. Currently, agriculture deals with numerous challenges, such as crop disease control, management of pesticide use, ecological weed control, and efficient irrigation management, among many others. Machine learning (ML), a subset of AI, has contributed to addressing all of these challenges [28]. AI-based systems, especially those utilizing artificial neural networks, have become reliable and effective solutions for agricultural purposes thanks to their predictive capabilities, parallel reasoning, and ability to adapt through training [29]. They excel in complex mapping tasks when they are provided with a reliable set of variables [30], like forecasting water resource variables [31] or predicting nutrition level in crops [32].

Neural networks have played a pivotal role in driving substantial advancements in machine vision, leading to a remarkable surge in their utilization within agriculture. Machine vision facilitates precise crop identification and assessment, delivering significant benefits [33]. Capturing insights into crop development enables actions in accordance with the agricultural cycle stage. Moreover, it aids in detecting plant pests and diseases, assessing fertilizer requirements, and optimizing irrigation management [34]. To leverage machine vision for identification, close-range or even overhead views are often needed.

Additionally, machine vision offers the valuable capability of accurate crop localization [35]. This feature empowers various on-field tasks, including treatments at distinct crop growth stages and efficient harvesting. Furthermore, having precise crop location data proves advantageous for field navigation, mitigating the risk of crop damage during agricultural operations [25].

Numerous experts within smart agriculture demand the involvement of agricultural robots, encompassing activities like fruit harvesting and crop yield tracking [36]. Consequently, agricultural robots and associated technologies hold a crucial role within the domain of smart agriculture [37]. Among these technologies, automatic navigation stands out as the fundamental and central function of autonomous agricultural robots [38].

In contrast to alternative navigation technologies, machine vision has witnessed growing adoption in autonomous agricultural robots. This preference arises from its merits, including cost-effectiveness, simplicity of maintenance, versatile applicability, and a high degree of intelligence [39]. Recent years have seen a proliferation of novel approaches, technologies, and platforms within the topic of machine vision, all of which have found pertinent applications in agricultural robotics [39]. Li et al. [40] proposed a navigation line extraction method specifically designed to cater to various stages of cotton plant growth, from emergence to blooming. Zhang et al. [41] investigated navigation in greenhouses by employing contour and height data of crop rows. The fusion of these features generated a confidence density image, which aided in determining heading and lateral errors. Experiments conducted in an indoor simulation environment have shown that agricultural robots can maneuver through rows of crops in S and O configurations.

In the agricultural field, image analysis stands as a significant research domain, where intelligent data analysis methods are actively deployed for tasks such as image recognition, classification, anomaly detection, and more across diverse agricultural applications [42].

1.3. Crop Detection and Localization

Numerous algorithms have been devised and tested to facilitate robot autonomous navigation, and new solutions are still being sought [39], many of which use crop row detection as a guide [43]. The challenge in row recognition lies in identifying robust features that remain unchanged in different environmental situations. The complexity of this task is heightened by factors like incomplete rows, absent plants, irregular size and shape of plants, and variability in light intensity, which is one of the significant challenges of computer vision in open environments. Additionally, the presence of weeds within the row can introduce noise and disrupt the row recognition process. These research studies mainly focus on advancing image segmentation techniques to extract orientation cues for crop row applications [44].

When dealing with crops in their early growth stages, characterized by considerable spacing between individual plants, applying segmentation techniques may not be the most suitable approach. While proficient at delineating the entire plant’s surface, image segmentation models face limitations in accurately capturing the morphology of plants during their initial growth phases [45]. Moreover, the presence of weeds and other vegetation can introduce inaccuracies in the results generated by segmentation algorithms. Another drawback lies in the intricacies associated with data preparation for effective model training. In such scenarios, it proves advantageous to employ object detection techniques [46], which facilitate the precise identification of each plant and subsequently enable the crop lines to be estimated. While segmentation models require delineating all plant edges, a task further complicated by the limited availability of open databases for training purposes, object detection models necessitate the identification of bounding boxes around objects of interest [47], simplifying the annotation procedure.

Object detection models have traditionally found primary applications in agriculture for tasks such as fruit detection [48]; however, their distinct advantages over segmentation models, particularly when crops exhibit spacing between individual plants, position them as a compelling alternative [39]. The versatility and accuracy offered by object detection models in identifying and localizing individual plants in such settings underscore their potential utility for various agricultural applications beyond the conventional use cases.

1.4. Object Detection

Object detection algorithms represent a vital computer vision technique within Artificial Intelligence and machine learning. Their fundamental purpose involves identifying and localizing objects within images or video frames. These algorithms play a pivotal role in recognizing multiple objects of interest within a given image while providing crucial information regarding their positions and shapes [28]. Over the past two decades, object detection has seen a remarkable technological evolution, significantly impacting the entire field of computer vision. This evolution has transitioned from traditional detection models in the early 2000s to the profound influence of deep learning, notably with the advent of Convolutional Neural Networks (CNNs) [49].

The introduction of regions with CNN features (RCNN) by R. Girshick in 2014 [47] marked a pivotal moment in the rapid advancement of object detection. In the era of deep learning, object detectors fall into two categories: “two-stage detectors” and “one-stage detectors”. The former follows a “coarse to fine” approach, while the latter aims to “complete in one step”. Examples of “two-stage detectors” include RCNN, SPPNet (Spatial Pyramid Pooling Network), Fast RCNN, Faster RCNN, and Feature Pyramid Networks (FPN). On the other hand, “one-stage detectors” encompass models like You Only Look Once (YOLO), Single Shot MultiBox Detector (SSD), CenterNet, and DEtection TRansformer (DETR) [49,50].

YOLO has emerged as a widely adopted and influential algorithm in object detection. Its distinctive features include model compactness and rapid computation speed [51]. YOLO gained prominence by introducing its first version by Redmon et al. in 2015 [52] and has since seen several subsequent versions published by scholars. Comparisons of object detection open-source models, such as the one created by B. Jabir et al. [53], have underscored the speed and lightweight nature of YOLO v5.

Generally, the Mean Average Precision (mAP) metric is commonly used to evaluate the performance of any trained model. mAP is a standard evaluation metric widely employed in machine learning models and benchmark challenges like Pattern Analysis, Statistical Modelling, and Computational Learning Visual Object Classes (PASCAL VOC) [54]; Common Objects in Context (COCO) [55]; ImageNET (database for visual object recognition software) [56]; and the Google Open Images Challenge [57]. mAP offers a comprehensive assessment of a model’s performance, mainly in tasks involving multi-class or multi-label classification, such as object detection and image segmentation, by considering precision and recall across multiple classes or labels.

To determine the success of object detection models, it is crucial to establish the level of overlap between the bounding box and the ground truth data. One way to determine this is through the use of intersections over unions (IOU), where mAP50 refers to the accuracy level where IOU equals 50%. Thus, detection is successful when there is more than a 50% overlap. In evaluating the results in this article, all mentions of mAPs are mAP50, with IOU being set at 50%.

1.5. Crop Identification

Crop identification using computer vision is a research field that merges image processing, machine learning, and agronomy practices to recognize diverse crop types from images taken mainly by color cameras. The objective focuses on facilitating the management and protection of crops.

Computer vision originated in the 1960s [58] when early studies acknowledged the necessity of aligning two-dimensional image features with three-dimensional object representations, which were applied to pattern recognition systems. Since then, significant progress has been made by using the results in many sectors, among which agriculture stands out, where it has been applied to the recognition of plants with significant results. In the last decade, some works have implemented the identification of plants and their position, relying on the plants’ outline characteristics [59]. CNNs have also been used for soybean image recognition [60]. Other methods have mixed deep CNNs and binary hash codes for field weed identification [61].

Moreover, some methods have focused on identifying color characteristics to discern between sugar beet plants and weeds, reaching accuracies more significantly than 90% [62]. Further work on sugar beet recognition [63] reports the utilization of local features such as speeded-up robust features (SURF), scale-invariant feature transform (SIFT), and twin leaf region (TLR) in place of characteristics that describe the complete plant outlines. The method used SIFT to define points of interest that were previously found by applying the Hessian–Laplace detector [64]. This method can distinguish thistles and sugar beets with an accuracy close to 100%.

Significant advances have also been carried out to identify maize in weeding tasks. This crop is challenging to identify because it has thin and sparsely distributed leaves [65], which hurts calculation time and the robustness of the algorithms [66,67]. In some recent proposals [68], the primary emphasis is on comprehensive image processing and centroid detection strategies. The images are first subjected to processing to separate the green color using the RGB color mode and Otsu’s threshold method [69]. Subsequently, the positioning of the maize plants is computed with the pixel projection histogram.

One challenge in this research is the constrained capacity to differentiate among various plant species. This suffices to locate crop plants for weeding, especially when using mechanical tools, where everything that is not a crop is destroyed. However, in weeding tasks that require distinguishing each weed species to apply a specific process (herbicides, laser, etc.), they need innovative techniques to manage a significant variety of plant species.

1.6. Determination of Crop Growth Stage

Until a few years ago, the estimation of the growth stage of crops using computer vision techniques presented low precision rates with the limitation that the experiments reported in the literature were carried out with very few images or were limited to very few growth stages [70]. For example, a study was reported in [71] in which two growth stages of maize (emergence and three leaves) were estimated using an image segmentation method combined with affinity propagation clustering to classify only two growth stages and training with a few samples. With these restrictions, the algorithm achieved a classification accuracy greater than 96%.

Other methods tested have been regression analyses from 2D images to model rice panicles [72]. This study focused only on the last growth stage, achieving an accuracy greater than 90%. At the same time, color histograms and a Support Vector Machine (SVM) classifier were studied to estimate four different stages of rice growth using RGB images [73].

Regarding maize, a previous study examined early growth (6 days) using RGB images [74]. The process involved taking images digitally, changing the RGB level to grayscale, cropping the image, and then calculating plant growth using regional growth. From the study results, regional growth can be used to estimate plant length growth with length and time parameters.

Advances in machine learning, especially deep CNN processing, now allow crop growth to be estimated with reasonable precision based on images. One applied method [75] uses low-level image feature extraction schemes, scale-invariant, mid-level representation, and an SVM to learn and classify wheat growth stages. The work focuses on estimating only two growth stages for six wheat varieties.

Recently, studies have been reported [70] on the classification of cereal growth stages using artificial vision with three characteristics: (i) it covers many growth stages, (ii) uses deep CNNs for estimating the growth stage of wheat and barley, and (iii) uses transfer learning. The study is also extendable to other cereal crops.

Despite these advances in identifying growth states, it is necessary to extend the algorithms to cover more crops and identify more growth states. Therefore, the main objective of this study is to achieve crop type identification and crop growth stage determination for autonomous navigation. Two methods are proposed: a two-phase approach, where, in an initial phase, the primary objective is crop type classification, followed by a subsequent phase dedicated to growth state identification; and a one-phase approach, where a single model seeks to identify both the type of crop and its growth stage. To achieve these objectives, Section 2 presents the two-phase and single-phase approaches. Then, Section 3 presents the experiments and discusses the results. Finally, Section 4 presents the main conclusions.

2. Materials and Methods

The increasing use of mobile robots in agriculture has led to a growing need for efficient object detection models. These models can identify and classify objects in the field, such as crops, weeds, and pests, among many others. This information can then be used not only by the robot to perform more efficient and precise tasks but also by the farmers to make decisions about the management of the field and the crops. Autonomously recognizing the growth status of crops allows farmers to make informed decisions concerning irrigation, fertilization, and pest management. Moreover, detecting signs of disease or pest infestations at an early stage can help prevent their spread and minimize crop damage.

Furthermore, constantly monitoring crop growth allows farmers to assess the risk of weather-related events, such as frost or drought, and take preventative measures.

2.1. Object Detection Models

Along with efficient object detection models, the growing use of mobile robots in agriculture has highlighted the need for an optimized methodology for training this type of model. The presented approach aims to enhance the efficiency of such training procedures. The accuracy of a model can be highly dependent on the number of samples in the dataset. Therefore, it becomes crucial to devise efficient methods for executing entirely automated tasks like crop and growth state identification without compromising accuracy by incorporating additional classes into a model. As outlined, crop identification and growth state determination are pivotal for enabling seamless interactions with fully automated robots. For this purpose, two different methods are evaluated:

Two-phase approach: in the first phase, a multiclass crop model is trained to identify the type of crop, and during the second phase, for each crop type, a multiclass model is trained, focusing on identifying the growth stages (Figure 1).

Single-phase approach: in this method, a multiclass model is trained with all the different crops and growth stages, which may be more susceptible to loss of accuracy depending on the number of images (Figure 2).

A final phase is included for each of the proposed approaches, called the final model, which represents a single-class model trained exclusively with one type of crop and one type of growth state. This final phase is relevant for two purposes: (1) to carry out a comparative analysis of the two presented approaches and (2) to enhance the main purpose of the methodologies presented, which is to enable crop-row following in an early-growth stage. In general terms, the crop grows relatively homogeneously, so once the type of crop and its growth state have been identified, the final model can be helpful to enhance the navigation task by exclusively detecting the crop, avoiding false detections caused due to the presence of weeds, the variability of the terrain, and the lighting.

Implementing the single-phase approach is more straightforward when dealing with a few crops because only one model is trained for crop identification and growth stage determination. In contrast, the two-phase approach needs to train one model to identify the crop types and one more model per crop to determine its growth stage. In the case of maize and sugar beets, for example, three models have to be trained.

The methods for detecting the crops and their growth stage proposed in this study are based on YOLO v5, a very popular algorithm that employs CNN to detect objects in real-time. Given that one of this study’s main purposes is to enable autonomous robot navigation in agricultural environments, YOLO was selected thanks to its inference speed and low computational requirements compared to similar strategies [76,77]. Moreover, YOLO is compatible with the Robot Operating System (ROS) [78], an open-source robotics middleware framework and a set of tools that allow a heterogeneous computer group to operate, and the robotic system used for conducting the experimental work in this study is based on ROS [79].

YOLO has many versions with different characteristics; therefore, the first step in this study was to select the more suitable version for this specific application, taking into consideration both the inference time and the time required for training. It should be noted that this study does not evaluate the performance of different object detectors, but rather the configuration of the methodology to be able to identify different crops with different growth stages. Therefore, by using a single crop in a specific growth state (maize14), the most popular versions of YOLO, from v4 up to v8, were tested. And, after finding the models that obtained the best mAP results, some tests were conducted to study the version behavior depending on the number of images in the training and the inference time. The results are illustrated in Table 1. YOLO v5 resulted in the more suitable alternative considering the combination of results obtained from mAP, the inference time, and the time required for training.

Before presenting the two approaches, the plant growth characterization and the datasets for robot navigation are introduced.

2.2. Crop Characterization

All agricultural tasks must be performed at specific plant life cycle stages. Understanding the developmental cycle of crops is fundamental in agriculture, as it entails a series of dynamic and evolving phenological stages. These stages introduce significant complexity into visual recognition, rendering it particularly challenging. To address this complexity and enhance the capacity for accurate crop identification, it is imperative to account for the various growth stages that crops undergo throughout their lifecycle. To refer to these phenological states, the “Biologische Bundesanstalt, Bundessortenamt und CHemische Industrie” (BBCH) scale is usually employed [80]. The BBCH scale is a widely recognized and standardized system for categorizing the growth and development of plants. The BBCH scale provides a common language for farmers, researchers, and agronomists to communicate about the growth stages of different crops.

The BBCH scale assigns a unique code to each developmental stage of a crop. These codes typically consist of two digits, each representing a specific growth attribute. The first digit represents the principal growth stages, while the second digit represents the sub-stages (secondary stages) or specific developments within that main stage. The principal growth stage for the two crops used in this study is leaf development, which is identified by 1. The secondary stage refers to the number of leaves grown. Table 2 and Table 3 present the phenological growth stages for maize and beet based on the BBCH scale, respectively. Moreover, some graphical examples of sugar beet and maize phenological growth stages are shown in Figure 3a and Figure 3b, respectively.

2.3. Dataset

A key challenge in training object detection models is the need for large annotated image datasets. However, limited publicly available datasets meet the requirements of mobile robot navigation [81]. Most existing datasets are captured from high-angle viewpoints, which are unsuitable for navigation by mobile robots that generally require a high-view camera angle to navigate. Therefore, this work benefits from its own generation of datasets of annotated images that capture a wide range of viewpoints, including high-angle shots.

The dataset was acquired as a part of the European initiative “Sustainable Weed Management in Agriculture with Laser-Based Autonomous Tools” (WeLASER). The core focus of the WeLASER project is to advance precision weeding technology, utilizing a high-power laser source to deliver targeted energy doses to weed meristems. This innovative approach aims to minimize the use of herbicides while simultaneously boosting agricultural productivity [79].

The two crops chosen for this study were maize (Zea mays L.) and sugar beet (Beta vulgaris L.), annual and summer crops commonly grown in Europe. They are well suited for cultivation in Europe, as they both tolerate a wide range of climatic conditions and soil types [82]. Additionally, these two crops can be intercropped, which can help improve crop production and reduce risks of pests and diseases [83,84]. The morphology of both crops changes significantly throughout their growth. However, in the early stages of growth, they may be easily confused with each other by the untrained eye.

This study employed two distinct datasets to focus on crop identification and growth stage classification: (1) the first dataset comprised images of maize plants, encompassing a total of 5 discernible growth stages, and (2) the second dataset featured images of sugar beet plants exhibiting 4 distinct growth stages. Using BBCH notation, the growth stages used for maize were 10, 12, 14, 16, and 18, and for sugar beet, 12, 14, 16, and 18. It is even possible to have plants with different numbers of leaves; for simplicity, they are included in the closest group.

Image acquisition was performed in an experimental field during different periods, located at the Centre for Automation and Robotics in Madrid (40°18′45.166” N, −3°28′51.096′’ W). These images were systematically acquired using a manually operated mobile platform equipped with an RGB camera, resulting in a consistent 2048 × 1536-pixel resolution across all samples [79].

To ensure accurate analysis, a meticulous manual annotation process was undertaken. In the context of developing methodologies for autonomous mobile robot navigation, the primary Region of Interest (ROI) lies within a range of approximately 3 to 4 m from the camera’s position. Within this distance, the system is tasked with the critical objectives of identifying relevant objects and generating navigational paths. This strategic placement enables the system to focus on the immediate surroundings, facilitating real-time decision making and the precise planning of routes for the autonomous robot as it navigates through its environment. Given the above, the annotation process prioritized the crops nearest to the camera within each image frame. All crops within an image have been annotated with a single growth stage. Although seeds were sown simultaneously, variations in irrigation practices may result in uneven growth within a single image.

3. Results

This section presents the results obtained by applying the methods presented above. First, some images were acquired on maize and sugar beet crops, and then they were annotated manually (Figure 4). Second, a combined dataset comprising 3162 images was selected. Table 4 presents the breakdown by number of images and annotations for each crop type and growth stage.

This dataset was used to train the following methods for crop identification and growth stage determination.

3.1. Two-Phase Approach: Multi-Class Model by Crop Type plus Multi-Class Model by Growth Stage

Training multi-class models demands a substantial volume of images and considerable computational resources. Employing a two-phase approach for final crop identification significantly alleviates the computational demands during training and reduces the training time.

In the initial phase, images for both crop types were used to train a two-class model with YOLO v5. In this phase, crop type classification is the primary focus. This model achieved an overall mAP of 74.2%, with an individual sugar beet mAP of 70.8% and a higher one for maize of about 77.6% (Table 5). While detection accuracy is vital, equal emphasis is placed on discerning false positives and misclassifications. The confusion matrix (Figure 5a) is a crucial tool in this context, revealing the extent to which one class has been misidentified as another. Notably, no instances were misclassified as the wrong crop, although some previously unlabeled plants were identified. The confusion matrix in object detection has a class called background; this class identifies parts of the image that have not been labeled. A background prediction but an actual maize scenario means that the model failed to identify a maize plant that was present in the image. A background truth case is one where the model detected an object as maize or sugar beet, but the object was not labeled as either of these crops in the ground truth. These detections may be correct in the proposed dataset, as the dataset only labels plants closest to the camera (Figure 6a). However, given the agricultural context where the environment contains the target crops alongside soil and weed varieties, there is a possibility that some weeds that are not part of the study’s focal crops might be misidentified as one of the crops under investigation (Figure 6b).

Upon crop identification, the subsequent task is to identify the specific growth stage. Training models with distinct classes for each growth stage yielded mAPs exceeding 60% (Table 6 and Table 7). However, substantial disparities emerged when analyzing the mAP for individual growth stages, particularly with the highest growth stage of maize, exhibiting the lowest accuracy.

Further examination of the confusion matrices reveals a remarkable phenomenon: a considerable proportion of maize plants were misclassified as growth stage 16 when they should belong to stage 18 and vice versa (Figure 7a). This discrepancy was notably absent in the case of sugar beet (Figure 7b), suggesting the need for refined model training and evaluation strategies to enhance the accuracy of growth stage identification for maize crops.

Investigating the dominion of misidentification, specifically concerning maize crops at growth stages 16 and 18, an explanation rooted in their inherent similarity was encountered. The classification is primarily based on the number of leaves present, with no further distinguishing characteristics considered. Consequently, even slight variations in the viewing angle or the proximity between plants can precipitate such misidentifications.

A tangible example of this phenomenon can be observed in Figure 8a, wherein misidentification occurred for plants originally classified as maize growth stage 16. Due to the proximity of certain plants in the background, they were erroneously identified as maize growth stage 18. Conversely, in Figure 8b, the inverse scenario unfolds, where the nearness and angle of plants led to an identification as growth stage 16 when the actual stage was 18.

3.2. Single-Phase: Multi-Class Model including Crop Type and Growth Stage

This scenario involves training a single model, including all crops and their growth stages. Using YOLO v5, a nine-class model was trained. As shown in Table 8, the overall mAP registered at 67.5%, notably lower than the mAP achieved for crop identification. A closer examination of individual growth stage mAP values reveals that this decline in the overall mAP can be attributed to the lower accuracy observed in the more advanced stages of maize growth.

The next step focuses on discerning whether there is any overlap or confusion in identifying different crops or growth stages. To gain insights into this, the confusion matrix was scrutinized (Figure 9). Notably, in the case of crop identification, there were no instances of misclassification or crossover. However, confusion primarily between stages 16 and 18 can be observed when it comes to maize growth stages, warranting further investigation and fine tuning of the model’s ability to differentiate between these specific stages within the maize crop. This classification error is the same as that detected by the previous method.

3.3. Final Model: Single-Class Model for Crop and Growth

Both proposed methodologies converge toward identifying the crop and its corresponding growth stage. When the primary objective is to obtain general information about the field, any of the previously outlined methodologies suffice. However, when a more specific and precise outcome is required, such as crop identification for row following, opting for a single-class model emerges as the optimal choice.

In defining autonomous robot navigation, the primary requirement is to precisely ascertain the plants’ location. This information is the foundation for distinguishing plants from weeds, delineating crop lines, and ultimately establishing the robot’s optimal path. In this context, achieving a higher level of precision is paramount. The goal extends beyond merely identifying the crop type or its growth stage, where even if most identifications are accurate, some errors might be acceptable.

For this specific application, the exact location of each plant is essential and significantly influences the robot’s navigation strategy. Therefore, the accuracy and reliability of plant location data are indispensable to guide the robot effectively through the agricultural landscape.

The concept of employing a single-class model as the ultimate choice for navigation is grounded in a set of advantages when compared to a multi-class counterpart. The reduction in the number of classes to consider translates into a diminished requirement for parameter estimation, resulting in a more streamlined and computationally efficient model. This advantage becomes particularly pronounced when faced with constraints on available computational resources.

Single-class models exhibit superior performance in scenarios where the target object possesses distinct characteristics that readily differentiate it from the background and other objects. Such models can be meticulously fine tuned to concentrate exclusively on the unique attributes of the specific object of interest.

Moreover, in real-time applications or situations characterized by limitations in computational resources, single-class models often exhibit swifter inference times owing to their inherent simplicity and reduced model complexity.

Table 9 shows the results of the nine one-class models trained with YOLO v5 for each crop and growth stage.

4. Discussion

Figure 10 presents a graph that compares the mAP data obtained with the two-phase approach (green), the single-phase approach (red), and the single-class model for every crop and growth stage (final model), i.e., the data registered in Table 5, Table 6, Table 7 and Table 8. Upon evaluating these results, the single-class models exhibited the highest mAP for maize and sugar beet except for growth stages 16 and 18 for sugar beet crops.

In the case of the one- and two-phase methods for determining crop growth stages, the findings indicate that, while the two-phase approach generally displayed superior results, these discrepancies may not significantly impact the overarching goal of growth stage identification. In this context, the primary aim is to accurately determine the crop type and its respective growth stage. Both one- and two-phase approaches successfully met this objective. However, based on the confusion matrix in Figure 7a and Figure 9, it is noteworthy that both approaches exhibited errors in classifying maize growth stages 16 and 18, highlighting the necessity for dataset refinement to rectify this specific classification challenge. Avoiding misclassification is a significant challenge due to the high similarity between these growth stages. Possible solutions to mitigate this issue include increasing the number of images and implementing data augmentation techniques, such as varying colors or flipping the images for these specific classes. Alternatively, testing different hyperparameters of the model, such as learning rates and batch sizes, could also aid in resolving this problem. Depending on the use case, if specific differentiation for these growth stages is unnecessary, another alternative is to combine all images into a single class that identifies plants above growth stage 16. These potential solutions will be addressed in further research.

Regarding the real-time implementation of both approaches in a robotic system, the two-phase approach represents greater complexity than the single-phase approach, given that the computational capacity of robotic systems is usually limited. For the efficient implementation of this type of object detection method, the use of Graphics Processing Units (GPUs) is recommended, and generally, loading a model occupies all available resources. Therefore, in implementing the two-phase approach, once the type of crop has been identified, a change in the uploaded model in computational resources must be made.

5. Conclusions

The increasing utilization of mobile robots in agriculture emphasizes the significance of streamlined object detection models. These models play a critical role in recognizing and categorizing diverse elements within the agricultural environment, encompassing crops, weeds, and pests. They supply essential data that empowers these robots to make well-informed decisions, contributing to their efficiency and effectiveness. Moreover, the training methodology for these models plays a crucial role in optimizing their efficiency.

Accurate crop identification is fundamental for mobile robots to tailor their actions based on crop-specific requirements, such as growth-stage-dependent treatments. The methodology proposed in this study introduces two crop and growth stage identification approaches: a two-phase approach, where separate models identify first the crop type and then the growth stage, and a one-phase approach that combines all crops and growth stages into a single multi-class model.

In the two-phase approach, the algorithm successfully identified the crop type with an overall mAP of 74.2%, with maize outperforming at 77.6%. However, challenges arise in accurately classifying maize growth stages 16 and 18. On the other hand, in the one-phase approach, the overall mAP dropped to 67.5%, mainly because of reduced accuracy in identifying advanced maize growth stages. Nevertheless, the model effectively identified both crops and their respective growth stages.

In addition to the two main approaches, a comparative analysis has been presented by training a single-class model for each type of crop and growth state. In this final single-class model, the highest mAP was attained, emphasizing the advantages of using single-class models when precision in identifying specific objects or classes is paramount. These models offer streamlined parameter estimation, making them appropriate for resource-constrained real-time applications.

Based on the observed outcomes, it is evident that both approaches yield favorable results. The distinguishing factors between these alternatives become the critical determinants for selecting the most suitable approach. When dealing with a limited number of crops, the one-phase approach emerges as an advantageous choice, as it necessitates training and operationalizing just a single model, in contrast to the three (3) models required in the case of two crop types, i.e., (i) one model to discriminate between maize and sugar beet, (ii) one model to determine the maize growth stage, and (iii) one model to determine the sugar beet growth stage. Conversely, if the objective is to expand the repertoire of crops, opting for the two-phase approach proves advantageous, as it facilitates a modular approach, allowing for the seamless addition of new crop models as needed. This adaptability underscores the flexibility of the two-phase approach, making it a compelling choice for scaling up the system to accommodate an expanding array of crops.

Author Contributions

Conceptualization, E.C. and L.E.; methodology, E.C. and L.E.; software, E.C.; validation, E.C. and L.E.; formal analysis, E.C., L.E. and P.G.-d.-S.; investigation, E.C., L.E. and P.G.-d.-S.; resources, P.G.-d.-S. data curation, L.E. and P.G.-d.-S.; writing—original draft preparation, E.C. and L.E.; writing—review and editing, L.E. and P.G.-d.-S.; visualization, E.C., L.E. and P.G.-d.-S.; supervision, L.E. and P.G.-d.-S.; project administration, P.G.-d.-S.; funding acquisition, P.G.-d.-S. All authors have read and agreed to the published version of the manuscript.

Funding

This article is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No. 101000256.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to further requirements of cleaning and anonymization.

Conflicts of Interest

The authors declare no conflict of interest.

References

Department of Economic and Social Affairs. World Population Prospects 2022: Summary of Results. Available online: https://population.un.org/wpp/ (accessed on 25 September 2023).
Tian, X.; Engel, B.A.; Qian, H.; Hua, E.; Sun, S.; Wang, Y. Will Reaching the Maximum Achievable Yield Potential Meet Future Global Food Demand? J. Clean. Prod. 2021, 294, 126285. [Google Scholar] [CrossRef]
Falcon, W.P.; Naylor, R.L.; Shankar, N.D. Rethinking Global Food Demand for 2050. Popul. Dev. Rev. 2022, 48, 921–957. [Google Scholar] [CrossRef]
Precision Agriculture and Food Security|Science. Available online: https://www.science.org/doi/abs/10.1126/science.1183899 (accessed on 25 October 2023).
Patrício, D.I.; Rieder, R. Computer Vision and Artificial Intelligence in Precision Agriculture for Grain Crops: A Systematic Review. Comput. Electron. Agric. 2018, 153, 69–81. [Google Scholar] [CrossRef]
Kunz, C.; Weber, J.F.; Peteinatos, G.G.; Sökefeld, M.; Gerhards, R. Camera Steered Mechanical Weed Control in Sugar Beet, Maize and Soybean. Precis. Agric. 2018, 19, 708–720. [Google Scholar] [CrossRef]
Emmi, L.; Le Flécher, E.; Cadenat, V.; Devy, M. A Hybrid Representation of the Environment to Improve Autonomous Navigation of Mobile Robots in Agriculture. Precis. Agric. 2021, 22, 524–549. [Google Scholar] [CrossRef]
Botta, A.; Cavallone, P.; Baglieri, L.; Colucci, G.; Tagliavini, L.; Quaglia, G. A Review of Robots, Perception, and Tasks in Precision Agriculture. Appl. Mech. 2022, 3, 830–854. [Google Scholar] [CrossRef]
Lv, M.; Wei, H.; Fu, X.; Wang, W.; Zhou, D. A Loosely Coupled Extended Kalman Filter Algorithm for Agricultural Scene-Based Multi-Sensor Fusion. Front. Plant Sci. 2022, 13, 849260. [Google Scholar] [CrossRef] [PubMed]
Blok, P.M.; van Boheemen, K.; van Evert, F.K.; IJsselmuiden, J.; Kim, G.-H. Robot Navigation in Orchards with Localization Based on Particle Filter and Kalman Filter. Comput. Electron. Agric. 2019, 157, 261–269. [Google Scholar] [CrossRef]
Yu, T.; Zhou, J.; Wang, L.; Xiong, S. Accurate and Robust Stereo Direct Visual Odometry for Agricultural Environment. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 2480–2486. [Google Scholar]
Aguiar, A.S.; dos Santos, F.N.; Cunha, J.B.; Sobreira, H.; Sousa, A.J. Localization and Mapping for Robots in Agriculture and Forestry: A Survey. Robotics 2020, 9, 97. [Google Scholar] [CrossRef]
Bac, C.W.; van Henten, E.J.; Hemming, J.; Edan, Y. Harvesting Robots for High-Value Crops: State-of-the-Art Review and Challenges Ahead. J. Field Robot. 2014, 31, 888–911. [Google Scholar] [CrossRef]
Hernández-Hernández, J.L.; García-Mateos, G.; González-Esquiva, J.M.; Escarabajal-Henarejos, D.; Ruiz-Canales, A.; Molina-Martínez, J.M. Optimal Color Space Selection Method for Plant/Soil Segmentation in Agriculture. Comput. Electron. Agric. 2016, 122, 124–132. [Google Scholar] [CrossRef]
Lin, G.; Tang, Y.; Zou, X.; Xiong, J.; Fang, Y. Color-, Depth-, and Shape-Based 3D Fruit Detection. Precis. Agric. 2020, 21, 1–17. [Google Scholar] [CrossRef]
Kamilaris, A.; Prenafeta-Boldú, F.X. A Review of the Use of Convolutional Neural Networks in Agriculture. J. Agric. Sci. 2018, 156, 312–322. [Google Scholar] [CrossRef]
Droukas, L.; Doulgeri, Z.; Tsakiridis, N.L.; Triantafyllou, D.; Kleitsiotis, I.; Mariolis, I.; Giakoumis, D.; Tzovaras, D.; Kateris, D.; Bochtis, D. A Survey of Robotic Harvesting Systems and Enabling Technologies. J. Intell. Robot. Syst. 2023, 107, 21. [Google Scholar] [CrossRef] [PubMed]
Meshram, A.T.; Vanalkar, A.V.; Kalambe, K.B.; Badar, A.M. Pesticide Spraying Robot for Precision Agriculture: A Categorical Literature Review and Future Trends. J. Field Robot. 2022, 39, 153–171. [Google Scholar] [CrossRef]
Zhang, W.; Miao, Z.; Li, N.; He, C.; Sun, T. Review of Current Robotic Approaches for Precision Weed Management. Curr. Robot. Rep. 2022, 3, 139–151. [Google Scholar] [CrossRef]
Small Robot Co. Available online: https://smallrobotco.com/#perplant (accessed on 8 November 2023).
Ecorobotix: Smart Spraying for Ultra-Localised Treatments. Available online: https://ecorobotix.com/en/ (accessed on 8 November 2023).
Our Vision for the Future: Autonomous Weeding (in Development) AVO. Available online: https://ecorobotix.com/en/avo/ (accessed on 8 November 2023).
Sanchez, J.; Gallandt, E.R. Functionality and Efficacy of Franklin Robotics’ Tertill^TM Robotic Weeder. Weed Technol. 2021, 35, 166–170. [Google Scholar] [CrossRef]
EarthSense. Available online: https://www.earthsense.co/home (accessed on 8 November 2023).
Rovira-Más, F.; Chatterjee, I.; Sáiz-Rubio, V. The Role of GNSS in the Navigation Strategies of Cost-Effective Agricultural Robots. Comput. Electron. Agric. 2015, 112, 172–183. [Google Scholar] [CrossRef]
Galceran, E.; Carreras, M. A Survey on Coverage Path Planning for Robotics. Robot. Auton. Syst. 2013, 61, 1258–1276. [Google Scholar] [CrossRef]
Eli-Chukwu, N.C. Applications of Artificial Intelligence in Agriculture: A Review. Eng. Technol. Appl. Sci. Res. 2019, 9, 4377–4383. [Google Scholar] [CrossRef]
Benos, L.; Tagarakis, A.C.; Dolias, G.; Berruto, R.; Kateris, D.; Bochtis, D. Machine Learning in Agriculture: A Comprehensive Updated Review. Sensors 2021, 21, 3758. [Google Scholar] [CrossRef] [PubMed]
Kujawa, S.; Niedbała, G. Artificial Neural Networks in Agriculture. Agriculture 2021, 11, 497. [Google Scholar] [CrossRef]
Jha, K.; Doshi, A.; Patel, P.; Shah, M. A Comprehensive Review on Automation in Agriculture Using Artificial Intelligence. Artif. Intell. Agric. 2019, 2, 1–12. [Google Scholar] [CrossRef]
Maier, H.R.; Dandy, G.C. Neural Networks for the Prediction and Forecasting of Water Resources Variables: A Review of Modelling Issues and Applications. Environ. Model. Softw. 2000, 15, 101–124. [Google Scholar] [CrossRef]
Song, H.; He, Y. Crop Nutrition Diagnosis Expert System Based on Artificial Neural Networks. In Proceedings of the Third International Conference on Information Technology and Applications (ICITA’05), Sydney, Australia, 4–7 July 2005; Volume 1, pp. 357–362. [Google Scholar]
Mavridou, E.; Vrochidou, E.; Papakostas, G.A.; Pachidis, T.; Kaburlasos, V.G. Machine Vision Systems in Precision Agriculture for Crop Farming. J. Imaging 2019, 5, 89. [Google Scholar] [CrossRef]
Das, S.; Ghosh, I.; Banerjee, G.; Sarkar, U. Artificial Intelligence in Agriculture: A Literature Survey. Int. J. Sci. Res. Comput. Sci. Appl. Manag. Stud. 2018, 7, 1–6. [Google Scholar]
Zhang, J.-L.; Su, W.-H.; Zhang, H.-Y.; Peng, Y. SE-YOLOv5x: An Optimized Model Based on Transfer Learning and Visual Attention Mechanism for Identifying and Localizing Weeds and Vegetables. Agronomy 2022, 12, 2061. [Google Scholar] [CrossRef]
Zhou, H.; Wang, X.; Au, W.; Kang, H.; Chen, C. Intelligent Robots for Fruit Harvesting: Recent Developments and Future Challenges. Precis. Agric. 2022, 23, 1856–1907. [Google Scholar] [CrossRef]
Idoje, G.; Dagiuklas, T.; Iqbal, M. Survey for Smart Farming Technologies: Challenges and Issues. Comput. Electr. Eng. 2021, 92, 107104. [Google Scholar] [CrossRef]
Gan, H.; Lee, W.S. Development of a Navigation System for a Smart Farm. IFAC-PapersOnLine 2018, 51, 1–4. [Google Scholar] [CrossRef]
Wang, T.; Chen, B.; Zhang, Z.; Li, H.; Zhang, M. Applications of Machine Vision in Agricultural Robot Navigation: A Review. Comput. Electron. Agric. 2022, 198, 107085. [Google Scholar] [CrossRef]
Li, J.; Zhu, R.; Chen, B. Image Detection and Verification of Visual Navigation Route during Cotton Field Management Period. Int. J. Agric. Biol. Eng. 2018, 11, 159–165. [Google Scholar] [CrossRef]
Zhang, Z.; Li, P.; Zhao, S.; Lv, Z.; Du, F.; An, Y. An Adaptive Vision Navigation Algorithm in Agricultural IoT System for Smart Agricultural Robots. Comput. Mater. Contin. 2020, 66, 1043–1056. [Google Scholar] [CrossRef]
Kamilaris, A.; Prenafeta-Boldú, F.X. Deep Learning in Agriculture: A Survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
Bai, Y.; Zhang, B.; Xu, N.; Zhou, J.; Shi, J.; Diao, Z. Vision-Based Navigation and Guidance for Agricultural Autonomous Vehicles and Robots: A Review. Comput. Electron. Agric. 2023, 205, 107584. [Google Scholar] [CrossRef]
Shalal, N.; Low, T.; McCarthy, C.; Hancock, N. A Review of Autonomous Navigation Systems in Agricultural Environments; University of Southern Queensland: Barton, Australia, 2013. [Google Scholar]
Hamuda, E.; Glavin, M.; Jones, E. A Survey of Image Processing Techniques for Plant Extraction and Segmentation in the Field. Comput. Electron. Agric. 2016, 125, 184–199. [Google Scholar] [CrossRef]
Bharati, S.; Wu, Y.; Sui, Y.; Padgett, C.; Wang, G. Real-Time Obstacle Detection and Tracking for Sense-and-Avoid Mechanism in UAVs. IEEE Trans. Intell. Veh. 2018, 3, 185–197. [Google Scholar] [CrossRef]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Koirala, A.; Walsh, K.B.; Wang, Z.; McCarthy, C. Deep Learning—Method Overview and Review of Use for Fruit Detection and Yield Estimation. Comput. Electron. Agric. 2019, 162, 219–234. [Google Scholar] [CrossRef]
Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object Detection in 20 Years: A Survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Computer Vision—ECCV 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 213–229. [Google Scholar]
Jiang, P.; Ergu, D.; Liu, F.; Cai, Y.; Ma, B. A Review of Yolo Algorithm Developments. Procedia Comput. Sci. 2022, 199, 1066–1073. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
Jabir, B.; Noureddine, F.; Rahmani, K. Accuracy and Efficiency Comparison of Object Detection Open-Source Models. Int. J. Online Biomed. Eng. 2021, 17, 165. [Google Scholar] [CrossRef]
Bentley, P. Pattern Analysis, Statistical Modelling and Computational Learning 2004–2008, 1st ed.; PASCAL Network of Excellence; University College London: London, UK, 2008; ISBN 978-0-9559301-0-2. [Google Scholar]
Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Computer Vision—ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Springer International Publishing: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar]
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
Addison, H.; Alina; Walker, J.; Uijlings, J.; Pont-Tuset, J.; McDonald, M.G.V.; Kan, W. Google AI Open Images—Object Detection Track. Available online: https://kaggle.com/competitions/google-ai-open-images-object-detection-track (accessed on 6 November 2023).
Andreopoulos, A.; Tsotsos, J.K. 50 Years of Object Recognition: Directions Forward. Comput. Vis. Image Underst. 2013, 117, 827–891. [Google Scholar] [CrossRef]
Shahbazi, N.; Ashworth, M.B.; Callow, J.N.; Mian, A.; Beckie, H.J.; Speidel, S.; Nicholls, E.; Flower, K.C. Assessing the Capability and Potential of LiDAR for Weed Detection. Sensors 2021, 21, 2328. [Google Scholar] [CrossRef] [PubMed]
dos Santos Ferreira, A.; Matte Freitas, D.; Gonçalves da Silva, G.; Pistori, H.; Theophilo Folhes, M. Weed Detection in Soybean Crops Using ConvNets. Comput. Electron. Agric. 2017, 143, 314–324. [Google Scholar] [CrossRef]
JIANG Honghua, W.P. Fast Identification of Field Weeds Based on Deep Convolutional Network and Binary Hash Code. Nongye Jixie XuebaoTransactions Chin. Soc. Agric. Mach. 2018, 49, 30–38. [Google Scholar]
Åstrand, B.; Baerveldt, A.-J. An Agricultural Mobile Robot with Vision-Based Perception for Mechanical Weed Control. Auton. Robots 2002, 13, 21–35. [Google Scholar] [CrossRef]
Dyrmann, M.; Karstoft, H.; Midtiby, H.S. Plant Species Classification Using Deep Convolutional Neural Network. Biosyst. Eng. 2016, 151, 72–80. [Google Scholar] [CrossRef]
Ferrari, F.; Verbitsky, I.E. Radial Fractional Laplace Operators and Hessian Inequalities. J. Differ. Equ. 2012, 253, 244–272. [Google Scholar] [CrossRef]
Esposito, M.; Crimaldi, M.; Cirillo, V.; Sarghini, F.; Maggio, A. Drone and Sensor Technology for Sustainable Weed Management: A Review. Chem. Biol. Technol. Agric. 2021, 8, 18. [Google Scholar] [CrossRef]
Peteinatos, G.G.; Reichel, P.; Karouta, J.; Andújar, D.; Gerhards, R. Weed Identification in Maize, Sunflower, and Potatoes with the Aid of Convolutional Neural Networks. Remote Sens. 2020, 12, 4185. [Google Scholar] [CrossRef]
Liu, B.; Bruch, R. Weed Detection for Selective Spraying: A Review. Curr. Robot. Rep. 2020, 1, 19–26. [Google Scholar] [CrossRef]
Xu, B.; Chai, L.; Zhang, C. Research and Application on Corn Crop Identification and Positioning Method Based on Machine Vision. Inf. Process. Agric. 2023, 10, 106–113. [Google Scholar] [CrossRef]
Otsu, N. A Tlreshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef]
Rasti, S.; Bleakley, C.J.; Silvestre, G.C.M.; Holden, N.M.; Langton, D.; O’Hare, G.M.P. Crop Growth Stage Estimation Prior to Canopy Closure Using Deep Learning Algorithms. Neural Comput. Appl. 2021, 33, 1733–1743. [Google Scholar] [CrossRef]
Yu, Z.; Cao, Z.; Wu, X.; Bai, X.; Qin, Y.; Zhuo, W.; Xiao, Y.; Zhang, X.; Xue, H. Automatic Image-Based Detection Technology for Two Critical Growth Stages of Maize: Emergence and Three-Leaf Stage. Agric. For. Meteorol. 2013, 174–175, 65–84. [Google Scholar] [CrossRef]
Zhao, S.; Zheng, H.; Chi, M.; Chai, X.; Liu, Y. Rapid Yield Prediction in Paddy Fields Based on 2D Image Modelling of Rice Panicles. Comput. Electron. Agric. 2019, 162, 759–766. [Google Scholar] [CrossRef]
Marsujitullah; Zainuddin, Z.; Manjang, S.; Wijaya, A.S. Rice Farming Age Detection Use Drone Based on SVM Histogram Image Classification. J. Phys. Conf. Ser. 2019, 1198, 092001. [Google Scholar] [CrossRef]
Yudhana, A.; Umar, R.; Ayudewi, F.M. The Monitoring of Corn Sprouts Growth Using The Region Growing Methods. J. Phys. Conf. Ser. 2019, 1373, 012054. [Google Scholar] [CrossRef]
Sadeghi-Tehran, P.; Sabermanesh, K.; Virlet, N.; Hawkesford, M.J. Automated Method to Determine Two Critical Growth Stages of Wheat: Heading and Flowering. Front. Plant Sci. 2017, 8, 252. [Google Scholar] [CrossRef]
Li, M.; Zhang, Z.; Lei, L.; Wang, X.; Guo, X. Agricultural Greenhouses Detection in High-Resolution Satellite Images Based on Convolutional Neural Networks: Comparison of Faster R-CNN, YOLO v3 and SSD. Sensors 2020, 20, 4938. [Google Scholar] [CrossRef] [PubMed]
Shah, R.M.; Sainath, B.; Gupta, A. Comparative Performance Study of CNN-Based Algorithms and YOLO. In Proceedings of the 2022 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), Bangalore, India, 8–10 July 2022; pp. 1–6. [Google Scholar]
ROS—The Robot Operating System. Available online: https://www.ros.org/ (accessed on 23 October 2023).
Emmi, L.; Fernández, R.; Gonzalez-de-Santos, P.; Francia, M.; Golfarelli, M.; Vitali, G.; Sandmann, H.; Hustedt, M.; Wollweber, M. Exploiting the Internet Resources for Autonomous Robots in Agriculture. Agriculture 2023, 13, 1005. [Google Scholar] [CrossRef]
Meier, U. Federal Biological Research Centre for Agriculture and Forestry Growth Stages of Mono-and Dicotyledonous Plants. Available online: https://library.wur.nl/WebQuery/titel/962304 (accessed on 18 September 2023).
Lu, Y.; Young, S. A Survey of Public Datasets for Computer Vision Tasks in Precision Agriculture. Comput. Electron. Agric. 2020, 178, 105760. [Google Scholar] [CrossRef]
Agricultural Production—Crops. Available online: https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Agricultural_production_-_crops (accessed on 13 September 2023).
Khazaie, M. The Study of Maize and Sugar Beet Intercropping. J. Crops Improv. 2015, 16, 987–997. [Google Scholar] [CrossRef]
Ćirić, M. Intercropping Sugar Beet with Different Agricultural Crops. In Sugar Beet Cultivation, Management and Processing; Misra, V., Srivastava, S., Mall, A.K., Eds.; Springer Nature: Singapore, 2022; pp. 387–406. ISBN 978-981-19273-0-0. [Google Scholar]

Figure 1. Conceptual diagram of the two-phase strategy. In the first phase (gray), there is a model of n classes, one per crop (n is the number of crops). In the second phase (blue), n trained models determine each growth stage. The last phase is the final model with a class for each crop and growth stage.

Figure 2. Conceptual diagram of the single-phase strategy.

Figure 3. Graphical examples of phenological growth stages and their respective BBCH identification codes. (a) Sketch of sugar beet plants in growth stage 10, with the first leaves visible, stage 12 with 2 leaves unfolded, and stage 17 with 7 leaves; (b) sketch of maize plants in stage 10 of growth with the first leaf through the coleoptile, stage 11 with one leaf unfolded, and stages 13 and 15 with 3 and 5 leaves unfolded, respectively.

Figure 4. Examples of images of the datasets: (a) Maize growth stage 10; (b) maize growth stage 12; (c) maize growth stage 14; (d) maize growth stage 16; (e) maize growth stage 18; (f) sugar beet growth stage 12; (g) sugar beet growth stage 14; (h) sugar beet growth stage 16; (i) sugar beet growth stage 18.

Figure 5. (a) The crop identification model’s confusion matrix compares predicted results with actual results for each class, specifically maize and sugar beet crops. (b) The mAP values and epochs graph depicts the training process for the crop models.

Figure 6. Detection (red) vs. ground truth (green) example. The model detections are highlighted in red, identifying the crop and confidence of the detection. The initial annotations for training are highlighted in green. (a) Example of correct detections of sugar beet plants that were not initially labeled. (b) Example of a case of incorrect detection of weeds as sugar beet plants with a confidence of 0.26.

Figure 7. Confusion matrix crossing the predictions against the actual results for each class. (a) Maize model for different stages of growth; each stage is a class. (b) Sugar beet model for different stages of growth.

Figure 8. Images with detections made with multiclass models of corn with different stages of growth where classification errors are identified between classes of maize in growth stage 16 and maize in growth stage 18. (a) Images with plants in growth stage 16 where two objects were detected as maize 18. These plants were incorrectly identified because they are very close. (b) Image with maize in growth stage 18, where plants with growth stage 16 were identified. Because of the angle from which the image was taken, it appears to have fewer leaves.

Figure 9. Confusion matrix crossing the predictions against the actual results for each class in the multiclass model that has all the crops and their growth stages, so each crop/growth stage combination is identified as a new class.

Figure 10. Graph identifying the different mAP values for each model with crop class/growth state and comparing the results of the proposed methods.

Table 1. Summary of the workouts for each version of YOLO using 318 images, where 80% was used for training, 10% for testing, and 10% for validation.

Model	Training Time (Hours)	mAP	Inference Time (ms) ¹
Yolo v4	1.330	66.8%	673.1
Yolo v5	0.342	77.0%	15.9
Yolo v6	2.184	66.9%	8.2
Yolo v7	2.307	69.8%	22.0
Yolo v8	1.663	77.5%	16.6

¹ Machine characteristics: NVIDIA Tesla T4, 15 GB memory.

Table 2. BBCH identification growth codes of maize (Zea mays L.) [80].

Code	Description
10	First leaf through coleoptile
11	Leaves unfolded: 1
12	Leaves unfolded: 2
13	Leaves unfolded: 3
1.	Stages continuous till
19	Leaves unfolded: >8

Table 3. BBCH identification growth codes of sugar beet (Beta vulgaris L.) [80].

Code	Description
10	First leaf visible
11	First pair of leaves folded
12	Leaves unfolded: 2 (First pair)
14	Leaves unfolded: 4 (Second pair)
15	Leaves unfolded: 5
1.	Stages continuous till
19	Leaves unfolded: >8

Table 4. Description of the dataset used for validating the approaches.

Crop and Stage	Images	Annotations
Maize 10	331	1987
Maize 12	392	2219
Maize 14	318	2341
Maize 16	451	2630
Maize 18	404	2237
Sugar beet 12	391	1911
Sugar beet 14	357	2284
Sugar beet 16	146	899
Sugar beet 18	372	1382

Table 5. mAP results for the two-phase crop identification model.

Crop	mAP
Maize	77.6%
Sugar beet	70.8%
Average of both classes	74.2%

Table 6. mAP results for the two-phase sugar beet growth stage model.

Growth Stage	mAP
12	57.2%
14	76.6%
16	86.3%
18	89.7%
Average (12 to 18)	77.4%

Table 7. mAP results for the two-phase maize growth stage model.

Growth Stage	mAP
10	69.4%
12	78.1%
14	65.3%
16	56.8%
18	37.1%
Average (10 to 18)	61.3%

Table 8. Multi-class crop and growth model mAP. (Two-phase crop and growth model.)

Crop	Growth Stage	mAP
Average (Maize–sugar beet)	10 to 18	67.5%
Maize	10	69.6%
	12	75.3%
	14	76%
	16	55.1%
	18	30.1%
Sugar beet	12	55.1%
	14	65.2%
	16	94.4%
	18	87%

Table 9. Single-class crop and growth stage model mAP.

Crop	Growth-Stage	mAP
Maize	10	76%
	12	80.4%
	14	87%
	16	85%
	18	90%
Sugar beet	12	68.7%
	14	78%
	16	82.7%
	18	89.4%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cortinas, E.; Emmi, L.; Gonzalez-de-Santos, P. Crop Identification and Growth Stage Determination for Autonomous Navigation of Agricultural Robots. Agronomy 2023, 13, 2873. https://doi.org/10.3390/agronomy13122873

AMA Style

Cortinas E, Emmi L, Gonzalez-de-Santos P. Crop Identification and Growth Stage Determination for Autonomous Navigation of Agricultural Robots. Agronomy. 2023; 13(12):2873. https://doi.org/10.3390/agronomy13122873

Chicago/Turabian Style

Cortinas, Eloisa, Luis Emmi, and Pablo Gonzalez-de-Santos. 2023. "Crop Identification and Growth Stage Determination for Autonomous Navigation of Agricultural Robots" Agronomy 13, no. 12: 2873. https://doi.org/10.3390/agronomy13122873

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Crop Identification and Growth Stage Determination for Autonomous Navigation of Agricultural Robots

Abstract

1. Introduction

1.1. Significance of Technology and Automation in Agriculture

1.2. Machine Vision in Agriculture

1.3. Crop Detection and Localization

1.4. Object Detection

1.5. Crop Identification

1.6. Determination of Crop Growth Stage

2. Materials and Methods

2.1. Object Detection Models

2.2. Crop Characterization

2.3. Dataset

3. Results

3.1. Two-Phase Approach: Multi-Class Model by Crop Type plus Multi-Class Model by Growth Stage

3.2. Single-Phase: Multi-Class Model including Crop Type and Growth Stage

3.3. Final Model: Single-Class Model for Crop and Growth

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI