# Tassel-YOLO: A New High-Precision and Real-Time Method for Maize Tassel Detection and Counting Based on UAV Aerial Images

## Abstract

## 1. Introduction

## 2. Related Work

#### 2.1. YOLO Model

#### 2.2. Global Attention Mechanism

#### 2.3. Gsconv

_{1}. Deep depthwise separable convolution is applied to half of the channels, while regular convolution is applied to the other half. The outputs of both convolutions are concatenated for feature fusion. Subsequently, the information generated by SC is permeated through shuffle to various parts of the information generated by DSC. Finally, the output channel number in the feature map is C

_{2}. The mathematical expression of the GSconv module is given by Equation (1).

## 3. Methods

#### 3.1. Tassel-YOLO Model Architecture

#### 3.2. Siou Loss Function

#### 3.2.1. Angle Cost

#### 3.2.2. Distance Cost

#### 3.2.3. Shape Cost

#### 3.2.4. IoU Cost

#### 3.2.5. SIoU Cost

## 4. Experimental Material

#### 4.1. The Establishment of the Dataset

#### 4.2. Data Augmentation

_{x1}is a batch of samples and batch

_{y1}is the corresponding labels, batch

_{x2}is another batch of samples and batch

_{y2}is the corresponding labels. λ is the mixing parameter calculated from the Beta distribution with parameters α and β, and the principal formula of Mixup is obtained accordingly.

_{x}refers to the mixed batch samples, and mixed_batch

_{y}refers to the corresponding labels. Mixup data augmentation increases the diversity of the training set by performing linear interpolation between different images and labels to generate new training data.

## 5. Experiment Results

#### 5.1. Experimental Platform and Evaluation Indicators

#### 5.2. Training Comparison with Other Models

#### 5.3. Counting and Detection Results

#### 5.4. Contrast Experiment Results of Introducing Attention Mechanism

#### 5.5. Ablation Experiment

## 6. Conclusions and Future Work

- This study focuses on the research and development of real-time detection tasks for maize tassels. In the future, as more data become available for various plant species and quantities, we will continue to optimize Tassel-YOLO and apply our model to broader fields, such as wheatear detection and ears of millet detection.
- Hyperspectral images can provide richer spectral information, and using hyperspectral images for tassel detection can provide more comprehensive and accurate data support. This is also a future research direction worth exploring.
- During the growth process of maize, which includes multiple growth stages, this study only investigated the detection and counting of the tasseling stage. In the future, we will experimentally analyze images from other growth stages to obtain a more comprehensive assessment of maize quantity.
- This study achieved the counting of tassels at a local position of a field represented by a single image. However, calculating the tassel count of the entire maize field through image overlap also has certain research significance.

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

**Figure 10.**Some effects of data augmentation methods. (

**a**) Original image; (

**b**) Rotation; (

**c**) Equal scaling; (

**d**) Color dithering; (

**e**) Mosaic; (

**f**) Mix up.

**Figure 12.**The results of maize tassel detection by Tassel-YOLO. (

**a**) The detection performance of maize tassels at different sizes; (

**b**) Overall image detection performance.

Date | Weather | Device | Resolution | FPS | Image Sensor |
---|---|---|---|---|---|

16 June 2022 | Sunny | DJI Mavic drone | 12 MP | 24@1080P | 1-inch CMOS |

2 July 2022 | Sunny | DJI Mavic drone | 12 MP | 24@1080P | 1-inch CMOS |

Model | mAP@0.5 | Precision | Recall | F1 | FPS |
---|---|---|---|---|---|

YOLOv4 | 89.10% | 88.01% | 85.92% | 86.95% | 55 |

YOLOv5 | 93.42% | 91.23% | 89.13% | 90.17% | 86 |

YOLOv7 | 94.71% | 92.32% | 91.74% | 92.03% | 69 |

YOLOv8 | 94.26% | 92.14% | 92.92% | 92.53% | 75 |

Tassel-YOLO | 96.14% | 93.16% | 93.21% | 93.18% | 74 |

Tassel-YOLO | YOLOv8 | YOLOv7 | YOLOv5 | YOLOv4 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Group | NMC | NAC | CA | MRE (%) | NAC | CA | MRE (%) | NAC | CA | MRE (%) | NAC | CA | MRE (%) | NAC | CA | MRE (%) |

1 | 380 | 368 | 96.8% | 0.32 | 359 | 94.5% | 0.55 | 364 | 95.8% | 0.42 | 358 | 94.2% | 0.58 | 347 | 91.3% | 0.87 |

2 | 791 | 771 | 97.5% | 0.25 | 756 | 95.6% | 0.44 | 763 | 96.5% | 0.35 | 754 | 95.3% | 0.47 | 729 | 92.2% | 0.78 |

3 | 1248 | 1221 | 97.8% | 0.22 | 1211 | 97.0% | 0.30 | 1209 | 96.9% | 0.31 | 1193 | 95.6% | 0.44 | 1158 | 92.8% | 0.72 |

4 | 1682 | 1650 | 98.1% | 0.19 | 1639 | 97.4% | 0.26 | 1633 | 97.1% | 0.29 | 1615 | 96.0% | 0.40 | 1569 | 93.3% | 0.67 |

Attention Mechanism | Precision | Recall | F1 | mAP@0.5 | FLOPs | Parameters | ||
---|---|---|---|---|---|---|---|---|

SE | CBAM | GAM | ||||||

× | × | × | 92.32% | 91.74% | 92.03% | 94.71% | 103.2 G | 36.48 M |

√ | × | × | 92.92% | 89.48% | 91.17% | 94.33% | 103.3 G | 36.62 M |

× | √ | × | 93.57% | 91.24% | 92.39% | 94.83% | 103.9 G | 37.63 M |

× | × | √ | 92.84% | 92.86% | 92.85% | 95.84% | 111.5 G | 43.98 M |

Methods | mAP@0.5 | F1 | FLOPs | Parameters | Inference Time (ms) |
---|---|---|---|---|---|

YOLOv7 | 94.71% | 92.03% | 103.2 G | 36.48 M | 14.5 |

YOLOv7 + GAM | 95.84% | 92.85% | 111.5 G | 43.98 M | 15.6 |

YOLOv7 + Slim Neck | 95.21% | 91.87% | 82.9 G | 26.69 M | 12.3 |

YOLOv7 + SIoU | 94.92% | 92.16% | 103.2 G | 36.48 M | 14.5 |

Tassel-YOLO | 96.14% | 93.18% | 91.8 G | 32.37 M | 13.5 |

