The random dumping of garbage in rivers has led to the continuous deterioration of water quality and affected people’s living environment. The accuracy of detection of garbage floating in rivers is greatly affected by factors such as floating speed, night/daytime natural light, viewing angle and position, etc. This paper proposes a novel detection model, called YOLOv5_CBS
, for the detection of garbage objects floating in rivers, based on improvements of the YOLOv5 model. Firstly, a coordinate attention (C
A) mechanism is added to the original C3 module (without compressing the number of channels in the bottleneck), forming a new C3-CA-Uncompress Bottleneck (CCUB) module for improving the size of the receptive field and allowing the model to pay more attention to important parts of the processed images. Then, the Path Aggregation Network (PAN) in YOLOv5 is replaced with a Bidirectional Feature Pyramid Network (B
iFPN), as proposed by other researchers, to enhance the depth of information mining and improve the feature extraction capability and detection performance of the model. In addition, the Complete Intersection over Union (CIoU) loss function, which was originally used in YOLOv5 for the calculation of location score of the compound loss, is replaced with the SCYLLA-IoU (S
IoU) loss function, so as to speed up the model convergence and improve its regression precision. The results, obtained through experiments conducted on two datasets, demonstrate that the proposed YOLOv5_CBS model outperforms the original YOLOv5 model, along with three other state-of-the-art models (Faster R-CNN, YOLOv3, and YOLOv4), when used for river floating garbage objects detection, in terms of the recall
, average precision
, and F1 score
achieved by reaching respective values of 0.885, 90.85%, and 0.8669 on the private dataset, and 0.865, 92.18%, and 0.9006 on the Flow-Img public dataset.