Next Article in Journal
The Effect of Short-Term Transcutaneous Electrical Stimulation of Auricular Vagus Nerve on Parameters of Heart Rate Variability
Previous Article in Journal
Emission Inventory for Maritime Shipping Emissions in the North and Baltic Sea
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Data Descriptor

RaspberrySet: Dataset of Annotated Raspberry Images for Object Detection

1
Institute of Horticulture, Graudu iela 1, LV-3701 Ceriņi, Latvia
2
Institute of Electronics and Computer Science, Dzērbenes iela 14, LV-1006 Riga, Latvia
*
Author to whom correspondence should be addressed.
Submission received: 20 March 2023 / Revised: 24 April 2023 / Accepted: 2 May 2023 / Published: 10 May 2023

Abstract

:
The RaspberrySet dataset is a valuable resource for those working in the field of agriculture, particularly in the selection and breeding of ecologically adaptable berry cultivars. This is because long-term changes in temperature and weather patterns have made it increasingly important for crops to be able to adapt to their environment. To assess the suitability of different cultivars or to make yield predictions, it is necessary to describe and evaluate berries’ characteristics at various growth stages. This process is typically carried out visually, but it can be time-consuming and labor-intensive, requiring significant expert knowledge. The RaspberrySet dataset was created to assist with this process, and it includes images of raspberry berries at five different stages of development. These stages are flower buds, flowers, unripe berries, and ripe berries. All these stages of raspberry images classified buds, damaged buds, flowers, unripe berries, and ripe berries and were annotated using ground truth ROI and presented in YOLO format. The dataset includes 2039 high-resolution RGB images, with a total of 46,659 annotations provided by experts using Label Studio software (1.7.1). The images were taken in various weather conditions, at different times of the day, and from different angles, and they include fully visible buds, flowers, berries, and partially obscured buds. This dataset is intended to improve the efficiency of berry breeding and yield estimation and to identify the raspberry phenotype more accurately. It may also be useful for breeding other fruit crops, as it allows for the reliable detection and phenotyping of yield components at different stages of development. By providing a homogenized dataset of images taken on-site at the Institute of Horticulture in Dobele, Latvia, the RaspberrySet dataset offers a valuable resource for those working in horticulture.
Dataset License: CC BY 4.0.

1. Summary

Raspberry breeding at the Institute of Horticulture, Dobele, Latvia (LatHort), GPS location: N: 56°36′39″ E: 23°17′50″ has been carried out since 1980. The main objectives of raspberry breeding are to achieve the ecological plasticity of plants, high-yield and fruit quality, and resistance to diseases and pests. The structure of the raspberry cultivar in the Baltic countries has been influenced by the historical situation dominated there in the twentieth century and climatic conditions, especially the winter hardiness—commercially widely grown cultivars are mainly bred in Russia. A similar situation is observed for the genetic resources in Baltic countries, consisting of some old European and American cultivars, but mostly of Russian cultivars and hybrids. Small breeding programs are only running in Latvia and Estonia [1]. A hybridization program provides the evaluation of about 1500 raspberry hybrids each growing season. The evaluation includes more than 30 traits. Most of them are evaluated visually, including yield compounds. Raspberries are an example of a plant with a complex set of traits influenced by the environment, i.e., meteorological conditions and genotype.
LatHort has developed rich genetic material for red raspberry, including cultivars and promising hybrids, which are intensively used in hybridization. The genotypes differ in yield compounds (number of canes, fruit laterals per cane, and the weight of fruit); winter hardiness; disease resistance; fruit quality characteristics including shape, color, biochemical composition, etc.; and fruit ripening time. Table 1 summarizes some of the most important fruit and yield component parameters of the florican raspberry cultivars and promising hybrids.
Table 2 summarizes some of the most important fruit and yield component parameters of the primocane raspberry cultivars and promising hybrids.
The process of raspberry breeding takes 15–20 years from crossing to cultivar. To select candidates for cultivars, the characteristics of several thousand seedlings must be described and evaluated, most of which is performed visually. This is a time-consuming and labor-intensive process that also requires sufficient manpower. In addition, visual scoring is relatively subjective, and results may vary among different evaluators. Therefore, the utility of new techniques for non-invasive fruit detection and phenotyping to improve yield performance should be evaluated by adopting Machine Learning (ML) techniques, considering cost–benefit and human-centered considerations.
ML and deep learning (DL) techniques have shown very promising results in fruit classification and detection problems [2] and yield quality evaluation [3]. A neat and clean image dataset in precision agriculture supplemented with an image labeling tool is the basic requirement to build accurate and robust ML models for the real-time environment.

2. Data Description

The annotated raspberry rubus idaeus dataset is a comprehensive collection of images and annotations of the fruit, specifically designed for use in the field of deep learning (DL). The dataset includes a total of 2039 original raw images, each with a resolution of 1773 × 1773 pixels, and saved in the .jpg format for easy accessibility and compatibility with a variety of image processing software. To provide a thorough and accurate representation of the fruit in the images, each image is accompanied by the same number of .txt files in the YOLO format, which stands for “You Only Look Once.” (detection results from the dataset are reflected in Figure 1) [4].
The YOLO format is a popular choice in the DL community for its efficient one-level representative detection architecture, which can detect, locate, and classify the specific category of individual objects within an image. This is particularly useful in the field of plant detection, where it is necessary to quickly and accurately identify the various plant species and characteristics present in an image.
In the case of raspberry detection, the YOLO [5] format was chosen for its ability to quickly process and detect relatively small raspberry fruits. Furthermore, compared to two-level models, one-level models are generally faster at detecting and counting fruits, making them a practical choice for agricultural applications [6]. The dataset was divided into five classes: “buds”, “damaged buds”, “flowers”, “unripe berries”, and “ripe berries”. The images were captured under field conditions, with images of buds, flowers, and unripe berries photographed in June 2021 and images of buds, flowers, unripe and ripe berries, and damaged buds photographed in July 2021. The images were collected from different raspberry genotypes, which can exhibit variations in bush form, yield components, and fruit location. The images were taken in an orchard at the Institute of Horticulture (LatHort) in Dobele, Latvia, by experts from LatHort who were responsible for image acquisition and manual annotation. The Institute of Electronics and Computer Science (EDI) also contributed to the creation of the dataset by providing software and hardware support. In total, out of 46,659 annotations, the raspberry dataset contains: 11,788 that were for buds, 4748 that were for flowers, 29,156 that were for unripe berries, 463 that were for ripe berries, and 504 that were for damaged buds.

3. Methods

All the software used is provided in Table 3.

3.1. Image Capturing

The raspberry images in the dataset were taken in an orchard at the Institute of Horticulture in Dobele, Latvia. The orchard was planted (coordinates: 56°36′23.5″ N, 23°18′09.8″ E) with 14 genotypes of raspberry, and the images were captured using an Apple XS smartphone. The images were taken at four different stages of the raspberry’s phenological development: buds, flowers, unripe, and ripe fruits. The distance between the raspberry bushes and the camera was about 30 cm, capturing close-up views so that the crop elements could be seen as clearly as possible in the images. The images were taken from a variety of angles; if the lengths of the raspberry shoots were 1.0–1.4 m, then the photographing angle to the soil was 45°. If the lengths of the shoots were around 1.5–1.6 m, then the angle was 90°, but if the shoots were longer than 1.7 m, then the angle was 120°.
The images were taken under a variety of weather conditions, including sunny, cloudy, and partly cloudy. Experts from the Institute of Horticulture evaluated the images and divided them into five classes: “buds”, “flowers”, “unripe berries”, “ripe berries”, and “damaged buds”. Florican raspberry buds, flowers, and unripe berry images were captured from 15 to 16 June 2021, and buds, damaged buds, flowers, and unripe and ripe berries were captured on 2 July 2021. Primocane raspberry buds, damaged buds, flowers, and unripe and ripe berries were captured on 6 August 2021 (Table 4). Temperature is one of the factors that influence yield, but when plants are grown under uncontrolled conditions (in the open field), it varies from year to year and thus affects the yield elements. For example, low temperatures during flowering can affect berry formation as the flowers are less likely to pollinate. In spring, high temperatures and insufficient moisture supply can intensify winter damage, resulting in bud dieback or the death of corroding shoots, which affects the overall view of yield elements. This may be less important for the identification of the objects themselves, but it certainly has an impact on yield and berry size. From a biological point of view, it is important that the plant characteristics are obtained under certain environmental conditions, but changing conditions, in this case, temperature, will change the yield elements and the overall characteristics. This would therefore also be relevant for yield forecasting. This could be particularly important when analyzing 3D images and comparing them with measured data.

3.2. Image Annotation

The dataset uploaded to the Institute of Electronics and Computer Science (EDI) consists of raw images of red raspberry fruit, each saved in the .jpg format. The dataset is divided into five classes: “buds”, “flowers”, “unripe berries”, “ripe berries”, and “damaged buds”. The dataset includes .txt files in the YOLO format, which provide annotations for the locations of the raspberry fruit in the images using bounding boxes. These annotations were created using the Label Studio software and may overlap to cover the entire berry. The YOLO format stores the annotations in .txt files: 0—buds, 1—flowers, 2—unripe berries, 3—ripe berries, and 4—damaged buds, and the following values indicate the x and y coordinates, as well as the height and width of the bounding box.

Author Contributions

Conceptualization, S.S.; methodology, S.S., I.K., K.S. and E.K.; software, A.N.; validation, K.S.; formal analysis, K.S.; investigation, I.N.; resources, S.S.; data curation, K.S.; writing—original draft preparation, E.E.; writing—review and editing, K.S., S.S., I.K. and E.K.; visualization, I.N.; supervision, E.K.; project administration, S.S.; funding acquisition, S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research and APC were funded by the Latvian Council of Science, grant number lzp-2020/1-0353 “Smart noninvasive phenotyping of raspberries and Japanese quinces using machine learning and hyperspectral and 3D imaging”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset is available on the Zenodo platform: https://doi.org/10.5281/zenodo.7014728 (accessed on 9 December 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Lācis, G.; Kota-Dombrovska, I.; Strautiņa, S. Evaluation of Red Raspberry Cultivars Used for Breeding and Commercial Growing in the Baltic Region. Proc. Latv. Acad. Sci. Sect. B Nat. Exact Appl. Sci. 2017, 71, 203–210. [Google Scholar] [CrossRef] [Green Version]
  2. Mimma, N.-E.-A.; Ahmed, S.; Rahman, T.; Khan, R. Fruits Classification and Detection Application Using Deep Learning. Sci. Program. 2022, 2022, 4194874. [Google Scholar] [CrossRef]
  3. Apolo-Apolo, O.E.; Martínez-Guanter, J.; Egea, G.; Raja, P.; Pérez-Ruiz, M. Deep Learning Techniques for Estimation of the Yield and Size of Citrus Fruits Using a UAV. Eur. J. Agron. 2020, 115, 126030. [Google Scholar] [CrossRef]
  4. Sudars, K.; Namatevs, I.; Judvaitis, J.; Balass, R.; Nikulins, A.; Peter, A.; Strautina, S.; Kaufmane, E.; Kalnina, I. YOLOv5 Deep Neural Network for Quince and Raspberry Detection on RGB Images. In Proceedings of the 2022 Workshop on Microwave Theory and Techniques in Wireless Communications (MTTW), Riga, Latvia, 5 October 2022; IEEE: New York, NY, USA, 2022; pp. 19–22. [Google Scholar]
  5. Bresilla, K.; Perulli, G.D.; Boini, A.; Morandi, B.; Corelli Grappadelli, L.; Manfrini, L. Single-Shot Convolution Neural Networks for Real-Time Fruit Detection Within the Tree. Front. Plant Sci. 2019, 10, 611. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Li, Z.; Guo, R.; Li, M.; Chen, Y.; Li, G. A Review of Computer Vision Technologies for Plant Phenotyping. Comput. Electron. Agric. 2020, 176, 105672. [Google Scholar] [CrossRef]
Figure 1. Detection results obtained with the trained detector.
Figure 1. Detection results obtained with the trained detector.
Data 08 00086 g001
Table 1. Characterization of florican raspberry yield components.
Table 1. Characterization of florican raspberry yield components.
Cultivars and HybridsFruit
Laterals
per Cane
Fruit per
Fruit Lateral
The Average Weight of Fruit, gYield per
Cane, g
Yield per Bush, gFruit Length,
mm
Fruit Width,
mm
Shape
Index (Ratio Length, Width)
Account of
Drupe
Fruit Glossiness
(Score 1–9)
Fruit
Firmness
(Score 1–9)
Fruit
Shape
Fruit Colour
Bozhestvennaja10.57.22.7204.11633.023.115.61.5106.22.06.0trapezoidallight red
Glen Ample6.97.32.2110.8886.517.718.21.064.52.26.5broad
conical
light red
Kapriz Bogov13.97.82.1227.71821.520.018.71.181.14.94.0broad
conical
red
Lina11.78.52.7268.52148.117.715.81.185.53.06.0broad
conical
light red
Lubetovskaja13.210.12.1280.02239.817.415.41.171.03.75.0conicaldark red
Octavia8.78.82.2168.41347.518.417.41.179.33.07.0broad
conical
light red
Patricija15.68.72.3312.22497.225.718.11.4112.33.84.7trapezoidallight red
Ruvi15.49.11.8252.32018.015.814.91.177.64.05.0conicallight red
Shahrizada9.762.3133.91070.917.715.31.286.54.26.3conicaldark red
Sulamifa18.67.81.3188.61508.821.417.31.275.62.13.7trapezoidaldark red
S1-12-1315.49.11.8252.32018.011.711.81.074.35.66.3conicaldark red
S11-25a-415.112.42.5468.13744.817.316.61.080.13.84.2conicalred
S2-6-1321.511.72503.14024.817.015.11.194.82.95.7trapezoidalred
S2-6-818.214.41.8471.73774.019.018.21.075.52.24.4conicallight red
Table 2. Characterization of primocane raspberry yield components.
Table 2. Characterization of primocane raspberry yield components.
Cultivars and HybridsLength of Cane, cmLength
of Fruiting Part of the Cane, cm
Fruit
Laterals
per Cane
The
Average Weight of Fruit, g
Yield per
Cane, g
Yield per Bush, gFruit Length,
mm
Fruit Width,
mm
Shape
Index (Ratio Length: Width)
Account of
Drupe
Fruit
Glossiness (Score 1–9)
Fruit
Firmness (Score 1–9)
Fruit ShapeFruit Colour
Brilliantovaja77.642.512.02.727.6220.823.821.41.178.24.95.0conicalred
Gerakl129.157.816.12.253.0424.017.319.70.956.84.47.1rounddark red
Poemat135.040.713.82.781.4651.217.217.41.072.74.25.9roundlight red
Polana132.847.816.02.2124.7997.623.420.31.2108.76.75.0conicalred
Polonez138.135.911.82.281.1649.021.518.01.2103.25.66.4conicallight red
Rubinovij Gigant127.652.817.52.150.7405.620.322.20.967.55.46.3broad
conical
red
Rubinovoje Ožerelje125.040.713.22.184.4675.224.118.91.382.24.86.7conicalred
B6R9103.043.513.73.3135.61084.818.118.61.066.35.24.3rounddark red
P6R3107441.214.43.9183.11464.824.021.61.1113.98.07.0conicalred
P6R33111.845.213.92.9157.91263.219.619.41.070.26.96.7roundred
Table 3. The list of software used.
Table 3. The list of software used.
Software|PlatformVersionInformation
Label Studio1.7.1https://github.com/heartexlabs/label-studio (accessed on 9 December 2022)
Table 4. The weather conditions under which all the images were gathered.
Table 4. The weather conditions under which all the images were gathered.
DateClassesNo. of ImagesTimeAir
Temperature,
°C
Humidity,
%
PPFD,
µmol/m2/s
Soil Temperature,
°C
Soil Moisture Content,
%
15 June 2021“Buds”, “Flowers”, “Unripe Berries”Range 1 (3516–4076)—558 images11:13–11:5521.856.71387.818.718.5
6 June 2021“Buds”, “Flowers”, “Unripe Berries”Range 2 (4132–4456)—324 images9:19–9:5919.749.11472.217.515.3
2 July 2021“Buds”, “Flowers”, “Unripe Berries”, “Ripe Berries”, “Damaged Buds”Range 3 (5095–5803)—678 images8:48–10:1826.954.81430.323.69.8
6 August 2021“Buds”, “Flowers”, “Unripe Berries”, “Ripe Berries”, “Damaged Buds”Range 4 (6843–7390)—512 images8:55–9:3319.057.0854.018.69.9
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Strautiņa, S.; Kalniņa, I.; Kaufmane, E.; Sudars, K.; Namatēvs, I.; Nikulins, A.; Edelmers, E. RaspberrySet: Dataset of Annotated Raspberry Images for Object Detection. Data 2023, 8, 86. https://doi.org/10.3390/data8050086

AMA Style

Strautiņa S, Kalniņa I, Kaufmane E, Sudars K, Namatēvs I, Nikulins A, Edelmers E. RaspberrySet: Dataset of Annotated Raspberry Images for Object Detection. Data. 2023; 8(5):86. https://doi.org/10.3390/data8050086

Chicago/Turabian Style

Strautiņa, Sarmīte, Ieva Kalniņa, Edīte Kaufmane, Kaspars Sudars, Ivars Namatēvs, Arturs Nikulins, and Edgars Edelmers. 2023. "RaspberrySet: Dataset of Annotated Raspberry Images for Object Detection" Data 8, no. 5: 86. https://doi.org/10.3390/data8050086

Article Metrics

Back to TopTop