Machine Learning for the Fast and Accurate Assessment of Fitness in Coral Early Life History

Macadam, Alex; Nowell, Cameron J.; Quigley, Kate

doi:10.3390/rs13163173

Open AccessArticle

Machine Learning for the Fast and Accurate Assessment of Fitness in Coral Early Life History

by

Alex Macadam

^1,*,

Cameron J. Nowell

²

and

Kate Quigley

¹

Australian Institute of Marine Science, Townsville 4810, Australia

²

Monash Institute of Pharmaceutical Sciences, Monash University, Parkville 3052, Australia

^*

Author to whom correspondence should be addressed.

Remote Sens. 2021, 13(16), 3173; https://doi.org/10.3390/rs13163173

Submission received: 9 June 2021 / Revised: 31 July 2021 / Accepted: 6 August 2021 / Published: 11 August 2021

(This article belongs to the Section Ecological Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

As coral reefs continue to degrade globally due to climate change, considerable effort and investment is being put into coral restoration. The production of coral offspring via asexual and sexual reproduction are some of the proposed tools for restoring coral populations and will need to be delivered at scale. Simple, inexpensive, and high-throughput methods are therefore needed for rapid analysis of thousands of coral offspring. Here we develop a machine learning pipeline to rapidly and accurately measure three key indicators of coral juvenile fitness: survival, size, and color. Using machine learning, we classify pixels through an open-source, user-friendly interface to quickly identify and measure coral juveniles on two substrates (field deployed terracotta tiles and experimental, laboratory PVC plastic slides). The method’s ease of use and ability to be trained quickly and accurately using small training sets make it suitable for application with images of species of sexually produced corals without existing datasets. Our results show higher accuracy of survival for slides (94.6% accuracy with five training images) compared to field tiles measured over multiple months (March: 77.5%, June: 91.3%, October: 97.9% accuracy with 100 training images). When using fewer training images, accuracy of area measurements was also higher on slides (7.7% average size difference) compared to tiles (24.2% average size difference for October images). The pipeline was 36× faster than manual measurements. The slide images required fewer training images compared to tiles and we provided cut-off guidelines for training for both substrates. These results highlight the importance and power of incorporating high-throughput methods, substrate choice, image quality, and number of training images for measurement accuracy. This study demonstrates the utility of machine learning tools for scalable ecological studies and conservation practices to facilitate rapid management decisions for reef protection.

Keywords:

coral restoration; machine learning; pixel classification; benthic ecology; climate change; coral reproduction; coral settlement; coral fitness

Graphical Abstract

1. Introduction

The continued increase in sea surface temperatures due to climate change has been the major driver in the loss of up to 50% of the world’s coral reefs [1,2,3,4,5,6]. The occurrence of mass bleaching and mortality events, in which corals lose their symbiotic dinoflagellates (Symbiodiniaceae) en-masse, has become more frequent [7]. Even relatively healthy ecosystems, such as the Great Barrier Reef (GBR), suffered back-to-back bleaching events in 2016 and 2017 [2,6]. An increase in the frequency of these mass bleaching events impedes a reef’s ability to potentially recover to previous levels of coral cover before the next disturbance event [8]. Coral reefs show some potential to endure anthropogenic impacts through rapid acclimation and adaptation [9,10,11]. However, some reef restoration may be needed to maintain resiliency whilst global efforts to reduce warming are implemented. As a result, there has been a rapid increase in investment in intervention and restoration initiatives focused on improving coral survival [12,13,14], especially in the large-scale production of sexually produced corals [15]. Consequently, it is vital for the development of conservation tools in habitat protection and decision making to take advantage of high-speed imaging technology which can quickly process large amounts of complex data [16].

Central to restoration-based interventions is “scalability”, or the ability of these interventions to influence large areas of reef. Previous techniques involving the manual census of coral fragments [17,18] and early life-history stages [19] are limited in their ability to process large volumes of data. This is particularly relevant for coral juveniles that are out-planted on devices designed for high survival [20], as they can be grown in the hundreds of thousands due to their small size and mass reproductive strategies (i.e., broadcast spawning; [21]). The early life-stages of corals, most notably post-settlement recruits, are susceptible to high rates of mortality. Current research is focused on determining and reducing the drivers of mortality by testing populational heat tolerance for the purpose of choosing colonies based on fitness for use in restoration [19,20,22,23,24]. The demand for analyses of minute-scale differences between juveniles is therefore needed. Coral juvenile survival, growth, and color (as an indicator of algal-symbiont density and therefore bleaching) are all factors used for assessing success of coral juvenile communities [25]. Current analysis techniques use time-consuming methods of manual size measurements, visual color grading using color reference cards [18], and potentially subjective assessments of survival, which predisposes the results to human-error. Manual image annotation is also difficult to scale for large intervention projects. Semi-automated tools exist (e.g., cell counting tool in Fiji ImageJ; [26,27]), but results may vary with different surface substrates, coral color, and structure. This highlights the need to develop workflows that are easy to use and scalable. Lessons from automated and semi-automated methods developed for medical imaging [28] and cancer detection [29] should be harnessed to detect minute differences and to minimise time constraints.

Automated, computer-driven software tools are accelerating data processing in both terrestrial [30,31] and marine research [32,33]. Habitats that are deteriorating due to climate change impacts and local pressures predicate the need for the application of these high-throughput conservation tools to rapidly process complex datasets. Software that enables automated image analysis is widely used to identify and count cells, with many of the employed techniques using model-based segmentation [34,35]. Often, these techniques are computationally complex and depend on building models of a single object type and require homogenous shapes and intensities to correctly classify an object. Effective methods of coral analysis exist that utilise various forms of artificial intelligence (AI) through machine learning (ML) and both deep neural network (DNN) and convolutional neural network (CNN) techniques. Here, we define ML as the use of computer algorithms that improve automatically through experience and the use of data (sensu Crisci et al. [36]). These methods have been used as conservation tools to assess large areas of reef for coral detection and classification [37,38,39,40,41,42,43,44,45,46,47] and for the quantification of benthic cover [47,48,49]. This diverse set of approaches enables quick, large-scale in situ habitat assessment and coral cover analysis. However, they lack certain features needed for assessing the fitness of coral juveniles produced for coral restoration. For example, high-resolution measurement to the micron scale is needed to measure the growth of juveniles that are often less than a millimeter in diameter.

Approaches that simplify the classification and segmentation of objects at the micron scale are being developed [50]. In particular, ML using random forests can further accelerate data acquisition and processing for small objects (e.g., coral juveniles) due to its effective supervized learning. This method has shown success at distinguishing boundaries of objects using pixel classification. Notably, comparisons between ML programs (e.g., Ilastik) and other methods that use deep learning approaches show less than 2% difference in model accuracy [51]. This is important when considering that ML methods are considerably less computationally expensive, time consuming and require no programming ability [52]. Bringing these AI technologies into coral reef research and restoration may be analogous to the harnessing of “big data” previously experienced during the molecular biology sequencing revolution [53].

Machine learning, compared to deep learning algorithms, may be considered a more reproducible workflow when using new datasets. This is, in part, due to their robustness when encountering different backgrounds and high replicability (see damage estimations; [54]). In contrast, deep neural networks are more complex and can face high variability in their optimisation, hyperparameters and architecture of the deep learning framework [55]. We therefore propose this ML method to be more suitable for assessing relatively simple coral juvenile fitness traits. It should be noted that true AI deep learning, although inordinately more complex and time consuming relative to machine learning pixel classification, is considerably more powerful [56]. True AI deep learning should, therefore, be considered over pixel classification for more complex, field-based classifications if the size of the dataset and the desired accuracy of the outputs outweighs the effort of training. Pixel classification using random forests was chosen for this pipeline over other supervized ML techniques due to its ability to more accurately quantify the size of objects and lower variability under different background surroundings and lighting [54]. Finally, deep learning often depends on large amounts of training data. Coral juveniles are often in low numbers and lack homogenous structures, textures, and shapes. The training needed using deep learning methods for each timepoint of these experiments is therefore not feasible.

Here we present a novel ML pipeline, for the high-throughput data acquisition of coral juveniles from images, that is accessible to coral restoration practitioners with little training (Figure 1). This tool uses the open-source pixel classification software, Ilastik, and a custom-made script for use in Fiji ImageJ to rapidly and accurately measure three coral traits: survival, growth, and color. We use both field and laboratory-based images of small coral juveniles (less than one year) to test the performance of the three measurements of fitness between the ML pipeline and human “ground-truthed” measurements. We also assess the trade-off between training time and accuracy of the model’s measurements and provide guidelines for practitioners on image acquisition, processing, quality, substrate type, and the number of training images on the success of the image analysis using ML.

2. Methods

2.1. Datasets and Manual Measurements of Juveniles Using ImageJ

The machine learning (ML) image analysis pipeline was assessed using coral juveniles settled onto two types of substrates: (1) field deployed terracotta tiles (hereafter ‘tile images’) and (2) laboratory-maintained PVC plastic slides (hereafter ‘slides images’). Both substrates used coral juveniles from the species Acropora tenuis of the same age range (within the first year of life). For the two coral juvenile experiments used here, there were thousands of tiles and slides from several time points. Specifically, there were 1200 tile images with three different time points from the field, and 900 slides with images taken at 12 time points in the laboratory. Terracotta tile images were obtained from a year-long field study conducted at Davies reef on the Great Barrier Reef (site description and map can be found in Quigley et al. [19]. Images were taken at three time points using a Nikon D810 camera body and Nikon AF-S 60 mm micro lens (image resolution: 7360 × 4912 pixels) and an Olympus Tough TG-5 camera with Ikelite DS160 strobes (image resolution 4000 × 3000 pixels). Plastic PVC slide images with settled juveniles were taken with a Nikon D810 camera body and Nikon AF-S 60 mm micro lens (image resolution: 7360 × 4912 pixels), as per the first time point from the tile field-based study. For both sets of images, “ground-truthed” juvenile measurements were taken manually. The area of each individual juvenile was measured using the polygon selection tool in the ImageJ software [57]. Size was calibrated using the scale bar present in each image. Color of juveniles was assessed using the CoralWatch Health Chart [18] and was matched to the closest score on the “D” scale by a single person to minimise observer bias. Survival of juveniles was classified by eye as either alive or dead.

2.2. Machine Learning Image Analysis Pipeline

2.2.1. Training Image Production

For the tile images, the initial training set consisted of images of five alive juveniles, five dead juveniles, five dying juveniles, and five images of miscellaneous substrate and debris including crustose coralline algae, snails, and macroalgae. Additional training images were produced by dividing the complete tile images into multiple images. Training images including coral juveniles were grouped and chosen at random. The slides in the slide images feature four wells containing settled coral juveniles. The number of juveniles varied between zero and six per well. In this analysis, the number of slides was taken into consideration, and not the number of single juveniles, due to gregarious settlement of juveniles and formation of groups of settled juveniles. The initial training set for the slide images consisted of five images of wells containing alive juveniles and five images of different parts of the slides, including the number engraved on the slide for unique slide identification and the spaces between the slides in each slide rack. Additional training images were used and included one image of a set of five slides. Each of these were counted as five training images (See Supplementary Figure S1A). All images of juveniles from the tiles were first cropped and batch processed. Cropping of images in sets was found to be more accurate, instead of using the whole tile image owing to the large number of false positives occurrences from the complex and variable substrate and the lower resolution of the images.

2.2.2. Measurements of Juveniles Using the Ilastik Pipeline

Ilastik [50,58] is an open-source pixel classification tool used to produce simple semantic segmentations of images (Figure 2). This tool performs classification using a random forest classifier with 100 trees, which was chosen for its good ‘generalisation’ performance, robust choice making when using small sample sizes, and reproducibility of results [59,60,61]. Pixel features and their scales are chosen to discriminate between the different classes of pixels. Ilastik gives the option to choose features that use color/intensity, edge (brightness or color gradient) and texture to discern objects. Each feature can be selected on different scales, corresponding to the sigma of the Gaussian and evaluate classifier predictions from the user’s annotations. Filters with higher sigma values can pull information from larger neighbourhoods. Initially, all features were selected to give the model the highest power, as recommended [58]. Through project testing, the ‘suggested features’ tool within Ilastik was used to determine the optimal settings for sensitivity to select the features that are the most effective at discriminating between pixels in different classes. Through testing project sensitivity (‘all features’ sensitivity = 0.95, ‘suggested features’ sensitivity = 0.93) and the average percentage difference of area (‘all features’ area difference = 7.3%, ‘suggested features’ area difference = 8.6%) we chose to use all features. This is recommended by the Ilastik developers [50] and demonstrates high accuracy [51]. Sensitivity is defined as:

S e n s i t i v i t y = \frac{t p}{t p + f p}

where tp = True Positive and fp = False Positive. Image filter outputs include pixel color, intensity, edge, and texture. The features used were: Gaussian Smoothing (σ = 0.3, 0.7, 1.0, 1.6, 3.5, 5.0, 10.0 in 2D for R, G, and B), Laplacian of Gaussian (σ = 0.7, 1.0, 1.6, 3.5, 5.0, 10.0 in 2D for R, G and B), Gaussian Gradient Magnitude (σ = 0.7, 1.0, 1.6, 3.5, 5.0, 10.0 in 2D for R, G and B), Difference of Gaussians (σ = 0.7, 1.0, 1.6, 3.5, 5.0, 10.0 in 2D for R, G and B), Structure Tensor Eigenvalues (σ = 0.7, 1.0, 1.6, 3.5, 5.0, 10.0 in 2D for R, G and B), and Hessian of Gaussian Eigenvalues (σ = 0.7, 1.0, 1.6, 3.5, 5.0, 10.0 in 2D for R, G and B). The exported segmentation masks were utilized to compute the coral health parameters using custom, open-source code (https://github.com/LaserKate/Coral_Ilastik_Pipeline; accessed on 7 August 2021) in Fiji ImageJ (version 1.53 h). Outputs from this code include the survival count of juveniles per image, juvenile area and their Hue, Saturation, and Lightness (HSL) values. This data was then exported into a ‘csv’ file, which can be directly uploaded into other open-source software such as R (See Supplementary Figure S2 and Supplementary Information S1).

2.3. Model Validation; Assessment of Pipeline Accuracy and Speed

Ilastik projects were produced using various numbers of training images for both the tile and slide substrates. The purpose of this was to test how many training images are needed to give the most accurate outputs. The total number of randomly selected test images was 111 March images, 81 June images and 95 October images for the tile images. A total of 275 test images of juveniles were randomly selected for the slide images. Further, there was no overlap of training and test data, and no areas in the test images were seen by Ilastik during training. As juvenile appearances vary between species, different projects would have to be trained for each species, especially if a different substrate was used. The accuracy of the survival counts, area, and HSL measurements were compared between each project and the mean differences ± standard deviations between the manual method and the machine learning pipeline were calculated.

The number of juveniles counted by the pipeline was automatically recorded to an excel file. This gave a ‘survival count’ of juveniles in each image. This count was then compared with the ground truth manual survival counts from the same images. Regions of interest (ROI) were visually (qualitatively) assessed against manual assessments to calculate false positive and false negative rates. For example, at times multiple juveniles were classified as a single juvenile due to their close proximity to each other (false positives). The accuracy of juvenile size measurements was assessed by comparing the pipeline ROI areas in the output excel file with the ground truth manual measurement using ImageJ. Color accuracy of images using the CoralWatch Health Chart “D” scale was also assessed by comparing manual and pipeline assessments [18]. Manual assessments were made for every juvenile in the test images and were given a visual score according to the health chart. The output.csv file from the pipeline contained an average HSL value for all the juveniles counted. These were directly compared with the HSL values of the CoralWatch Chart color score. The time to perform the manual and pipeline measurements for survival, size, and color was quantified for both tile and slide images. This included data handling time for each step.

2.4. Statistical Analysis

To evaluate the difference in juvenile counts between methods, the “correct” values (defined as those taken from the manually derived measurements of images) were compared to Ilastik outputs for each test image. True positives were measurements that were identified in both the manual and pipeline assessments. False positives were juveniles counted in the pipeline but not manually and false negatives were juveniles that were counted manually but not in the pipeline. From this, an F-score (F1) was calculated for each project using the equation [62]:

F 1 = 2 \times \frac{P P V \times T P R}{P P V + T P R} = \frac{2 T P}{2 T P + F P + F N}

Abbreviations are as follows: PPV = positive predictive value, TPR = true positive rate (correct), FNR = false negative rate, TP = true positives, FP = false positives, and FN = false negatives. Survival means and standard deviations were calculated using R (version 4.0.3; [63]). The packages ‘rstatix’ and ‘broom’ were used for the area analysis [64,65]. For the tile image size and color analysis, a two-way ANCOVA and pair-wise comparison was performed to examine the effects of the number of training images on area, with manual measurements set as the covariate. A Bonferroni adjustment was applied using the package ‘emmeans’ [66]. For the analysis of slide images areas and color, a one-way ANCOVA and pair-wise comparison was performed to examine the effect of number of training images in measurement accuracy.

3. Results

3.1. Assessment of Manual Versus Pipeline Calculations of Coral Juvenile Survival

The optimal number of tile images to train with to measure survival was consistent for all timepoints, but varied by which month’s images were used in training (Figure 3A). The optimal number of training images to train on was 100 for all months (March, June, and October), giving a True Positive Rate (TPRs) value of 77.5%, 91.3% and 97.9%, respectively, when trained using training images from the same time point (Figure 3B). The number of juveniles counted as alive was consistently low (<25%) if trained using images from a different time point.

When the slide images were input into the pipeline with five training images, an average of a 94.6% TPR was obtained, with a 0.89% FNR and a 7.3% FPR in comparison to the ground truth counts (Figure 3C,D). TPRs and FNRs did not drop to less than 0.9% and 0.5% when training with more than five training images. However, the lowest FPR was recorded with 40 training images (4.9%, Figure 3D). The number of training images with the highest F-score was 40 images (F1 = 0.973), with the initial training set having the lowest (F1 = 0.756). Training the initial set with just five additional slide images increased the F-score to 0.958.

3.2. Assessment of Manual vs. Pipeline Calculations of Coral Juvenile Size

The correlation between ground truth manual measurements of coral juvenile area and the measurement of the segmentation masks made by the pipeline on the tile images varied by month and the number of images used in training (Figure 4A). For the March data, the regression with the best fit was found when training with 150 images (R² = 0.634, p = 1.76 × 10¹⁴), although training with 60 images showed a similar fit (R² = 0.614, p = 6.84 × 10⁸). The best fit regression for the June data was found when training with 40 images (R² = 0.936, p = 8.51 × 10¹⁵), whilst the October data was best fit when trained with 60 images (R² = 0.867, p = 5.76 × 10¹⁰; Figure 4A). The regressions indicate a more accurate measurement by the pipeline in the later timepoints (i.e., June and October), indicating higher measurement accuracy in older corals. The area was consistently underpredicted (~30%) by the pipeline when trained with 20 or more images (Figure 4B). After 40 training images, the difference between manual and pipeline measurements did not vary considerably with the mean difference in area measured around 27.9% ± 23.7 for 40 images and 29.1% ± 19.6 for 150 images (Supplementary Table S1). Pairwise comparisons for June and October showed that there was a statistical difference when using the initial training set and all additional numbers of training images (Supplementary Table S2; all p < 0.05). There was no statistical difference between any other number of training images. For March, the pairwise comparison showed that all analyses with more than 40 training images were not significantly different (Supplementary Table S2).

For the slide images, there was no significant difference seen when training with different numbers of images (F = 1.914, p = 0.054). The mean difference in the size of juveniles settled onto slides when using just the initial training images was an underestimate on average of −0.11% ± 116.75. When using five more images to train, the mean underestimated difference was −10.84% ± 15.91, substantially improving the standard deviation and decreasing the median range of values (Figure 4C), although the pipeline mean was farther from the mean of the manual measurements. The lowest mean underestimated difference was measured when training with 40 images (−3.47% ± 19.35).

3.3. Assessment of Manual vs. Pipeline Calculations of Coral Juvenile Color

Coral color is used as a health proxy for corals to indicate the relative abundance of symbiont cells (Symbiodiniaceae) inside their tissues [18], where “D1” scores represent low symbiont densities (pale, bleached corals), and “D6” scores represent tissues with high symbiont densities (Figure 5A). Pairwise comparison for the slide images showed significant differences in over and under-predicting HSL values when comparing the initial training set with all other numbers of training images (Figure 5B; Supplementary Tables S3–S5; all p < 0.05). However, there were no significant differences in over- or under-predicting HSL values when five or more training images were used, indicating that the accuracy of color assessment does not significantly improve with greater than five training images.

In the tile images, the hue and lightness measurements did not significantly differ when using more training images for all three time-points (hue: F = 0.989, p = 0.462, lightness: F = 0.675, p = 0.800). However, pairwise comparisons between manual and pipeline measurements showed that saturation, when using the initial training set, significantly under-predicted juvenile values compared to predictions from all other numbers of training images (Figure 5C; Supplementary Table S4).

3.4. Assessment of Time Saving for the Measurement of Coral Juvenile Survival, Size, and Color in Manual vs. the Pipeline Measurements

On average, it took ~720 h to manually measure 1200 tiles, compared to between ~115 and 215 h using the pipeline (depending on number of training images used). This is equivalent to 6.2× faster per time point when using 20 training images, and 5× faster when training with 60 images (Figure 6A). The time to process slides was also significantly quicker using the pipeline compared to manual measurements (Figure 6B). Time efficiency drastically increased for larger datasets, with the analysis of 900 slides being 36× faster compared to manual measurements per time point using five training images, and even being 4× faster when a large number (50) of images was used to train against. The threshold in which the time taken using manual measurements is slower than the pipeline processing occurs at >250 tiles (60 training images) and >30 slides (5 training images; Figure 6A,B).

4. Discussion

As coral reef ecosystems face continuing stress and degradation from persistent ocean warming and other anthropogenic pressures [6], conservation efforts are becoming increasingly considered to restore reefs and improve their resilience to stress. This includes understanding processes that promote the potential for coral recovery and resilience. While a global effort to reduce carbon dioxide emissions should be central, large investments are being made to design coral restoration interventions that can be used to mitigate degradation [67]. Some of these interventions include the seeding of enhanced, heat tolerant reef-building coral species using assisted gene-flow methods onto reefs to prepare them for future warming [19,68]. This may include the seeding of hundreds of thousands of individuals, necessitating tools that can rapidly count and assess corals on large scales. Open-source tools are therefore critical and should be made available to aid in the development and vetting of these proposed management strategies [25]. We present a pipeline for coral juvenile analysis targeted at three key coral fitness traits relevant for ecological and restoration purposes that performs up to 36× faster with minimal differences in accuracy, especially in survival and size. While employing current methods of manual measurements using tools such as ImageJ, analysis of coral juveniles can take many months [19]. This pipeline, therefore, has the potential to accelerate the turnover of results between experiments and scale-up experimental size using a user-friendly, free, and reproducible interface. Once training is completed, the pipeline has the capability to batch process images and requires little further human input, thereby allowing the scale of experimental analysis to be expanded with little increase in analysis time.

This pipeline represents an open-source interactive learning and segmentation toolkit. It leverages machine learning (ML) algorithms to easily segment, classify, track, and count cells and other experimental data in an interactively supervized format, averting the necessity for understanding the complex algorithms underpinning them [50,58,69]. This allows for multiple user groups to take advantage of its capabilities. For example, the interface cues the user to draw label annotations onto images, allowing the program to classify pixels using a powerful non-linear algorithm. This creates a decision surface in feature space and generates accurate classification and projects the class assignment onto the original image. Other deep learning methods of image analysis are extremely powerful and show great success, but rely on significant amounts of training data to build a high-dimensional feature space [70], compared to the small training set needed for algorithm parameterisation through user supervision [50,58]. Additionally, this pipeline offers a limited number of pre-defined features that are powerful enough to detect image features that the human eye cannot [29], which are relevant for small, cryptic, and taxonomically amorphous coral juveniles.

4.1. Overall Assessment of the Pipeline Performance in Accurately and Rapidly Measuring Survival, Size, and Color

The assessment of coral juvenile survival on the tile images was challenging given the diversity of organisms that recruited onto the tiles in the field and the changing nature of the community composition through time. However, this pipeline successfully (77.5–97.9% detection with 100 training images) detected and quantified surviving juveniles when trained with juvenile images from the same time point. The higher accuracy in detection in later timepoints may also have been due to changing coral morphology over time as the juveniles grew and developed ever-increasing branch complexity. This also shows the importance of training with the corresponding time point when assessing measurements taken in the field, given the successional changes in flora and fauna, environmental parameters such as temperature and light intensity. This highlights the importance of minimizing the number of arbitrary objects in images that could be difficult for both the pipeline and the user to classify. Low accuracy when using images from different timepoints further suggests that this pipeline would perform with low accuracy if assessing images of a different substrate not seen in training. Moreover, image quality is important for the same reason. While training, the pipeline creates uncertainty masks that can be retrained to give more accurate segmentation masks, however low-resolution images lead to high uncertainty in these masks. This is especially difficult if the user is also uncertain of the classification of specific areas of an image due to low resolution. Finally, although high-resolution images will give a more accurate analysis of the juveniles, it will increase the time needed for the pipeline to compute pixel classifications for the training images. Alternatively, the slides used in this study were specifically designed to provide a high contrast (grey matte PVC versus brown coral tissue), low complexity surface for rapid image classification. Therefore, the slide images were devoid of other objects and had less objects to classify. This allowed the number of training images to be fewer and more accurate, with an almost 100% accurate assessment of survival using as few as five training images.

Juvenile size analysis for the tile images required 40 images to give the most accurate assessment of size compared with manual measurements. Although the pipeline significantly and consistently underestimated the size of juveniles on the complex tiles of the surfaces (no matter the number of training images), by the last time point in which juveniles were larger, the underestimation was less than 10% (compared to initial ~40%) and was consistent, suggesting that a calibration factor could be applied if needed. This further highlights the importance of image quality. When the images of juveniles had an indistinguishable boundary with the substrate it was difficult for both the human user and the ML algorithm to accurately identify and then train to classify pixels near the boundary. Interestingly, when juveniles are very small and the substrate is complex, this brings up the argument whether the manual measurement or the ML measurement is correct. Alternatively, where substrate complexity was kept low by design on the slides, the size comparisons between the manual and pipeline measurements were almost indistinguishable (median ~0% difference), especially with >five training images. It is important to note that the slide images still exhibited outliers with extreme differences in area (seen in the median interquartile ranges for different sets of training image numbers). These often occurred when juveniles had settled in close proximity to one another, and the pipeline classified the slide pixels between the juveniles as “juvenile” pixels, joining several juveniles together. However, these cases were rare when using high-resolution images and can be easily excluded during the qualitative visual analysis of the output images.

The assessment of coral color was more complex, given the three measurements used to quantitatively describe it. However, hue, saturation, and lightness (HSL) did not change when using more than five training images. The percentage difference in HSL between the manual and pipeline measurements when using just the initial training set compared to using five or more images is likely due to the pipeline classifying pixels differently at the borders between juvenile and tiles. This means that the pipeline is calculating an average HSL value for a different number of pixels. A more accurate prediction of the object border and area allows less influence of false positive pixels in the color classification, creating more accurate HSL values and can be further improved with subsequent testing. Interestingly, the final October time point showed more variability in hue when compared to March and June, indicating higher variability in hue in later life stages of coral development. This is potentially due to colonization of symbionts throughout new growth regions of the coral. Even though the HSL scores were consistent between the numbers of training images used in most instances, these values were different to the HSL scores on the coral health chart. For example, hue values were higher, while saturation and lightness values were lower. The coral health chart has been extremely important in standardizing the analysis of coral stress and bleaching and is a key tool when making standardized in situ assessments [18]. Its simplicity means that it can be used in citizen science initiatives giving consistent comparisons and will remain a key tool in bleaching assessment. However, the method has some limitations. Coral color is extremely diverse and may not fall under one of the four scales provided on the coral health chart. Thus, this pipeline gives additional options in the assessment of hue, allowing the user to standardize to symbiont type, more accurately predict symbiont density, and potentially include other measurements like fluorescence [71], thereby uncovering new traits of temporal color change between life stages that the color chart may not reveal. Finally, while saturation and lightness can be assessed using the coral chart scale, the output given by this pipeline is a mean value of the entire coral juvenile. This potentially reduces the human bias of bleaching score given that color is often patchily distributed along the coral colony. When a coral juvenile first takes up symbionts, they are often ingested through the mouth and displayed first in the tentacles. The coenosarc (the connecting tissue between corallites) frequently does not display symbionts until after the polyp has dense aggregations of algal cells. This variation in cell density between polyp and coenosarc gives the coral a non-uniform color and can make it difficult to score using coral health chart [72] and introduces additional human bias. The proposed pipeline therefore has the potential to eliminate this bias by taking a mean value across the full surface of the individual.

4.2. Recommendations for Training Parameters for Pipeline Users

While the pipeline successfully analyzed coral juvenile survival, size, and color, there are factors to consider when preparing images. For example, different image types require a different number of training images to produce the most accurate outputs that minimize the difference between manual and ML measurements. We advise that the user should train this pipeline on several numbers of training images until the comparison between output images and the test set images is minimized. For these substrates, it was generally between 5 to 100 training images, depending on the trait. As coral juveniles grow, their appearance will also change, impacting accuracy. If several time points are being analyzed with little visual change to the juveniles, the same pipeline and training images may be used. However, if there is substantial time between time points, especially in field-based studies, it is advised that a test set of images from the corresponding time point is analyzed.

One of the key benefits of this pipeline is the ability to increase number of coral juveniles analyzed with little increase to the analysis time. Once trained, the pipeline can be left to analyze a large number of images without supervision. The results from this analysis show that when training with 60 images (the number of images that showed accurate results for survival, size, and color), the pipeline is nearly 5× faster than manual measurements if retraining is needed for each time point. The slide analysis, when training with five training images (the number of images that showed accurate results for survival, size, and color), is 36× faster than manual measurements if retraining is needed for each time point. If retraining is not needed for every time point, analysis is even faster. The overlap in manual versus the pipeline estimates clearly demonstrate the pipeline’s efficiency.

Alternatively, if an experiment has a small number of juveniles to analyze (150 slides or 200 tiles), then training time may outweigh the time it takes to manually measure the juveniles. Training time will change with more complex substrates. This is relevant as the development of artificial substrates for sexually-produced restoration corals using biomimicry of CCA is being discussed to increase settlement success [15]. The two substrates used here included terracotta tiles and plastic PVC slides. The tiles are ideal for coral recruit settlement given their rough texture and crevices, creating areas for the larvae to attach to and escape predation. However, this also allows other organisms to easily attach, creating a more difficult image to train on when compared to the smooth, matte, and uniform PVC, where less accumulation of algae and other organisms can occur. However, the smooth surface may also hinder the ability of certain coral species to attach during settlement. Therefore, the most suitable substrate would be both visually uniform and have substantial micro-texture for attachment.

Image resolution was also extremely important for analysis accuracy as image ambiguity increases training time and produces uncertainty for both the trainer and the pipeline. Images should be clear with distinct boundaries in order to provide accurate training classifications, thereby generating accurate outputs. If too many of the same objects are trained on, overtraining can occur [69]. Overtraining can then lead to overfitting, which arises when a classifier fits the training data too “tightly”, leading to unwanted behaviour from the ML algorithm. It can cause the algorithm to perform well on the training data, but not on independent test data. We recommend testing for this by comparing the test data outputs using several numbers of training images, as we have demonstrated here. Once accuracy starts to decrease, then the algorithm is overfitting, and the training number should be reduced.

4.3. Future Directions in Tool Development for Coral Conservation

As new restoration intervention technologies are developed, further improvements will be made to conservation tools. For example, although this analysis used 2D images, this pipeline is also capable of processing 3D images [34]. This will deliver estimates of the size and color of 3D scanned corals, further improving biological information, throughput, time efficiency, accuracy, and ease of use in the field of current 3D methods [34,73,74,75] and apply them to early life stages, which are especially challenging for small coral juveniles (see Quigley et al. [19]). Understanding coral ontogeny over all life stages is essential, in which multiple methods are needed to understand growth form and structural complexity of corals. Although using 2D images suits the analysis of coral recruits and young juveniles due to their flat form in the early months (of particular species), 3D analysis is essential as coral juveniles start to grow more complex shapes [68,74]. Open-source tools that combine powerful software libraries for biological image analysis with easy functionality will be pivotal to their rapid application [27]. For example, the plugin we designed here, and project algorithms trained in other open-source programs can be combined in software, such as Fiji, so that segmentation masks can be created and analyzed simultaneously.

5. Conclusions

The use of ML in coral juvenile analysis can provide researchers, managers, and conservation practitioners with free, open-source tools to improve the health of coral reefs by enhancing the understanding of factors influencing coral growth and survival. Our pipeline successfully analyzed coral juveniles on two lab and field-based substrates and demonstrated an improvement of efficiency of up to 36× times. This adaptable pipeline could be used for various organisms to assess diverse benthic communities. This information will help spur and refine conservation practices in the new era of large-scale and ambitious restoration projects globally.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/rs13163173/s1, Supplementary Figure S1: Example of a PVC plastic slide image showing the five slides in the holder (A), A collage of cropped images of terracotta tiles illustrating the variability of substrates in the images (B), an up-close image of two coral juveniles in a well of a PVC slide (C), Supplementary Figure S2: Flow of training in Ilastik, Supplementary Information S1: Ilastik pipeline overview, Supplementary Table S1: Mean % and SE of area difference between ImageJ and Ilastik tiles, Supplementary Table S2: Pairwise comparison for tile area analysis, Supplementary table S3: Pairwise comparison of slide hue, Supplementary table S4: Pairwise comparison of slide saturation, Supplementary table S5: Pairwise comparison of slide lightness.

Author Contributions

Conceptualization, K.Q., C.J.N.; methodology, All Authors; validation, All Authors; formal analysis, All Authors; investigations, All Authors; resources, All Authors; data curation, All Authors; writing—original draft preparation, A.M.; writing—review and editing, All Authors; visualisation, A.M.; supervision, K.Q., C.J.N.; project administration, K.Q.; funding acquisition, K.Q. All authors have read and agreed to the published version of the manuscript.

Funding

Great Barrier Reef Foundation and Australian Institute of Marine Science (AIMS).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The Fiji software is available at https://imagej.net/Fiji (accessed on 7 August 2021). The Ilastik software is available at https://www.ilastik.org (accessed on 7 August 2021). The coral photos and analysis code for use in Fiji and Ilastik is available at https://github.com/LaserKate/Coral_Ilastik_Pipeline (accessed on 7 August 2021).

Acknowledgments

We would like to acknowledge support from the Great Barrier Reef Foundation and the Australian Institute of Marine Science.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationship that could be construed as potential conflicts of interest.

References

Burke, L.; Reytar, K.; Spalding, M.; Perry, A. Reefs at Risk Revisited; World Resources Institute: Washington, DC, USA, 2011. [Google Scholar]
Hughes, T.P.; Barnes, M.L.; Bellwood, D.R.; Cinner, J.E.; Cumming, G.S.; Jackson, J.B.C.; Kleypas, J.; van de Leemput, I.A.; Lough, J.M.; Morrison, T.H.; et al. Coral reefs in the Anthropocene. Nature 2017, 546, 82–90. [Google Scholar] [CrossRef]
De’ath, G.; Fabricius, K.E.; Sweatman, H.; Puotinen, M. The 27–year decline of coral cover on the Great Barrier Reef and its causes. Proc. Natl. Acad. Sci. USA 2012, 109, 17995–17999. [Google Scholar] [CrossRef] [Green Version]
Pandolfi, J.M.; Bradbury, R.H.; Sala, E.; Hughes, T.P.; Bjorndal, K.A.; Cooke, R.G.; McArdle, D.; McClenachan, L.; Newman, M.J.; Paredes, G. Global trajectories of the long-term decline of coral reef ecosystems. Science 2003, 301, 955–958. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wilkinson, C.R.; Souter, D. Status of Caribbean Coral Reefs after Bleaching and Hurricanes in 2005; NOAA: Washington, DC, USA, 2008. [Google Scholar]
Hughes, T.P.; Anderson, K.D.; Connolly, S.R.; Heron, S.F.; Kerry, J.T.; Lough, J.M.; Baird, A.H.; Baum, J.K.; Berumen, M.L.; Bridge, T.C. Spatial and temporal patterns of mass bleaching of corals in the Anthropocene. Science 2018, 359, 80–83. [Google Scholar] [CrossRef] [Green Version]
Heron, S.F.; Maynard, J.A.; Van Hooidonk, R.; Eakin, C.M. Warming trends and bleaching stress of the world’s coral reefs 1985–2012. Sci. Rep. 2016, 6, 38402. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Emslie, M.J.; Bray, P.; Cheal, A.J.; Johns, K.A.; Osborne, K.; Sinclair-Taylor, T.; Thompson, C.A. Decades of monitoring have informed the stewardship and ecological understanding of Australia’s Great Barrier Reef. Biol. Conserv. 2020, 252, 108854. [Google Scholar] [CrossRef]
Ateweberhan, M.; Feary, D.A.; Keshavmurthy, S.; Chen, A.; Schleyer, M.H.; Sheppard, C.R.C. Climate change impacts on coral reefs: Synergies with local effects, possibilities for acclimation, and management implications. Mar. Pollut. Bull. 2013, 74, 526–539. [Google Scholar] [CrossRef]
Bay, R.A.; Palumbi, S.R. Rapid Acclimation Ability Mediated by Transcriptome Changes in Reef-Building Corals. Genome Biol. Evol. 2015, 7, 1602–1612. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Elder, H.; Weis, V.; Montalvo-Proano, J.; Mocellin, V.J.L.; Baird, A.H.; Meyer, E.; Bay, L.K. Genetic variation in heat tolerance of the coral Platygyra daedalea offers the potential for adaptation to ocean warming. bioRxiv 2020. [Google Scholar] [CrossRef]
Anthony, K.; Bay, L.K.; Costanza, R.; Firn, J.; Gunn, J.; Harrison, P.; Heyward, A.; Lundgren, P.; Mead, D.; Moore, T.; et al. New interventions are needed to save coral reefs. Nat. Ecol. Evol. 2017, 1, 1420–1422. [Google Scholar] [CrossRef]
Anthony, K.R.N.; Helmstedt, K.J.; Bay, L.K.; Fidelman, P.; Hussey, K.E.; Lundgren, P.; Mead, D.; McLeod, I.M.; Mumby, P.J.; Newlands, M.; et al. Interventions to help coral reefs under global change—A complex decision challenge. PLoS ONE 2020, 15, e0236399. [Google Scholar] [CrossRef]
Board, O.S.; National Academies of Sciences, Engineering, and Medicine. A Decision Framework for Interventions to Increase the Persistence and Resilience of Coral Reefs; National Academies Press: Washington, DC, USA, 2019. [Google Scholar]
Randall, C.J.; Negri, A.P.; Quigley, K.M.; Foster, T.; Ricardo, G.F.; Webster, N.S.; Bay, L.K.; Harrison, P.L.; Babcock, R.C.; Heyward, A.J. Sexual production of corals for reef restoration in the Anthropocene. Mar. Ecol. Prog. Ser. 2020, 635, 203–232. [Google Scholar] [CrossRef] [Green Version]
Madin, E.M.; Darling, E.S.; Hardt, M.J. Emerging technologies and coral reef conservation: Opportunities, challenges, and moving forward. Front. Mar. Sci. 2019, 6, 727. [Google Scholar] [CrossRef] [Green Version]
Boström-Einarsson, L.; Babcock, R.C.; Bayraktarov, E.; Ceccarelli, D.; Cook, N.; Ferse, S.C.A.; Hancock, B.; Harrison, P.; Hein, M.; Shaver, E.; et al. Coral restoration—A systematic review of current methods, successes, failures and future directions. PLoS ONE 2020, 15, e0226631. [Google Scholar] [CrossRef]
Siebeck, U.E.; Marshall, N.J.; Klüter, A.; Hoegh-Guldberg, O. Monitoring coral bleaching using a colour reference card. Coral Reefs 2006, 25, 453–460. [Google Scholar] [CrossRef]
Quigley, K.M.; Marzonie, M.; Ramsby, B.; Abrego, D.; Milton, G.; van Oppen, M.J.; Bay, L.K. Variability in Fitness Trade-Offs Amongst Coral Juveniles With Mixed Genetic Backgrounds Held in the Wild. Front. Mar. Sci. 2021, 8, 161. [Google Scholar] [CrossRef]
Randall, C.J.; Giuliano, C.; Heyward, A.J.; Negri, A.P. Enhancing coral survival on deployment devices with microrefugia. Front. Mar. Sci. 2021, 8. [Google Scholar] [CrossRef]
Baird, A.; Guest, J. Spawning synchrony in scleractinian corals: Comment on Mangubhai & Harrison (2008). Mar. Ecol. Prog. Ser. 2009, 374, 301–304. [Google Scholar] [CrossRef] [Green Version]
Quigley, K.M.; Baker, A.; Coffroth, M.; Willis, B.L.; van Oppen, M.J. Bleaching resistance and the role of algal endosymbionts. Coral Bleach. 2018, 233, 111–151. [Google Scholar] [CrossRef]
Quigley, K.M.; Randall, C.J.; van Oppen, M.J.H.; Bay, L.K. Assessing the role of historical temperature regime and algal symbionts on the heat tolerance of coral juveniles. Biol. Open 2020, 9, bio047316. [Google Scholar] [CrossRef] [Green Version]
Whitman, T.N.; Negri, A.P.; Bourne, D.G.; Randall, C.J. Settlement of larvae from four families of corals in response to a crustose coralline alga and its biochemical morphogens. Sci. Rep. 2020, 10, 16397. [Google Scholar] [CrossRef]
Baums, I.B.; Baker, A.C.; Davies, S.W.; Grottoli, A.G.; Kenkel, C.D.; Kitchen, S.A.; Kuffner, I.B.; LaJeunesse, T.C.; Matz, M.V.; Miller, M.W.; et al. Considerations for maximizing the adaptive potential of restored coral populations in the western Atlantic. Ecol. Appl. 2019, 29, e01978. [Google Scholar] [CrossRef] [Green Version]
Dang, M.; Nowell, C.; Nguyen, T.; Bach, L.; Sonne, C.; Nørregaard, R.; Stride, M.; Nowak, B. Characterisation and 3D structure of melanomacrophage centers in shorthorn sculpins (Myoxocephalus scorpius). Tissue Cell 2019, 57, 34–41. [Google Scholar] [CrossRef]
Rao, M.K.; Rajamani, K.T.; Palanisamy, T.; Narayan, K.; Chinnadorai, R. Novel generalized workflow for cell counting. In Proceedings of the 2015 Third International Conference on Image Information Processing (ICIIP), Waknaghut, India, 21–24 December 2015; pp. 468–473. [Google Scholar]
Maco, B.; Holtmaat, A.; Cantoni, M.; Kreshuk, A.; Straehle, C.N.; Hamprecht, F.A.; Knott, G.W. Correlative In Vivo 2 Photon and Focused Ion Beam Scanning Electron Microscopy of Cortical Neurons. PLoS ONE 2013, 8, e57405. [Google Scholar] [CrossRef]
Desmeules, P.; Hovington, H.; Nguilé-Makao, M.; Léger, C.; Caron, A.; Lacombe, L.; Fradet, Y.; Têtu, B.; Fradet, V. Comparison of digital image analysis and visual scoring of KI-67 in prostate cancer prognosis after prostatectomy. Diagn. Pathol. 2015, 10, 67. [Google Scholar] [CrossRef] [Green Version]
Mascaro, J.; Asner, G.P.; Knapp, D.E.; Kennedy-Bowdoin, T.; Martin, R.E.; Anderson, C.; Higgins, M.; Chadwick, K.D. A tale of two “forests”: Random Forest machine learning aids tropical forest carbon mapping. PLoS ONE 2014, 9, e85993. [Google Scholar] [CrossRef]
Znidersic, E.; Towsey, M.; Roy, W.K.; Darling, S.E.; Truskinger, A.; Roe, P.; Watson, D.M. Using visualization and machine learning methods to monitor low detectability species—The least bittern as a case study. Ecol. Inform. 2020, 55, 101014. [Google Scholar] [CrossRef]
Ditria, E.M.; Connolly, R.M.; Jinks, E.L.; Lopez-Marcano, S. Annotated Video Footage for Automated Identification and Counting of Fish in Unconstrained Seagrass Habitats. Front. Mar. Sci. 2021, 8. [Google Scholar] [CrossRef]
Lopez-Marcano, S.; L Jinks, E.; Buelow, C.A.; Brown, C.J.; Wang, D.; Kusy, B.; M Ditria, E.; Connolly, R.M. Automatic detection of fish and tracking of movement for ecology. Ecol. Evol. 2021, 11, 8254–8263. [Google Scholar] [CrossRef]
Lefevre, J.G.; Koh, Y.W.H.; Wall, A.A.; Condon, N.D.; Stow, J.L.; Hamilton, N.A. LLAMA: A robust and scalable machine learning pipeline for analysis of cell surface projections in large scale 4D microscopy data. bioRxiv 2020. [Google Scholar] [CrossRef]
Heimann, T.; Delingette, H. Model-based segmentation. In Biomedical Image Processing; Springer: Berlin/Heidelberg, Germany, 2011; pp. 279–303. [Google Scholar]
Crisci, C.; Ghattas, B.; Perera, G. A review of supervised machine learning algorithms and their applications to ecological data. Ecol. Model. 2012, 240, 113–122. [Google Scholar] [CrossRef]
Beijbom, O.; Edmunds, P.J.; Kline, D.I.; Mitchell, B.G.; Kriegman, D. Automated annotation of coral reef survey images. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 1170–1177. [Google Scholar] [CrossRef] [Green Version]
Beijbom, O.; Edmunds, P.J.; Roelfsema, C.; Smith, J.; Kline, D.I.; Neal, B.P.; Dunlap, M.J.; Moriarty, V.; Fan, T.-Y.; Tan, C.-J.; et al. Towards Automated Annotation of Benthic Survey Images: Variability of Human Experts and Operational Modes of Automation. PLoS ONE 2015, 10, e0130312. [Google Scholar] [CrossRef]
Elawady, M. Sparse coral classification using deep convolutional neural networks. arXiv 2015, arXiv:1511.09067. [Google Scholar]
Gómez-Ríos, A.; Tabik, S.; Luengo, J.; Shihavuddin, A.; Krawczyk, B.; Herrera, F. Towards highly accurate coral texture images classification using deep convolutional neural networks and data augmentation. Expert Syst. Appl. 2019, 118, 315–328. [Google Scholar] [CrossRef] [Green Version]
Mahmood, A.; Bennamoun, M.; An, S.; Sohel, F.A.; Boussaid, F.; Hovey, R.; Kendrick, G.A.; Fisher, R.B. Deep image representations for coral image classification. IEEE J. Ocean. Eng. 2018, 44, 121–131. [Google Scholar] [CrossRef] [Green Version]
Nunes, J.A.C.; Cruz, I.C.; Nunes, A.; Pinheiro, H.T. Speeding up coral reef conservation with AI-aided automated image analysis. Nat. Mach. Intell. 2020, 2, 292. [Google Scholar] [CrossRef]
Pizarro, O.; Rigby, P.; Johnson-Roberson, M.; Williams, S.B.; Colquhoun, J. Towards image-based marine habitat classification. In Proceedings of the OCEANS 2008, Quebec City, QC, Canada, 15–18 September 2008; pp. 1–7. [Google Scholar] [CrossRef]
Raphael, A.; Dubinsky, Z.; Iluz, D.; Benichou, J.I.; Netanyahu, N.S. Deep neural network recognition of shallow water corals in the Gulf of Eilat (Aqaba). Sci. Rep. 2020, 10, 1–11. [Google Scholar] [CrossRef]
Raphael, A.; Dubinsky, Z.; Iluz, D.; Netanyahu, N.S. Neural network recognition of marine benthos and corals. Diversity 2020, 12, 29. [Google Scholar] [CrossRef] [Green Version]
Stokes, M.D.; Deane, G.B. Automated processing of coral reef benthic images. Limnol. Oceanogr. Methods 2009, 7, 157–168. [Google Scholar] [CrossRef]
Williams, I.D.; Couch, C.S.; Beijbom, O.; Oliver, T.A.; Vargas-Angel, B.; Schumacher, B.D.; Brainard, R.E. Leveraging Automated Image Analysis Tools to Transform Our Capacity to Assess Status and Trends of Coral Reefs. Front. Mar. Sci. 2019, 6. [Google Scholar] [CrossRef] [Green Version]
Johnson-Roberson, M.; Kumar, S.; Willams, S. Segmentation and Classification of Coral for Oceanographic Surveys: A Semi-Supervised Machine Learning Approach. In Proceedings of the OCEANS 2006—Asia Pacific, Singapore, 16–19 May 2006; pp. 1–6. [Google Scholar] [CrossRef]
Yuval, M.; Alonso, I.; Eyal, G.; Tchernov, D.; Loya, Y.; Murillo, A.C.; Treibitz, T. Repeatable Semantic Reef-Mapping through Photogrammetry and Label-Augmentation. Remote Sens. 2021, 13, 659. [Google Scholar] [CrossRef]
Sommer, C.; Straehle, C.; Köthe, U.; Hamprecht, F. Ilastik: Interactive learning and segmentation toolkit. In Proceedings of the 2011 IEEE International Symposium on Biological Imaging: From Nano to Macro, Chicago, IL, USA, 30 March–2 April 2011; pp. 230–233. [Google Scholar]
Baltissen, D.; Wollmann, T.; Gunkel, M.; Chung, I.; Erfle, H.; Rippe, K.; Rohr, K. Comparison of segmentation methods for tissue microscopy images of glioblastoma cells. In Proceedings of the 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Washington, DC, USA, 4–7 April 2018; pp. 396–399. [Google Scholar]
Ongsulee, P. Artificial intelligence, machine learning and deep learning. In Proceedings of the 2017 15th International Conference on ICT and Knowledge Engineering (ICT&KE), Bangkok, Thailand, 22–24 November 2017; pp. 1–6. [Google Scholar]
Sherman, R.M.; Salzberg, S.L. Pan-genomics in the human genome era. Nat. Rev. Genet. 2020, 21, 243–254. [Google Scholar] [CrossRef]
Ojeda-Martinez, D.; Martinez, M.; Diaz, I.; Santamaria, M.E. Saving time maintaining reliability: A new method for quantification of Tetranychus urticae damage in Arabidopsis whole rosettes. BMC Plant Biol. 2020, 20, 1–19. [Google Scholar] [CrossRef]
Renard, F.; Guedria, S.; De Palma, N.; Vuillerme, N. Variability and reproducibility in deep learning for medical image segmentation. Sci. Rep. 2020, 10, 13724. [Google Scholar] [CrossRef] [PubMed]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Rueden, C.T.; Eliceiri, K.W. ImageJ for the next generation of scientific image data. Microsc. Microanal. 2019, 25, 142–143. [Google Scholar] [CrossRef] [Green Version]
Berg, S.; Kutra, D.; Kroeger, T.; Straehle, C.N.; Kausler, B.X.; Haubold, C.; Schiegg, M.; Ales, J.; Beier, T.; Rudy, M.; et al. ilastik: Interactive machine learning for (bio)image analysis. Nat. Methods 2019, 16, 1226–1232. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Fernández-Delgado, M.; Cernadas, E.; Barro, S.; Amorim, D. Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res. 2014, 15, 3133–3181. [Google Scholar]
Geurts, P.; Irrthum, A.; Wehenkel, L. Supervised learning with decision tree-based methods in computational and systems biology. Mol. Biosyst. 2009, 5, 1593–1605. [Google Scholar] [CrossRef]
Sasaki, Y. The Truth of the F-Measure. Available online: http://www.cs.odu.edu/~mukka/cs795sum09dm/Lecturenotes/Day3/F-measure-YS-26Oct07.pdf (accessed on 7 August 2021).
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2020; Available online: https://www.r-project.org/ (accessed on 7 August 2021).
Kassambara, A. Rstatix: Pipe-Friendly Framework for Basic Statistical Tests (0.7. 0) 687 [Computer Software]. 2021. Available online: https://cran.r-project.org/package=rstatix (accessed on 7 August 2021).
Robinson, D. Broom: Convert Statistical Objects into Tidy Tibbles [Computer Software]. 2021. Available online: https://cran.r-project.org/package=broom (accessed on 7 August 2021).
Lenth, R.; Singmann, H.; Love, J.; Buerkner, P.; Herve, M. Emmeans: Estimated Marginal Means, Aka Least-Squares Means [Computer Software]. 2018. Available online: https://cran.r-project.org/package=emmeans (accessed on 7 August 2021).
Bay, L.; Rocker, M.; Boström-Einarsson, L.; Babcock, R.; Buerger, P.; Cleves, P.; Harrison, D.; Negri, A.; Quigley, K.; Randall, C. Reef Restoration and Adaptation Program: Intervention Technical Summary. In A Report Provided to the Australian Government by the Reef Restoration and Adaptation Program; Australian Institute of Marine Science (AIMS): City of Townsville, QLD, Australia, 2019. [Google Scholar]
Quigley, K.M.; Alvarez Roa, C.; Torda, G.; Bourne, D.G.; Willis, B.L. Co-dynamics of Symbiodiniaceae and bacterial populations during the first year of symbiosis with Acropora tenuis juveniles. MicrobiologyOpen 2020, 9, e959. [Google Scholar] [CrossRef] [PubMed]
Haubold, C.; Schiegg, M.; Kreshuk, A.; Berg, S.; Koethe, U.; Hamprecht, F.A. Segmenting and Tracking Multiple Dividing Targets Using ilastik. Adv. Anat. Embryol. Cell Biol. 2016, 219, 199–229. [Google Scholar] [CrossRef] [PubMed]
Shpilman, A.; Boikiy, D.; Polyakova, M.; Kudenko, D.; Burakov, A.; Nadezhdina, E. Deep Learning of Cell Classification Using Microscope Images of Intracellular Microtubule Networks. In Proceedings of the 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), Cancun, Mexico, 18–21 December 2017; pp. 1–6. [Google Scholar] [CrossRef]
Dizon, E.G.S.; Da-Anoy, J.P.; Roth, M.S.; Conaco, C. Fluorescent protein expression in temperature tolerant and susceptible reef-building corals. J. Mar. Biol. Assoc. UK 2021, 101, 71–80. [Google Scholar] [CrossRef]
Chakravarti, L.J.; Beltran, V.H.; van Oppen, M.J.H. Rapid thermal adaptation in photosymbionts of reef-building corals. Glob. Chang. Biol. 2017, 23, 4675–4688. [Google Scholar] [CrossRef]
Ferrari, R.; Figueira, W.F.; Pratchett, M.S.; Boube, T.; Adam, A.; Kobelkowsky-Vidrio, T.; Doo, S.S.; Atwood, T.B.; Byrne, M. 3D photogrammetry quantifies growth and external erosion of individual coral colonies and skeletons. Sci. Rep. 2017, 7, 1–9. [Google Scholar] [CrossRef]
Koch, H.R.; Wallace, B.; DeMerlis, A.; Clark, A.S.; Nowicki, R.J. 3D scanning as a tool to measure growth rates of live coral microfragments used for coral reef restoration. Front. Mar. Sci. 2021. [Google Scholar] [CrossRef]
Reichert, J.; Schellenberg, J.; Schubert, P.; Wilke, T. 3D scanning as a highly precise, reproducible, and minimally invasive method for surface area and volume measurements of scleractinian corals. Limnol. Oceanogr. Methods 2016, 14, 518–526. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Overview of the coral juvenile machine learning image anaylysis pipeline. The pipeline is divided into four sections as outlined in the bottom right-hand corner of the figure: data preparation (yellow), Ilastik workflow (green), Fiji ImageJ workflow (blue), and Excel workflow (red).

Figure 2. Infographic showing the evolution of images through the machine learning image analysis pipeline. The sequence includes: the original image, training annotations, the produced prediction layer, and uncertainty layer signalling areas that require further training. After training, the final segmentation layer shows the outline of the juvenile with precise boundaries. This is then converted into a simple segmentation mask, which uses grey scale for each classification label. The final image shows the juvenile after being labelled with a unique identifier and outlined to show the area measured.

Figure 3. Assessment of machine learning image analysis pipeline for coral juvenile survival. Image showing an example of a field deployed tile with a juvenile highlighted in the inset. Juveniles were cropped to use as training images and test images (A). Percentage of juveniles counted correctly as alive (True Positive Rate-TPR) by the pipeline when using the tile images. Facet indicates the time point (month) that the test images were measured from, and color indicates the month the training images were from (B). Image showing three plastic PVC slides with settled coral juveniles in each well (C). Average percentage ± standard deviation of juveniles correctly counted as alive (true positives), false positives, and false negatives per slide for different sets of numbers of training images (D).

Figure 4. Assessment of machine learning image analysis pipeline for coral juvenile size. Linear regressions comparing the measured size of juveniles using manual measurements and the pipeline for tile images (A). The facet indicates the number of training images used. The color indicates the time point (month) in which the images are from. Boxplots showing the percentage change in area (mm²) between manual measurements and the pipeline using different numbers of training images when using the tile images (B). The facet and color indicate the time point (month) the images were from. Boxplots of the percentage change in area (mm²) between manual measurements and the pipeline using the slide images (C). Note that the y-axis for B and C have different scales. For (B,C), any point below 0 on the y-axis indicates the pipeline is under-predicting the size of the juveniles relative to the manual measurements. If the point is above 0 then the pipeline is over-predicting the size of the juvenile relative to the manual measurements.

Figure 5. Assessment of machine learning image analysis pipeline for coral juvenile color. Example of the coral color health chart and corresponding values for hue (H), saturation (S), and lightness (L) for the “D” scale [18] (A). Boxplots of the percentage change of hue (green), saturation (red), and lightness (blue) between manual and pipeline measurements for the slide (B) and tile images (C). Facets in (C) indicate the time points (month) for which the training images were from. Note that the y-axis for (B,C) have different scales. For (B,C), if the point is below 0 on the y-axis, the pipeline is under-predicting the HSL. If the point is above 0 then the pipeline is over-predicting the HSL.

Figure 6. Time comparison between manual versus pipeline measurements. The time taken (hours) to process tiles (A) and slides (B) when training with different numbers of images (represented by different colors) and when using manual measurements versus the pipeline. Manual measurements are indicated by the light blue line on both graphs.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Macadam, A.; Nowell, C.J.; Quigley, K. Machine Learning for the Fast and Accurate Assessment of Fitness in Coral Early Life History. Remote Sens. 2021, 13, 3173. https://doi.org/10.3390/rs13163173

AMA Style

Macadam A, Nowell CJ, Quigley K. Machine Learning for the Fast and Accurate Assessment of Fitness in Coral Early Life History. Remote Sensing. 2021; 13(16):3173. https://doi.org/10.3390/rs13163173

Chicago/Turabian Style

Macadam, Alex, Cameron J. Nowell, and Kate Quigley. 2021. "Machine Learning for the Fast and Accurate Assessment of Fitness in Coral Early Life History" Remote Sensing 13, no. 16: 3173. https://doi.org/10.3390/rs13163173

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning for the Fast and Accurate Assessment of Fitness in Coral Early Life History

Abstract

1. Introduction

2. Methods

2.1. Datasets and Manual Measurements of Juveniles Using ImageJ

2.2. Machine Learning Image Analysis Pipeline

2.2.1. Training Image Production

2.2.2. Measurements of Juveniles Using the Ilastik Pipeline

2.3. Model Validation; Assessment of Pipeline Accuracy and Speed

2.4. Statistical Analysis

3. Results

3.1. Assessment of Manual Versus Pipeline Calculations of Coral Juvenile Survival

3.2. Assessment of Manual vs. Pipeline Calculations of Coral Juvenile Size

3.3. Assessment of Manual vs. Pipeline Calculations of Coral Juvenile Color

3.4. Assessment of Time Saving for the Measurement of Coral Juvenile Survival, Size, and Color in Manual vs. the Pipeline Measurements

4. Discussion

4.1. Overall Assessment of the Pipeline Performance in Accurately and Rapidly Measuring Survival, Size, and Color

4.2. Recommendations for Training Parameters for Pipeline Users

4.3. Future Directions in Tool Development for Coral Conservation

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI