Predicting Plant Growth and Development Using Time-Series Images

Wang, Chunying; Pan, Weiting; Song, Xubin; Yu, Haixia; Zhu, Junke; Liu, Ping; Li, Xiang

doi:10.3390/agronomy12092213

Open AccessArticle

Predicting Plant Growth and Development Using Time-Series Images

by

Chunying Wang

¹,

Weiting Pan

¹,

Xubin Song

¹,

Haixia Yu

²,

Junke Zhu

³,

Ping Liu

^1,*

and

Xiang Li

^2,*

¹

College of Mechanical and Electronic Engineering, Shandong Agricultural University, Taian 271018, China

²

State Key Laboratory of Crop Biology, College of Life Sciences, Shandong Agricultural University, Taian 271018, China

³

School of Agricultural and Food Engineering, Shandong University of Technology, Zibo 255000, China

^*

Authors to whom correspondence should be addressed.

Agronomy 2022, 12(9), 2213; https://doi.org/10.3390/agronomy12092213

Submission received: 4 August 2022 / Revised: 1 September 2022 / Accepted: 13 September 2022 / Published: 16 September 2022

(This article belongs to the Special Issue Application of Deep Learning in Precise Analysis of Agricultural Crops)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Early prediction of the growth and development of plants is important for the intelligent breeding process, yet accurate prediction and simulation of plant phenotypes is difficult. In this work, a prediction model of plant growth and development based on spatiotemporal long short-term memory (ST-LSTM) and memory in memory network (MIM) was proposed to predict the image sequences of future growth and development including plant organs such as ears. A novel dataset of wheat growth and development was also compiled. The performance of the prediction model of plant growth and development was evaluated by calculating structural similarity index measure (SSIM), mean square error (MSE), and peak signal to noise ratio (PSNR) between the predicted and real plant images. Moreover, the optimal number of time steps and the optimal time interval between steps were determined for the proposed model on the wheat growth and development dataset. Under the optimal setting, the SSIM values surpassed 84% for all time steps. The mean of MSE values was 46.11 and the MSE values were below 68 for all time steps. The mean of PSNR values was 30.67. When the number of prediction steps was set to eight, the prediction model had the best prediction performance on the public Panicoid Phenomap-1 dataset. The SSIM values surpassed 78% for all time steps. The mean of MSE values was 77.78 and the MSE values were below 118 for all time steps. The mean of PSNR values was 29.03. The results showed a high degree of similarity between the predicted images and the real images of plant growth and development and verified the validity, reliability, and feasibility of the proposed model. The study shows the potential to provide the plant phenotyping community with an efficient tool that can perform high-throughput phenotyping and predict future plant growth.

Keywords:

prediction model; machine learning; plant growth and development

1. Introduction

Plant growth is a dynamic and complex physiological, biochemical, and metabolic process [1]. Plant phenotyping can assess the complex traits of plant growth and measure individual quantitative parameters [2,3]. These data can be used to quantify small changes in crop growth in a short or long period of time and provide technical support for timely and accurate planting management and plant breeding. Plants grow slowly under natural conditions, which limits experimental cycles. Predicting plant growth and development early and measuring phenotypic traits holds the potential to speed up experimental plant cycles and accelerate plant breeding by reducing the time of plant growth, imaging, and measurement [4,5]. Predicting plant growth and development will be promising to solve the issues of long cycle, low efficiency, and great uncertainty in the plant breeding industry.

Deep learning models provide the possibility for prediction and simulation of phenotypic traits [6,7,8,9]. The effects of environmental factors on plant growth and development were predicted by time-series prediction models. In addition, Yang et al. established predictive models based on long short-term memory (LSTM) and Convolutional LSTM (ConvLSTM) to predict sunshine hours [10]. Their proposed model could predict the sunshine hours, accumulated precipitation, and average temperature in the next year. At the same time, a data-driven model was developed to predict each growth stage in the work. However, these studies concentrate on prediction and simulation of dynamic changes of a deterministic factor and cannot visualize plant growth and development.

A first attempt was made to predict plant growth and development using historical plant images based on ConvLSTM [5]. Then, the growth change in plant leaves and roots was predicted from time-series images using the generative adversarial network (GAN) [4]. Nevertheless, the images of the growth and development were predicted by these models based on the mask of plants. The texture and color information of plants were ignored during the prediction.

Another generative growth model based on conditional generative adversarial networks was proposed to predict the future appearance of individual plants [11]. A model using the cycle-consistent generative adversarial network (CycleGAN) was proposed to forecast a probable time-series of images with an advancing disease spread [12]. Although realistic looking and reliable images of future plant growth stages were generated by these GAN models, the structural similarity between the generated and real plant images still needs to be improved. Another line of thinking, the prediction model of plant growth and development based on spatiotemporal long short-term memory (ST-LSTM, a variant of the ConvLSTM) was proposed to predict images of future growth and development [13]. The structural similarity between the generated and real plant images was relatively good. However, the timeliness of the prediction by this model was insufficient and the prediction image was blurry. This is because these predictive models discard long-term non-stationary trend information in memory states, resulting in catastrophic forgetting. The processes of plant growth and development can be highly non-stationary in many ways. Examples include low-level non-stationarity such as spatial correlations or temporal dependencies of local pixel values of plant images and high-level variations such as the accumulation and stress adaption of plant growth and development.

The memory in memory network (MIM) was proposed to learn higher-order non-stationarity from spatiotemporal dynamics [14]. MIM uses two cascaded recurrent models to handle the non-stationary and approximately stationary components in the spatiotemporal domain. MIM could compensate for the defects in the lack of ability of LSTM-based variants to model non-stationary trends in the spatiotemporal domain. The above-mentioned studies have been conducted predominantly on the prediction of future growth and development of Brassicaceae, such as Arabidopsis thaliana. The current prediction models still have some problems to improve. Therefore, a prediction model based on ST-LSTM and MIM was proposed in this work.

The major contributions in this work are described as follows.

(1): A novel dataset of wheat growth and development was compiled. The prediction model of plant growth and development was proposed based on ST-LSTM and MIM to predict images of future plant growth.
(2): The performance of the prediction model of plant growth and development was evaluated by calculating SSIM, MSE, and PSNR between the predicted and real plant images. The leaf number, projected area, and length and width of the minimum bounding rectangle of the predicted and real images were also measured and compared to assess whether the prediction of plant growth and development was biologically accurate.
(3): Comparison of evaluation results between the proposed model and the existing models (i.e., a model based only on ConvLSTM [5] and a model based only on ST-LSTM [13]) was conducted.

2. Materials and Methods

2.1. Dataset

In this work, a novel dataset of wheat growth and development was given, including the multi-view wheat dataset and the successive-view wheat dataset. The multi-view wheat dataset was composed of successive images of four different varieties of wheat: Fielder, Shannong28 (SN28), Jimai22 (JM22), and Kenong199 (KN199). All wheat samples were cultivated in the growth chamber under controlled conditions (23 °C, 16/8 light/dark, 30 Klux, 90% humidity). Every pot of wheat was constantly monitored via an image acquisition platform, which we designed and installed at the growth chamber, as shown in Figure 1a. The image sequences of wheat growth and development in the multi-view wheat dataset were imaged weekly from eight fixed-side views. The obtained sequence for each plant from one view involved eight successive images (Figure 1c). The number of sequences in the dataset was 592. The successive-view wheat dataset composed of successive images of Fielder was also given. The two samples of Fielder were cultivated in the same growth chamber. The images of Fielder samples were taken every half hour from the same view after the two-leaf stage. The obtained sequence for each plant involved 3886 successive images (Figure 1d).

After collecting successive images of wheat growth and development, images were pre-processed to remove the background to avoid any influence of the background on the prediction of growth and development, as shown in Figure 1b. The wheat images in RGB color space were converted into HSV color space. Image segmentation was performed using K-means clustering. A color filter was applied to the HSV images for the extraction of wheat (the green range of images).

The experiment was also conducted on the Panicoid Phenomap-1 dataset [15], which contained 39 varieties of panicoid grain crops. The total group number of panicoid grain crops in the Panicoid Phenomap-1 dataset was 176. The images of panicoid grain crops were captured from 0-degree and 90-degree side views once per day for 29 days. These panicoid grain crops were segmented from the background by auto-thresholding using the Otsu algorithm, and the background was removed by setting it to black. Pre-processed examples of the Panicoid Phenomap-1 dataset content are shown in Figure 1e.

2.2. Formulation of Plant Growth and Development Predicting Problem

The prediction of plant growth and development could be regarded as a spatiotemporal sequence forecasting problem. The goal of the prediction model is to use the j past images (

I_{t - j + 1 : t}

) to predict the k future images of plant growth and development (

{\hat{I}}_{t + 1 : t + k}

). The prediction process is defined as Equation (1).

{\hat{I}}_{t + 1}, \dots, {\hat{I}}_{t + k} = \underset{I_{t + 1}, \dots, I_{t + k}}{\arg \max} p (I_{t + 1}, \dots, I_{t + k} {| I}_{t - j + 1}, \dots I_{t})

(1)

where

I_{t}

is the RGB image of the plant at time t and represented by a tensor

I_{t} \in ℝ^{3 \times m \times n}

,

{\hat{I}}_{t + 1}

is the predicted image of the plant at time t + 1.

2.3. Prediction Model of Plant Growth and Development

On the tightly coupled spatiotemporal correlation of plant growth and development, a prediction model based on ST-LSTM and MIM was proposed by taking the timing images of plant growth and development as the research object, as shown in Figure 2. The prediction model based on ST-LSTM and MIM could learn stationary variations and the higher-order non-stationarity variations from the dynamics of plant growth and development.

The prediction model consisted of the input layer, the ST-LSTM layer, the MIM layer, the convolutional layer, and the output layer. Three ST-LSTM layers were stacked and each contained 64 ST-LSTM units. Two MIM layers were stacked and each contained 64 MIM units. The input of the prediction model was the j past time-series RGB images of plant growth and development (

I_{t - j + 1 : t}

), and the output was the predicted k future images of growth and development (

{\hat{I}}_{t + 1 : t + k}

). The hidden presentations of spatiotemporal stationarity variations in the time-series images were generated by the ST-LSTM layers. Then, the temporal differencing obtained by subtracting the hidden state

H_{p - 1}^{l - 1}

from the hidden state

H_{p}^{l - 1}

was used as the input of the MIM layers. The temporal features and spatiotemporal features were extracted by the MIM layers. The non-stationarity variations of the temporal differencing were captured to improve the ability to extract temporal features. As predicted, time-series images of plant growth and development had the same dimensionality as the input, all the temporal and spatiotemporal states were concatenated and fed into a 1 × 1 convolutional layer to generate the final prediction.

The ST-LSTM was proposed on the basis of ConvLSTM and introduced a gate-controlled dual memory structure to extract and memorize temporal and spatiotemporal representations simultaneously. A sample of the ST-LSTM unit is shown in Figure 3a. The equations of ST-LSTM are shown as Equations (2)–(11), where ‘∗’ denotes the convolution operator and ‘◦’ denotes the Hadamard product. The outputs from the ST-LSTM cell were two memory states (

C_{p}^{l}

and

M_{p}^{l}

) and a hidden state (

H_{p}^{l}

).

C_{p}^{l}

was the temporal memory that was delivered from the p − 1 node of the same hidden layer.

M_{p}^{l}

was the spatiotemporal memory conveyed vertically from the previous layer at the same time step. These memory states were derived from different directions and concatenated. Different from simple memory concatenation, the ST-LSTM unit used a shared output gate for both memory types to enable seamless memory fusion.

g_{p} = \tanh (ω_{x g} * X_{p} + ω_{h g} * H_{p - 1}^{l} + b_{g}),

(2)

i_{p} = σ (ω_{x i} * H_{p}^{l - 1} + ω_{h i} * H_{p - 1}^{l} + b_{i})

(3)

f_{p} = σ (ω_{x f} * X_{p} + ω_{h f} * H_{p - 1}^{l} + b_{f})

(4)

C_{p}^{l} = f_{p} ⊙ C_{p - 1}^{l} + i_{p} ⊙ g_{p}

(5)

{g^{'}}_{p} = \tanh (ω_{x g} * H_{p}^{l - 1} + ω_{m g} * M_{p}^{l - 1} + b_{g})

(6)

{i^{'}}_{p} = σ (ω_{x i} * H_{p}^{l - 1} + ω_{m i} * M_{p}^{l - 1} + b_{i})

(7)

{f^{'}}_{p} = σ (ω_{x f} * H_{p}^{l - 1} + ω_{m f} * M_{p}^{l - 1} + b_{f})

(8)

M_{p}^{l} = {f^{'}}_{t} ⊙ M_{p}^{l - 1} + {i^{'}}_{p} ⊙ {g^{'}}_{p}

(9)

o_{p} = σ (ω_{x o} * X_{p} + ω_{h o} * H_{p - 1}^{l} + ω_{c o} * C_{p}^{l} + ω_{m o} * M_{p}^{l} + b_{o})

(10)

H_{p}^{l} = o_{p} ⊙ \tanh (ω_{1 \times 1} * [C_{p}^{l}, M_{p}^{l}])

(11)

Based on the idea of difference-stationary assumption, the MIM was proposed to model the non-stationary variations using a series of cascaded memory transitions, as shown in Figure 3b. Two cascaded, self-renewed memory models (non-stationary model and stationary model) were proposed in the MIM to replace the temporal forget gate

C_{p}^{l}

in the ST-LSTM. Key calculations of the MIM block are shown as Equations (12)–(20). The temporal differencing (

H_{p}^{l - 1} - H_{p - 1}^{l - 1}

) obtained by subtracting the hidden state

H_{p - 1}^{l - 1}

from the hidden state

H_{p}^{l - 1}

was used as the input of MIM. The new temporal memory

C_{p - 1}^{l}^{'}

, the spatiotemporal memory

M_{p}^{l - 1}

, and the hidden state

H_{p - 1}^{l}

were also used as the input of MIM. The non-stationary model (MIM-N) generated the differential features

D_{p}^{l}

to capture the non-stationary variations of the temporal differencing, as shown in Figure 4. The output

D_{p}^{l}

of the MIM-N model and the outer temporal memory

C_{p - 1}^{l}^{'}

were taken as inputs of the stationary model (MIM-S) to capture the approximately stationary variations in spatiotemporal sequences, as shown in Figure 4.

D_{p}^{l} = MIM - N (H_{p}^{l - 1}, H_{p - 1}^{l - 1}, N_{p - 1}^{l}),

(12)

T_{p}^{l} = MIM - S (D_{p}^{l}, C_{p - 1}^{l} {^{'}}_{^{'}}, S_{p - 1}^{l}),

(13)

C_{p}^{l}^{'} = T_{p}^{l} + i_{p} ⊙ g_{p}

(14)

{g^{″}}_{p} = \tanh ({ω^{″}}_{x g} * (H_{p}^{l - 1} - H_{p - 1}^{l - 1}) + ω_{n g} * N_{p - 1}^{l} + b_{g}),

(15)

{i^{″}}_{p} = σ (ω_{x i} * (H_{p}^{l - 1} - H_{p - 1}^{l - 1}) + ω_{n i} * N_{p - 1}^{l} + b_{i})

(16)

{f^{″}}_{p} = σ (ω_{x f} * (H_{p}^{l - 1} - H_{p - 1}^{l - 1}) + ω_{n f} * H_{p - 1}^{l} + b_{f})

(17)

N_{p}^{l} = {f^{″}}_{t} ⊙ N_{p}^{l - 1} + {i^{″}}_{p} ⊙ {g^{″}}_{p}

(18)

{o^{″}}_{p} = σ (ω_{x o} * (H_{p}^{l - 1} - H_{p - 1}^{l - 1}) + ω_{n o} * N_{p}^{l} + b_{o})

(19)

D_{p}^{l} = {o^{″}}_{p}_{t} ⊙ \tanh (N_{p}^{l})

(20)

{g^{‴}}_{p} = \tanh (ω_{x g} * D_{p}^{l - 1} + ω_{c g} * C_{p - 1}^{l}^{'} + b_{g}),

(21)

{i^{‴}}_{p} = σ (ω_{x i} * D_{p}^{l - 1} + ω_{c i} * C_{p - 1}^{l}^{'} + b_{i})

(22)

{f^{‴}}_{p} = σ (ω_{x f} * D_{p}^{l - 1} + ω_{n f} * C_{p - 1}^{l}^{'} + b_{f})

(23)

S_{p}^{l} = {f^{‴}}_{t} ⊙ S_{p - 1}^{l} + {i^{‴}}_{p} ⊙ {g^{‴}}_{p}

(24)

{o^{‴}}_{p} = σ (ω_{d o} * D_{p}^{l} + ω_{c o} * C_{p - 1}^{l}^{'} + ω_{s o} * S_{p}^{l} + b_{o})

(25)

T_{p}^{l} = {o^{‴}}_{p}_{t} ⊙ \tanh (S_{p}^{l})

(26)

2.4. Evaluation of the Model Performance

To assess the performance of the prediction model of plant growth and development, the predicted plant images were evaluated quantitatively by using three metrics, including mean square error (MSE) [4], peak signal to noise ratio (PSNR) [16], and structural similarity index measure (SSIM) [17]. MSE calculated by Equation (27) was used to calculate the average squared difference between the predicted image and the real image. A smaller MSE value represents a higher similarity. The PSNR was used to evaluate the level of image noise, as shown in Equation (28). The higher the PSNR value, the better the quality of the predicted image. The SSIM calculated using Equation (29) was used to assess the similarity between the predicted image and the ground truth image. A larger SSIM value represents a higher similarity And SSIM is one where the predicted image is identical to the ground truth image.

M S E = \frac{1}{m n} \sum_{i = 0}^{m - 1} \sum_{j = 0}^{n - 1} [X_{k} (i, j) - {\hat{X}}_{k} (i, j)],

(27)

P S N R = 10 \log_{10} (\frac{\max_{X_{k}}^{2}}{M S E}),

(28)

S S I M = \frac{(2 μ_{X_{k}} μ_{{\hat{X}}_{k}} + c_{1}) (2 σ_{X_{k} {\hat{X}}_{k}} + c_{2})}{(μ_{X_{k}}^{2} + μ_{{\hat{X}}_{k}}^{2} + c_{1}) (σ_{X_{k}}^{2} + σ_{{\hat{X}}_{k}}^{2} + c_{2})}

(29)

Moreover, parameters of the predicted and real images were measured and compared to assess whether the prediction was biologically accurate, such as the leaf number, the projected area, length of the minimum bounding rectangle, and width of the minimum bounding rectangle. The projected area was the pixel number of the plant binary images. The change of projection area could indirectly reflect the growth rate and morphological change of plants. The size of the minimum bounding rectangle could reflect the height and morphological compactness of plants.

3. Results and Discussion

The prediction model of plant growth and development was trained on the training and validation dataset, and its performance was evaluated on the testing dataset. The proposed prediction model of plant growth and development consisted of one input layer, three ST-LSTM layers, two MIM layers, one convolutional layer, and one output layer. Three ST-LSTM layers were stacked and each contained 64 ST-LSTM units. Two MIM layers were stacked and each contained 64 MIM units. L2-Loss was chosen as the loss function, and Adam Optimizer with a learning rate of 0.001 was used in the loss function optimization. The batch size was set to two and the number of iterations was set to 80,000 for every training.

In order to validate the performance of the proposed prediction model, it was compared with the existing models, such as the prediction model based only on ConvLSTM [5] and the prediction model based only on ST-LSTM [13]. The comparison of evaluation results between the proposed prediction model and the existing models tested on the given dataset was conducted.

3.1. Successive-View Wheat Dataset

In the experimental study, the successive images of Fielder were firstly used to verify the effectiveness of the proposed prediction model. The sliding time window method was used to construct continuous time-series images as the input sequences of the model. The set window length slides along the time axis. The test dataset contained three waves of data (599 successive images at the middle growth stage, 144 successive images at the late growth stage, and 502 successive images at the mid to late growth stage) selected randomly from successive images of plant growth and development. The remaining successive images were considered for the training and validation dataset.

Five images of future growth and development were predicted at once based on five input images. The window length was set as 10. The prediction model was trained on the training and validation dataset and tested on the test dataset. The qualitative comparison between the predicted images and the real images of future growth and development is shown in Figure 5a. The average values of MSE, PSNR, and SSIM at each step from t + 1 to t + 5 are shown in Figure 5b. The corresponding parameters (the leaf number, the projected area, and the length and width of the minimum bounding rectangle) of the predicted and real images were measured and compared, as shown in Figure 5c.

The SSIM values surpassed 91% for all time steps. The mean of MSE values was 18.50 and the MSE values were below 30 for all time steps. The mean of PSNR values was 34.45. The smaller the value of MSE, the better the predictive ability of the proposed model. The results showed a high degree of similarity between the predicted images and the real images of plant growth and development. The values of PSNR and SSIM typically showed a gradual decrease trend as time step increased and the values of MSE showed a gradually increasing trend as time step increased. This may be caused by deviations of the predictions accumulated over time and the complexity of plant growth and development. Because the time interval between the two steps was only 30 min, the leaf number, and the length and width of the minimum bounding rectangle of real images had fewer changes and were similar to the ones of the predicted images, as shown in Figure 5c. Yet, the rhythmic leaf movement and the growth of young leaves may change the projected area. There was a larger discrepancy between the projected area of predicted images and the projected area of real images. The changing trend of the projected area of predicted images is similar to that of real images. These results reflected the predictive validity of the proposed model.

The prediction of plant growth and development at different numbers of time steps and different time intervals between two steps were compared to determine the optimal number of time steps and optimal time interval between two steps. The spacer input sequences of the model were acquired from the dataset at the set number of prediction steps and the time interval between time steps. The prediction results of different numbers of steps and different time intervals between steps are shown in Figure 6.

The values of MSE, PSNR, and SSIM at each step between the predicted images with the real images are shown in Figure 6, where the time interval between two steps was set as 30 min and the number of prediction steps was set as 3, 5, 8, 10, and 20. When the number of prediction steps was 20, the values of PSNR, and SSIM at each step were significantly lower than other results and the values of MSE at each step were significantly higher than other results. When the number of prediction steps was lower than 10, the values of MSE at each step (from t + 1 to t + 5) increased as the number of prediction steps increased (Figure 6c), and the values of PSNR and SSIM at each step (from t + 1 to t + 5) were increased as the number of prediction steps increased (Figure 6a,b). The standard deviations of MSE and PSNR values at each step were both less than two. The standard deviation of SSIM values at each step was smaller than 0.01. These results illustrate that the number of prediction steps had a comparatively small effect on the performance of the proposed prediction model until it was greater than 10.

Next, the performances of the proposed prediction model tested on different time intervals were evaluated to explore the effect of the time interval between two steps on the proposed prediction model. The average values of MSE, PSNR, and SSIM at each step from t + 1 to t + 5 are shown in Figure 7, where the time interval between two steps was set as 30 min, 1 h, 2 h, 6 h, and 12 h. The values of MSE at each step (from t + 1 to t + 5) were increased as the time interval between the two steps increased (Figure 7c), and the values of PSNR and SSIM at each step (from t + 1 to t + 5) were increased as the number of prediction steps increased (Figure 7a,b). However, when the time interval between the two steps was set as 2 h, the values of MSE at each step were significantly increased. When the time interval between the two steps was set as 6 h, the values of MSE at each step obtained were close to those obtained by setting the time intervals as 12 h. When the time interval between two steps was set as 6 h, the SSIM values surpassed 73% for all time steps and the SSIM value at the first time step was 79.85%, the mean of PSNR values was 26.68 and the mean of MSE values was 34.45. These results illustrate that the time interval had an extremely large effect on the performance of the proposed prediction model. In order to achieve 85% SSIM between the prediction and real plant images, the time interval needs to be set to 1 h. Therefore, for a more reliable and longer-term prediction of plant growth and development, the optimal number of time steps is 10 and the optimal time interval between two steps is 1 h.

Under the optimal setting, the performance of the proposed prediction model and the existing models are shown in Figure 8. The mean of PSNR values of the proposed prediction model was 30.67. The SSIM values of the proposed prediction model surpassed 85% for all time steps, which was higher than the ones of the prediction model based only on ConvLSTM and the prediction model based only on ST-LSTM. The mean of MSE values of the proposed prediction model was 46.11 and the MSE values of the proposed prediction model were below 68 for all time steps. The proposed prediction model was not good at the MSE. The MSE values of the proposed prediction model at each step (from t + 4 to t + 10) were higher than the ones of the prediction model based only on ConvLSTM and the prediction model based only on ST-LSTM. As shown in Figure 5c, the projection area of wheat was as high as 1000 and the projection area was also the pixel number of the binary images. The relative difference of MSE between the proposed prediction model and the existing models was less than 30, which was acceptable. The results above validated the proposed prediction model and showed its robustness as compared with the existing models.

3.2. Multi-View Wheat Dataset

In the experimental study, the successive images of four different varieties of wheat without background were used to further verify the effectiveness of the proposed model in predicting the growth and development of different varieties and different views. The test dataset contained 144 sequences (obtained from eight views of 18 plants) selected randomly from the multi-view wheat dataset. The remaining sequences of successive images were considered for the training and validation dataset. The prediction model of plant growth and development was trained on the training and validation dataset. Two images of future growth and development were predicted at once based on three input images.

The proposed prediction model of growth and development was tested on the test dataset. The qualitative comparison between the predicted images and the real images of future growth and development is shown in Figure 9. The average values of MSE, PSNR, and SSIM at each step from t + 1 to t + 2 were also calculated at different time steps to evaluate the predicted result of wheat growth and development. The values of SSIM at t + 1 and t + 2 steps were 81.63% and 80.46%. These results illustrate the validity of the proposed model again. When the time interval was increased, the predicted plant growth and development images could still have relatively good structural similarity with the real images by increasing the amount of training data.

3.3. Panicoid Phenomap-1 Dataset

The proposed prediction model was also evaluated on the Panicoid Phenomap-1 dataset. Successive images of 39 varieties of panicoid grain crops without background were used to verify the effectiveness and robustness of the proposed model in predicting growth and development. The test dataset contained 39 randomly chosen groups containing all genotypes of panicoid grain crops. The remaining 137 sequences of successive images were considered for the training and validation dataset. The prediction model was trained on the training and validation dataset and tested on the test dataset. Five images of future growth and development were predicted at once based on five input images. The window length was set as 10. The predicted images and the real images of future growth and development are shown in Figure 10. The corresponding parameters (leaf number, projected area, length and width of the minimum bounding rectangle) of the predicted and real images were measured and compared, as shown in Figure 11. The average values of MSE, PSNR, and SSIM at each step from t + 1 to t + 5 are also shown in Figure 12.

The predicted results obtained by the proposed model on the Panicoid Phenomap-1 dataset were similar to the results above. The leaf number, projected area, and length and width of the minimum bounding rectangle of the predicted images were comparable and showed good agreement with the ones of real images. However, accuracies for late prediction time steps were lower, especially for the length of the minimum bounding rectangle. This problem is also reflected in the changing trend of the SSIM, MSE, and PSNR. When the number of time steps was set as five, the values of PSNR and SSIM typically decreased and the values of MSE increased with time. This again validated that the predicted results gradually become worse and may be caused by deviations of the predictions accumulated over time and the complexity of plant growth and development.

On the other hand, the number of prediction steps was set as 3, 5, 8, 10, and 20. The average values of MSE, PSNR, and SSIM at each step were also calculated to determine the optimal number of prediction steps, as shown in Figure 12. The values of MSE at each step were first decreased and then increased as the number of prediction steps increased (Figure 12c), and the values of PSNR and SSIM at each step were first increased and then decreased as the number of prediction steps increased (Figure 12a,b). The standard deviations of PSNR values at each step were less than two and the standard deviations of SSIM values at each step were smaller than 0.02. These results again illustrated that the number of prediction steps had a comparatively small effect on the performance of the proposed prediction model. However, a larger difference was found regarding the MSE values at each step obtained by setting different numbers of prediction steps. When the number of prediction steps was set to eight, the model had the best prediction performance on the Panicoid Phenomap-1 dataset. The SSIM values surpassed 78% for all time steps. The mean of MSE values was 77.78 and the MSE values were below 118 for all time steps. The mean of PSNR values was 29.03. The trend of the predicted results on the Panicoid Phenomap-1 dataset was different from that of the successive-view wheat dataset. This may be caused by the increase in the time intervals. The time intervals between two steps of the successive-view wheat dataset were less than 12 h. The real images of plant growth and development at the time step t + 1 bore a strong visual resemblance to the real images at the time step t + 1. However, the time interval of the Panicoid Phenomap-1 dataset was 24 h. With the number of prediction steps increased, the proposed model can better model the dynamic of plant growth and development to predict plant future growth and development. In parallel, deviations of the predictions accumulated over time became more conspicuous. Therefore, when the number of prediction steps was set to eight, the model had the best prediction performance on the Panicoid Phenomap-1 dataset.

Under the optimal setting, the performances of the proposed prediction model and the existing models are shown in Figure 13. The SSIM and PSNR values of the proposed model are significantly higher than those of the prediction model based only on ConvLSTM and the prediction model based only on ST-LSTM. The proposed prediction model was also not good at the MSE. The results were similar to the results above. The MSE values of the proposed prediction model at each step were higher than the ones of the prediction model based only on ConvLSTM and the prediction model based only on ST-LSTM. The relative difference of MSE between the proposed prediction model and the existing models was less than 40. These results above again validated the robustness of the proposed prediction model.

Compared with others’ work using GAN models, the proposed model did not perform well in terms of the blurriness of the predicted images. Nevertheless, there are two advantages of this work. The first advantage is that the images of future plant growth and development predicted by the proposed model have higher structural similarities with the real images. The proposed model predicted the future images of plant growth and development by modeling the dynamic behaviors of plant growth and development using ST-LSTM and MIM modules. The hidden presentations of spatiotemporal stationarity variations in the time-series images were generated by the ST-LSTM layers. The MIM exploits the differential signals between adjacent recurrent states to model the non-stationary and approximately stationary properties in spatiotemporal dynamics with two cascaded, self-renewed memory models. By stacking multiple MIMs, we could potentially handle higher-order non-stationarity of plant growth and development. The second advantage is that the effect of the number of time steps and the optimal time interval between two steps on the prediction performance of the proposed model was analyzed and the optimal number of time steps and the optimal time interval between two steps were determined, which will provide a valuable reference for the prediction studies of plant growth and development.

4. Conclusions

In this work, we proposed a prediction model of plant growth and development to reliably generate images of future plant growth stages and give a novel dataset of wheat growth and development. The performance of the prediction model of plant growth and development was evaluated by calculating SSIM, MSE, and PSNR between the predicted and real plant images. The leaf number, projected area, and length and width of the minimum bounding rectangle of the predicted and real images were also measured and compared to assess whether the prediction was biologically accurate. Findings reveal a high degree of consistency and similarity between the predicted and real plant frames. Moreover, the optimal number of time steps and the optimal time interval between two steps were determined to provide a valuable reference for the prediction studies of plant growth and development. The comparison of evaluation results between the proposed prediction model and the existing models tested on the given dataset was conducted to validate its robustness. The proposed model also could be retrained and adapted to other domains, such as the prediction of plant growth and development on the effects of abiotic and biotic stresses. This work could potentially speed up biologists’, geneticists’, and breeders’ experimental cycles by reducing the time required to grow, image, and measure plants and further accelerate breeding for addressing the challenge of declining food security.

Author Contributions

Conceptualization, C.W., P.L. and X.L.; methodology, C.W.; software, C.W.; validation, C.W., W.P. and X.S.; formal analysis, C.W.; investigation, C.W., H.Y. and J.Z.; resources, C.W., H.Y., P.L., J.Z. and X.L.; data curation, C.W. and W.P.; writing—original draft preparation, C.W.; writing—review and editing, C.W., P.L. and X.S.; visualization, C.W. and W.P.; supervision, X.L.; project administration, P.L. and X.L.; funding acquisition, P.L. and X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Shandong Province (ZR2020KF002); Shandong Provincial Key Research and Development Plan (Major Science and Technology Innovation Project) (2021LZGC013; 2021TZXD001); and NSFC (31871543). The authors are grateful to all study participants.

Data Availability Statement

The data presented in this work are available on request from the corresponding author.

Acknowledgments

We would like to acknowledge the State Key Laboratory of Crop Biology, College of Mechanical and Electronic Engineering, Shandong Agricultural University, and Shandong Provincial Key Laboratory of Horticultural Machinery and Equipment for infrastructural support.

Conflicts of Interest

The authors declare no conflict of interest.

References

Wang, C.; Sun, M.; Liu, L.; Zhu, W.; Liu, P.; Li, X. A High-accuracy Genotype Classification Approach Using Time Series Imagery. Biosyst. Eng. 2022, 220, 172–180. [Google Scholar] [CrossRef]
Li, X.; Zeng, R.; Liao, A.H. Improving Crop Nutrient Efficiency Through Root Architecture Modifications. J. Integr. Plant Biol. 2016, 58, 193–202. [Google Scholar] [CrossRef] [PubMed]
Wang, C.; Liu, B.; Liu, L.; Zhu, Y.; Hou, J.; Liu, P.; Li, X. A Review of Deep Learning Used in the Hyperspectral Image Analysis for Agriculture. Artif. Intell. Rev. 2021, 54, 5205–5253. [Google Scholar] [CrossRef]
Yasrab, R.; Zhang, J.; Smyth, P.; Pound, M.P. Predicting Plant Growth from Time-Series Data Using Deep Learning. Remote Sens. 2021, 13, 331. [Google Scholar] [CrossRef]
Sakurai, S.; Uchiyama, H.; Shimada, A.; Taniguchi, R.I. Plant Growth Prediction using Convolutional LSTM. In Proceedings of the 14th International Conference on Computer Vision Theory and Applications, Prague, Czech Republic, 27 February 2019. [Google Scholar]
Kaur, P.; Harnal, S.; Tiwari, R.; Alharithi, F.S.; Almulihi, A.H.; Noya, I.D.; Goyal, N. A Hybrid Convolutional Neural Network Model for Diagnosis of COVID-19 Using Chest X-ray Images. Int. J. Environ. Res. Public Health 2021, 18, 12191. [Google Scholar] [CrossRef] [PubMed]
Sapra, L.; Sandhu, J.K.; Goyal, N. Intelligent Method for Detection of Coronary Artery Disease with Ensemble Approach. In Lecture Notes in Electrical Engineering, Proceedings of the Advances in Communication and Computational Technology, Singapore, 14 August 2020; Springer: Singapore, 2021. [Google Scholar]
Dong, S.K.; Kim, S.K. Prediction of Strawberry Growth and Fruit Yield based on Environmental and Growth Data in a Greenhouse for Soil Cultivation with Applied Autonomous Facilities. Hortic. Sci. Technol. 2020, 38, 840–849. [Google Scholar]
Shibata, S.; Mizuno, R.; Mineno, H. Semisupervised Deep State-Space Model for Plant Growth Modeling. Plant Phenomics 2020, 2020, 4261965. [Google Scholar] [CrossRef] [PubMed]
Yue, Y.; Li, J.-H.; Fan, L.-F.; Zhang, L.-L.; Zhao, P.-F.; Zhou, Q.; Wang, N.; Wang, Z.-Y.; Huang, L.; Dong, X.-H. Prediction of Maize Growth Stages based on Deep Learning. Comput. Electron. Agric. 2020, 172, 105351. [Google Scholar] [CrossRef]
Drees, L.; Junker-Frohn, L.V.; Kierdorf, J.; Roscher, R. Temporal Prediction and Evaluation of Brassica Growth in the Field Using Conditional Generative Adversarial Networks. Comput. Electron. Agric. 2021, 190, 106415. [Google Scholar] [CrossRef]
Förster, A.; Behley, J.; Behmann, J.; Roscher, R. Hyperspectral Plant Disease Forecasting Using Generative Adversarial Networks. In Proceedings of the 2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July 2019. [Google Scholar]
Wang, C.; Pan, W.; Li, X.; Liu, P. Plant Growth and Development Prediction Model Based on ST-LSTM. Trans. Chin. Soc. Agric. Mach. 2022, 53, 9. [Google Scholar]
Wang, Y.; Zhang, J.; Zhu, H.; Long, M.; Wang, J.; Yu, P.S. Memory in Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity From Spatiotemporal Dynamics. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
Choudhury, S.D.; Stoerger, V.; Samal, A.; Schnable, J.C.; Liang, Z.; Yu, J.-G. Automated Vegetative Stage Phenotyping Analysis of Maize Plants using Visible Light Images DS-FEW. In Proceedings of the KDD: Data Science for Food, Energy and Water, San Francisco, CA, USA, 14 August 2016. [Google Scholar]
Cheong, H.; Krishna Devalla, S.; Chuangsuwanich, T.; Tun, T.A.; Wang, X.; Aung, T.; Schmetterer, L.; Buist, M.L.; Boote, C.; Thiéry, A.H.; et al. OCT-GAN: Single Step Shadow and Noise Removal from Optical Coherence Tomography Images of the Human Optic Nerve Head. Biomed. Opt. Express 2021, 12, 1482–1498. [Google Scholar] [CrossRef]
Zheng, M.; Zhao, Y.; Han, S.; Ji, D.; Li, Y.; Lv, W.; Xin, X.; Zhao, X.; Hu, C. Iterative Reconstruction Algorithm Based on Discriminant Adaptive-weighted TV Regularization for Fibrous Biological Tissues Using in-line X-ray Phase-contrast Imaging. Biomed. Opt. Express 2021, 12, 2460–2483. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Flowchart of dataset construction and examples of the given dataset content. (a) Data acquisition using an image acquisition platform we designed and installed at the growth chamber. (b) Image pre-processing to obtain plant images without background. (c) Examples of the multi-view wheat dataset content. (d) Examples of the successive-view wheat dataset content. (e) Pre-processed examples of the Panicoid Phenomap-1 dataset content.

Figure 2. Structural diagram of the prediction model for plant growth and development. Yellow arrows: the diagonal state transition paths of

H_{p - 1}^{l - 1}

for differential modeling. Green arrows: the horizontal transition paths of the memory cells

C_{p}^{l}

,

D_{p}^{l}

and

S_{p - 1}^{l}

. Red arrows: the zigzag state transition paths of

M_{p}^{l - 1}

.

Figure 2. Structural diagram of the prediction model for plant growth and development. Yellow arrows: the diagonal state transition paths of

H_{p - 1}^{l - 1}

for differential modeling. Green arrows: the horizontal transition paths of the memory cells

C_{p}^{l}

,

D_{p}^{l}

and

S_{p - 1}^{l}

. Red arrows: the zigzag state transition paths of

M_{p}^{l - 1}

.

Figure 3. Structure diagram of ST-LSTM (a) and MIM (b) cell.

Figure 4. Internal structure diagrams of MIM-N and MIM-S.

{i^{″}}_{p}

{g^{″}}_{p}

{f^{″}}_{p}

{o^{″}}_{p}

: the input gate, input-modulation gate, forget gate, and output gate in MIM-N.

{i^{‴}}_{p}

{g^{‴}}_{p}

f^{‴}

{o^{‴}}_{p}

: the input gate, input-modulation gate, forget gate, and output gate in MIM-S.

Figure 4. Internal structure diagrams of MIM-N and MIM-S.

{i^{″}}_{p}

{g^{″}}_{p}

{f^{″}}_{p}

{o^{″}}_{p}

: the input gate, input-modulation gate, forget gate, and output gate in MIM-N.

{i^{‴}}_{p}

{g^{‴}}_{p}

f^{‴}

{o^{‴}}_{p}

: the input gate, input-modulation gate, forget gate, and output gate in MIM-S.

Figure 5. Samples of the results predicted by the proposed prediction model on the successive-view wheat dataset. (a) Comparison between the predicted images and the real images of future growth and development. (b) Average evaluation results of MSE, PSNR, and SSIM at each time step from t + 1 to t + 5. (c) Comparison of leaf number, projected area, length of the minimum bounding rectangle, and width of the minimum bounding rectangle between the predicted and real images.

Figure 6. Average evaluation results of MSE (a), PSNR (b), and SSIM (c) at each time step where the number of prediction steps was set as 3, 5, 8, 10, and 20.

Figure 7. Average evaluation results of MSE (a), PSNR (b), and SSIM (c) at each time step where the time interval between two steps was set as 30 min, 1 h, 2 h, 6 h, and 12 h.

Figure 8. Comparison of MSE (a), PSNR (b), and SSIM (c) between the proposed prediction model and the existing models, where the time intervals between two steps were set as 1 h.

Figure 9. Comparison between the predicted images and the real images of future growth and development of the experimental results on the multi-view wheat dataset. The past 3 images (

I_{t - 2 : t}

) were used to predict the future images of plant growth and development (

{\hat{I}}_{t + 1 : t + 2}

).

I_{t}

was the RGB image of the plant at time t.

Figure 9. Comparison between the predicted images and the real images of future growth and development of the experimental results on the multi-view wheat dataset. The past 3 images (

I_{t - 2 : t}

) were used to predict the future images of plant growth and development (

{\hat{I}}_{t + 1 : t + 2}

).

I_{t}

was the RGB image of the plant at time t.

Figure 10. Comparison between the predicted images of panicoid grain crops and the real images of future growth and development of the experimental results on the Panicoid Phenomap-1 dataset.

Figure 11. Comparison of leaf number, projected area, and length and width of the minimum bounding rectangle between the predicted and real images of the experimental results on the Panicoid Phenomap-1 dataset.

Figure 12. Average evaluation results of MSE (a), PSNR (b), and SSIM (c) at each time step on the Panicoid Phenomap-1 dataset, where the number of prediction steps was set as 3, 5, 8, 10, and 20.

Figure 13. Comparison of evaluation results (MSE (a), PSNR (b), and SSIM (c)) between the proposed prediction model and the existing models tested on the Panicoid Phenomap-1 dataset.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, C.; Pan, W.; Song, X.; Yu, H.; Zhu, J.; Liu, P.; Li, X. Predicting Plant Growth and Development Using Time-Series Images. Agronomy 2022, 12, 2213. https://doi.org/10.3390/agronomy12092213

AMA Style

Wang C, Pan W, Song X, Yu H, Zhu J, Liu P, Li X. Predicting Plant Growth and Development Using Time-Series Images. Agronomy. 2022; 12(9):2213. https://doi.org/10.3390/agronomy12092213

Chicago/Turabian Style

Wang, Chunying, Weiting Pan, Xubin Song, Haixia Yu, Junke Zhu, Ping Liu, and Xiang Li. 2022. "Predicting Plant Growth and Development Using Time-Series Images" Agronomy 12, no. 9: 2213. https://doi.org/10.3390/agronomy12092213

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predicting Plant Growth and Development Using Time-Series Images

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset

2.2. Formulation of Plant Growth and Development Predicting Problem

2.3. Prediction Model of Plant Growth and Development

2.4. Evaluation of the Model Performance

3. Results and Discussion

3.1. Successive-View Wheat Dataset

3.2. Multi-View Wheat Dataset

3.3. Panicoid Phenomap-1 Dataset

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI