# Reconstruction of Single-Cell Trajectories Using Stochastic Tree Search

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Methods

#### 2.1. Preprocessing

#### 2.2. Penalized Likelihood

#### 2.3. Projection of the Data to the Tree

#### 2.4. Updating Vertex Embedding Location

- 1.
- ${\tau}^{b}=p{\tau}^{b-1}$
- 2.
- ${y}^{b+1}={y}^{b}-{\tau}^{b}\nabla {S}_{b}$
- 3.
- Continue the iteration until we have $S({y}^{b}-{\tau}^{b}\nabla {S}_{b})-S\left({y}^{b}\right)\le c{\tau}^{b}\nabla {S}_{b}^{T}\nabla {S}_{b}$.

#### 2.5. Tree Similarity Score

#### 2.6. Stochastic Optimization

#### 2.6.1. Initial Tree Generation

#### 2.6.2. Grow Trees by Adding Nodes

#### 2.6.3. Optimizing Tree with Data

#### 2.6.4. Final Optimal Tree

#### 2.7. Pseudotime Calculation

#### 2.8. Extension from Linear Trees to Nonlinear Trees

## 3. Simulation

#### 3.1. Design

#### 3.2. Comparison with Other Methods

#### 3.3. Comparison between Linear and Curved Tree Methods

## 4. Application

#### 4.1. Induction of Mouse Embryonic Stem (ES) Cell Differentiation

#### 4.2. Resolution of Cell Fate Decisions from Zygote to Blastocyst

## 5. Discussion

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

TR | Trajectory Reconstruction |

PCA | Principal Component Analysis |

ICA | Independent Component Analysis |

LLE | Locally Linear Embedding |

MST | Minimal Spanning Tree |

KNN | K-nearest Neighbor |

DPT | Diffusion Pseudotime |

STS | Stochastic Tree Search |

RGE | Reversed Graph Embedding |

## Appendix A. Updating Vertex Embedding Location with a Closed-Form Solution

## Appendix B. Extension to Curved Tree Method

#### Appendix B.1. Curved Embedding Tree

#### Appendix B.2. Penalized Likelihood

#### Appendix B.3. Projection of the Data to the Tree

#### Appendix B.4. Updating Vertex Embedding Location

- 1.
- ${\tau}_{\gamma}^{b}=p{\tau}_{\gamma}^{b-1}$,
- 2.
- ${\gamma}_{k}^{b+1}={\gamma}_{k}^{b}-{\tau}_{{\gamma}_{k}}^{b}\frac{\partial {S}_{b}}{\partial {\gamma}_{k}^{b}}$
- 3.
- Continue the iteration until $S(y,{\gamma}_{k}^{b}-{\tau}_{{\gamma}_{k}}^{b}\frac{\partial {S}_{b}}{\partial {\gamma}_{k}^{b}})-S(y,{\gamma}_{k}^{b})\le c{\tau}_{\gamma}^{b}{\left(\frac{\partial {S}_{b}}{\partial {\gamma}_{k}^{b}}\right)}^{T}\frac{\partial {S}_{b}}{\partial {\gamma}_{k}^{b}}$.

- 1.
- ${\tau}_{y}^{b}=p{\tau}_{y}^{b-1}$,
- 2.
- ${y}^{b+1}={y}^{b}-{\tau}_{y}^{b}\frac{\partial {S}_{b}}{\partial {y}_{b}}$,
- 3.
- Continue the iteration until $S({y}^{b}-{\tau}_{y}^{b}\frac{\partial {S}_{b}}{\partial {y}_{b}},{\alpha}_{k}^{*},{\beta}_{k}^{*})-S(y,{\alpha}_{k}^{*},{\beta}_{k}^{*})\le c{\tau}_{y}^{b}{\left(\frac{\partial {S}_{b}}{\partial {y}_{b}}\right)}^{T}\frac{\partial {S}_{b}}{\partial {y}_{b}}$.

#### Appendix B.5. Pseudotime Calculation

**Figure 2.**Kendall correlations with different noise level for two different noise distribution cases. (

**a**) Normal. (

**b**) T distribution.

**Figure 4.**Estimated cell trajectories by our linear STS and curved tree methods on the mouse ES cell dataset. (

**a**) Linear tree method. (

**b**) Curved tree method.

**Figure 5.**Scatter plots between the estimated pseudotimes and the true times for our approach and four existing TR methods on the mouse ES cell dataset. (

**a**) SLICER. (

**b**) Slingshot. (

**c**) Monocle ICA. (

**d**) Monocle DDRTree. (

**e**) Linear Tree. (

**f**) Curved Tree.

**Figure 6.**Cell trajectories estimated by our linear STS and curved tree methods on the Zygote–Blastocyst dataset. (

**a**) Linear tree method. (

**b**) Curved tree method.

**Figure 7.**Scatter plots between estimated pseudotimes and true times for our approach and four existing TR methods on the Zygote–Blastocyst dataset. (

**a**) SLICER. (

**b**) Slingshot. (

**c**) Monocle ICA. (

**d**) Monocle DDRTree. (

**e**) Linear Tree. (

**f**) Curved Tree.

**Table 1.**Mean Kendall correlations and mean residual standard error for both curved tree algorithm and linear tree algorithm with different noise level.

Noise Level | Mean Kendall Correlation (SD) | Mean Residual Standard Error (SD) | ||
---|---|---|---|---|

Curved Tree | Linear Tree | Curved Tree | Linear Tree | |

0.01 | 0.87 (0.23) | 0.84 (0.26) | 0.0356 (0.0325) | 0.1812 (0.0365) |

0.05 | 0.83 (0.29) | 0.82 (0.29) | 0.0529 (0.0293) | 0.1839 (0.0399) |

0.10 | 0.83 (0.25) | 0.81 (0.29) | 0.0889 (0.0285) | 0.1988 (0.0410) |

0.15 | 0.81 (0.27) | 0.79 (0.30) | 0.0889 (0.0283) | 0.1978 (0.0403) |

**Table 2.**Mean Kendall correlations and mean residual standard error for both curved tree algorithm and linear tree algorithm with different curvature.

Curvature | Kendall Correlation | Residual Standard Error | ||
---|---|---|---|---|

Curved Tree | Linear Tree | Curved Tree | Linear Tree | |

0 | 0.85 (0.24) | 0.84 (0.24) | 0.0823 (0.0262) | 0.1753 (0.0452) |

0.5 | 0.81 (0.29) | 0.81 (0.29) | 0.0889 (0.0285) | 0.1988 (0.0410) |

1.0 | 0.83 (0.24) | 0.78 (0.26) | 0.1027 (0.0420) | 0.2621 (0.0366) |

1.5 | 0.60 (0.34) | 0.57 (0.34) | 0.1325 (0.0466) | 0.3039 (0.0376) |

2.0 | 0.48 (0.43) | 0.43 (0.40) | 0.1405 (0.0478) | 0.3248 (0.0409) |

Method | Linear Tree | Cruved Tree | SLICER | Slingshot | Monocle ICA | Monocle DDRTree |
---|---|---|---|---|---|---|

Kendall Correlation | 0.87 | 0.87 | 0.39 | 0.87 | 0.86 | 0.87 |

Method | Linear Tree | Curved Tree | SLICER | Slingshot | Monocle ICA | Monocle DDRTree |
---|---|---|---|---|---|---|

Kendall Correlation | 0.76 | 0.76 | 0.69 | 0.78 | 0.51 | 0.69 |

