# Group Lasso Regularized Deep Learning for Cancer Prognosis from Multi-Omics and Clinical Features

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Data Collection

#### 2.2. TCGA Data Preprocessing

#### 2.3. Data Simulation

^{−7}) is the scale parameter and $v$ (2.8) is the shape parameter of Weibull distribution. ${X}_{i}$ ($i$ = 1, 2, …, G) is the feature vector with the length of the group size, and it is generated from multivariate normal distribution (covariance matrix with diagonal elements set to be 1 and other elements set to be 0.3). G is the number of groups. U is randomly generated from a uniform distribution with the interval between 0 and 1. β

_{i}is the coefficients vector for group i with the length of group size, and the value of it depends on the type of group. If group i is the relevant survival time (this group is called relevant group), then each element of β

_{i}is drawn from the normal distribution with mean of 0 and standard deviation of $\frac{10}{\#ofrelevantfeatures}$; otherwise β

_{i}is set to be a vector of all zeros.

^{−3}) is the rate of exponential distribution. Furthermore, the event status (S or survival status) and censoring status (C) were formulated according to:

#### 2.4. GDP Model

_{1}is the first hidden layer of the GDP and H

_{2}is the second hidden layer, where g is the activation function of the neural network, where ReLU (rectified linear unit) was specifically used with the definition:

#### 2.5. Model Training

#### 2.6. Model Evaluation and Feature Selection

#### 2.7. Availabilities of Software

## 3. Results

#### 3.1. Group Lasso Prevents Overfitting During GDP Training

#### 3.2. Group Lasso Performed Better than Lasso Regularization on the Simulated Time-To-Event Data with Group Prior Knowledge

#### 3.3. Influence of Group Size on the Performance of GDP Survival Prediction

#### 3.4. GDP Performed Better than CPH under Complex Simulations

#### 3.5. GDP Performances on TCGA Cancer Data

## 4. Discussions

## Supplementary Materials

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Abbreviations

Lasso | Least absolute shrinkage and selection operator |

TCGA | The Cancer Genome Atlas |

C-index | Concordance index |

CPH | Cox proportional hazard model |

NSS | Non-linear survival simulation |

LSS | Linear survival simulation |

CNV | Copy number variation |

GDP | Group lasso regularized deep learning for the survival prediction in cancer patients |

TRIPOD | Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis |

iCAGES | Integrated CAncer Genome Score |

GISTIC | Genomic Identification of Significant Targets in Cancer |

GDAC | Genome Data Analysis Center |

## References

- James, N.D.; Sydes, M.R.; Clarke, N.W.; Mason, M.D.; Dearnaley, D.P.; Spears, M.R.; Ritchie, A.W.S.; Parker, C.C.; Russell, J.M.; Attard, G.; et al. Addition of docetaxel, zoledronic acid, or both to first-line long-term hormone therapy in prostate cancer (STAMPEDE): Survival results from an adaptive, multiarm, multistage, platform randomised controlled trial. Lancet
**2016**, 387, 1163–1177. [Google Scholar] [CrossRef] - Von Minckwitz, G.; Procter, M.; de Azambuja, E.; Zardavas, D.; Benyunes, M.; Viale, G.; Suter, T.; Arahmani, A.; Rouchet, N.; Clark, E.; et al. Adjuvant pertuzumab and trastuzumab in early HER2-positive breast cancer. N. Engl. J. Med.
**2017**, 377, 122–131. [Google Scholar] [CrossRef] [PubMed] - Mlecnik, B.; Bindea, G.; Angell, H.K.; Maby, P.; Angelova, M.; Tougeron, D.; Church, S.E.; Lafontaine, L.; Fischer, M.; Fredriksen, T.; et al. Integrative analyses of colorectal cancer show immunoscore is a stronger predictor of patient survival than microsatellite instability. Immunity
**2016**, 44, 698–711. [Google Scholar] [CrossRef] [PubMed] - Flynn, R. Survival analysis. J. Clin. Nurs.
**2012**, 21, 2789–2797. [Google Scholar] [CrossRef] - Cox, D.R. Regression models and life-tables. J. R. Stat. Soc. B
**1972**, 34, 187. [Google Scholar] [CrossRef] - Goodwin, S.; McPherson, J.D.; McCombie, W.R. Coming of age: Ten years of next-generation sequencing technologies. Nat. Rev. Genet.
**2016**, 17, 333–351. [Google Scholar] [CrossRef] - Weinstein, J.N.; Collisson, E.A.; Mills, G.B.; Shaw, K.R.M.; Ozenberger, B.A.; Ellrott, K.; Shmulevich, I.; Sander, C.; Stuart, J.M. Network CGAR: The cancer genome atlas pan-cancer analysis project. Nature Genet.
**2013**, 45, 1113–1120. [Google Scholar] [CrossRef] - Cancer Genome Atlas Research Network. Electronic address edsc, cancer genome atlas research N: Comprehensive and integrated genomic characterization of adult soft tissue sarcomas. Cell
**2017**, 171, 950–965. [Google Scholar] [CrossRef] - Tibshirani, R.; Witten, D.M. Survival analysis with high-dimensional covariates. Stat. Methods Med. R.
**2010**, 19, 29–51. [Google Scholar] - Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc.
**1996**, 58, 267–288. [Google Scholar] [CrossRef] - Tibshirani, R. The lasso method for variable selection in the Cox model. Stat. Med.
**1997**, 16, 385–395. [Google Scholar] [CrossRef] - Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc.
**2005**, 67, 301–320. [Google Scholar] [CrossRef] - Werner, H.M.; Mills, G.B.; Ram, P.T. Cancer systems biology: A peek into the future of patient care? Nat. Rev. Clin. Oncol.
**2014**, 11, 167–176. [Google Scholar] [CrossRef] - Yuan, M.; Lin, Y. Model selection and estimation in regression with grouped variables. J. R. Stat. Soc B
**2006**, 68, 49–67. [Google Scholar] [CrossRef] [Green Version] - Meier, L.; van de Geer, S.A.; Buhlmann, P. The group lasso for logistic regression. J. R. Stat. Soc. B
**2008**, 70, 53–71. [Google Scholar] [CrossRef] [Green Version] - LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature
**2015**, 521, 436–444. [Google Scholar] [CrossRef] [PubMed] - Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; van den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; et al. Mastering the game of go with deep neural networks and tree search. Nature
**2016**, 529, 484–489. [Google Scholar] [CrossRef] - Karpathy, A.; Fei-Fei, L. Deep visual-semantic alignments for generating image descriptions. IEEE Trans. Pattern Anal. Mach. Intell.
**2017**, 39, 664–676. [Google Scholar] [CrossRef] [PubMed] - Chen, C.; Seff, A.; Kornhauser, A.; Xiao, J. DeepDriving: Learning affordance for direct perception in autonomous driving. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Las Condes, Chile, 7–13 December 2015. [Google Scholar]
- Alipanahi, B.; Delong, A.; Weirauch, M.T.; Frey, B.J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol.
**2015**, 33, 831. [Google Scholar] [CrossRef] [PubMed] - Xiong, H.Y.; Alipanahi, B.; Lee, L.J.; Bretschneider, H.; Merico, D.; Yuen, R.K.C.; Hua, Y.M.; Gueroussov, S.; Najafabadi, H.S.; Hughes, T.R.; et al. The human splicing code reveals new insights into the genetic determinants of disease. Science
**2015**, 347, 6218. [Google Scholar] [CrossRef] - Zhou, J.; Troyanskaya, O.G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods
**2015**, 12, 931–934. [Google Scholar] [CrossRef] - Katzman, J.L.; Shaham, U.; Cloninger, A.; Bates, J.; Jiang, T.; Kluger, Y. DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network. BMC Med. Res. Methodol.
**2016**, 18, 24. [Google Scholar] [CrossRef] [PubMed] - Yousefi, S.; Amrollahi, F.; Amgad, M.; Dong, C.; Lewis, J.E.; Song, C.; Gutman, D.A.; Halani, S.H.; Vega, J.E.V.; Brat, D.J.; et al. Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models. Sci. Rep.
**2017**, 7, 11707. [Google Scholar] [CrossRef] - Martín Abadi, A.A.; Paul, B.; Brevdo, E.; Zhifeng, C.; Craig, C.; Greg, S.; Corrado, A.D.; Jeffrey, D.; Devin, M.; Sanjay, G. Google research: TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv, 2016; arXiv:1603.04467. [Google Scholar]
- Li, B.; Dewey, C.N. RSEM: Accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinform.
**2011**, 12, 323. [Google Scholar] [CrossRef] [PubMed] - Mermel, C.H.; Schumacher, S.E.; Hill, B.; Meyerson, M.L.; Beroukhim, R.; Getz, G. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol.
**2011**, 12, R41. [Google Scholar] [CrossRef] - Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; DePristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The variant call format and VCFtools. Bioinformatics
**2011**, 27, 2156–2158. [Google Scholar] [CrossRef] - Dong, C.; Guo, Y.; Yang, H.; He, Z.; Liu, X.; Wang, K. ICAGES: Integrated cancer genome score for comprehensively prioritizing driver genes in personal cancer genomes. Genome Med.
**2016**, 8, 135. [Google Scholar] [CrossRef] - Bender, R.; Augustin, T.; Blettner, M. Generating survival times to simulate COX proportional hazards models. Stat. Med.
**2005**, 24, 1713–1723. [Google Scholar] [CrossRef] - Harrell, F.E., Jr.; Califf, R.M.; Pryor, D.B.; Lee, K.L.; Rosati, R.A. Evaluating the yield of medical tests. JAMA
**1982**, 247, 2543–2546. [Google Scholar] [CrossRef] - Helbing, C.C.; Veillette, C.; Riabowol, K.; Johnston, R.N.; Garkavtsev, I. A novel candidate tumor suppressor, ING1, is involved in the regulation of apoptosis. Cancer Res.
**1997**, 57, 1255–1258. [Google Scholar] - Tallen, U.G.; Truss, M.; Kunitz, F.; Wellmann, S.; Unryn, B.; Sinn, B.; Lass, U.; Krabbe, S.; Holtkamp, N.; Hagemeier, C.; et al. Down-regulation of the inhibitor of growth 1 (ING1) tumor suppressor sensitizes p53-deficient glioblastoma cells to cisplatin-induced cell death. J. Neurooncol.
**2008**, 86, 23–30. [Google Scholar] [CrossRef] - Yuan, Y.; van Allen, E.M.; Omberg, L.; Wagle, N.; Amin-Mansour, A.; Sokolov, A.; Byers, L.A.; Xu, Y.; Hess, K.R.; Diao, L.; et al. Assessing the clinical utility of cancer genomic and proteomic data across tumor types. Nat. Biotechnol.
**2014**, 32, 644–652. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Simon, N.; Friedman, J.; Hastie, T.; Tibshirani, R. A sparse-group lasso. J. Comput. Graph. Stat.
**2013**, 22, 231–245. [Google Scholar] [CrossRef] - Collins, G.S.; Reitsma, J.B.; Altman, D.G.; Moons, K.G. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD). Ann. Intern. Med.
**2015**, 162, 735–736. [Google Scholar] [CrossRef] [PubMed]

Sample Availability: The R scripts for survival data simulation and the GDP python package can be found in GitHub (https://github.com/WGLab/GDP). |

**Figure 1.**Basic framework of GDP. GDP can perform survival analysis for the cancer data with high-dimensional features. The number of features for each type of datasets is given in the parenthesis. GISTIC CNV means the CNV data was processed using GISTIC 2.0 and the focal data via gene output was used. Furthermore, iCAGES mutation means the DNA mutation data was processed using iCAGES and the iCAGES gene level scores were used. The number of molecular features shown here is the average number of molecular features from 14 TCGA tumor types selected in this study.

**Figure 2.**Group lasso overcame the overfitting of GDP training for simulated data with group information as seen in the GDP training process comparison among three different types of regularization methods: (

**A**) group lasso, (

**B**) lasso, and (

**C**) no regularization. One training step is one round of GDP training with 50 randomized samples fed in batch as input, and the model trained after each step was evaluated on training, validation, and testing data. The biggest overfitting gap between testing data and training data was observed in the training process without regularization. Both lasso and group lasso reduced the overfitting gap, and the latter of which more significantly improved the survival prediction accuracy in the testing data (c-index approaches 0.8). For simulation details, see simulation s1 in Table S2.

**Figure 3.**Group lasso regularization method achieved higher survival prediction accuracy than the lasso regularization method. (

**A**) Grid search of the best hyper-parameters evaluated on validation data. GDP with group lasso regularization (red) was compared to both of GDP with lasso regularization (green) and no regularization (blue). Group lasso performed best at a scale of 16 (red box), while lasso performed best at a scale of 0.5 (green box). (

**B**) Group lasso performed significantly better than both of lasso and no regularization on testing data. The p-values based on a two-sided t-test between different methods were: 0.0078 (group lasso vs lasso), 1.48 × 10

^{−8}(group lasso vs no regularization), and 7.34 × 10

^{−5}(lasso vs no regularization). α and scale were the parameters for the regularization terms. α controlled the proportion of group lasso regularization value and scale regulated the proportion of the whole regularization value in the loss function. For simulation details, see s1 in Table S2. (

**C**) GDP performance comparison for the simulated data with different group sizes under constant number of relevant features. Simulation settings can be found in s2A–E in Table S2. GDP performed best when the group size was 4 or 8, and performed worst when the group size was 1 (reduced to lasso at group size 1).

**Figure 4.**Comparison between GDP and CPH models. C-index comparison between GDP and CPH under different types of feature transformations. Normal: no additional layer of function was applied to the simulation model. Quadratic: quadratic function layer was added to the simulation model. Absolute: absolute function layer was added to the simulation model. Simulation details can be found in S3A–C of Table S2.

**Figure 5.**Group lasso performed significantly better regarding survival prediction for GBM, KIRC, and BLCA than the lasso method. GDP survival analysis was done on 14 tumor types from TCGA (Table 1). The group lasso method was compared to lasso and no regularization scenarios. For each tumor type datasets, 20% of the data was kept as testing datasets, and 80% of them was used for training and evaluation. Among this 80%, 75% was used for training and 25% was used for cross-evaluation.

**Table 1.**Summary of the selected TCGA datasets. Only the TCGA tumor types with a ratio of non-censored over censored of no less than 0.3 and the number of patients no less than 100 were considered in the GDP analysis.

Tumor Name | Tumor Full Name | # of Patients | # Censored |
---|---|---|---|

GBM | Glioblastoma multiforme | 579 | 101 |

OV | Ovarian serous cystadenocarcinoma | 571 | 232 |

KIRC | Kidney renal clear cell carcinoma | 532 | 355 |

HNSC | Head and neck squamous cell carcinoma | 528 | 304 |

LUAD | Lung adenocarcinoma | 507 | 322 |

LUSC | Lung squamous cell carcinoma | 504 | 284 |

SKCM | Skin cutaneous melanoma | 469 | 249 |

STAD | Stomach adenocarcinoma | 443 | 270 |

BLCA | Bladder urothelial carcinoma | 409 | 229 |

LIHC | Liver hepatocellular carcinoma | 377 | 245 |

SARC | Sarcoma | 261 | 162 |

LAML | Acute myeloid leukemia | 198 | 66 |

PAAD | Pancreatic adenocarcinoma | 185 | 85 |

ESCA | Esophageal carcinoma | 185 | 108 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Xie, G.; Dong, C.; Kong, Y.; Zhong, J.F.; Li, M.; Wang, K.
Group Lasso Regularized Deep Learning for Cancer Prognosis from Multi-Omics and Clinical Features. *Genes* **2019**, *10*, 240.
https://doi.org/10.3390/genes10030240

**AMA Style**

Xie G, Dong C, Kong Y, Zhong JF, Li M, Wang K.
Group Lasso Regularized Deep Learning for Cancer Prognosis from Multi-Omics and Clinical Features. *Genes*. 2019; 10(3):240.
https://doi.org/10.3390/genes10030240

**Chicago/Turabian Style**

Xie, Gangcai, Chengliang Dong, Yinfei Kong, Jiang F. Zhong, Mingyao Li, and Kai Wang.
2019. "Group Lasso Regularized Deep Learning for Cancer Prognosis from Multi-Omics and Clinical Features" *Genes* 10, no. 3: 240.
https://doi.org/10.3390/genes10030240