Redefining Prognostic Risk in Colorectal Cancer: Calibrated Deep Learning Reclassifies High-Risk Mortality and Mitigates Overtreatment
Download PDF

Keywords

Colorectal cancer
DeepSurv
Survival analysis
Model calibration
Simulation study

DOI

10.26689/par.v10i2.14542

Submitted : 2026-03-15
Accepted : 2026-03-30
Published : 2026-04-14

Abstract

Accurate prognostic risk stratification is critical for colorectal cancer (CRC), yet traditional linear models are limited by complex non-linear multi-omics interactions. We compared three mainstream survival models (regularized Cox, random survival forests [RSF], and DeepSurv) via multi-scenario simulations spanning linear to strongly non-linear risks, with rigorous validation in TCGA (n = 610) and independent GEO (n = 566) cohorts using a five-dimensional evaluation framework, plus blinded isotonic regression for model calibration. DeepSurv showed significant predictive superiority in non-linear scenarios, achieving a global C-index of 0.7820 in TCGA (vs 0.7610 for regularized Cox), 42.18% net reclassification improvement for high-risk mortality patients, and 20.60% reduced prediction error after calibration, with robust external validation performance. The regularized Cox model remained robust for linear low-dimensional data. In conclusion, calibrated DeepSurv is optimal for high-risk CRC identification in complex multi-omics data, providing a standardized paradigm for survival model selection.

References

Sung H, Ferlay J, Siegel RL, et al., 2021, Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA: A Cancer Journal for Clinicians, 71(3): 209–249.

Cancer Genome Atlas Network, 2012, Comprehensive Molecular Characterization of Human Colon and Rectal Cancer. Nature, 487(7407): 330–337.

Cox DR, 1972, Regression Models and Life-Tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2): 187–202.

Ma S, Huang J, 2007, Regularized Cox Regression in High-Dimensional and Low-Sample Size Settings. Biometrics, 63(3): 813–820.

Ishwaran H, Kogalur UB, Blackstone EH, et al., 2008, Random Survival Forests. The Annals of Applied Statistics, 2(3): 841–860.

Katzman JL, Shaham U, Cloninger A, et al., 2018, DeepSurv: A Personalized Treatment Recommender System Using a Cox Proportional Hazards Deep Neural Network. BMC Medical Research Methodology, 18(1): 24.

Zhang L, Chen G, Li J, et al., 2023, Multi-Omics Integration for Colorectal Cancer Prognosis: A Systematic Review and Meta-Analysis. Computational Biology and Chemistry, 108(1): 107802.

National Cancer Institute, 2025, U.S. Genome Data Sharing Data Portal, visited on June 10, 2025, https://portal.gdc.cancer.gov/.

Troyanskaya O, Cantor M, Sherlock G, et al., 2001, Missing Value Estimation Methods for DNA Microarrays. Bioinformatics, 17(6): 520–525.

Cai L, Zhong Z, Yang M, et al., 2022, Simulation of Survival Data with Specified Censoring Proportions and Its R Implementation. Chinese Journal of Health Statistics, 39(2): 287–290.

Uno H, Cai T, Pencina MJ, et al., 2011, On the C-Statistics for Evaluating Overall Adequacy of Risk Prediction Procedures with Censored Survival Data. Statistics in Medicine, 30(10): 1105–1117.

Gerds TA, Kattan MW, Schumacher M, et al., 2021, Assessing the Performance of Prediction Models for Survival Data: A General Framework for Measuring Discrimination, Calibration, and Clinical Utility. Biometrical Journal, 63(1): 124–143.

Pencina MJ, D’Agostino RB Sr, D’Agostino RB Jr, et al., 2008, Evaluating the Added Predictive Ability of a New Marker: From Area Under the ROC Curve to Reclassification and Beyond. Statistics in Medicine, 27(2): 157–172.

Benson AB 3rd, Venook AP, Al-Hawary MM, et al., 2022, NCCN Guidelines Insights: Colon Cancer, Version 2.2022. Journal of the National Comprehensive Cancer Network, 20(3): 359–369.

Ching T, Zhu X, Garmire LX, 2018, Deep Learning-Based Survival Analysis Using Multi-Omics Data from TCGA. Pacific Symposium on Biocomputing, 23: 24–35.

Wang JL, Zhong Q, 2025, Joint Modeling of Longitudinal and Survival Data. Annual Review of Statistics and Its Application, 12: 449–476.

Chen Z, Shen Y, Qin J, et al., 2024, Likelihood Adaptively Incorporated External Aggregate Information with Uncertainty for Survival Data. Biometrics, 80(4): ujae120.