Evaluation of different survival analysis models for NKI breast cancer data

John Edmon Alejandro Ganas; Peter John Berces Aranas

doi:10.55529/jhtd.36.1.9

Authors

John Edmon Alejandro Ganas Graduate School, Polytechnic University of the Philippines-Manila, Philippines.
Peter John Berces Aranas School of Statistics, University of the Philippines-Diliman Philippines.

Keywords:

Survival Analysis Models, Concordance Index, Akaike Information Criterion (AIC), Breast Cancer, Semiparametric Models, Parametric Models.

Abstract

Background: Survival analysis is pretty central in oncological research, but choosing the right model for a specific dataset is still a major, methodological problem. In this work we look at six survival analysis models on the NKI Breast Cancer dataset over an 18-year clinical trial period to see which one sort of performs best for survival prediction, yes. Objective: The aim is to contrast semi-parametric versus parametric survival analysis approaches using Concordance Index (C-index) and Akaike Information Criterion (AIC), and then decide which one is most suitable for building a breast cancer survival prediction tool. Methods: We used right-censoring to handle the cases where subjects were lost to follow-up. A proportional hazards test was applied to check whether Cox-based models make sense, and for the parametric side we ran a time-to-event distribution test to judge if those models are applicable. For the semi-parametric group, we evaluated three methods: Classical Cox, Cox-Lasso, and Cox-Ridge Regression. For the parametric group, we also tested three Accelerated Failure Time (AFT) models: Weibull AFT, Log-logistic AFT, and Log-Normal AFT. Results: In the semi-parametric category, Cox-Ridge Regression came out on top with the highest C-index (0.7709) and the lowest AIC (752.6703). It performed better than Classical Cox and Cox-Lasso, in a fairly clear way. For the parametric models, Log-Normal AFT was best overall, showing a C-index of 0.780 and an AIC of 608.822, which beat both Weibull and Log-logistic AFT. When we compare across models overall, Log-Normal AFT stays the best-performing option. Conclusions: The Log-Normal AFT option came out ahead of all the other semi parametric and parametric alternatives on both evaluation criteria, like clearly better. From these results it seems the survival times in the NKI Breast Cancer dataset match a log normal distribution fairly well , and so it is suggested that Log-Normal AFT be used as the starting point or basis for a breast cancer survival prediction model.

Evaluation of different survival analysis models for NKI breast cancer data

Authors

Keywords:

Abstract

Published

How to Cite

Issue

Section

Similar Articles

SidebarMenu

Downloads

Current Issue

Information

Make a Submission

Keywords