Evaluation of different survival analysis models for NKI breast cancer data
Keywords:
Survival Analysis Models, Concordance Index, Akaike Information Criterion (AIC), Breast Cancer, Semiparametric Models, Parametric Models.Abstract
Background: Survival analysis is pretty central in oncological research, but choosing the right model for a specific dataset is still a major, methodological problem. In this work we look at six survival analysis models on the NKI Breast Cancer dataset over an 18-year clinical trial period to see which one sort of performs best for survival prediction, yes. Objective: The aim is to contrast semi-parametric versus parametric survival analysis approaches using Concordance Index (C-index) and Akaike Information Criterion (AIC), and then decide which one is most suitable for building a breast cancer survival prediction tool. Methods: We used right-censoring to handle the cases where subjects were lost to follow-up. A proportional hazards test was applied to check whether Cox-based models make sense, and for the parametric side we ran a time-to-event distribution test to judge if those models are applicable. For the semi-parametric group, we evaluated three methods: Classical Cox, Cox-Lasso, and Cox-Ridge Regression. For the parametric group, we also tested three Accelerated Failure Time (AFT) models: Weibull AFT, Log-logistic AFT, and Log-Normal AFT. Results: In the semi-parametric category, Cox-Ridge Regression came out on top with the highest C-index (0.7709) and the lowest AIC (752.6703). It performed better than Classical Cox and Cox-Lasso, in a fairly clear way. For the parametric models, Log-Normal AFT was best overall, showing a C-index of 0.780 and an AIC of 608.822, which beat both Weibull and Log-logistic AFT. When we compare across models overall, Log-Normal AFT stays the best-performing option. Conclusions: The Log-Normal AFT option came out ahead of all the other semi parametric and parametric alternatives on both evaluation criteria, like clearly better. From these results it seems the survival times in the NKI Breast Cancer dataset match a log normal distribution fairly well , and so it is suggested that Log-Normal AFT be used as the starting point or basis for a breast cancer survival prediction model.
Published
How to Cite
Issue
Section
Copyright (c) 2023 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.