Software defect prediction using ensemble machine learning on open-source code repositories

Authors

  • Dr. Ruwaida Mohammed Yas University of Information Technology and Communications/Informatics Institute for Post Graduate Studies, Baghdad, Iraq.

Keywords:

Software Defect Prediction, Ensemble Learning Stacking, Random Forest, Promise Repository, CK Metrics, Software Quality Assurance.

Abstract

Software defect prediction is an important software quality assurance exercise which helps the development teams to allocate the testing software resource effectively and identify the modules that are prone to the risk of fault before the software is introduced to the market. Single-classifier methods that are traditional are usually affected by the bias-variance trade-offs, and poor cross-heterogeneous-codebase generalization. This paper provides a Postulation of ensemble machine learning structure involving the incorporation of Random Forest, XGBoost, Support Vector Machine (SVM), and LightGBM to be able to be base learner in a stacking meta-ensemble architecture in order to take defect prediction of open-source software projects details. The models used in the proposed framework are based on the Chidamber-Kemerer (CK) object-oriented measures of feature engineering of six open-source projects, namely Camel, Jedit, Xerces, Ant, Log4j, and Lucene. In order to overcome the class disparity that exists in sets of defects, Synthetic Minority Oversampling Technique (SMOTE) is used during preprocessing. A logistic regression meta-learner thereof is a combination of the probability output of the four base classifiers and in this way the stacking ensemble is able to identify a wide range of decision boundaries and accurately reduce prediction error. Strategic 10-fold cross-validation experimental validation with proposed ensemble model on benchmark PROMISE and NASA MDP datasets show that the ensemble model has an accuracy of 94.3 and 93.1 as well as the recall of 92.7 and F1-score of 92.9 and an AUC-ROC of 0.97. These scores are the improvements of 3.3 to 10.6 percentages points as compared to single classifiers. The Wilcoxon signed-rank tests are found to have no statistical significance (p < 0.05). The paper also compares the cross-project transferability, ranking of the feature importance, and states that the measures of complexity and coupling are the most pertinent ones as far as detecting the defects are concerned. The results indicate that ensemble stacking is practically viable and a strengthened and broad applicability of the method in managing the quality of software on a large scale in industries.

Published

2026-04-02

How to Cite

Yas, D. R. M. (2026). Software defect prediction using ensemble machine learning on open-source code repositories. International Journal of Information Technology & Computer Engineering , 6(1), 46–56. Retrieved from https://hmjournals.com/journal/index.php/IJITC/article/view/6190

Issue

Section

Aricle Publication