TY - JOUR
T1 - A novel approach to address external validity issues in fault prediction using bandit algorithms
AU - HAYAKAWA, Teruki
AU - TSUNODA, Masateru
AU - TODA, Koji
AU - NAKASAI, Keitaro
AU - TAHIR, Amjed
AU - BENNIN, Kwabena Ebo
AU - MONDEN, Akito
AU - MATSUMOTO, Kenichi
N1 - Funding Information:
This research was partially supported by the Japan Society for the Promotion of Science (JSPS) [Grants-in-Aid for Scientific Research (C) (No. 20K11749)].
Publisher Copyright:
© 2021 The Institute of Electronics.
PY - 2021/2/1
Y1 - 2021/2/1
N2 - Various software fault prediction models have been proposed in the past twenty years. Many studies have compared and evaluated existing prediction approaches in order to identify the most effective ones. However, in most cases, such models and techniques provide varying results, and their outcomes do not result in best possible performance across different datasets. This is mainly due to the diverse nature of software development projects, and therefore, there is a risk that the selected models lead to inconsistent results across multiple datasets. In this work, we propose the use of bandit algorithms in cases where the accuracy of the models are inconsistent across multiple datasets. In the experiment discussed in this work, we used four conventional prediction models, tested on three different dataset, and then selected the best possible model dynamically by applying bandit algorithms. We then compared our results with those obtained using majority voting. As a result, Epsilon-greedy with ϵ = 0.3 showed the best or second-best prediction performance compared with using only one prediction model and majority voting. Our results showed that bandit algorithms can provide promising outcomes when used in fault prediction.
AB - Various software fault prediction models have been proposed in the past twenty years. Many studies have compared and evaluated existing prediction approaches in order to identify the most effective ones. However, in most cases, such models and techniques provide varying results, and their outcomes do not result in best possible performance across different datasets. This is mainly due to the diverse nature of software development projects, and therefore, there is a risk that the selected models lead to inconsistent results across multiple datasets. In this work, we propose the use of bandit algorithms in cases where the accuracy of the models are inconsistent across multiple datasets. In the experiment discussed in this work, we used four conventional prediction models, tested on three different dataset, and then selected the best possible model dynamically by applying bandit algorithms. We then compared our results with those obtained using majority voting. As a result, Epsilon-greedy with ϵ = 0.3 showed the best or second-best prediction performance compared with using only one prediction model and majority voting. Our results showed that bandit algorithms can provide promising outcomes when used in fault prediction.
KW - Defect prediction
KW - Diversity of datasets
KW - Dynamic model selection
KW - Multi-armed bandit
KW - Risk-based testing
UR - http://www.scopus.com/inward/record.url?scp=85100794769&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100794769&partnerID=8YFLogxK
U2 - 10.1587/transinf.2020EDL8098
DO - 10.1587/transinf.2020EDL8098
M3 - Article
AN - SCOPUS:85100794769
SN - 0916-8532
VL - E104D
SP - 327
EP - 331
JO - IEICE Transactions on Information and Systems
JF - IEICE Transactions on Information and Systems
IS - 2
ER -