TY - GEN
T1 - Using Bandit Algorithms for Selecting Feature Reduction Techniques in Software Defect Prediction
AU - Tsunoda, Masateru
AU - Monden, Akito
AU - Toda, Koji
AU - Tahir, Amjed
AU - Bennin, Kwabena Ebo
AU - Nakasai, Keitaro
AU - Nagura, Masataka
AU - Matsumoto, Kenichi
N1 - Funding Information:
This research is partially supported by the Japan Society for the Promotion of Science [Grants-in-Aid for Scientific Research (C) and (S) (No.21K11840 and No. 20H05706). A. Tahir is also partially supported by a NZ National Science Challenge grant.
Publisher Copyright:
© 2022 ACM.
PY - 2022
Y1 - 2022
N2 - Background: Selecting a suitable feature reduction technique, when building a defect prediction model, can be challenging. Different techniques can result in the selection of different independent variables which have an impact on the overall performance of the prediction model. To help in the selection, previous studies have assessed the impact of each feature reduction technique using different datasets. However, there are many reduction techniques, and therefore some of the well-known techniques have not been assessed by those studies. Aim: The goal of the study is to select a high-accuracy reduction technique from several candidates without preliminary assessments. Method: We utilized bandit algorithm (BA) to help with the selection of best features reduction technique for a list of candidates. To select the best feature reduction technique, BA evaluates the prediction accuracy of the candidates, comparing testing results of different modules with their prediction results. By substituting the reduction technique for the prediction method, BA can then be used to select the best reduction technique. In the experiment, we evaluated the performance of BA to select suitable reduction technique. We performed cross version defect prediction using 14 datasets. As feature reduction techniques, we used two assessed and two non-assessed techniques. Results: Using BA, the prediction accuracy was higher or equivalent than existing approaches on average, compared with techniques selected based on an assessment. Conclusions: BA can have larger impact on improving prediction models by helping not only on selecting suitable models, but also in selecting suitable feature reduction techniques.
AB - Background: Selecting a suitable feature reduction technique, when building a defect prediction model, can be challenging. Different techniques can result in the selection of different independent variables which have an impact on the overall performance of the prediction model. To help in the selection, previous studies have assessed the impact of each feature reduction technique using different datasets. However, there are many reduction techniques, and therefore some of the well-known techniques have not been assessed by those studies. Aim: The goal of the study is to select a high-accuracy reduction technique from several candidates without preliminary assessments. Method: We utilized bandit algorithm (BA) to help with the selection of best features reduction technique for a list of candidates. To select the best feature reduction technique, BA evaluates the prediction accuracy of the candidates, comparing testing results of different modules with their prediction results. By substituting the reduction technique for the prediction method, BA can then be used to select the best reduction technique. In the experiment, we evaluated the performance of BA to select suitable reduction technique. We performed cross version defect prediction using 14 datasets. As feature reduction techniques, we used two assessed and two non-assessed techniques. Results: Using BA, the prediction accuracy was higher or equivalent than existing approaches on average, compared with techniques selected based on an assessment. Conclusions: BA can have larger impact on improving prediction models by helping not only on selecting suitable models, but also in selecting suitable feature reduction techniques.
KW - external validity
KW - online optimization
KW - Software fault prediction
KW - variable selection
UR - http://www.scopus.com/inward/record.url?scp=85134070516&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85134070516&partnerID=8YFLogxK
U2 - 10.1145/3524842.3529093
DO - 10.1145/3524842.3529093
M3 - Conference contribution
AN - SCOPUS:85134070516
T3 - Proceedings - 2022 Mining Software Repositories Conference, MSR 2022
SP - 670
EP - 681
BT - Proceedings - 2022 Mining Software Repositories Conference, MSR 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 Mining Software Repositories Conference, MSR 2022
Y2 - 23 May 2022 through 24 May 2022
ER -