TY - GEN
T1 - Which Metrics Should Researchers Use to Collect Repositories
T2 - 20th IEEE International Conference on Software Quality, Reliability, and Security, QRS 2020
AU - Yamamoto, Kai
AU - Kondo, Masanari
AU - Nishiura, Kinari
AU - Mizuno, Osamu
N1 - Funding Information:
This work has been supported by JSPS KAKENHI Japan (Grant Numbers: JP19J23477).
Publisher Copyright:
© 2020 IEEE.
PY - 2020/12
Y1 - 2020/12
N2 - GitHub is a huge publicly available development platform for hosting a version control system based on Git; software developers prefer to host their various software projects in GitHub. Therefore researchers who are interested in mining software repository frequently use GitHub to collect software projects as datasets. GitHub provides us with repository metrics such as popularity, contribution, and interest. We believe that such metrics are related to the quality of software; we use them to opt for studied repositories according to our research purpose. However, to the best of our knowledge, nobody has any evidence to support this assumption.Our main purpose is to provide researchers who study software quality, especially issue management, with repository metrics to select appropriate repositories for their studies. In this paper, we study the relationship between the characteristics of the issue pages of repositories that are selected by repository metrics in order to figure out the best repository metric to select proper repositories. The following findings are the highlights of our study: (1) The number of contributors that indicates the number of developers who contribute to a GitHub repository can be used to select the repositories having issue pages that are well-maintained. More specifically, such issue pages include more issues and in which developers use the labels more frequently rather than those that are selected by other metrics. (2) The number of dependencies opts for the repositories that have fewer issues and in which developers use the labels less often rather than those that are selected by other metrics.
AB - GitHub is a huge publicly available development platform for hosting a version control system based on Git; software developers prefer to host their various software projects in GitHub. Therefore researchers who are interested in mining software repository frequently use GitHub to collect software projects as datasets. GitHub provides us with repository metrics such as popularity, contribution, and interest. We believe that such metrics are related to the quality of software; we use them to opt for studied repositories according to our research purpose. However, to the best of our knowledge, nobody has any evidence to support this assumption.Our main purpose is to provide researchers who study software quality, especially issue management, with repository metrics to select appropriate repositories for their studies. In this paper, we study the relationship between the characteristics of the issue pages of repositories that are selected by repository metrics in order to figure out the best repository metric to select proper repositories. The following findings are the highlights of our study: (1) The number of contributors that indicates the number of developers who contribute to a GitHub repository can be used to select the repositories having issue pages that are well-maintained. More specifically, such issue pages include more issues and in which developers use the labels more frequently rather than those that are selected by other metrics. (2) The number of dependencies opts for the repositories that have fewer issues and in which developers use the labels less often rather than those that are selected by other metrics.
KW - GitHub
KW - issue
KW - issue tracking system
KW - Libraries.io
KW - repository
UR - http://www.scopus.com/inward/record.url?scp=85099280962&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099280962&partnerID=8YFLogxK
U2 - 10.1109/QRS51102.2020.00065
DO - 10.1109/QRS51102.2020.00065
M3 - Conference contribution
AN - SCOPUS:85099280962
T3 - Proceedings - 2020 IEEE 20th International Conference on Software Quality, Reliability, and Security, QRS 2020
SP - 458
EP - 466
BT - Proceedings - 2020 IEEE 20th International Conference on Software Quality, Reliability, and Security, QRS 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 11 December 2020 through 14 December 2020
ER -