TY - JOUR
T1 - Optimum Tuning Parameter Selection in Generalized lasso for Clustering with Spatially Varying Coefficient Models
AU - Rahardiantoro, S.
AU - Sakamoto, Wataru
N1 - Funding Information:
This research is supported by JICA (Japan International Cooperation Agency).
Publisher Copyright:
© Published under licence by IOP Publishing Ltd.
PY - 2022/1/17
Y1 - 2022/1/17
N2 - Spatial clustering with spatially varying coefficient models is useful for determining the region with common effects of variables in spatial data. This study focuses on selecting the optimum tuning parameter of the generalized lasso for clustering with the spatially varying coefficient model. The k-fold cross-validation (CV) may fail to split spatial data into a training set and a testing set, if a region contains only a few observations. Moreover, the k-fold CV is known to give a biased estimate of the out-of-sample prediction error. Therefore, we investigated the performance of approximate leave-one-out cross-validation (ALOCV) in comparison with k-fold CV for selecting the tuning parameter in a simulation study on 2-dimensional grid. The ALOCV yielded smaller error than k-fold CV and could detect edges with differences shrunk by generalized lasso appropriately. Then, the ALOCV for selecting the optimum tuning parameter of the generalized lasso in fitting the spatially varying coefficient model is applied to the Chicago crime data. The result of selection by ALOCV was in accordance with the conclusion suggested in the preceding literature. Clustering into regions in advance for making k-fold CV feasible may lead to a wrong result of clustering with a spatially varying coefficient model.
AB - Spatial clustering with spatially varying coefficient models is useful for determining the region with common effects of variables in spatial data. This study focuses on selecting the optimum tuning parameter of the generalized lasso for clustering with the spatially varying coefficient model. The k-fold cross-validation (CV) may fail to split spatial data into a training set and a testing set, if a region contains only a few observations. Moreover, the k-fold CV is known to give a biased estimate of the out-of-sample prediction error. Therefore, we investigated the performance of approximate leave-one-out cross-validation (ALOCV) in comparison with k-fold CV for selecting the tuning parameter in a simulation study on 2-dimensional grid. The ALOCV yielded smaller error than k-fold CV and could detect edges with differences shrunk by generalized lasso appropriately. Then, the ALOCV for selecting the optimum tuning parameter of the generalized lasso in fitting the spatially varying coefficient model is applied to the Chicago crime data. The result of selection by ALOCV was in accordance with the conclusion suggested in the preceding literature. Clustering into regions in advance for making k-fold CV feasible may lead to a wrong result of clustering with a spatially varying coefficient model.
UR - http://www.scopus.com/inward/record.url?scp=85123821821&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85123821821&partnerID=8YFLogxK
U2 - 10.1088/1755-1315/950/1/012093
DO - 10.1088/1755-1315/950/1/012093
M3 - Conference article
AN - SCOPUS:85123821821
SN - 1755-1307
VL - 950
JO - IOP Conference Series: Earth and Environmental Science
JF - IOP Conference Series: Earth and Environmental Science
IS - 1
M1 - 012093
T2 - 2nd International Seminar on Natural Resources and Environmental Management, ISeNREM 2021
Y2 - 4 August 2021 through 5 August 2021
ER -