Two-stage credit rating prediction using machine learning techniques
Abstract
Purpose
Credit ratings have become one of the primary references for financial institutions to assess credit risk. Conventional credit rating approaches mainly concentrated on two-class classification (i.e. good or bad credit), which lacks adequate precision to perform credit risk evaluations in practice. In addition, most of previous researches directly focussed on employing various data mining techniques, but rare studies discussed the influence of data preprocessing before classifier construction. The paper aims to discuss these issues.
Design/methodology/approach
This study considers nine-class classification (i.e. nine credit risk level) to credit rating prediction. For the development of more accurate classifiers, the paper adopts two-stage analysis, which integrates multiple data preprocessing and supervised learning techniques. Specifically, the first stage applies feature selection, data clustering, and data resampling methods to preprocess the data, and then the second stage utilizes several classification techniques and classifier ensembles to construct prediction models.
Findings
The results show that Bagging-DT with data resampling method achieves excellent accuracy (82.96 percent), indicating that the proposed two-stage prediction model is better than conventional one-stage models.
Originality/value
Practical implication of this study can lower credit rating expenses and also allow corporations to gain credit rating information instantly.
Keywords
Citation
Wu, H.-C., Hu, Y.-H. and Huang, Y.-H. (2014), "Two-stage credit rating prediction using machine learning techniques", Kybernetes, Vol. 43 No. 7, pp. 1098-1113. https://doi.org/10.1108/K-10-2013-0218
Publisher
:Emerald Group Publishing Limited
Copyright © 2014, Emerald Group Publishing Limited