Performance Evaluation of Machine Learning Approaches for Credit Scoring

Anqi Cao; Hongliang He; Zixuan Chen; Wenyu Zhang

doi:doi:10.11648/j.ijefm.20180606.12

| Peer-Reviewed

Performance Evaluation of Machine Learning Approaches for Credit Scoring

Anqi Cao, Hongliang He, Zixuan Chen, Wenyu Zhang

Published in International Journal of Economics, Finance and Management Sciences (Volume 6, Issue 6)

Received: 9 December 2018 Published: 11 December 2018

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

In today’s world, assessing financial credit risk is of immense importance in both accounting and finance areas. Financial institutions need to keep the credit default risk to an acceptable level so that higher profitability can be achieved. Recently, with the fast development of modern data science, many machine learning methods have been applied to make accurate predictions based on the information extracted from diverse data sources. The present study aims to apply data mining techniques in acquiring evidence used to judge which classifier performs better in assessing credit scoring for a proposed model. The two datasets employed in the analysis of this paper are the “Give Me Some Credit” dataset and the “PPDai” dataset. Eight classification methods are adopted in the paper including Linear Discriminant Analysis (LDA), Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), eXtreme Gradient Boosting (XGboost) and Multi-Layer Perceptron (MLP). Three indicators (Accuracy, AUC and Logistic loss) are used to analyze the performance of each classifier. The final experiment results indicate that the XGBoost classifier has a better performance in predictive analytics compared with the other seven models. The study results will also provide practical values for financial institutions in choosing the appropriate classifier so as to make correct judgements when they are faced with credit problems in real situations.

Published in	International Journal of Economics, Finance and Management Sciences (Volume 6, Issue 6)
DOI	10.11648/j.ijefm.20180606.12
Page(s)	255-260
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2018. Published by Science Publishing Group

Keywords

Data Mining, Credit Scoring, Machine Learning, Performance Evaluation

References

[1]	Huang, C. L., Chen, M. C., & Wang, C. J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert Systems with Applications, 33(4), 847-856.
[2]	Sharda, R., Delen, D., & Turban, E. (2018). Business Intelligence, Analytics, and Data Science: A Managerial Perspective (4th ed.). Boston: Pearson.
[3]	Han, J., Kamber, M., & Pei, J. (2006). Data Mining: Concepts and Techniques. Burlington: Morgan Kaufmann.
[4]	Chen, S. Y., & Liu, X. (2004). The contribution of data mining to information science. Journal of Information Science, 30(6), 550-558.
[5]	Chen, N., Ribeiro, B., & Chen, A. (2015). Financial credit risk assessment: a recent review. Artificial Intelligence Review, 45(1), 1-23.
[6]	Xia, Y., Liu, C., Li, Y., & Liu, N. (2017). A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Systems with Applications, 78, 225-241.
[7]	Fisher, R. A., (1986). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179-188.
[8]	Karlis, D., & Rahmouni, M. (2007). Analysis of defaulters’ behaviour using the Poisson mixture approach. IMA Journal Management Mathematics, 18(3), 297-311.
[9]	Danenas, P., & Garsva, G. (2015). Selection of Support Vector Machines based classifiers for credit risk domain. Expert Systems with Applications, 42(6), 3194-3204.
[10]	Orrù, G., Pettersson-Yeo, W., Marquand, A. F., Sartori, G., & Mechelli, A. (2012). Using Support Vector Machine to identify imaging biomarkers of neurological and psychiatric disease: A critical review. Neuroscience and Biobehavioral Reviews, 36(4), 1140-1152.
[11]	Maldonado, S., Pérez, J., & Bravo, C. (2017). Cost-based feature selection for support vector machines: an application in credit scoring. European Journal of Operational Research, 261(2), 656-665.
[12]	Kamiński, B., Jakubczyk, M., Szufel, P., & Leopold-Wildburger, U. (2018). A framework for sensitivity analysis of decision trees: Central European Journal of Operations Research, 26(1), 135-159.
[13]	Trevor, H., Tibshirani, R., & Friedman, J. (2008). The Elements of Statistical Learning (2nd ed.), Springer.
[14]	Zięba, M., Tomczak, S. and Tomczak, J. (2016). Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction. Expert Systems with Applications, 58, pp. 93-101.
[15]	Chen T, Guestrin C. (2016). Xgboost: a scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, USA, August 13-17, 2016, pp. 785-794.

Cite This Article

Plain Text BibTeX RIS

APA Style

Anqi Cao, Hongliang He, Zixuan Chen, Wenyu Zhang. (2018). Performance Evaluation of Machine Learning Approaches for Credit Scoring. International Journal of Economics, Finance and Management Sciences, 6(6), 255-260. https://doi.org/10.11648/j.ijefm.20180606.12

Copy | Download

ACS Style

Anqi Cao; Hongliang He; Zixuan Chen; Wenyu Zhang. Performance Evaluation of Machine Learning Approaches for Credit Scoring. Int. J. Econ. Finance Manag. Sci. 2018, 6(6), 255-260. doi: 10.11648/j.ijefm.20180606.12

Copy | Download

AMA Style

Anqi Cao, Hongliang He, Zixuan Chen, Wenyu Zhang. Performance Evaluation of Machine Learning Approaches for Credit Scoring. Int J Econ Finance Manag Sci. 2018;6(6):255-260. doi: 10.11648/j.ijefm.20180606.12

Copy | Download

@article{10.11648/j.ijefm.20180606.12,
  author = {Anqi Cao and Hongliang He and Zixuan Chen and Wenyu Zhang},
  title = {Performance Evaluation of Machine Learning Approaches for Credit Scoring},
  journal = {International Journal of Economics, Finance and Management Sciences},
  volume = {6},
  number = {6},
  pages = {255-260},
  doi = {10.11648/j.ijefm.20180606.12},
  url = {https://doi.org/10.11648/j.ijefm.20180606.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijefm.20180606.12},
  abstract = {In today’s world, assessing financial credit risk is of immense importance in both accounting and finance areas. Financial institutions need to keep the credit default risk to an acceptable level so that higher profitability can be achieved. Recently, with the fast development of modern data science, many machine learning methods have been applied to make accurate predictions based on the information extracted from diverse data sources. The present study aims to apply data mining techniques in acquiring evidence used to judge which classifier performs better in assessing credit scoring for a proposed model. The two datasets employed in the analysis of this paper are the “Give Me Some Credit” dataset and the “PPDai” dataset. Eight classification methods are adopted in the paper including Linear Discriminant Analysis (LDA), Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), eXtreme Gradient Boosting (XGboost) and Multi-Layer Perceptron (MLP). Three indicators (Accuracy, AUC and Logistic loss) are used to analyze the performance of each classifier. The final experiment results indicate that the XGBoost classifier has a better performance in predictive analytics compared with the other seven models. The study results will also provide practical values for financial institutions in choosing the appropriate classifier so as to make correct judgements when they are faced with credit problems in real situations.},
 year = {2018}
}

Copy | Download

TY  - JOUR
T1  - Performance Evaluation of Machine Learning Approaches for Credit Scoring
AU  - Anqi Cao
AU  - Hongliang He
AU  - Zixuan Chen
AU  - Wenyu Zhang
Y1  - 2018/12/11
PY  - 2018
N1  - https://doi.org/10.11648/j.ijefm.20180606.12
DO  - 10.11648/j.ijefm.20180606.12
T2  - International Journal of Economics, Finance and Management Sciences
JF  - International Journal of Economics, Finance and Management Sciences
JO  - International Journal of Economics, Finance and Management Sciences
SP  - 255
EP  - 260
PB  - Science Publishing Group
SN  - 2326-9561
UR  - https://doi.org/10.11648/j.ijefm.20180606.12
AB  - In today’s world, assessing financial credit risk is of immense importance in both accounting and finance areas. Financial institutions need to keep the credit default risk to an acceptable level so that higher profitability can be achieved. Recently, with the fast development of modern data science, many machine learning methods have been applied to make accurate predictions based on the information extracted from diverse data sources. The present study aims to apply data mining techniques in acquiring evidence used to judge which classifier performs better in assessing credit scoring for a proposed model. The two datasets employed in the analysis of this paper are the “Give Me Some Credit” dataset and the “PPDai” dataset. Eight classification methods are adopted in the paper including Linear Discriminant Analysis (LDA), Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), eXtreme Gradient Boosting (XGboost) and Multi-Layer Perceptron (MLP). Three indicators (Accuracy, AUC and Logistic loss) are used to analyze the performance of each classifier. The final experiment results indicate that the XGBoost classifier has a better performance in predictive analytics compared with the other seven models. The study results will also provide practical values for financial institutions in choosing the appropriate classifier so as to make correct judgements when they are faced with credit problems in real situations.
VL  - 6
IS  - 6
ER  -

Copy | Download

Author Information

Anqi Cao

Dongfang College, Zhejiang University of Finance and Economics, Hangzhou, China
Hongliang He

School of Information, Zhejiang University of Finance and Economics, Hangzhou, China
Zixuan Chen

School of Information, Zhejiang University of Finance and Economics, Hangzhou, China
Wenyu Zhang

School of Information, Zhejiang University of Finance and Economics, Hangzhou, China

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Anqi Cao, Hongliang He, Zixuan Chen, Wenyu Zhang. (2018). Performance Evaluation of Machine Learning Approaches for Credit Scoring. International Journal of Economics, Finance and Management Sciences, 6(6), 255-260. https://doi.org/10.11648/j.ijefm.20180606.12

Copy | Download

ACS Style

Anqi Cao; Hongliang He; Zixuan Chen; Wenyu Zhang. Performance Evaluation of Machine Learning Approaches for Credit Scoring. Int. J. Econ. Finance Manag. Sci. 2018, 6(6), 255-260. doi: 10.11648/j.ijefm.20180606.12

Copy | Download

AMA Style

Anqi Cao, Hongliang He, Zixuan Chen, Wenyu Zhang. Performance Evaluation of Machine Learning Approaches for Credit Scoring. Int J Econ Finance Manag Sci. 2018;6(6):255-260. doi: 10.11648/j.ijefm.20180606.12

Copy | Download

@article{10.11648/j.ijefm.20180606.12,
  author = {Anqi Cao and Hongliang He and Zixuan Chen and Wenyu Zhang},
  title = {Performance Evaluation of Machine Learning Approaches for Credit Scoring},
  journal = {International Journal of Economics, Finance and Management Sciences},
  volume = {6},
  number = {6},
  pages = {255-260},
  doi = {10.11648/j.ijefm.20180606.12},
  url = {https://doi.org/10.11648/j.ijefm.20180606.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijefm.20180606.12},
  abstract = {In today’s world, assessing financial credit risk is of immense importance in both accounting and finance areas. Financial institutions need to keep the credit default risk to an acceptable level so that higher profitability can be achieved. Recently, with the fast development of modern data science, many machine learning methods have been applied to make accurate predictions based on the information extracted from diverse data sources. The present study aims to apply data mining techniques in acquiring evidence used to judge which classifier performs better in assessing credit scoring for a proposed model. The two datasets employed in the analysis of this paper are the “Give Me Some Credit” dataset and the “PPDai” dataset. Eight classification methods are adopted in the paper including Linear Discriminant Analysis (LDA), Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), eXtreme Gradient Boosting (XGboost) and Multi-Layer Perceptron (MLP). Three indicators (Accuracy, AUC and Logistic loss) are used to analyze the performance of each classifier. The final experiment results indicate that the XGBoost classifier has a better performance in predictive analytics compared with the other seven models. The study results will also provide practical values for financial institutions in choosing the appropriate classifier so as to make correct judgements when they are faced with credit problems in real situations.},
 year = {2018}
}

Copy | Download

TY  - JOUR
T1  - Performance Evaluation of Machine Learning Approaches for Credit Scoring
AU  - Anqi Cao
AU  - Hongliang He
AU  - Zixuan Chen
AU  - Wenyu Zhang
Y1  - 2018/12/11
PY  - 2018
N1  - https://doi.org/10.11648/j.ijefm.20180606.12
DO  - 10.11648/j.ijefm.20180606.12
T2  - International Journal of Economics, Finance and Management Sciences
JF  - International Journal of Economics, Finance and Management Sciences
JO  - International Journal of Economics, Finance and Management Sciences
SP  - 255
EP  - 260
PB  - Science Publishing Group
SN  - 2326-9561
UR  - https://doi.org/10.11648/j.ijefm.20180606.12
AB  - In today’s world, assessing financial credit risk is of immense importance in both accounting and finance areas. Financial institutions need to keep the credit default risk to an acceptable level so that higher profitability can be achieved. Recently, with the fast development of modern data science, many machine learning methods have been applied to make accurate predictions based on the information extracted from diverse data sources. The present study aims to apply data mining techniques in acquiring evidence used to judge which classifier performs better in assessing credit scoring for a proposed model. The two datasets employed in the analysis of this paper are the “Give Me Some Credit” dataset and the “PPDai” dataset. Eight classification methods are adopted in the paper including Linear Discriminant Analysis (LDA), Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), eXtreme Gradient Boosting (XGboost) and Multi-Layer Perceptron (MLP). Three indicators (Accuracy, AUC and Logistic loss) are used to analyze the performance of each classifier. The final experiment results indicate that the XGBoost classifier has a better performance in predictive analytics compared with the other seven models. The study results will also provide practical values for financial institutions in choosing the appropriate classifier so as to make correct judgements when they are faced with credit problems in real situations.
VL  - 6
IS  - 6
ER  -

Copy | Download