Risk Analysis in Microfinance Using Machine Learning and Potential Integration with Artificial Intelligence Agent
DOI:
https://doi.org/10.21678/jb.2026.2798Keywords:
Microfinance, credit risk, payment default, small businesses, machine learning, predictive modeling, artificial intelligence agents.Abstract
Abstract. This study proposes a comprehensive approach for the early detection of default risk in microfinance portfolios, combining machine learning techniques with historical analysis of clients’ payment behavior. A database of more than 50,000 microcredits granted in Peru by a microfinance institution in Huancayo (2019–2021) was used, constructing a risk indicator based on the proportion of days in arrears relative to the agreed payment frequency, with a critical threshold of 25% of the installment period. This criterion differentiates clients with a higher propensity to default without penalizing minor delays, improving analytical accuracy.
The study focuses on microenterprises and informal entrepreneurs, traditionally excluded from formal banking. It provides predictive tools adapted to segments with limited credit history, fostering financial inclusion and strengthening risk management in microfinance institutions.
Four predictive models were evaluated, representing the main families of supervised learning: Gradient Boosting Machine (GBM) for Boosting, Bayesian Additive Regression Trees (BART) for Bayesian ensembles, Random Forest (RF) for Bagging, and Support Vector Machines (SVM) as optimal margin classifiers. This selection allows contrasting methodologies and identifying the most suitable approach for the microfinance context.
The use of supervised learning is justified because the problem has historical labels of default and non-default, enabling predictions directly applicable to credit decision-making. Performance was assessed using metrics such as Cohen’s Kappa, Geometric Mean, and F1-score. Results show that GBM delivers the most consistent performance, BART achieves the best F1-score, and SVM excels in geometric precision. These findings validate the effectiveness of supervised learning in segmenting credit risk, optimizing operational management, and laying the foundation for incorporating artificial intelligence agents to monitor payments in real time and reduce losses from default.
Keywords: Microfinance, credit risk, payment default, small businesses, machine learning, predictive modeling, artificial intelligence agents.
Downloads
References
Armendáriz, B., & Morduch, J. (2010). The economics of microfinance (2nd ed.). MIT Press.
Brown, I., & Mues, C. (2012). An experimental comparison of classification algorithms for imbalanced credit scoring data.
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). ACM. https://doi.org/10.1145/2939672.2939785
Economist Intelligence Unit. (2012). Global microscope on the microfinance business environment 2012. https://www.eiu.com/n/campaigns/microscope2012/
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd ed.).
He, H., & Garcia, E. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering 21(9), 1263–1284 10.1109/TKDE.2008.239
Huang, C. L., Chen, M. C., & Wang, C. J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert Systems with Applications, 33(4), 847–856. https://doi.org/10.1016/j.eswa.2006.07.007
J-PAL. (2022). Microcredit: Impacts and promising innovations. Abdul Latif Jameel Poverty Action Lab. https://www.povertyactionlab.org/policy-insight/microcredit
Khandani, A., Kim, A., & Lo, A. (2010). Consumer credit-risk models via ML. Journal of Banking & Finance 34(11), 2767–2787 https://doi.org/10.1016/j.jbankfin.2010.06.001
Lessmann, S., Baesens, B., Seow, H., & Thomas, L. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 66(4), 740–758. https://doi.org/10.1057/jors.2014.22
Malekipirbazari, M., & Aksakalli, V. (2015). Risk assessment in social lending via random forests. Expert Systems with Applications, 42(10), 4621–4631. https://doi.org/10.1016/j.eswa.2015.01.002
Nhung, D. H., & Simioni, M. (2021). A comparison of Random Forest and logistic regression model in credit scoring. HAL Open Archive. https://hal.science/hal-03178971
Rinaldo, A., Passos, L., Lopes, H. F., & Giudici, P. (2018). Application of Bayesian additive regression trees in the development of credit scoring models in Brazil. Brazilian Journal of Probability and Statistics, 32(2), 264–280. https://doi.org/10.1214/17-BJPS354
Sharma, D. (2013). Improving credit scoring with random forests [Masters thesis, San José State University]. SJSU ScholarWorks. https://scholarworks.sjsu.edu/etd_projects/353
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management. 45(4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Diego Arriola León, Mohsen Ghodrat

This work is licensed under a Creative Commons Attribution 4.0 International License.
Journal of Business publishes all its articles and reviews under a Creative Commons Attribution (CC BY 4.0) license with the objective of promoting academic exchange worldwide. Therefore, articles and book reviews can be distributed, edited, amended, etc., as the author sees fit. The only condition is that the name of the author(s) and Apuntes. Revista de Ciencias Sociales (as the publisher) be cited.

.jpg)
