GDP Nowcasting with Machine Learning and Unstructured Data
DOI:
https://doi.org/10.21678/apuntes.99.2189Keywords:
nowcasting, machine learning, GDP growthAbstract
Nowcasting models based on machine learning (ML) algorithms deliver a noteworthy advantage for decision-making in the public and private sectors due to their flexibility and ability to handle large amounts of data. This article introduces real-time forecasting models for the monthly Peruvian GDP growth rate. These models merge structured macroeconomic indicators with high-frequency unstructured sentiment variables. The analysis spans January 2007 to May 2023, encompassing a set of 91 leading economic indicators. Six ML algorithms were evaluated to identify the most effective predictors for each model. The findings underscore the remarkable capability of ML models to yield more precise and foresighted predictions compared to conventional time series models. Notably, the gradient boosting machine, LASSO, and elastic net models emerged as standout performers, achieving a reduction in prediction errors of 20% to 25% compared to autoregression and various specifications of dynamic factor model. These results could be influenced by the analysis period, which includes crisis events featuring high uncertainty, where ML models with unstructured data improve significance.
Downloads
References
Araujo, D., Bruno, G., Marcucci, J., Schmidt, R., & Tissot, B. (2023). Machine learning: applications in central banking. Journal of AI, Robotics & Workplace Automation, 2(3), 271–293.
Armstrong. (2001). Principles of forecasting: A handbook for researchers and practitioners (Vol. 30). Springer.
Aruoba, S. B., Diebold, F. X., & Scotti, C. (2009). Real-time measurement of business conditions. Journal of Business & Economic Statistics, 27(4), 417-427.
Athey, S. (2018). The impact of machine learning on economics intelligence: An agenda. In The economics of artificial (pp. 507-547). University of Chicago Press.
Bánbura, M., & Modugno, M. (2014). Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data. (W. O. Library, Ed.) Journal of applied econometrics, 29(1), 133-160.
Bánbura, M., & Rünstler, G. (2011). A look into the factor model black box: Publication lags and the role of hard and soft data in forecasting gdp. International Journal of Forecasting, 27(2), 333-346.
Bánbura, M., Giannone, D., Modugno, M., & Reichlin, L. (2013). Now-casting and the real-time data flow. In Elsevier (Ed.), Handbook of economic forecasting (Vol. 2, pp. 195-237).
Barrios, J. J., Escobar, J., Leslie, J., Martin, L., & Peña, W. (2021). Nowcasting para predecir actividad económica en tiempo real: Los casos de Belice y El Salvador. Inter-American Development Bank.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003, Jan). Latent dirichlet allocation. Journal of machine Learning research, 3, 993-1022.
Boehmke, B., & Greenwell, B. (2020). Chapter 12: Gradient boosting. In Hands-on machine learning with R. Chapman & Hall.
Bok, B., Caratelli, D., Giannone, D., Sbordone, A. M., & Tambalotti, A. (2018). Macroeconomic nowcasting and forecasting with big data. Annual Review of Economics, 10, 615-643.
Bolivar, O. (2024). Gdp nowcasting: A machine learning and remote sensing data-based approach for Bolivia. Latin American Journal of Central Banking, 5(3).
Breiman, L. (2001). Random forests. In Machine learning (Vol. 45, pp. 5-32). Springer.
Brownlee, J. (2016). Bagging and random forest ensemble algorithms for machine learning. In Master Machine learning algorithms (pp. 4-22). Machine Learning Mastery.
Caruso, A. (2018). Nowcasting with the help of foreign indicators: The case of Mexico. In Economic Modelling (Vol. 69, pp. 160-168). Elsevier.
Chakraborty, C., & Joseph, A. (2017). Machine learning at central banks. Bank of England working paper.
Corona, F., González-Farías, G., & López-Pérez, J. (2022). Timely estimates of the monthly Mexican economic activity. Journal of Official Statistics, 38(3), 733-765.
Diebold, F. X., & Mariano, R. S. (1995). Comparing predictive accuracy. Journal of Business and Economic Statistics, 13(3), 253-263.
Döpke, J., Fritsche, U., & Pierdzioch, C. (2017). Predicting recessions with boosted regression trees. International Journal of Forecasting, 33(4), 745-759.
Doz, C. G. (2011). A quasi–maximum likelihood approach for large, approximate dynamic factor models. Review of economics and statistics, 94(4), 188-205.
Doz, C. G. (2011). A two-step estimator for large approximate dynamic factor models based on kalman filtering. Journal of Econometrics, 164(1), 188-205.
Eberendu, A. C., & al., e. (2016). Unstructured data: An overview of the data of big data. International Journal of Computer Trends and Technology, 38(1), 46-50.
Einav, L., & Levin, J. (2014). The data revolution and economic analysis. Innovation Policy and the Economy, 14(4), 1-24.
Escobal D’Angelo, J., & Torres, J. (2002). Un sistema de indicadores lideres del nivel de actividad para la economía peruana.
Etter, R. G., & al., e. (2011). A composite leading indicator for the peruvian economy based on the bcrp’s monthly business tendency surveys (tech. rep.). Banco Central de Reserva del Perú.
Evans, M. (2005). Where are we now? real-time estimates of the macro economy.
Forero, F. J., Aguilar, O. J., & Vargas, R. F. (2016). Un indicador lider de actividad real para Perú.
Gálvez-Soriano, O. d. (2020). Nowcasting Mexico’s quarterly GDP using factor models and bridge equations. Estudios Económicos (México, DF), 35(2), 213-265.
Garcia-Donato, G., & Martinez-Beneito, M. A. (2013). On sampling strategies in bayesian variable selection problems with large model spaces. Journal of the American Statistical Association, 108(501), 340-352.
Ghosh, S., & Ranjan, A. (2023). A machine learning approach to GDP nowcasting: An emerging market experience. Buletin Ekonomi Moneter dan Perbankan, 26, 33-54.
Giannone, D., Reichlin, L., & Small, D. (2008). Nowcasting: The real-time informational content of macroeconomic data. Journal of monetary economics, 55(4), 665-676.
Giglio, S., Kelly, B., & Xiu, D. (2022). Factor models, machine learning, and asset pricing. Annual Review of Financial Economics, 14, 337-368.
González-Astudillo, M., & Baquero, D. (2019). A nowcasting model for Ecuador: Implementing a time-varying mean output growth. Economic Modelling, 82, 250-263.
Green, K. C., & Armstrong, S. (2015). Simple versus complex forecasting: The evidence. Journal of Business Research, 68(8), 1678-1685.
Harvey, D., Leybourne, S., & Newbold, P. (1997). Testing the equality of prediction mean squared errors. International Journal of Forecasting, 13(2), 281-291.
Kant, D., Pick, A., & deWinter, J. (2022). Nowcasting GDP using machine learning methods. Nederlandsche Bank Working Paper.
Kapsoli Salinas, J., & Bencich Aguilar, B. (2002). Indicadores lideres, redes neuronales y predicción de corto plazo. Pontificia Universidad Católica del Perú. Departamento de Economía.
Liu, Z. Z. (2014). The doubly adaptive lasso methods for time series analysis. The University of Western Ontario (Canada).
Longo, L., Riccaboni, M., & Rungi, A. (2022). A neural network ensemble approach for gdp forecasting. Journal of Economic Dynamics and Control, 134.
Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2018). Statistical and machine learning forecasting methods: Concerns and ways forward. PloS one, 13(3), e0194889.
Martınez, M., & Quineche, R. (2014). Un indicador lıder para el nowcasting de la actividad económica del perú (tech. rep.). Mimeo.
Medeiros, M. C., Vasconcelos, G. F., Veiga, Á., & Zilberman, E. (2021). Forecasting inflation in a data-rich environment: The benefits of machine learning methods. Journal of Business & Economic Statistics, 39(1), 98-119.
Muchisha, N. D., Tamara, N., Andriansyah, A., & Soleh, A. M. (2021). Nowcasting Indonesia’s GDP growth using machine learning algorithms. Indonesian Journal of Statistics and Its Applications, 5(2), 355-368.
Natekin, A., & Knoll, A. (2013). Gradient boosting machines, a tutorial. Frontiers in neurorobotics, 7(21).
Pérez Forero, F. (2018). Nowcasting Peruvian GDP using leading indicators and bayesian variable selection (tech. rep.). Banco Central de Reserva del Perú.
Richardson, A., & Mulder, T. (2018). Nowcasting New Zealand GDP using machine learning algorithms. CAMA Working Paper.
Romer, C., & Romer, D. (2008). The fomc versus the staff: Where can monetary policymakers add value? American Economic Review, 98(2), 230-235.
Rusnák, M. (2016). Nowcasting Czech GDP in real time. Economic Modelling, 54, 26-39.
Scott, S. L., & Varian. (2013). Bayesian variable selection for nowcasting economic time series (tech. rep.). National Bureau of Economic Research.
Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical Bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25.
Stock, J. H., & Watson, M. W. (1989). New indexes of coincident and leading economic indicators. NBER macroeconomics annual, 4, 351-394.
Suphaphiphat, N., Wang, Y., & Zhang, H. (2022). A scalable approach using DFM, machine learning and novel data, applied to european economies.
Tenorio, J., & Pérez, W. . (2023). GDP nowcasting with machine learning and unstructured data to Peru. Perueconomics, (No. 197).
Tenorio, J., & Perez, W. (2024). GDP nowcasting with machine learning and unstructured data. (No. 2024-003).
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1), 267-288.
Tiffin, M. A. (2016). Seeing in the dark: A machine-learning approach to nowcasting in Lebanon. International Monetary Fund.
Varian, H. (2014). Machine learning and econometrics. Slides package from talk at University of Washington.
Woloszko, N. (2020). A weekly tracker of activity based on machine learning and google trends.
Zhang, Q., Ni, H., & Xu, H. (2023). Nowcasting Chinese GDP in a data-rich environment: Lessons from machine learning algorithms. Economic Modelling, 122, 106204.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American statistical association, 101(476), 1418-1429.
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(2), 301-320.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Juan Tenorio

This work is licensed under a Creative Commons Attribution 4.0 International License.
Apuntes publishes all its articles and reviews under a Creative Commons Attribution (CC BY 4.0) license with the objective of promoting academic exchange worldwide. Therefore, articles and book reviews can be distributed, edited, amended, etc., as the author sees fit. The only condition is that the name of the author(s) and Apuntes. Revista de Ciencias Sociales (as the publisher) be cited.