Precipitation Probability Prediction through NWP Bias Correction for South Korea Using Random Forest

Yun Am Seo, Jieun Cha


This study presents the results of an effort to improve the forecast of precipitation (> 0.1 mm/hr or > 0.1 mm/3hr) in the Local Data Assimilation and Prediction System (LDAPS) and the Global Data Assimilation and Prediction System (GDAPS) by applying the Random Forest (RF) model in South Korea. LDAPS and GDAPS are Numerical Weather Prediction (NWP) models operated by the Korea Meteorological Administration (KMA) for weather forecasting. GDAPS operates the Unified Model (UM) and the Korean Integrated Model (KIM). This study used weather forecast data from LDAPS, GDAPS/KIM, and GDAPS/UM. Precipitation forecasts from LDAPS and GDAPS were corrected by RF training with rain gauge observations from about 685 stations. Approximately 35 selected NWP model output variables were used as inputs to the RF training. To reflect recent trends in biases between observations and NWP, the precipitation probability prediction model was designed for real-time learning using a sliding window technique. In addition, the precipitation data had a data imbalance problem with more precipitation cases than non-precipitation cases, so an under-sampling method was applied to solve this problem. Comparing the performance of the proposed method with NWP in predicting precipitation, the CSI was improved by 14.7-23.1% (LDAPS), 33.9% (GDAPS/KIM), and 6.7%-38% (GDAPS/UM) over NWP, and the accuracy was also better. In future research, automating the sampling rate selection to reflect recent weather trends when under-sampling is likely to improve forecast performance.


Precipitation forecast; imbalanced data; sliding window; bias correction

Full Text:



A. Jabbari, and D.H. Bae, “Application of artificial neural networks for accuracy enhancements of real-time flood forecasting in the Imjin basinâ€. Water, vol.10, no. 11, pp. 1626, 2018.

T. Zhang, W. Lin, Y. Lin, M. Zhang, H. Yu, K. Cao, and W. Xue, “Prediction of tropical cyclone genesis from mesoscale convective systems using machine learningâ€, Weather and Forecasting, vol. 34, no. 4, pp. 1035-1049, 2019.

S.E. Haupt, W. Chapman, S.V. Adams, and C. Kirkwood, “Combining artificial intelligence with physics-based methods for probabilistic renewable energy forecastingâ€, Energies, Vol. 13, pp. 1979, 2020.

J. Y. Lee, M. Kwon, K. S. Yun, S. K. Min, I. H. Park, Y. G. Ham, E.K. Jin, J.H. Kim, K.H. Seo, W. Kim, S.Y. Yim, and J. H. Yoon, “The long-term variability of changma in the East Asian summer monsoon system: A review and revisitâ€, Asia-Pacific Journal of Atmospheric Sciences, vol. 53, pp. 257-272, 2017.

H.J. Song, and B.J. Sohn, “Polarizing rain types linked to June drought in the Korean peninsula over last 20 years†International Journal of Climatology, vol . 40, pp. 2173-2182, 2020.

G.M. Carter, J.P. Dallavalle, and H.R. Glahn, “Statistical forecasts based on the National Meteorological Center’s numerical weather prediction systemâ€, Weather and Forecast, vol. 4, pp. 401–412, 1989.

A.E. Raftery, T. Gneiting, F. Balabdaoui, and M. Polakowski, “Using bayesian model averaging to calibrate forecast ensemblesâ€, Monthly Weather Review, col. 133, pp. 1155-1174, 2005.

J.M Sloughter, T. Gneiting, and A. Raftery, “Probabilistic wind speed forecasting using ensembles and Bayesian model averagingâ€, Journal of the American Statistical Association, vol. 105, pp.25–35, 2010.

S. Hemri, M. Scheuerer, F. Pappenberger, K. Bogner, and T. Haiden, “Trends in the predictive performance of raw ensemble weather forecastsâ€, Geophys. Res. Lett., vol. 41, pp. 9197–9205, 2014.

G.R. Herman, and R.S. Schumacher, “Money doesn’t grow on trees, but forecasts do: Forecasting extreme precipitation with random forestsâ€, Monthly Weather Review, vol. 146, pp. 1571-1600, 2018.

E.D. Loken, A.J. Clark, A. McGovern, M. Flora, and K. Knopfmeier, “Post-processing next-day ensemble probabilistic precipitation forecasts using random forestsâ€, Weather and Forecasting, vol. 34, pp. 2017–2044, 2019.

C.M. Ko, Y.Y. Jeong, Y.M. Lee, and B.S. Kim, “The development of a Quantitative Precipitation Forecast Correction Technique Based on Machine Learning for Hydrological Applicationâ€, Atmosphere, vol. 11, no. 1, pp. 111, 2020.

M. Taillardat, O. Mestre, M. Zamo, and P. Naveau, “Calibrated ensemble forecasts using quantile regression forests and ensemble model output statisticsâ€. Monthly Weather Review, vol. 144, pp. 2375–2393, 2016.

K. Bakker, Whan, K. Knap, and M. Schmeits, “Comparison of statistical post-processing methods for probabilistic NWP forecasts of solar radiation†Solar Energy, vol. 191, pp. 138-150, 2019.

D. Cho, C. Yoo, J. Im, and D.H. Cha, “Comparative assessment of various machine learningâ€based bias correction methods for numerical weather prediction model forecasts of extreme air temperatures in urban areasâ€, Earth and Space Science, vol. 7, no. 4, pp. e2019EA000740, 2020, doi: 10.1029/2019EA000740.

Z. Tian, S. Li, and Y. Wang, “A prediction approach using ensemble empirical mode decompositionâ€permutation entropy and regularized extreme learning machine for shortâ€term wind speed†Wind Energy, vol. 23, no. 2, 177-206, 2020.

C. Kirkwood, T. Economou, H. Odbert, and N. Pugeault, “A framework for probabilistic weather forecast post-processing across models and lead times using machine learning†Philosophical Transactions of the Royal Society A, vol. 379, no. 2194, pp. 20200099, 2021.

P. Grönquist, C. Yao, T. Ben-Nun, N. Dryden, P. Dueben, S. Li, and T.Hoefler, “Deep learning for post-processing ensemble weather forecastsâ€, Philosophical Transactions of the Royal Society A, vol. 379, no. 2194, pp. 20200092, 2021.

D. Cho, C. Yoo, B. Son, J. Im, D. Yoon, and D. H. Cha, “A novel ensemble learning for post-processing of NWP Model's next-day maximum air temperature forecast in summer using deep learning and statistical approachesâ€. Weather and Climate Extremes, vol. 35, no. 100410, 2022.

S. Vannitsem, J.B. Bremnes, J. Demaeyer, G.R. Evans, J. Flowerdew, and S. Hemri, “Statistical post-processing for weather forecasts: Review, challenges, and avenues in a big data worldâ€, Bulletin of the American Meteorological Society, vol. 102, no. 3, pp. E681-E699, 2020.

N.B. Allen, B.W. Nelson, D. Brent, and R.P. Auerbach, “Short-term prediction of suicidal thoughts and behaviors in adolescents: Can recent developments in technology and computational science provide a breakthrough?â€, Journal of affective disorders, vol. 250, no. 163-169, 2019.

J.L. Leevy, T.M. Khoshgoftaar, R.A. Bauder, and N. Seliya, “A survey on addressing high-class imbalance in big data†Journal of Big Data, vol. 5, no.1, pp.1-30, 2018.

T. Sasada, Z.Liu, T. Baba, K. Hatano, Y. Kimura, “A resampling method for imbalanced datasets considering noise and overlapâ€, Procedia Computer Science, vol. 176, pp. 420-429, 2020.

G.U. Park, and I. Jung, “Comparison of resampling methods for dealing with imbalanced data in binary classification problemâ€, The Korean Journal of Applied Statistics, vol. 32, no. 3) 349-374, 2019.

M. Alam, and M. Amjad, “Weather forecasting using parallel and distributed analytics approaches on big data cloudsâ€, Journal of Statistics and Management Systems, vol. 22, no. 4, pp. 791-799, 2019.

A. Bhatt, W. Ongsakul, and J.G. Singh, “Sliding window approach with first-order differencing for very short-term solar irradiance forecasting using deep learning modelsâ€, Sustainable Energy Technologies and Assessments, vol. 50, pp.101864, 2022.

D. R. Garrido, and M. S. Lorenzo, “Application of the Sliding Window Method to the Short Range Prediction System for the Correction of Precipitation Forecast Errorsâ€, Environmental Sciences Proceedings, vol. 19, no.1, pp. 53, 2022.

M.S. Saravanan, “Prediction of Temperature for Next Three Days Using Decision Tree Algorithm by Comparing Sliding Window Algorithm for Better Accuracyâ€, ECS Transactions, vol. 107, no. 1, pp. 14097, 2022.

C. Chen, Q. Zhang, M.H. Kashani, C. Jun, S.M. Bateni, S.S. Band, S.S. Dash, and K.W. Chau, “Forecast of rainfall distribution based on fixed sliding window long short-term memoryâ€, Engineering Applications of Computational Fluid Mechanics, vol. 16, no. 1, pp. 248-261, 2022.

H.S. Lee, H.S. Park, S.Y. Kim, J.H. Park, C.K. Park, Y.A. Seo, I.K. Kim, S.Y. Roh, J.S. Park, H.J. Song, M.K. Hong, and Y.S. Ryu, “Development of the AI technique for the prediction of rainfall optimized over the Korean peninsulaâ€, National Institute of Meteorological Sciences, Korea, Tech. Rep. 11-1360620-000209-10, 2020.

L. Breiman, “Random forestsâ€, Machine learning, vol. 45, no. 1, pp. 5-32, 2001.

A.J. Hill, G.R. Herman, and R.S. Schumacher, “Forecasting severe weather with random forestsâ€, Monthly Weather Review, vol. 148, no. 5, pp. 2135-2161, 2020.

Y. He, C. Chen, B. Li, and Z. Zhang, “Prediction of near-surface air temperature in glacier regions using ERA5 data and the random forest regression methodâ€, Remote Sensing Applications: Society and Environment, vol. 28, pp. 100824, 2022.

E.D. Loken, A.J. Clark, and A. McGovern, “Comparing and interpreting differently designed random forests for next-day severe weather hazard predictionâ€, Weather and Forecasting, vol. 37, no. 6, pp. 871-899, 2022.



  • There are currently no refbacks.

Published by INSIGHT - Indonesian Society for Knowledge and Human Development