Jonas Botz

Dynamic Ensemble Model for Short-term Forecasts in Pandemic Situations

PhD candidate Jonas describes how dynamic ensembles can help reduce bias in short-term forecasts in pandemic situations. This work is part of the AIOLOS project.

Motivation

In late 2019, the novel coronavirus SARS-CoV-2 emerged. This not only gave rise to the COVID-19 pandemic but also affected every aspect of human life, from an economic downturn and disruption in education and social interactions to severe health implications, including millions of deaths. Early on, governments struggled to find a balance between containing the spread of the virus and maintaining as much economy, social interactions, and educational services as possible. One indicator used for decision-making was the number of confirmed cases or later hospitalization. During that time, models were developed to forecast the number of cases or hospitalizations, respectively. Many of these, however, were and still are limited in mapping the dynamics of the pandemic to make reliable forecasts.

Ensemble Models – Multiple Base Models with Reduced Bias

In the past, so-called ensemble models have been used for forecasting the spread of infectious diseases like Influenza or Ebola. In principle, ensemble models can be understood as a collection of base models, which all produce an output based on each model's assumption plus an algorithm that either selects one of the models' outputs or combines them into one ensemble output. The advantage of such an ensemble approach is that the bias of the individual models is reduced, making the final output more robust. In the literature, such ensemble methods often use the mean or median of the base model outputs. However, pandemics are dynamic; there are times when the number of cases barely changes, there is exponential growth and decay, and there are turning points of waves, which can all depend on external factors like interventions, people's behavior, seasonality, or variants of concern.

Dynamic Ensemble Model

To capture these dynamics and to be better prepared for future pandemics, we developed an ensemble model that is dynamically adjusted to either select the suitable model at the right time or to weigh the models' predictions according to the current situation by using a meta-model. First, the input data is fed into each base model: The surveillance data, e.g., the number of confirmed cases. Then, each of the base models makes a short-term forecast. This forecast is then evaluated by comparing the forecast to the real observed values. The models' forecasts and performances are then used as input for our meta-model, which is trained in one of the following two ways: 1. select one of the models (selection), 2. combine the model's predictions into one prediction (stacking). This is superior to just taking the models' mean or median because the meta-model weights are adjusted for each testing period. We then take this a step further by including metadata as a covariate in the meta-model. This metadata can, for example, be weather data or data from social media like Google trends that have been shown to correlate with the surveillance data. This, in turn, can improve the performance of the meta-model. The full pipeline can be seen in Figure 1.

Read more (Preprint)

© Fraunhofer SCAI
Figure 1: Model Pipeline
The surveillance data is used as input for the base models, which output a short-term forecast. The forecast then gets evaluated on the observed data. The metadata is fed into an LSTM to learn the long-range time patterns. Next, the output of the LSTM is concatenated with the predictions and performances of the base models. This is used for training the meta-model which can either learn to select one of the model outputs (selection) or to weigh all model outputs and combine them into one ensemble output (stacking).

References:

  1. Coronavirus disease (COVID-19) pandemic [WWW Document], n.d. URL https://www.who.int/europe/emergencies/situations/covid-19 (accessed 1.5.24).
  2. Botz, J., Wang, D., Lambert, N., Wagner, N., Génin, M., Thommes, E., Madan, S., Coudeville, L., Fröhlich, H., 2022. Modeling approaches for early warning and monitoring of pandemic situations as well as decision support. Front Public Health 10, 994949. https://doi.org/10.3389/fpubh.2022.994949
  3. Reich, N.G., McGowan, C.J., Yamana, T.K., Tushar, A., Ray, E.L., Osthus, D., Kandula, S., Brooks, L.C., Crawford-Crudell, W., Gibson, G.C., Moore, E., Silva, R., Biggerstaff, M., Johansson, M.A., Rosenfeld, R., Shaman, J., 2019. Accuracy of real-time multi-model ensemble forecasts for seasonal influenza in the U.S. PLOS Computational Biology 15, e1007486. https://doi.org/10.1371/journal.pcbi.1007486
  4. Wang, D., Lentzen, M., Botz, J., Valderrama, D., Deplante, L., Perrio, J., Génin, M., Thommes, E., Coudeville, L., Fröhlich, H., 2023. Development of an early alert model for pandemic situations in Germany. Sci Rep 13, 20780. https://doi.org/10.1038/s41598-023-48096-3