International Journal of Forecasting 39 (2023) 1820-1838 ELSEVIER Contents lists available at ScienceDirect International Journal of Forecasting journal homepage: www.elsevier.com/locate/ijforecast Macroeconomic forecasting in the euro area using predictive combinations of DSGE models^ Jan Čapek3 , Jesus Crespo Cuaresmab 'a 'c 'd '*, Niko Hauzenbergere , a , Vlastimil Reichela ^Masaryk University, Czech Republic b Vienna University of Economics and Business, Austria c 11ASA, Austria " W1F0, Austria e University of Salzburg, Austria A R T I C L E I N F O Keywords: Forecasting Model averaging Prediction pooling DSGE models Macroeconomic variables A B S T R A C T W e provide a comprehensive assessment of the predictive power of combinations of dynamic stochastic general equilibrium (DSGE) models for G D P growth, inflation, and the interest rate i n the euro area. W e employ a battery of static and dynamic pooling weights based o n Bayesian model averaging principles, prediction pools, and dynamic factor representations, and entertain six different DSGE specifications and five prediction weighting schemes. O u r results indicate that exploiting mixtures of DSGE models produces competitive forecasts compared to individual specifications for both point and density forecasts over the last three decades. Although these combinations do not tend to systematically achieve superior forecast performance, w e find improvements for particular periods o f time a n d variables w h e n using prediction pooling, dynamic model averaging, and combinations o f forecasts based o n Bayesian predictive synthesis. © 2022 The Author(s). Published by Elsevier B.V. on behalf of International Institute of Forecasters. This is an open access article under the CC BY license (http://creativecommons.Org/licenses/by/4.0/). 1. Introduction Dynamic stochastic general equilibrium (DSGE) models have become the workhorse of modern macroeconomic " The authors would like to thank two anonymous referees for very helpful comments on an older version of the paper. Financial support from the Czech Science Foundation, Grants 17-14263S and 21-10562S, is gratefully acknowledged. This work was supported by the Ministry of Education, Youth, and Sports of the Czech Republic through the e-INFRA CZ (ID: 90140). Jesus Crespo Cuaresma gratefully acknowledges funding from IIASA, Austria and the National Member Organizations, Austria that support the institute. Niko Hauzenberger gratefully acknowledges financial support from the Jubilaumsfonds of the Oesterreichische Nationalbank (OeNB, grant no. 18718). * Corresponding author at: Vienna University of Economics and Business, Austria. E-mail address: jcrespo@wu.ac.at (J. Crespo Cuaresma). research, due to their internal consistency and their ability to assess the effects of policy shocks i n a rigorous manner. In spite of their importance i n modern economic analysis, the existing results concerning their out-of-sample forecasting ability are mixed. A series of studies have assessed the predictive ability of different types of DSGE models. Christoffel et al. (2011) examine the out-of-sample predictive ability of the European Central Bank's N e w Area-Wide Model ( N A W M ) , the DSGE model used to create projections of macroeconomic variables by the monetary authorities of the euro area. The results in Christoffel et al. (2011) indicate that this DSGE model, as compared to other alternative reduced-form specifications, provides good predictions for a set of 12 different macroeconomic variables. The predictive accuracy of DSGE models, however, does not necessarily remain stable over time. Del Negro et al. (2016) provide https://doi.org/10.1016/j.ijforecast.2022.09.002 0169-2070/© 2022 The Author(s). Published by Elsevier B.V. on behalf of International Institute of Forecasters. This is an open access article under the CC BY license (http://creativecommons.Org/licenses/by/4.0/). J. Čapek, J. Crespo Cuaresma, N. Hauzenberger et al. evidence that forecasts produced using a Smets-Wouters type of DSGE model (Smets & Wouters, 2003, 2007) w i t h financial frictions perform particularly well in periods of financial turmoil (in particular in the Great Recession), but that the predictive accuracy of the model tends to suffer in tranquil periods. The forecast quality of DSGE structures that include financial frictions has also been assessed by Kolasa and Rubaszek (2015), and improvements in forecast ability are reported in episodes of financial turmoil when housing market frictions are included in the model, although no systematic gains in predictive performance are found in more stable periods. Another strand of the literature on macroeconomic forecasting has shown interest in analyzing predictive combinations based on a wide range of models, rather than focusing on a single specification, an idea that dates back to the work by Bates and Granger (1969). Amisano and Geweke (2017), for instance, find improvements in out-of-sample prediction errors for macroeconomic variables in the US by pooling forecasts from different macroeconomic models using Bayesian predictive distri­ butions. In this study, we evaluate the forecast ability of weighted combinations of six different DSGE models for GDP growth, inflation, and the interest rate in the euro area, making use of several prediction combination techniques. Our analysis expands the work by Wolters (2015), which assesses the forecast accuracy of four DSGE models for the US, as well as the potential predictive gains obtained by using combinations of these. W e entertain six different DSGE specifications for the euro area and five forecast combination methods, both static and dynamic, and evaluate point forecasts as well as density predictions. Our set of prediction combination techniques contains some of the forecast pooling techniques entertained in existing studies for DSGE models (Wolters, 2015, for example), as well as more novel methods based on the optimization of weights, that can potentially be timevarying and evolve according to flexible laws of motion. In particular, we use static weights based on principles of Bayesian model averaging and prediction pools, and dynamic weights that build upon dynamic (latent) factor representations of the variables of interest. The combination techniques employed in our analysis result in significantly different weighting schemes across models. While dynamic Bayesian model averaging and combinations based on dynamic factors lead to pooled forecasts which assign positive weights to all of the DSGE specifications, the technique based on prediction pools acts as a dynamic model selection tool, assigning weights close to zero to most individual model predictions over the out-of-sample period. The potential gains in predictive accuracy that can be exploited are specific to sub-periods, variables, and forecasting horizons, with no one-size-fits-all predictive combination strategy, ensuring systematic improvements in all situations. The rest of this paper is organized as follows. Section 2 introduces the DSGE models used in the analysis, and Section 3 presents the predictive density combination methods. Section 4 shows the results of the out-of-sample forecasting exercise, and Section 5 concludes. International Journal of Forecasting 39 (2023) 1820-1838 2. The battery of DSGE models 2.1. Individual DSGE models For our empirical analysis, we use a battery of DSGE models for the euro area. Their specifications differ in size, complexity, and the particular features highlighted. Since the analysis is conducted on a set of three core macroeconomic variables (GDP growth, inflation, and the interest rate), we ensure that these three observable variables are common across all models. The sparsest model entertained is a basic three-equation New Keynesian model, which serves as a benchmark in terms of simplicity. The model presented in Cogley et al. (2010) also requires only three basic observable variables, but introduces two additional shocks and allows the inflation target to change over time. The specification by Christensen and Dib (2008) adds investment and money as additional observable variables. This group of models is extended w i t h three more complex specifications that share the set of observable variables of the model by Smets and Wouters (2007): GDP growth, inflation, the interest rate, consumption growth, investment growth, real wage growth, and hours worked. The specification by Justiniano et al. (2011) contains the relative price of consumption to investment as the eighth observable variable, whereas Del Negro et al. (2015) add spread and inflation expectations as observable variables to the modeling framework and allow the shocks to be of a non-stationary nature. The group of DSGE specifications used spans model structures which differ in the mechanisms highlighted for the transmission of macroeconomic shocks. Tracking the predictive ability of such models over time can thus help us grasp changes in the relative importance of particular theoretical channels as determinants of macroeconomic dynamics in the euro area. Table 1 lists the models entertained, together with their corresponding abbreviations (which are used in the description of the results of the analysis and in all subsequent figures and tables), and summarizes information about the number of observable variables, the number of exogenous shocks, and the main features of each model. The particular observable variables included in each one of the DSGE models are presented in Table 2. 2.2. Data The models in Table 1 are estimated using quarterly data for the euro area in its 19-country composition. The database spans information from 1970Q3 to 2019Q1 and thus contains 195 quarterly observations. The core of the database is sourced from the Area W i d e Model (AWM) presented in Fagan et al. (2005) and updated and extended by Brand and Toulemonde (2015). The original A W M database is updated using data from the European Central Bank or Eurostat since the 1990s and is extended by population and hours worked from the Total Economy Database and Eurostat. Data on monetary aggregates are obtained directly from the OECD. W e use time series compiled by Gilchrist and Mojon (2018) for the interest rate spread variable. Inflation expectations are sourced from the European Central Bank's Survey of Professional 1821 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 Table 1 Euro area DSGE models used in the forecasting exercise. Reference Name Observables Shocks Features 3-equation basic NI< model NKModel 3 3 IS, PC, Taylor rule Cogley et al. (2010) CPS 2010 3 5 Inflation target can change over time Christensen and Dib (2008) CD 2008 5 5 Financial frictions as in Bernanke et al. (1999) Smets and Wouters (2007) SW 2007 7 7 Deterministic growth rate driven by labor-augmenting technological progress Justiniano et al. (2011) JPT 2011 8 8 Two investment shocks Del Negro et al. (2015) DNGS 2015 9 9 Smets and Wouters (2007) with financial frictions + spread and inflation expectations as observables Table 2 DSGE models: Observable variables. CPS 2010 NKModel CD 2008 SW 2007 JPT 2011 DNGS 2015 Output / / / / / Inflation / / / / / Interest rate / / / / / Consumption / / / Investment / / / / Hours worked / / / Wage / / / Money supply (Ml) / Relative investment price / Spread / Inflation expectations / Forecasters. The longest-term forecast available was selected, which spans four to five years ahead. Growth rates are calculated as quarter-on-quarter differences of logs, and the interest rate is calculated per quarter. Details on the sources of the different variables are provided in Appendix A. The data transformations performed to the model variables correspond to those used in Smets and Wouters (2007). Real consumption, investment, and GDP are d i vided by population and transformed to growth rates. Hours worked are divided by population and logged. Inflation is defined as the growth rate of the GDP deflator. The nominal wage is deflated by the GDP deflator and transformed to growth rates. The interest rates are shortterm market interest rates. The monetary aggregate M l is deflated by the GDP deflator, divided by population, and transformed to growth rates. Finally, the relative price of investment is calculated as the investment deflator divided by the consumption deflator, and transformed to growth rates. 2.3. Detrending macroeconomic variables The macroeconomic variables used in the estimation of DSGE specifications are often highly persistent and need to be detrended using methods that are consistent with the theoretical framework used in the model. For some existing models, the authors specify the particular filter employed to detrend the variables, while in other cases, these details are not specified (see, e.g., Gorodnichenko & Ng, 2010). Delle Chiaie (2009) investigates the effects of detrending observable variables with the Hodrick-Prescott (HP) filter and a linear trend in the model by Smets and Wouters (2003), and finds that structural parameter estimates are rather sensitive to the choice of a particular filtering method. Consequently, forecasting performance may be significantly affected by the choice of a detrending approach. The original contributions on which we base our individual specifications use different detrending methods for the macroeconomic variables. While Christensen and Dib (2008) use the HP filters, Smets and Wouters (2007)—and models that build upon a similar structureintroduce some of the observable variables in first differences when estimating the parameters of the DSGE specification. Gorodnichenko and Ng (2010) offer a perspective of detrending approaches commonly used in a broader literature by compiling the detrending methods employed in 21 different models. The list of data filters used in various DSGE models shows a predominance of linear detrending, HP filtering, and first difference transformations. Our analysis employs several approaches used in the literature, while keeping the detrending method identical across all models considered. By doing so, we aim to separate the influence on forecasting performance of core model features, such as financial frictions or flexible inflation targets, from that of the trend formulation. In our baseline detrending specification, we use the data for GDP (and its sub-components) in first differences. Time series which present higher persistence are filtered using one-sided HP filters.1 Alternatively, we also assess the forecasting performance of our models employing the filtering strategy proposed by Del Negro et al. (2015), Justiniano et al. (2011), and Smets and Wouters (2007) and find evidence that our baseline detrending approach leads to superior forecasting performance in the majority of cases (see Table C.4 in Appendix C). We also perform the analysis using different detrending approaches, such as using the (one-sided) HP filter for all data series, employing the regression-based data filter introduced in Hamilton (2018), and demeaning the times series in the models. The results for these alternative detrending specifications can be found in Appendix C. 1822 J. Čapek, J. Crespo Cuaresma, N. Hauzenberger et al. 2.4. Estimation and predictive densities Each one of the models employed in the forecasting exercise is estimated recursively using Bayesian methods, starting with a sample size composed by 78 observations (corresponding to the time frame from 1970Q3-1989Q4) and adding one quarter at a time to the sample up to a m a x i m u m of 195 observations (corresponding to the full sample, w h i c h spans the period from 1970Q3-2019Q1). Additionally, we perform the forecasting exercise of estimating the models w i t h a rolling w i n d o w of 60 observations. The models are estimated using a m i n i m u m of a half million Metropolis-Hastings replications in two chains for the NKModel, one million replications for CD 2008 and CPS 2010, two million replications for JPT 2011 and S W 2007, and a m i n i m u m of four million replications for the DNGS 2015 model. To ensure convergence of the Markov chain, the checks in Brooks and Gelman (1998) are performed and, if these fail, the number of replications is increased until convergence of the posterior distributions is achieved. W e use a Monte Carlo-based optimization routine to ensure that the optimal acceptance ratio of the Metropolis-Hasting algorithm is reached, and we discard 90% of the replications as burn-ins.2 Forecasts are computed using 10,000 draws from the posterior distribution for every estimated model and each in-sample period. In each instance, we calculate one- to four-step-ahead out-of-sample forecasts of GDP growth, inflation, and the interest rate, which correspond to periods ranging from 1990Q1 to 2018Q4. The analysis of forecasts is conducted after imposing back the trend of the observable variables, so as to ensure comparability across detrending approaches. 3. Predictive combinations of DSGE models In this section, we outline the forecast combination methods employed to average the predictions of our set of models. Each DSGE model typically includes a different set of observables, targets a specific feature of the economy, and thus provides its o w n characterization of the economy by imposing different (structural) dynamics on the macroeconomic variables of interest. Some smallscale DSGE models abstract from the interaction between developments in the real economy, the labor market, and the financial sectors, while others include features and mechanisms related to these linkages. W e concentrate exclusively on one- and four-step-ahead predictive densities of GDP growth, inflation, and the interest rate, which are common to all the specifications used. W e assess and combine the joint predictive density of these three variables, as well as their corresponding marginal predictive densities. In the following, we illustrate the methods we use to combine predictive densities by focusing on a scalar time series y t + 1 and the one-step-ahead horizon. W i t h only m i nor adjustments, these techniques work analogously for the joint predictive densities and for the multi-step-ahead horizon. In our analysis, we therefore consider predictive All models are estimated using Dynare (Adjemian et al., 2011). International Journal of Forecasting 39 (2023) 1820-1838 densities for y t + 1 , w h i c h are available from K different DSGE models. Each DSGE model, indexed by j = 1 , . . . , K, incorporates information up to time r to generate a predictive density p/(yt +i |Zj(t)) for period t +1. The information set Xj(t) typically consists of the target variable y] : t = iy\, • • • ,yt)'< as well as of the information up to time t provided by additional variables specific to that particular DSGE model. W e aim to combine the K predictive densities {Pj(yt+i |2j(t))}jli using a K x 1 weights vector oot+i = (o>i,t+i, • • •, o>K,t+\)' that is specific to the onestep-ahead forecast horizon and potentially time-varying. The combined predictive density f o r y t + i is then given by K P ( y t + 1 | i 1 ( t ) , . . . ,iK(t)) = J2w it+m (yt+il^jCO) • ( i ) j = l Eq. (1) directly relates to the Bayesian predictive synthesis of McAlinn et al. (2019), McAlinn and West (2019), where w t + i is described as a dynamic synthesis func­ tion.3 This synthesis function can incorporate different objectives based on policy targets and historical performance up to period r, and nests traditional approaches to forecast combination, such as prediction pools (Geweke & Amisano, 2011; Hall & Mitchell, 2007) and Bayesian dynamic model averaging (Koop & Korobilis, 2012, 2013; Raftery et al., 2010). W e start by discussing a simple static weighting scheme implying w t + 1 = co, and then turn to more general approaches based on using dynamic weights for the predictive densities. Equal static weights An obvious starting point to combine predictions from different DSGE models, which provides a benchmark to evaluate different weighting schemes, is to use Since &>j,t+i > 0 and 5Z/Li ^l/.t+i = 1> the combination of predictive densities also constitutes a predictive density (Geweke & Amisano, 2011; Hall & Mitchell, 2007). This agnostic approach neglects the fact that different models might not be equally suitable for prediction at different time periods, and does not provide updates of the corresponding weights as information is gained about the differential predictive ability of model specifications. An equal weighting scheme is commonly found to be a good competitor in terms of out-of-sample forecasting accuracy, as it tends to hedge against large forecast errors of single specifications (see Timmermann, 2006). Dynamic Bayesian model averaging A natural choice of model weights can be achieved by pooling forecasts according to particular model selection criteria (for example, based on the predictive marginal likelihood or past forecast performance). For a given set i Del Negro et al. (2016) and McAlinn and West (2019) provide a formal treatment of the decision problem concerning the choice of time-varying weights o ) t + i . 1823 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 of priors over specifications, traditional Bayesian model averaging (BMA) approaches give models w i t h a higher marginal likelihood more support while downweighting models w i t h deficient predictive characteristics. Following Raftery et al. (2010) and Koop and Korobilis (2012, 2013), we consider posterior weights for individual specifications based on their (discounted) historical predictive likelihood, a procedure known as dynamic model averaging (DMA). According to this literature, D M A consists of a prediction equation co m 3,t+\\t %t\t 2-,k=\ M k,t\t and an updating equation ;,t+i|t Here, w t + 1 | t = (a>ij t + i|t K,t+\\t+\)' is a If x 1 vector of updated weights, and pj (y^\Xj(t)j refers to the one-step-ahead predictive density for model j evaluated at the realized value y£li (i.e., the predictive likelihood).4 Moreover, a forgetting factor 8 e {0,1) discounts past predictive performance more heavily, while more recent predictive likelihoods receive more weight. In the empirical application, we set S = 0.95, implying that the predictive likelihood four quarters (i.e., one year) in the past receives around 80% of the weight of the predictive likelihood of the most recent quarter.5 The D M A algorithm, moreover, is easy to implement without the need for any simulation techniques. Prediction pools Recent approaches to forecast combination assess the set of model-specific forecasts as if it was a portfolio of predictions, w h i c h must be chosen optimally w i t h respect to a particular loss function (see, inter alia, Geweke & Amisano, 2011, 2012; Hall & Mitchell, 2007; Pettenuzzo & Ravazzolo, 2016). Following Geweke and Amisano (2011), the loss function is defined as a function of historical log predictive scores, w h i c h gives rise to optimal weights after minimization. Similar to B M A and D M A methods, this approach ensures that forecasts from DSGE models with poor predictive abilities are downweighted, and those computed from specifications that predict more successfully receive higher weights. Information up to time r is available in order to choose the predictive weight wt+i|t By construction, both t + i|t = m m J = 1 where S again denotes a discount factor that serves the same purpose as in the D M A procedure by assigning increasing weight to the most recent predictive performance. W e additionally impose the restriction that weights are non-negative and sum to one. Note that we use standard numerical optimization algorithms for prediction pools, w h i c h are therefore easy to implement and computationally fast. Bayesian predictive synthesis with a dynamic factor model As noted by Del Negro et al. (2016), the predictive ability of particular specifications may be affected by structural breaks in the parameters governing the dynamics of macroeconomic variables. Such changes in predictive power should be addressed when combining the K predictive densities over time, and thus the mapping from the forecasts of each model to the combined predictive density should be adjusted accordingly. Eq. (1) can be directly related to a dynamic factor model, as proposed by McAlinn and West (2019) in the context of dynamic Bayesian predictive synthesis (BPS) methods, by defining the synthesis function as where we define the latent factors F t + ] = (yi,t +i, • • •, y/c,t+i)' w i t h y J j t + i , for j = \,...K, being a draw from the one-step-ahead predictive density pj{yt+\\Xj(t)) of each model j for period t + 1. Further, a>t + ] refers to timevarying loadings, and the shock in the observation equation et+i is Gaussian w i t h zero mean and variance §. The latent loadings (or states), that relate the draws from the predictive distributions to the realized value y[r ^ evolve according to a random walk: wt+i =(*t + ?t+i> Vt+i ~ M 0 , where j j t + ] refers to a K x 1 vector of Gaussian state innovations, w h i c h are centered on zero and feature a K x K variance-covariance matrix In contrast to equal weighting, DMA, and predictive pooling, the weights a>t + ] are no longer necessarily non-negative and do not need to sum up to one. a>t + ] are thus to be interpreted as (time-varying) calibration parameters relating draws from the predictive densities to the actual realization y^_v A further difference from other weighting schemes is that we consider a measurement error e t + i in the observation equation that explicitly accounts for model incompleteness (see, e.g., Aastveit et al., 2018; Hoogerheide et al., 2010; McAlinn & West, 2019). Moreover, the latent weights ( » t + ] are allowed to be correlated among models via a full variance-covariance matrix t+i, but also takes into account the dependencies 1824 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 between individual predictive specifications that share similar characteristics. W e use weakly informative priors, w h i c h are standard in the literature for dynamic factor models. This implies the use of a multivariate normal prior for w 0 , an inverse Gamma prior for §, and an inverse Wishart prior for We repeat this procedure for R draws from the predictive density and explicitly account for a potentially non-trivial form of the predictive densities of the DSGE models. To estimate the model we rely on standard Bayesian estimation techniques used for time-varying parameter models. In particular, we use a Gibbs sampler w h i c h iterates through these R draws. Conditional on all other quantities, we update the latent states a>t + ] w i t h a standard forward filtering backward sampling (FFBS) algorithm (Carter & Kohn, 1994; Fruhwirth-Schnatter, 1994). In a next step, conditional on the time-varying calibration parameters, we independently draw the observation equation variance f and the state equation variance-covariance matrix All steps involve standard conditional posteriors (for details, see McAlinn & West, 2019). Moreover, by using the filtering step in the FFBS algorithm, we directly obtain the predictive weights wt+i|t. w h i c h are used to combine the most recent predictive densities when the realization is not yet available. The M C M C algorithm of the dynamic factor model is somewhat more computationally demanding than the approximate procedure of D M A and the numerical optimization used for the pooling approach. However, compared to sequential Monte Carlo techniques such as particle filters (see, e.g., Billio et al., 2013; Del Negro et al., 2016), the computational burden can still be considered light. The DECO approach In addition to the combination methods outlined above, we consider the dynamic predictive density combination (DECO) approach of Billio et al. (2013). Like BPS, DECO allows for the specification of time-varying weights that evolve according to a flexible law of motion, and accounts for model incompleteness: yfl, = F ' t + 1 w t + 1 + € t + u e t + i ~ Af{0, §), (2) with o ) t + ] relating draws from the predictive densities to the actual realization y[r ^ and considering a Gaussiandistributed measurement error € t + \ . The main difference from BPS lies in the state equation that governs the evolution of the weights a>t + ] and thus the learning mechanism used in prediction. Instead of assuming that the weights evolve according to a multivariate random walk with a full variance-covariance matrix a non-linear link function between the elements in wt +i and K independent dynamic latent processes £ t + ] = (fi,t+i, • • •, fr,t+i)' is introduced: exp(& E ^ e x p ( £ , forj = 1 K. t+ij to be non-negative and sum to one. These restrictions thus effectively result in a non-linear state-space model, where Eq. (2) can be interpreted as a dynamic location mixture with a fixed variance. In what follows, £ t + ] encodes the learning mechanism and governs the weight dynamics. Each element in £ t + ] evolves according to independent random walks: ?j,t + i = ft, + »7j,t + i, »7j,t+i ~ Af(0, Vo), for )=-[,..., K. Here, fy-.t+i denotes element-specific state innovations with zero mean and variance i/fj. In DECO, the state innovation variances 1/0 encode the learning mechanism and depend on a scoring rule preselected by the researcher, a discount factor S, and a number of past observations r considered. For example, if the scoring rule indicates that the predictive performance of some particular model has deteriorated for the past realized values, the mechanism allows for the corresponding adjustment of the weights by increasing tp-j, thus introducing time variation in cojtt+\. Sequential Monte Carlo techniques are commonly used for such a non-linear state-space model. For the empirical implementation of DECO, we specify the key learning hyperparameters according to the following standard setting: we use the Kullback-Leibler scoring rule, set the number of past realized values to r = 9, and the discount factor 8 = 0.95. The remaining parameters are estimated from the data. For the particle filter, moreover, we define 50 particles and use an additional smoothing factor of 0.01.6 4. Forecasting macroeconomic variables in the euro area using DSGE models W e start by quantitatively assessing the predictive ability differences across DSGE models, before moving to the analysis of the potential improvements in forecasting quality from combining the predictions of individual models, and of the dynamics of predictive weights over the out-of-sample period. 4.1. Overall forecast performance of individual DSGE models The top panel of Table 3 presents the forecasting performance of individual DSGE models, w h i c h are estimated recursively over the out-of-sample period. W e present the root mean squared forecast error (RMSE) ratios, as well as the average log predictive Bayes factors (LPBFs), defined as the difference in average log predictive scores (LPSs), for one-step-ahead and four-step-ahead predictions. For the RMSEs, Table 3 also shows the results of DieboldMariano tests of equal predictive performance (Diebold & Mariano, 1995), and for the LPSs, those of the A m i s a n o Giacomini tests (Amisano & Giacomini, 2007). In both cases, the equality of predictive ability is tested using the SW2007 model as the benchmark specification. The results of this predictive ability analysis based on rolling w i n d o w estimation (instead of parameter estimates based This logistic link function does not allow for the use of unconstrained calibration parameters via a synthesis function, as in BPS, since it restricts the elements in a>t + ] 6 An efficient algorithm for this approach is implemented in the DeCo toolbox in Matlab (see Casarin et al., 2015) for one-step-ahead forecasts. 1825 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 Table 3 Forecasting performance of recursively estimated DSGE models and combinations of these models. Target variable(s) DSGE model CD 2008 CPS 2010 DNGS 2015 JPT 2011 NKmodel SW 2007 One step ahead Joint 1.246*" 0.929** 1.069** 1.196*** 0.889** 0.320 (-0.194) (0.340***) (-0.063**) (-0.452) (0.159*) (0.125) GDP growth 1.277*** 0.938 1.069** 1.206*** 0.885* 0.511 (-0.236**) (0.064*) (-0.069) (-0.224***) (0.102***) (-0.777) Inflation 1.033 0.858*** 1.064 1.107** 0.896** 0.194 (-0.013) (0.117***) (0.036***) (-0.089) (0.019*) (0.022) Interest rate 1.122 0.973* 1.088 1.291*** 1.016 0.085 (-0.011*) (0.138***) (-0.017) (-0.167***) (0.012) (0.819) Four steps ahead Joint 1.115 0.981 1.017 1.176*** 0.941** 0.379 (0.228) (0.419***) (0.308***) (-0.255***) (0.418***) (-1.219) GDP growth 1.115 0.996 1.037* 1.148** 0.963 0.573 (-0.162) (-0.041) (-0.043) (-0.241***) (-0.026) (-0.865) Inflation 1.099 0.863 0.938 1.375*** 0.775** 0.220 (0.268***) (0.388***) (0.268***) (-0.006) (0.331***) (-0.396) Interest rate 1.128 0.988 0.968 1.148*** 0.940 0.234 (0.051***) (0.122***) (0.089***) (-0.038*) (0.162***) (-0.101) Combination method EQ DMA POOL BPS DECO One step ahead Joint 1.061* 0.926 0.936 0.993 0.941 (0.085) (0.328***) (0.330***) (0.164***) (0.226***) GDP growth 1.074* 0.928 0.942 1.003 0.949 (-0.006) (0.072***) (0.085***) (-0.011) (-0.145***) Inflation 0.958 0.877** 0.884" 0.929** 0.872*** (0.028) (0.082) (0.115***) (0.109***) (0.307***) Interest rate 1.102 1.086 0.994 0.925 0.987 (0.017) (0.102***) (0.123***) (0.118***) (0.158***) Four steps ahead Joint 1.070** 0.999 0.985 0.926 (0.321***) (0.440***) (0.448***) (0.673***) GDP growth 1.089** 1.015 1.005 0.974 (-0.020) (0.008) (-0.037) (-0.013) Inflation 1.019 0.896 0.871 0.831** (0.234***) (0.365***) (0.386***) (0.396***) Interest rate 0.997 0.991 0.961 0.667** (0.108**) (0.143***) (0.150***) (0.409**) Notes: The table shows root mean squared errors (RMSEs), and average log predictive Bayes factors (LPBFs) in parentheses, relative to the SW 2007 model. Bold numbers indicate the best performing DSGE model as well as the best performing combination method that obtains the smallest RMSE ratio (largest LPBF). The SW 2007 column (highlighted in gray) shows the actual RMSEs and log predictive scores (LPSs) of our benchmark. Asterisks indicate statistical significance relative to the SW 2007 model at the \% (***), 5% (**), and 10% (*) significance levels in terms of Diebold and Mariano (1995) tests for RMSEs and Amisano and Giacomini (2007) tests for LPSs. on enlarging the in-sample period recursively) can be found in Appendix B, and the results based on alternative detrending methods are presented in Appendix C. The forecast error measures are presented for the joint vector of GDP growth, inflation, and the interest rate, as well as for these three variables individually. W e start by considering the overall forecasting ability for the group of macroeconomic variables, reflected in the characteristics of the joint predictive distribution. The results in the top panel of Table 3 for the full out-of-sample period indicate that the simple N K M o d e l has particularly good predictive ability compared to other DSGE specifications w i t h more complex model structures. In terms of the joint accuracy of point forecasts (i.e., for the full vector of variables) as measured by the average RMSEs, this specification outperforms all other DSGE models for both one-step-ahead and four-step-ahead predictions. Considering each variable individually, the quality of point predictions of the N K M o d e l appears particularly high for four-step-ahead predictions. The quality of point forecasts from the N K M o d e l partially translates to good performance in density forecasting (as measured by the LPBFs) in both of the prediction horizons considered. The joint density predictions of the N K M o d e l specification, however, appear less accurate than those of the CPS2010 model, w h i c h includes five structural shocks instead of the three of the N K M o d e l . The focus of the CPS2010 specification on offering a structural modeling framework for inflation dynamics (based on the inclusion of changes in the inflation target in the model) is successful at improving out-of-sample density predictions for this variable compared to the rest of the DSGE 1826 J. Čapek, J. Crespo Cuaresma, N. Hauzenberger et al. models entertained. Furthermore, the average predictive performance of the CPS2010 model for short-run density forecasts of the interest rate is also the best among the set of models considered. The particularly good forecasting ability of models that include a small number of observable variables is broadly robust to the use of different detrending methods and to the use of parameter estimates based on a rolling sample instead of on recursive estimation (see Appendices B and C). 4.2. Overall forecast performance of predictive combinations The comparison of results concerning predictive ability presented in Table 3 indicates that using forecast combinations can lead to improvements in average predictive ability over the full out-of-sample period. The best individual models in terms of forecasting ability at short horizons outperform all of the combination methods for GDP growth and jointly for all three variables. Concentrating on point prediction performance and both analyzed horizons, individual DSGE models predict GDP and inflation better than any combination scheme considered. The combinations, on the other hand, outperform individual DSGE specifications at predicting the interest rate. Combining predictions of DSGE models also delivers better results for density forecasting inflation, and yields the best results when evaluating the longer horizon of joint predictive performance. Since the forecasting ability results of single DSGE specifications and their combinations for the full sample may be driven by differences in out-of-sample predictive quality in sub-periods of the out-of-sample interval chosen, a more detailed analysis of the dynamics of the weights that combination methods assign to different DSGE models appears necessary. In the following sub-section, we analyze the dynamics of the predictive weights for the different averaging methods entertained, thus moving beyond average forecast quality and turning to the assessment of changes in predictive accuracy over time. 4.3. The dynamics of predictive weights We start by assessing the dynamics in the relative predictive ability of DSGE models by studying the evolution of predictive weights along the hold-out sample for our three target variables: GDP growth, inflation, and the interest rate. For each observable variable, we combine the predictions from DSGE models using statistics based on marginal predictive densities rather than on the joint predictive density of all target variables. One key advantage of this approach is that the weights used to combine predictive densities are thus specific to each variable and reflect changes in the relative forecasting ability of each DSGE specification for that particular phenomenon. We calibrate the weights for each forecast combination scheme with at least eight quarters (1990Q1 to 1991Q4) for the first period of our hold-out sample (1992Q1) and employ S = 0.95. Figs. 1 and 2 show the weights obtained for each model and target variable in the hold-out sample period International Journal of Forecasting 39 (2023) 1820-1838 for one-step-ahead (Fig. 1) and four-step-ahead forecasts (Fig. 2). The weighting schemes across forecast horizons are relatively similar, indicating a certain degree of stability of the predictive power of DSGE models across forecast horizons. In spite of the fact that the loss functions in the D M A and prediction pool methods are both based on log predictive scores, we observe substantial differences in the magnitude of the weights obtained for these two approaches. The weights in the prediction pool approach typically suggest a dynamic model selection scheme where single models tend to receive a weight close to one in a given period of time, while D M A usually assigns positive weights to forecasts from all different DSGE models. For the combination approach based on Bayesian predictive synthesis, weights (corresponding to factor loadings) are positive and relatively similar across models for the majority of periods. However, during the financial crisis, individual negative factor loadings can be observed, implying a reversal of the sign of the prediction of the respective DSGE model in the combined forecast for these quarters. Focusing on one-step-ahead weights, the first row of panels in Fig. 1 shows the results for the different combination techniques for GDP growth. For DMA, we observe that CPS2010 and NKModel tend to dominate in terms of predictive ability prior to the financial crisis. In the subsequent years, and in particular after the debt crisis in the euro area, the relevance of CPS2010 within the group of combined predictions decreases in favor of SW2007. For prediction pooling, the distribution of weights shows the importance of predictions from CPS2010 and NKModel for the combined forecast in particular periods, with SW2007 gaining importance only in the aftermath of the debt crisis. Both the D M A and DECO combination schemes give high weights to predictions from CPS2010 and NKModel, and the weights from DECO reflect the importance of forecasts from DNGS2015 until the mid-2000s. The distribution of weights implied by Bayesian predictive synthesis is much more uniform and stable over time. The second row of panels in Fig. 1 depicts the dynamics of weighting schemes for inflation as a target variable for one-step-ahead forecasts. Using DMA, the highest weights are assigned to CPS2010 and DNGS2015, with the latter gaining importance during the financial crisis. Both of these models are designed with a focus on tracking inflation dynamics: CPS2010 features a time-varying inflation target, and DNGS2015 includes inflation expectations, operationalized by making use of data from the Survey of Professional Forecasters. W i t h prediction pools, a qualitatively similar scheme appears, with weights close to unity alternating between these two DSGE models, and predictions from DNGS2015 being particularly important during the financial crisis years. Bayesian predictive synthesis and DECO assign practically identical stable weights across models for the full period. For interest rate predictions, the resulting weighting schemes are presented in the third row of panels in Fig. 1. In general, for the interest rate we observe a more persistent pattern in the weighting scheme, similar to that found for inflation. The D M A method leads to large and stable weights for CPS2010 throughout the hold-out 1827 J. Čapek, J. Crespo Cuaresma, N. Hauzenberger et al. B P S D E C O International Journal of Forecasting 39 (2023) 1820-1838 D M A P O O L C D 2 0 0 8 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2011 N K m o d e l S W 2 0 0 7 C D 2 0 0 8 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2011 N K m o d e l S W 2 0 0 7 C D 2 0 0 8 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2011 N K m o d e l S W 2 0 0 7 2018Q4 199201 2018Q4 199201 Fig. 1. Evolution of model weights over the hold-out sample for one-step-ahead predictions. Notes: The figure shows four different weighting schemes for the three target variables: GDP growth, inflation, and the interest rate. For BPS and DECO we use the posterior mean as a point estimate. sample, w i t h the exception of the period corresponding to the financial crisis, when DNGS2015 and NKModel receive relatively larger weights. The results from prediction pools are qualitatively similar, with forecasts from CPS2010 receiving weights close to unity throughout the period, except for in the mid-1990s and during the financial crisis, where predictions from SW2007 and DNGS2015 play a small role. As in the case of inflation, for interest rates, Bayesian predictive synthesis and DECO assign stable and similar weights to the individual model predictions throughout the hold-out sample. For four-step-ahead forecasts of GDP growth, Fig. 2 shows a partly similar evolution of the weights for D M A combinations, but w i t h weights that are more spread across DSGE specifications, especially before the financial crisis. In contrast to one-step-ahead predictions, for the longer horizon, the forecasts of GDP growth from SW2007 gain importance during the euro area debt crisis period, and weights in the last part of our hold-out sample are more uniformly spread across DSGE specifications. For output, the combination chosen by prediction pooling leads to a more erratic weighting scheme prior to the financial crisis as compared to one-step-ahead predictions. Output growth forecasts from CD2008 gain relevance right before the financial crisis, as do those from NKModel and SW2007 in the aftermath of the debt crisis i n the euro area. The weights from the combination method based on Bayesian predictive synthesis for fourstep-ahead forecasts roughly resemble those found for one-step-ahead predictions. The evolution of weighting schemes along the holdout sample for inflation predictions at the four-step-ahead horizon is relatively similar to that for the one-step-ahead predictions. The pooling combination scheme selects the CPS2010 model for almost the whole time period under study, as in the case of the shorter prediction horizon. More notable differences across prediction horizons can be found for D M A combinations. For the longer prediction horizon, the JPT2011 and SW2007 models are assigned almost zero weight, while DNGS2015 receives higher weight in the aftermath of the debt crisis in the euro area. The particular characteristics of the DNGS2015 model, which includes financial frictions and aims to explain the dynamics of output and inflation after financial shocks, make it conceptually adequate for predictions in the environment of debt distress. The Bayesian predictive synthesis combination method results in roughly uniformly distributed weights across models. Finally, the results for interest rate predictions at the four-step-ahead horizon, presented in the last row of Fig. 2, differ strongly from those obtained for one-stepahead forecasts. The predictions of the CPS2010 model, which obtained the highest weights using D M A and prediction pools for the shorter-term horizon, now receive 1828 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 B P S D M A P O O L C D 2008 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2011 NKmodel S W 2 0 0 7 C D 2008 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2011 NKmodel S W 2 0 0 7 C D 2008 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2011 NKmodel S W 2 0 0 7 II I I. I I II 2018Q41992Q1 2000Q1 2018Q41992Q1 - 0 . 2 5 0.00 0.25 0.50 0.75 1.00 Fig. 2. Evolution of model weights over the hold-out sample for four-step-ahead predictions. Notes: The figure shows three different weighting schemes for the three target variables: GDP growth, inflation, and the interest rate. For BPS we use the posterior mean as a point estimate. Note that DECO is only used for the one-step-ahead horizon. low weights over the hold-out sample and are replaced by the NKModel for the majority of the hold-out period, with the weights for CD2008 and DNGS2015 being prominent during the outbreak of the financial crisis. The results of the analysis of the evolution of weight estimates for combinations of DSGE model predictions illustrate the stark differences in weights across forecast pooling methods and over time. The fact that the combination method based on prediction pools acts as a dynamic model-selection device contrasts w i t h the weighting schemes resulting from the other approaches entertained in the exercise, which tend to lead to composite predictions w i t h positive weights for all specifications. The relative predictive performance of these combination approaches along the hold-out sample, as well as that of individual model forecasts, is explored in more detail in the next section.7 4.4. Predictive ability of individual specifications and forecast combinations: Variation over time In this section, we examine the variation over time of the predictive performance of the individual DSGE models and the forecast combinations. W e concentrate on the 7 The evolution of predictive weights across methods and over time for rolling samples can be found in Appendix B. 1829 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 a.) 1-step-ahead Joint Output growth Inflation Interest rate 1992Q1 2000Q1 2010Q1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2 0 1 0 Q 1 2 0 1 8 Q 4 b.) If.-step-ahead Joint Output growth Inflation Interest rate 1992Q1 2 0 0 0 Q 1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2000Q1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2000Q1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2010Q1 2 0 1 8 Q 4 Fig. 3. Evolution of average log predictive Bayes factors (LPBFs) relative to the SW 2007 model. Combination methods. Notes: The gray shaded areas indicate OECD recessions for the euro area. Note that DECO is only used for the one-step-ahead horizon. analysis of the evolution of log predictive Bayes factors, as a measure for the marginal likelihood, over the hold-out sample. Fig. 3 presents the predictive performance of forecasts based on the different weighting schemes across variables and forecast horizons by means of log predictive Bayes factors relative to the SW2007 model. In panel a) of Fig. 3, the results for one-step-ahead forecasts are shown. The overall evolution of the predictive ability of forecast combination methods at this prediction horizon presents similar dynamics across most of the approaches, w i t h i m provements in predictive ability over the hold-out sample and a relatively stable forecasting performance at the end of the out-of-sample period. A notable exception is the DECO scheme, especially for output growth and inflation. Practically all forecast combination methods tend to perform poorly at the very beginning of our hold-out sample compared to the SW2007 benchmark, a feature that is likely related to the imprecise estimation of weights.8 Considering the joint set of macroeconomic variables of interest as a whole, the predictive ability of prediction pooling and D M A tends to be similar and to dominate all other combination methods after the mid-1990s, a result which is mostly driven by their ability to provide precise predictions of GDP growth. Combinations of forecasts based on the DECO method, on the other hand, We also perform the exercise based on rolling samples instead of a recursive reestimation scheme, and the results are presented in Appendix B. The relative forecasting ability of individual models does not change qualitatively, while the performance of combination schemes with respect to the SW2007 benchmark tends to worsen, thus lending support to this conclusion. dominate the other combination alternatives when predicting inflation and interest rates after the mid-1990s. In contrast to the results obtained for the shorter-term horizon, the Bayesian predictive synthesis method of forecast averaging systematically outperforms the other predictive combinations for the joint group of observable macroeconomic variables after the mid-1990s at the longer horizon. The predictive quality shown by this method is fueled by its performance at predicting interest rates in the longer term, while in the other two variables, the forecast error appears comparable to that of other combination methods. In Fig. 4 we present the log predictive Bayes factors of individual specifications over the hold-out period with respect to the benchmark model, SW2007. A comparison across DSGE models reveals a systematically good relative predictive performance of the CPS2010 model (in particular after the mid-1990s) that extends to all three variables and to both forecasting horizons. In addition, a worsening in forecast ability of some specifications with respect to the SW2007 benchmark during the financial crisis and in its aftermath can be observed for many of the individual DSGE specifications. This is particularly the case for CD2008 at both horizons, but the loss of predictive quality also takes place in other specifications and is asymmetric across macroeconomic variables, with GDP growth forecasts being the most affected. The loss of predictive power triggered by the financial crisis is in many cases persistent, and relative predictive scores (as measured by the log predictive Bayes factor) do not always reach the level they had prior to the crisis. A n interesting exception to this stylized fact is the inflation 1830 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 a.) 1-step-ahead Joint Output growth Inflation Interest rate - 1 . 2 - L i 1 1 1-1 ! - i 1 1 1-1 L-i 1 1 H '-i 1 1 H 1992Q1 2000Q1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2010Q1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2010Q1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2010Q1 2 0 1 8 Q 4 b.) If.-step-ahead Joint Output growth Inflation Interest rate 3 I 1 1 I 1 L_, , , , I J , , , I—I , , , 1992Q1 2000Q1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2000Q1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2010Q1 2 0 1 8 Q 4 1992Q1 2000Q1 2010Q1 2 0 1 8 Q 4 Fig. 4. Evolution of average log predictive Bayes factors (LPBFs) relative to the SW 2007 model. DSGE models. Notes: The gray shaded areas indicate OECD recessions for the euro area. predictions from the DNGS2015 model, whose specification incorporates a more sophisticated assessment of inflation expectations than the rest of the DSGE models used, and whose predictive ability for this variable improves in the crisis period. A comparison of the predictive ability of forecast combinations and individual DSGE models over the hold-out period reveals that in some periods and for particular variables, weighted averages of forecasts achieve higher and less volatile log predictive Bayesian factors. However, the results show that it is not possible to find a onesize-fits-all method to combine predictions from DSGE models that would provide systematically superior predictions for all variables under scrutiny and over the full period studied. The difficulty in finding such a forecast averaging method for our sample is related to the particular characteristics of the economic area being studied. The existence of cross-country heterogeneity in shock transmission mechanisms and macroeconomic outcomes across euro area economies, in particular since the onset of the sovereign bond crisis, is widely documented in the literature (see Burriel & Galesi, 2018; Holton & d'Acri, 2018, just to name two recent examples). The difference in shock propagation between countries in the euro area aggregate poses particular challenges in terms of how they can be accommodated in DSGE specifications such as those entertained in our analysis. 5. Conclusions The results of our analysis show that combining forecasts from DSGE models does not systematically lead to improvements in predictive ability for macroeconomic variables for the euro area over the full period under scrutiny, which spans the last three decades. For some variables and periods, predictive weighting schemes are able to reach superior forecasting performance over individual DSGE specifications. In particular, the gains in the predictive ability of forecast combinations of DSGE models are larger in the last part of our sample. The weighting schemes implied by the combination methods employed are fundamentally different across techniques. Weighting based on prediction pools tends to lead to forecasts based on dynamic model selection, assigning zero weights to many individual model predictions over the out-of-sample period. D M A and weighting based on dynamic factors, on the other hand, results in combined forecasts with positive weights for practically all of the DSGE specifications. The forecasting performance of individual DSGE models and combinations thereof systematically worsens during the financial crisis with respect to the benchmark, although the loss of predictive power and the volatility of forecast errors appear larger in individual specifications as compared to predictive combinations. The results of our analysis may be significantly affected by the focus on the euro area economy, which is characterized by differences in the propagation of macroeconomic shocks across the countries that compose it. The suite of DSGE models employed in our forecasting exercise does not contain any specification that explicitly addresses the differential structural characteristics of the euro area. In this context, the results of our analysis should be considered very conservative estimates of the potential of predictive combination methods combined with forecasts from DSGE models. Refining the theoretical 1831 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. structure of the models employed for predictive combinations to address the particularities of the euro area is likely to be a fruitful avenue of further research building upon the analysis presented here. Declaration of competing interest The authors declare that they have no k n o w n c o m peting financial interests or personal relationships that could have appeared to influence the w o r k reported in this paper. International Journal of Forecasting 39 (2023) Í820-Í838 Appendix A. Data See Table A . l . Appendix B. Forecasting performance based on rolling window estimation See Table B . l and Figs. B.1-B.4. Appendix C. Forecasting performance based on alternative detrending schemes See Tables C.1-C.4. Table A.1 Source of data. Source Database, mnemonic Output AWM, Eurostat AWM:YER, Eurostat:namq_10_gdp (Q.CLV10_MEUR.SCA.B1GQ.EA19) Inflation AWM, Eurostat AWM:YED, Eurostat:namq_10_gdp (Q.PD10_EUR.SCA.B1GQ.EA19) Interest rate AWM, Eurostat AWM:STN, Eurostat:irt_st_q (Q.IRT_M3.EA) Consumption AWM, Eurostat AWM:PCR, Eurostat:namq_10_gdp (O_CLV10_MEUR.SCA.P31_S14_S15.EA19) Investment AWM, Eurostat AWM:ITR, Eurostat:namq_10_gdp (O_CLV10_MEUR.SCA.P51G.EA19) Hours worked Conference Board, CB:Total Economy Database ("Total Hours Worked"), Eurostat:namq_10_al0_e Eurostat (Q.THS_HW.TOTAL.SCA.EMP_DC.EA19) Wage AWM, Eurostat AWM:WIN, Eurostat:namq_10_al0 (0_.CP_MEUR.SCA.TOTAL.Dl.EA19) Money supply ( M l ) OECD MANMM101* Relative investment price AWM, Eurostat AWM:PCD, ITD, Eurostat:namq_10_gdp (Q.PD10_EUR.SCA.P31_S14_S15.EA19, aPD10_EUR.SCA.P51G.EA19) Spread Gilchrist and spr_nfc_bund_ea Mojon (2018) Inflation expectations ECB SPF - Survey of Professional Forecasters (SPF.Q.U2.HICP.POINT.LT.Q.AVG) Population Eurostat demo_pjanbroad (ANR.Y15-64.T), lfsq_pganws (Q.THS.T.TOTAL.Y15-64.POP.EA19) Notes: 'Although the time series of the monetary aggregate M l is described as seasonally adjusted in the OECD database, some parts of the series still exhibit a clear seasonal pattern, which we removed making use of the TRAMO-SEATS method in JDemetra+. -1.5-1—, , 1992Q1 2000Q1 a.) 1-step-ahead Output growth DSGE model C D 2008 C P S 2010 D N G S 2 0 1 5 — J P T 2011 NKmodel S W 2 0 0 7 2010Q1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2010Q1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2010Q1 2 0 1 8 Q 4 b.) 4-step-ahead Output growth Interest rate 0.8 i , 1992Q1 2000Q1 • I _ • r DSGE model — C D 2008 C P S 2010 D N G S 2 0 1 5 — J P T 2011 NKmodel — S W 2 0 0 7 2010Q1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2010Q1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2010Q1 2 0 1 8 Q 4 Fig. B.l. Evolution of average log predictive Bayes factors (LPBFs) relative to the SW 2007 model. Notes: The gray shaded areas indicate OECD recessions for the euro area. DSGE models are estimated based on a rolling window. 1832 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 a.) 1-step-ahead Output growth Interest rate M e t h o d — EO — D M A — P O O L — B P S D E C O 1992Q1 2000Q1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2 0 0 0 0 1 2 0 1 0 0 1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2 0 1 0 Q 1 2 0 1 8 Q 4 b.) ^-step-ahead Output growth Interest rate 1992Q1 2000Q1 2 0 1 0 Q 1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2010Q1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2010Q1 2 0 1 8 Q 4 1992Q1 2 0 0 0 Q 1 2 0 1 0 Q 1 2 0 1 8 Q 4 Fig. B.2. Evolution of average log predictive Bayes factors (LPBFs) relative to the SW 2007 model. Notes: The gray shaded areas indicate OECD recessions for the euro area. Note that DECO is only used for the one-step-ahead horizon. DSGE models are estimated based on a rolling window. B P S D E C O P O O L C D 2 0 0 8 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2 0 1 1 N K m o d e l S W 2 0 0 7 C D 2 0 0 8 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2 0 1 1 N K m o d e l S W 2 0 0 7 C D 2 0 0 8 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2 0 1 1 N K m o d e l S W 2 0 0 7 in 2018Q4 1992QI 200001 2018Q4 199201 J3CGOI 201001 201BQ4 199201 200901 Fig. B.3. Evolution of model weights over the hold-out sample for one-step-ahead predictions. Notes: The figure shows four different weighting schemes for the three target variables: output growth, inflation, and the interest rate. For BPS and DECO we use the posterior mean as a point estimate. DSGE models are estimated based on a rolling window. 1833 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 BPS D M A P O O L C D 2 0 0 8 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2011 N K m o d e l S W 2 0 0 7 C D 2 0 0 8 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2 0 1 1 N K m o d e l S W 2 0 0 7 C D 2 0 0 8 C P S 2 0 1 0 D N G S 2 0 1 5 J P T 2 0 1 1 N K m o d e l S W 2 0 0 7 I II II I 2018Q4 1992Q1 2018Q41992Q1 2000Q1 - 0 . 2 5 0.00 0.25 0.50 0.75 1.00 Fig. B.4. Evolution of model weights over the hold-out sample for four-step-ahead predictions. Notes: The figure shows three different weighting schemes for the three target variables: output growth, inflation, and the interest rate. For BPS we use the posterior mean as a point estimate. Note that DECO is only used for the one-step-ahead horizon. DSGE models are estimated based on a rolling window. 1834 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 Table B.l Forecasting performance of DSGE models based on rolling window estimation and combinations of these models. Target variable(s) DSGE model CD 2008 CPS 2010 DNGS 2015 JPT 2011 NKmodel SW 2007 One step ahead Joint 1.181 0.947** 1.089* 1.164*** 0.888** 0.319 (-0.295) (0.196***) (-0.187) (-0.543**) (0.146***) (0.501) GDP growth 1.215 0.957 1.097* 1.165** 0.884** 0.511 (-0.321) (-0.016) (-0.051) (-0.161**) (0.046) (-0.794) Inflation 0.899 0.850*** 1.011 1.113 0.877*** 0.196 (0.010*") (0.135) (-0.047**) (-0.088) (0.066***) (0.226) Interest rate 1.264*** 1.085 1.198*** 1.431*** 1.136 0.075 (-0.107) (0.066) (-0.145***) (-0.318***) (0.018) (1.030) Four steps ahead Joint 1.017 0.996 1.115 1.079 0.961** 0.382 (-0.328) (-0.036) (-0.155*) (-0.347*) (0.034) (-0.787) GDP growth 1.022 1.012 1.122 1.082 0.981 0.583 (-0.384) (-0.424) (-0.135) (-0.227**) (-0.330) (-0.877) Inflation 0.949 0.903 1.026 1.052 0.804** 0.210 (0.127*) (0.230***) (0.074***) (-0.037**) (0.218*) (-0.011) Interest rate 1.039 0.970 1.135 1.083 0.956 0.233 (-0.170*) (0.111***) (-0.188) (-0.095*) (0.093***) (0.015) Combination method EQ DMA POOL BPS DECO One step ahead Joint 1.005 1.018 0.932** 1.036 0.951** (0.150) (0.223***) (0.261***) (-0.010***) (-0.265***) GDP growth 0.992 1.035 0.935* 1.054 0.962 (0.060**) (0.078*) (0.065) (0.031) (-0.296***) Inflation 1.071 0.858*** 0.893*** 0.903*** 0.857*** (0.072***) (0.110) (0.090) (0.009***) (0.078***) Interest rate 1.134** 1.186 1.053 1.033 1.065 (-0.015**) (0.057) (0.067) (-0.015**) (-0.013) Four steps ahead Joint 1.066 0.990 1.007 0.922*** (0.241) (0.271) (0.369) (0.437*) GDP growth 1.082 1.006 1.025 0.967 (-0.013) (0.025) (0.037) (-0.033) Inflation 0.973 0.882* 0.871 0.842** (0.138) (0.229***) (0.227***) (0.137) Interest rate 1.037 0.973 0.995 0.657** (0.070***) (0.115***) (0.079***) (0.402) Notes: The table shows root mean squared errors (RMSEs), and average log predictive Bayes factors (LPBFs) in parentheses, relative to the SW 2007 model. Bold numbers indicate the best performing DSGE model as well as the best combination method that obtains the smallest RMSE ratio (largest LPBF). The SW 2007 column shows the actual RMSEs and LPSs of our benchmark. Asterisks indicate statistical significance relative to SW 2007 at the 1% (***), 5% (**), and 10% (*) significance levels in terms of Diebold and Mariano (1995) tests for RMSEs and Amisano and Giacomini (2007) tests for LPSs. 1835 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 Table C.l Forecasting performance of recursively estimated DSGE models with HP filter detrending. Target variable(s) DSGE modelTarget variable(s) CD 2008 CPS 2010 DNGS 2015 JPT 2011 NKmodel SW 2007 One step ahead Joint 1.078" 0.923** 1.064*** 1.103 0.881*** 0.278 (-0.092") (0.220***) (-0.221) (-0.313**) (0.085) (0.492) GDP growth 1.082" 0.920* 1.050*** 1.107 0.853*** 0.446 (-0.066) (0.091***) (-0.030) (-0.092*) (0.154***) (-0.630) Inflation 1.027 0.928* 1.094 1.020 1.001 0.166 (-0.042*) (0.004*) (-0.040**) (-0.058***) (-0.058***) (0.178) Interest rate 1.202*** 0.989 1.353*** 1.325*** 1.190** 0.074 (0.021***) (0.151***) (-0.115*) (-0.186***) (-0.033) (0.873) Four steps ahead Joint 1.010 1.018 1.166*** 1.148** 0.989 0.300 (0.113**) (0.069) (-0.182**) (-0.201*) (0.128) (-0.455) GDP growth 0.976 1.021 1.135*** 1.141** 0.981 0.457 (0.022) (-0.017) (-0.098**) (-0.166***) (0.015) (-0.672) Inflation 1.134 1.025 1.255*** 1.261** 0.953 0.166 (0.070) (0.101) (0.015) (-0.054) (0.087) (-0.035) Interest rate 1.105** 0.997 1.270** 1.094 1.064 0.187 (0.070***) (0.076) (-0.115*) (0.003***) (0.082***) (0.111) Notes: The table shows root mean squared errors (RMSEs), and average log predictive Bayes factors (LPBFs) in parentheses, relative to the SW 2007 model. Bold numbers indicate the best performing DSGE model that obtains the smallest RMSE ratio (largest LPBF). The SW 2007 column shows the actual RMSEs and log predictive scores (LPSs) of our benchmark. Asterisks indicate statistical significance relative to SW 2007 at the 1% (***), 5% (**), and 10% (*) significance levels in terms of Diebold and Mariano (1995) tests for RMSEs and Amisano and Giacomini (2007) tests for LPSs. Table C.2 Forecasting performance of recursively estimated DSGE models with Hamilton filter detrending. Target variable(s) DSGE model CD 2008 CPS 2010 DNGS 2015 JPT 2011 NKmodel SW 2007 One step ahead Joint 1.235*** 0.909*** 0.979 1.104*** 0.901* 0.340 (-0.428**) (0.310) (-0.023) (-0.452***) (0.020) (-0.494) GDP growth 1.277*** 0.920** 0.988 1.092* 0.878** 0.519 (-0.214***) (0.110***) (0.014) (-0.116***) (0.141***) (-0.821) Inflation 1.009 0.827*** 0.900* 1.057 0.889 0.250 (-0.103) (0.088**) (0.059) (-0.094) (-0.006**) (-0.124) Interest rate 1.332*** 1.033 1.116*** 1.469*** 1.296*** 0.120 (-0.074) (0.136***) (-0.054***) (-0.232***) (-0.104***) (0.376) Four steps ahead Joint 1.123 0.984 0.990 1.120*** 0.979 0.418 (0.018) (0.256**) (0.188***) (-0.183***) (0.192***) (-1.780) GDP growth 1.129 1.009* 1.035 1.139 0.993* 0.552 (-0.131) (-0.016) (-0.022) (-0.223***) (-0.030) (-0.864) Inflation 0.979 0.825 0.786 1.056 0.779 0.315 (0.160***) (0.287***) (0.168***) (0.010) (0.203***) (-0.577) Interest rate 1.212 1.038 1.022 1.123 1.084 0.347 (0.020) (0.094*) (0.062) (0.015) (0.098) (-0.560) Notes: The table shows root mean squared errors (RMSEs), and average log predictive Bayes factors (LPBFs) in parentheses, relative to the SW 2007 model. Bold numbers indicate the best performing DSGE model that obtains the smallest RMSE ratio (largest LPBF). The SW 2007 column shows the actual RMSEs and log predictive scores (LPSs) of our benchmark. Asterisks indicate statistical significance relative to SW 2007 at the 1% (***), 5% (**), and 10% (*) significance levels in terms of Diebold and Mariano (1995) tests for RMSEs and Amisano and Giacomini (2007) tests for LPSs. 1836 J. Capek, J. Crespo Cuaresma, N. Hauzenberger et al. International Journal of Forecasting 39 (2023) 1820-1838 Table C.3 Forecasting performance of recursively estimated DSGE models with demeaned observables. Target variable(s) DSGE modelTarget variable(s) CD 2008 CPS 2010 DNGS 2015 JPT 2011 NKmodel SW 2007 One step ahead Joint 1.206"* 0.801*** 0.889*** 1.031 0.905 0.373 (-0.734***) (0.512***) (0.242) (-0.356***) (-0.112) (-0.298) GDP growth 1.025 0.794*** 0.875*** 1.015 0.848** 0.602 (-0.057) (0.203***) (0.117***) (-0.020) (0.076*) (-0.919) Inflation 2.136*** 0.824*** 0.965 1.066 1.216*** 0.216 (-0.598***) (0.119) (0.136**) (-0.071**) (-0.206***) (-0.124) Interest rate 1.313*** 0.950 1.020 1.465*** 1.217*** 0.091 (-0.104*) (0.168***) (0.012) (-0.234***) (-0.098***) (0.695) Four steps ahead Joint 1.059 0.808*** 0.879*** 1.019 1.139** 0.474 (-0.090) (0.751***) (0.723***) (-0.018) (-0.132) (-1.998) GDP growth 0.827** 0.809*** 0.888*** 0.958 0.988 0.703 (0.082) (0.134) (0.089) (-0.084) (-0.115**) (-1.036) Inflation 1.978*** 0.755*** 0.834* 1.169 1.700*** 0.280 (-0.223**) (0.429***) (0.405***) (0.078**) (-0.183***) (-0.625) Interest rate 1.046 0.837** 0.870 1.171*** 1.279*** 0.319 (0.061) (0.163***) (0.174**) (-0.015) (-0.153**) (-0.388) Notes: The table shows root mean squared errors (RMSEs), and average log predictive Bayes factors (LPBFs) in parentheses, relative to the SW 2007 model. Bold numbers indicate the best performing DSGE model that obtains the smallest RMSE ratio (largest LPBF). The SW 2007 column shows the actual RMSE and log predictive scores of our benchmark. Asterisks indicate statistical significance relative to SW 2007 at the 1% (***), 5% (**), and 10% (*) significance levels in terms of Diebold and Mariano (1995) tests for RMSEs and Amisano and Giacomini (2007) tests for log predictive scores (LPSs). Table C.4 Forecasting performance of the three recursively estimated DSGE models with the baseline data filtering used for Table 3 relative to the originally proposed model and data filtering. Target variable(s) DNGS 2015 JPT 2011 SW 2007Target variable(s) Baseline Original Baseline Original Baseline Original One step ahead Joint 0.943* 0.362 1.164*** 0.329 0.944 0.338 (0.470***) (-0.407) (0.114) (-0.441) (0.381) (-0.256) GDP growth 0.951 0.575 1.195*** 0.516 0.950 0.538 (0.044) (-0.890) (-0.179***) (-0.822) (0.051) (-0.828) Inflation 0.992 0.208 1.122** 0.192 0.899** 0.216 (0.099**) (-0.041) (0.025) (-0.092) (0.153) (-0.132) Interest rate 0.660*** 0.139 0.760*** 0.143 0.999 0.085 (0.356***) (0.447) (0.242***) (0.410) (0.101***) (0.719) Four steps ahead Joint 0.772*** 0.500 1.027 0.434 0.875*** 0.434 (0.986***) (-1.898) (0.356) (-1.830) (0.742***) (-1.961) GDP growth 0.918 0.648 1.091" 0.603 0.910* 0.630 (0.117) (-1.025) (-0.105) (-1.001) (0.124) (-0.989) Inflation 0.639* 0.323 1.292 0.234 0.796* 0.277 (0.257**) (-0.385) (-0.094**) (-0.308) (0.238***) (-0.634) Interest rate 0.476*** 0.475 0.700** 0.384 0.776** 0.301 (0.609**) (-0.621) (0.391**) (-0.530) (0.289***) (-0.390) Notes: The table shows root mean squared errors (RMSEs), and average log predictive Bayes factors (LPBFs) in parentheses of the baseline data filtering relative to the originally proposed data filtering in Del Negro et al. (2015), Justiniano et al. (2011), and Smets and Wouters (2007), respectively. The columns "Original" show the actual RMSEs and log predictive scores of these benchmarks. Asterisks indicate statistical significance of the "Baseline" relative to the "Original" at the 1% (***), 5% (**), and 10% (*) significance levels in terms of Diebold and Mariano (1995) tests for RMSEs and Amisano and Giacomini (2007) tests for LPSs. 1837 J. Čapek, J. Crespo Cuaresma, N. Hauzenberger et al. References Aastveit, K. A., Ravazzolo, F., & Van Dijk, H. K. (2018). Combined density nowcasting in an uncertain economic environment. Journal of Business & Economic Statistics, 36(1), 131-145. Adjemian, S., Bastani, H., Juillard, M., Mihoubi, F., Perendia, G., Ratto, M., & Villemot, S. (2011). Dynare: reference manual, version 4: Dynare working papers 1, CEPREMAP. Amisano, G., & Geweke, J. (2017). Prediction using several macroeconomic models. The Review of Economics and Statistics, 99(5), 912-925. Amisano, G., & Giacomini, R. (2007). Comparing density forecasts via weighted likelihood ratio tests. Journal of Business & Economic Statistics, 25(2), 177-190. Bates, J. M., & Granger, C. W. (1969). The combination of forecasts. Journal of the Operational Research Society, 20(4), 451-468. Bernanke, B. S., Gertler, M., & Gilchrist, S. (1999). Chapter 21 the financial accelerator in a quantitative business cycle framework. In Handbook of macroeconomics: vol. 1, (pp. 1341-1393). Elsevier. Billio, M., Casarin, R., Ravazzolo, F., & Van Dijk, H. K. (2013). Time-varying combinations of predictive densities using nonlinear filtering. Journal of Econometrics, 177(2), 213-232. Brand, T., & Toulemonde, N. (2015). Automating update of the Smets and Wouters (2003) database: Technical report. Brooks, S. P., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7(4), 434-455. Burriel, P., & Galesi, A. (2018). Uncovering the heterogeneous effects of ECB unconventional monetary policies across euro area countries. European Economic Review, Í0Í, 210-229. Carter, C. K., & Kohn, R. (1994). On gibbs sampling for state space models. Biometrika, 81(3), 541-553. Casarin, R., Grassi, S., Ravazzolo, F., & van Dijk, H. K. (2015). Parallel sequential Monte Carlo for efficient density combination: The DeCo MATLAB toolbox. Journal of Statistical Software, 68(3), 1-30. Christensen, I., & Dib, A. (2008). The financial accelerator in an estimated New Keynesian model. Review of Economic Dynamics, i i ( l ) , 155-178. Christoffel, K., Coenen, G., & Warne, A. (2011). Forecasting with DSGE models. In The Oxford handbook of economic forecasting (pp. 89-126). Cogley, T., Primiceri, G. E., & Sargent, T. J. (2010). Inflation-gap persistence in the US. American Economic Journal: Macroeconomics, 2(1), 43-69. Del Negro, M., Giannoni, M . P., & Schorfheide, F. (2015). Inflation in the Great Recession and New Keynesian Models. American Economic Journal: Macroeconomics, 7(1), 168-196. Del Negro, M., Hasegawa, R. B., & Schorfheide, F. (2016). Dynamic prediction pools: An investigation of financial frictions and forecasting performance. Journal of Econometrics, 192(2), 391-405. Delle Chiaie, S. (2009). The sensitivity of DSGE models' results to data detrending: Technical Report 157, Oesterr. Nationalbank Working Paper Series. Diebold, F. X., & Mariano, R. S. (1995). Comparing predictive accuracy. Journal of Business & Economic Statistics, 13(3), 253-263. International Journal of Forecasting 39 (2023) 1820-1838 Fagan, G., Henry, J., & Mestre, R. (2005). An area-wide model for the Euro area. Economic Modelling, 22(1), 39-59. Frühwirth-Schnatter, S. (1994). Data augmentation and dynamic linear models. Journal of Time Series Analysis, 15(2), 183-202. Geweke, J., & Amisano, G. (2011). Optimal prediction pools. Journal of Econometrics, )64(1), 130-141. Geweke, J., & Amisano, G. (2012). Prediction with misspecified models. American Economic Review, 102(3), 482-486. Gilchrist, S., & Mojon, B. (2018). Credit risk in the Euro area. The Economic Journal, 128(608), 118-158. Gorodnichenko, Y., & Ng, S. (2010). Estimation of DSGE models when the data are persistent. Journal of Monetary Economics, 57(3), 325-340. Hall, S. G., & Mitchell, J. (2007). Combining density forecasts. International Journal of Forecasting, 23(1), 1-13. Hamilton, J. D. (2018). Why you should never use the Hodrick-Prescott filter. The Review of Economics and Statistics, 100(5), 831-843. Holton, S., & d'Acri, C. R. (2018). Interest rate pass-through since the Euro area crisis. Journal of Banking & Finance, 96, 277-291. Hoogerheide, L., Kleijn, R., Ravazzolo, F., Van Dijk, H. K., & Verbeek, M . (2010). Forecast accuracy and economic gains from Bayesian model averaging using time-varying weights. Journal of Forecasting, 29(1-2), 251-269. Justiniano, A., Primiceri, G. E., & Tambalotti, A. (2011). Investment shocks and the relative price of investment. Review of Economic Dynamics, 14(1), 102-121. Kolasa, M., & Rubaszek, M. (2015). Forecasting using DSGE models with financial frictions. International Journal of Forecasting, 3)(1), 1-19. Koop, G., & Korobilis, D. (2012). Forecasting inflation using dynamic model averaging. International Economic Review, 53(3), 867-886. Koop, G., & Korobilis, D. (2013). Large time-varying parameter VARs. Journal of Econometrics, 177(2), 185-198. McAlinn, K., Aastveit, K. A , Nakajima, J., & West, M. (2019). Multivariate Bayesian predictive synthesis in macroeconomic forecasting. Journal of the American Statistical Association, 1-19. McAlinn, K., & West, M . (2019). Dynamic Bayesian predictive synthesis in time series forecasting. Journal of Econometrics, 210(1), 155-169. Pettenuzzo, D., & Ravazzolo, F. (2016). Optimal portfolio choice under decision-based model combinations. Journal of Applied Econometrics, 31(7), 1312-1332. Raftery, A. E., Kárny, M., & Ettler, P. (2010). Online prediction under model uncertainty via dynamic model averaging: Application to a cold rolling mill. Technometrics, 52(1), 52-66. Smets, F., & Wouters, R. (2003). An Estimated Dynamic Stochastic General Equilibrium Model of the Euro Area. Journal of the European Economic Association, 1(5), 1123-1175. Smets, F., & Wouters, R. (2007). Shocks and frictions in US business cycles: A Bayesian DSGE approach. American Economic Review, 97(3), 586-606. Timmermann, A. (2006). Forecast combinations. In Handbook of economic forecasting. Vol. 1 (pp. 135-196). Elsevier. Wolters, M . H. (2015). Evaluating point and density forecasts of DSGE models. Journal of Applied Econometrics, 30(1), 74-96. 1838