Ethiopia Today: Seasonal-to-Interannual Variability of Ethiopia/Horn of Africa Monsoon. Part II: Statistical Multimodel Ensemble Rainfall Predictions

By Lamb, Peter J
Proquest LLC

ABSTRACT

An ensemble-based multiple linear regression technique is developed to assess the predictability of regional and national June-September (JJAS) anomalies and local monthly rainfall totals for Ethiopia. The ensemble prediction approach captures potential predictive signals in regional circulations and global sea surface temperatures (SSTs) two to three months in advance of the monsoon season. Sets of 20 potential predictors are selected from visual assessments of correlation maps that relate rainfall with regional and global predictors. Individual predictors in each set are utilized to initialize specific forward stepwise regression models to develop ensembles of equal number of statistical model estimates, which allow quantifying prediction uncertainties related to individual predictors and models. Prediction skill improvement is achieved through error minimization afforded by the ensemble.

For retroactive validation (RV), the ensemble predictions reproduce well the observed all-Ethiopian JJAS rainfall variability two months in advance. The ensemble mean prediction outperforms climatology, with mean square error reduction (SSClim) of 62%. The skill of the prediction remains high for leave-one-out cross validation (LOOCV), with the observed-predicted correlation r (SSClim) being10.81 (65%) for 1970-2002. For tercile predictions (below, near, and above normal), the ranked probability skill score is 0.45, indicating improvement compared to climatological forecasts. Similarly high prediction skill is found for local prediction of monthly rainfall total at Addis Ababa (r510.72) and Combolcha (r510.68), and for regional prediction of JJAS standardized rainfall anomalies for northeastern Ethiopia (r 5 10.80). Compared to the previous generation of rainfall forecasts, the ensemble predictions developed in this paper show substantial value to benefit society.

(ProQuest: ... denotes formulae omitted.)

1. Introduction

Rainfall is themost important climate element affecting the Ethiopian population (96.6 million). The main rainy season (called Kiremt) occurs during June-September (JJAS), when the northern two-thirds of the country receives 65%-95% of Ethiopia's total annual rainfall (Segele and Lamb 2005) and produces 85%-95% of its food (Degefu 1987). Because all Ethiopian agricultural activities and resulting crop production are dependent upon on the distribution and amount of JJAS rainfall, its seasonal prediction is of great importance for agricultural planning and socioeconomic disaster mitigation.

Several previous studies have examined aspects of the intraseasonal-to-interannual variability of Ethiopian rainfall. These studies have considered the following: Kiremt onset, cessation, and resulting growing season variability (Segele and Lamb 2005); the abrupt latitudinal rainfall changes involved and their representation in climate model simulations (Riddle and Cook 2008; Segele et al. 2009c); associated large-scale atmospheric circulation characteristics and global sea surface temperature patterns (Beltrando and Camberlin 1993; Shanko and Camberlin 1998; Segele et al. 2009a; Berhane et al. 2014); and connections to the Indian monsoon (Camberlin 1997; Vizy and Cook 2003).Also investigated have been the hydrological responses to rainfall (Conway 2000), temporal trends and decadal shifts of the rain (Seleshi and Zanke 2004; Bowden and Semazzi 2007; Cheung et al. 2008; Williams et al. 2012;Viste et al. 2013; Berhane et al. 2014), and seasonal spring (Belg) and Kiremt rainfall predictability (Gissila et al. 2004;Korecha and Barnston 2007; Block and Rajagopalan 2007; Diro et al. 2008; Korecha and Sorteberg 2013; Jury 2013; Nicholson 2014).

Segele et al. (2009b, hereinafter Part I) used wavelet analysis to identify temporal and spectral characteristics of the seasonal-to-interannual variability of 5-day average Ethiopian JJAS rainfall, and to present a time- frequency quantification of teleconnections between the rainfall and large-scale atmospheric circulation and global sea surface temperature (SST) patterns. The results linked Ethiopian monsoon rainfall variations principally with seasonal-to-interannual time-scale atmospheric circulation patterns that involve fluctuations in the major components of the monsoon system, including the monsoon trough, tropical easterly jet (TEJ), and Somali low level jet (SLLJ). Part I thus provided the physical basis for understanding the mechanisms of Ethiopian rainfall variability and identifying regional rainfall drivers. The current study builds on that foundation by developing newstatisticalmethods for assessing the predictability of monthly-to-seasonal Ethiopian rainfall on national and local scales, as a step toward an operational seasonal prediction capability for Ethiopia.

On interannual time scales, monsoon rainfall variability now is widely recognized as being governed by slowly varying surface boundary conditions, such as SSTs, land surface albedo, and soil moisture. For Africa, the role of basin- and global-scale SST anomalies for such rainfall variability received considerable attention over the last 35 years (e.g., Lamb 1978a,b; Folland et al. 1986, 1991; Lamb and Peppler 1992; Barnston et al. 1996; Ward 1998; Mutai and Ward 2000; Camberlin et al. 2001; Giannini et al. 2003; Segele et al. 2009a; Part I; Ndiaye et al. 2011). Using diagnostic relationships developed in those studies, several other investigations further assessed the predictability of Ethiopian Belg (Diro et al. 2008) and Kiremt (Gissila et al. 2004; Block and Rajagopalan 2007; Korecha and Barnston 2007; Jury 2013; Nicholson 2014) seasonal rainfall. The possible role of land surface conditions for African rainfall was noted first by Charney (1975) and has been investigated in many subsequent studies (e.g., Charney et al. 1977; Charney and Shukla 1981; Xue and Shukla 1993; Clark and Arritt 1995; Clark et al. 2001).

The surface boundary focus of the present Ethiopian study is SST. However, El Niño-Southern Oscillation (ENSO)-related ''predictability barrier'' in Northern Hemisphere spring (e.g., Goswami and Shukla 1991; Webster and Yang 1992; Webster et al. 1998) can pose a major challenge to providing seasonal rainfall forecasts two or more months in advance in the tropics (Goddard et al. 2001; Korecha and Barnston 2007). Also, because the effects of slowly evolving global SST variations on Ethiopian rainfall must be transmitted through changes in local and regional atmospheric circulations and SST patterns, development ofmonthly-to-seasonal Ethiopian rainfall predictions can be enhanced by identifying local and regional predictors and aggregating their effects into ensemble-based statistical prediction schemes.

The present Kiremt predictability investigation involves 2-3-month lead time forecasts of local, regional, and national rainfall for Ethiopia. Individual predictions employ a set of 20 predetermined regional and global predictor variables to construct equal number ensembles of initial statistical model estimates that capture potentially valuable and unique signals from every variable in the 20-member predictor set. Final models are selected objectively from the 20 initial models after a series of tests are applied that address regression assumptions and statistical significance requirements. The averaging of the final ensembles of statistical estimates is shown to improve cross-validation skill compared to a traditional, single-model based technique. The ensemble provides a novel approach to quantify the envelope of uncertainty created by the model building process, reflecting the predictive uncertainties associated with individual predictors and models. Crossvalidation skill improvement is achieved through error minimization afforded by the ensemble.

2. Data and methodology

Accurate prediction of all-Ethiopian Kiremt rainfall would provide valuable national-scale information. Stakeholders would benefit from accurate monthly rainfall predictions at local and regional levels, especially for highly populated areas and those with large interannual rainfall variability. To achieve a high level of accuracy and a measure of forecast uncertainty, ensemble-based statistical prediction tools were developed and assessed for 1) all-Ethiopian standardized Kiremt rainfall anomalies, 2) August total rainfall at Addis Ababa (the most populated city in theAfrican Union) and Combolcha (center of the 1984 Ethiopian drought), and 3) northeastern Ethiopia standardized JJAS rainfall anomalies. Addis Ababa and Combolcha have daily rainfall records, with Combolcha's high interannual rainfall variability challenging traditional single-model-based seasonal forecast methods (e.g., Gissila et al. 2004; Korecha and Barnston 2007). Northeastern Ethiopia was selected for regional prediction because of the region's susceptibility to drought and high rainfall variability (e.g., Lanckriet et al. 2015).

a. Data

This study utilizes sets of Ethiopian rainfall (predictand) and large-scale atmospheric and global SST (predictor) data for 1970-99 that were described in Part I and now have been extended through 2000-02. For seasonal prediction at a national level, a standardized all-Ethiopian JJAS rainfall index originally was constructed for 1970-99 using daily totals from 100 rain gauge stations (Fig. 1, dots). The present extension of that index for 2000-02 employed seasonal totals for 52 of those stations (Fig. 1, squares) analyzed in Korecha and Barnston (2007; data obtained from A. G. Barnston in 2008). For local prediction examples, August monthly rainfall totals for 1970-99 were used for Addis Ababa and Combolcha (Fig. 1). For regional prediction, 11 stations from three regional states (Amahara, Afar, and Tigray) were used to construct a standardized regional JJAS rainfall index for 1970-99 for northeasternEthiopia (Fig. 1, stars). This index was extended for 2000-02 using seven stations fromKorecha and Barnston (2007) seasonal total rainfall for northeastern Ethiopia(Fig. 1, collocated squares/stars).

The set of atmospheric predictors for 1970-99 was constructed from monthly NCEP-NCAR reanalysis data (508N-408S, 308W-908E and 2.58 latitude 3 2.58 longitude spatial resolution; Kalnay et al. 1996). These predictors included daily-averaged fields of mean sea level pressure (MSLP), geopotential height, temperature, horizontal wind, vertical velocity, and specific humidity at standard pressure levels. SST predictors for 1970-99 were obtained from the Met Office Hadley Centre global monthly SST dataset (HadISST1; 18 latitude 3 18 longitude resolution; Rayner et al. 2003). Both these datasets were extended to 2002.

b. Prediction design

In Part I, wavelet analysis identified the temporal and spectral characteristics of the intraseasonal-to-interannual variability of 5-day average JJAS Ethiopian rainfall, and their strong contemporary teleconnections with largescale atmospheric circulation and global SST patterns. Although variability on quasi-biennial (QB; 5%) and ENSO (2%) time scales accounted for much less rainfall variance than the annual mode (66%), they modulated the season-long persistence of major regional monsoon components and their remote teleconnection linkages. However, time-lagged relationships between regional rainfall and such large-scale variables generally do not remain strong or statistically significant (e.g., DelSole and Shukla 2002; Segele et al. 2009a, their Fig. 14), which limits their predictive potential.

The present study therefore develops and applies an ensemble-based multiple linear regression technique to predict 1) standardized JJAS rainfall anomalies for all ofEthiopia for 1970-2002, 2) monthly rainfall totals for Addis Ababa and Combolcha for 1970-99, and 3) regional standardized rainfall anomalies for northeasternEthiopia for 1970-2002 (Fig. 1). This approach assumes that slowly evolving global SST variations affect Ethiopian rainfall through changes in more regional and local atmospheric circulation and SST anomalies, even when the influential ENSO state is unpredictable or its phase has minimal effect (e.g., DelSole and Shukla 2002; Nicholson 2014). The methodology involves three distinct steps: 1) building a set of potential atmospheric circulation and SST predictors from a series of correlation maps; 2) constructing initial ensemble model members by specifying each potential predictor as the first predictor in a multiple regression model; and 3) selecting final ensemble model members based on statistical significance tests.

1) STEP 1:BUILD A SET OF POTENTIAL PREDICTORS

Guided by Part I, the full three-dimensional state of the regional atmosphere and global SST was represented by a set of potential March predictors of Kiremt (JJAS) rainfall. Raw atmospheric predictors consisted of gridpoint values for 508N-408S, 308W-908E of geopotential height (F), temperature (T), horizontal wind (u, y), vertical velocity (w), and specific humidity (q) at standard pressure levels, and MSLP. To accommodate potential interaction effects between these predictors, additional second-order predictors were derived by their cross multiplication-that is, Fu, Fy, Tu, Ty, uy, uw, and yw at 12 pressure levels from 1000 to 100 hPa, and qu and qy at eight levels from 1000 to 300 hPa. Also, following Part I, the horizontal wind components and vertical velocity for 1000-100 hPa within 308N-208S, 308-508E were used to construct zonal and meridional vertical cross sections of uw and yw. Raw global (608N-608S) SST predictors for March were used in their original form. Each gridpoint variable was standardized. Correlations were computed at all grid points and levels and, following Gissila et al. (2004), Block and Rajagopalan (2007), Korecha and Barnston (2007), and Nicholson (2014), coherent areas of highly significant correlation were selected and the average values of individual variables within the coherent areas were taken as candidate predictors. Here, it was required that the correlations remain statistically significant at 5% level according to a nonparametric bootstrap test for 1970-90 and 1970-99. In addition, the physical significance of the correlations was assessed for all selected predictors. Representative samples of predictors are discussed in section 3a. The number of candidate predictors for a given predictand was limited to a maximum of 20. These predictors were then used for both retroactive/retrospective verification (RV) and leaveone- out cross validation (LOOCV) approaches. In RV, the 20 selected March predictors were used to generate all-Ethiopian Kiremt (JJAS) rainfall predictions for the 1990-2002 verification period, based on 1970-89 model development data. In the LOOCV, 20 March predictors were used for each year during 1) 1970-2002 for all of Ethiopia national and northeastern regional predictions of seasonal rainfall anomalies and 2) 1970-99 for local monthly rainfall prediction at Addis Ababa and Combolcha, based on model development data for all other years in the corresponding training period.

The above predictors may not be mathematically independent. However, the problem of multicollinearity that arises in a given predictor matrix, owing to lack of independence between levels and variables, was evaluated for individual regression models during the final model selection stage, as described in step 3 below.

2) STEP 2: CONSTRUCTION OF MULTIMODEL ENSEMBLE SET

Model development used a forward selection stepwise multiple linear regression procedure (S-PLUS 2013) that began with base models containing only one predictor from the preselected set and an intercept. Other preselected predictors then were added individually in turn, until no further improvement resulted according to the Akaike information criterion (AIC; Venables and Ripley 1997, 218-222). This process was repeated until all preselected predictors (Np) variables for a given prediction/ verification year were used as a first model parameter, giving Np individual prediction equations. This ensures that a potential predictive signal uniquely associated with a predictor is not unduly discarded. The forward selection procedurewas necessary because the alternative backward elimination stepwise approach starts from a full (complex) model and does not guarantee that the final model will contain a particular predictor as its first parameter.

3) STEP 3: SELECTION OF FINAL MODEL MEMBERS

Final multimodel ensemble members were selected from candidate models identified in step 2, based on predictor independence, model assumption validity, model coefficient statistical significance, and overall model significance. Several procedures were applied to ensure that the multiple regression assumption of independent predictors was not seriously violated. Correlations between all predictors were determined for the entire time series, andmodels for which correlation magnitudes between any two predictors exceeded 0.5 were discarded. Further scrutiny for multicollinearity used the variance inflation factor (VIF; Tamhane and Dunlop 2000, p. 417) to eliminate remaining models with VIFs .10. Remaining models were considered for further analysis only when the null hypotheses of normality, zero autocorrelation, and homoscedasticity were not rejected at the 5% (a 5 0.05) significance level according to the following tests: the Shapiro-Wilk normality test (Steinskog et al. 2007), the Durbin-Watson test for serial independence (Wilks 2006, 192-193), and the Breusch-Pagan test for conditional heteroscedasticity (Breusch and Pagan 1979). A model then was admitted to a finalmultimodel ensemble set only when its individual regression coefficients, as well as the overall model fit, were statistically significant at the 5% level according to an F test (e.g., Draper and Smith 1981, 31-33; Tamhane and Dunlop 2000, 407-410). In the above significance tests, if the cost of type I error is much larger (smaller) than expected, the a value can be adjusted to a smaller (larger) value. The cost of type II error needs to be considered simultaneously when adjusting a. Following other regression-based prediction development (e.g., Hastenrath et al. 1995; DelSole and Shukla 2002; Saunders and Lea 2005; Block and Rajagopalan 2007; Korecha and Barnston 2007), nonparametric tests were not performed for regression coefficient significance because of the considerable computational requirements associated with the large number of regression models involved.

c. Forecast verification

Comprehensive model evaluation was performed by comparing predicted and observed rainfall values for mutually exclusive/independent verification periods, using goodness-of-fit and relative/absolute error measures with supporting estimates of reliability (Willmott et al. 1985; Legates and McCabe 1999, hereinafter LM99). Mean absolute error (MAE) and root-meansquare error (RMSE) statistics quantified prediction errors. To assess the models' goodness of fit or relative error, the Pearson's product-moment correlation coefficient, coefficient of efficiency [E2; Eq. (2) of LM99], index of agreement [d2; Eq. (3) of LM99], modified index of agreement [d1; Eq. (4) of LM99], and modified coefficient of efficiency [E1; Eq. (5) of LM99] were all calculated. Willmott et al. (1985) and LM99 explain E2, d2, d1, and E1, while DelSole and Shukla (2002) and Wilks (2006, 280-281) further discuss E2.

The aforementioned deterministic model evaluation metrics were complemented by the ranked probability score (RPS), a commonly used ensemble forecast performance measure (Mason and Mimmack 1992; Wilks 2006, 299-302; Block and Rajagopalan 2007), defined as

...

where J is the number of prediction categories, andYm and Om are cumulative predicted and observed probabilities through category m, respectively. The RPS was converted into a ranked probability skill score (RPSS) by computing it relative to climatological probabilities, where

...

Three equiprobable categories (terciles) were used, and were obtained by ranking and grouping predictand time series into the bottom (below normal), middle (near normal), and top (above normal) thirds.

Because the aforementioned measures follow no known distribution, no standard parametric approach can determine the reliability and significance; therefore, significance and confidence intervals were assessed through two nonparametric bootstrap methods that permit any data distribution to generate statistics (Efron and Gong 1983; Willmott et al. 1985; Solow 1985; LM99). First, and following Part I, the reliability of each sample statistic (Mason and Mimmack 1992; Nicholls 2001) was assessed by calculating its confidence intervals, using the bias-corrected and accelerated (BCa) bootstrap technique of Efron and Tibshirani (1993, p. 178). To establish confidence interval accuracy, the BCa method was applied to sample statistics computed from 5000 bootstrap samples of N pairs of observed/predicted rainfall time series members. Each of the N pairs was obtained by reshuffling and randomly choosing, with replacement, single-pair values at a time from the N pairs of observed/predicted rainfall time series members. In the second bootstrap approach, an achieved significance level (ASL)-the probability of exceeding the observed correlation by chance-was obtained for correlations computed from the same 5000 bootstrap samples of N pairs of observed/predicted rainfall time series members (Efron and Tibshirani 1993, p. 203). Each observed-predicted correlation was ranked among the bootstrap sample correlations to determine the fraction of the random correlations that were at least as large as the observed correlation.

3. Prediction of Kiremt rainfall

a. Examples of March predictors for all-Ethiopia Kiremt rainfall

This section documents the physical importance of certain selected atmospheric and SST predictors for Kiremt rainfall prediction. Figures 2 and 3 (atmospheric circulation) and Fig. 4 (SST) document the March predictors that were most strongly correlated with standardized all-Ethiopian Kiremt rainfall for the 1970-90 and 1970-99 periods. The discussion below focuses on regions of strong correlations indicated by red arrows in Figs. 2-4.

Zonal temperature advection at 100 hPa (Tu100) over southern Europe during March correlated strongly with Kiremt rainfall, with achieved significance levels (ASLs) of #1% according to a nonparametric bootstrap test for both the 1970-90 and 1970-99 periods (Fig. 2, top). The association between north-south variations in zonal westerlies and warm lower stratospheric midlatitude air is manifested by strong correlations. The correlation magnitude decreased appreciably (from10.50 to10.40) at 150 hPa, although similar correlation patterns continue in the zonal wind at pressure levels greater than 250 hPa. The positive correlation indicates a likelihood of enhanced Kiremt rainfall with increased Tu100 over southern Europe.

Another potential predictor involving the mass field was found over the western equatorial Indian Ocean in association with the meridional flux of geopotential height at 500 hPa (fv500) during March (Fig. 2, middle). The main correlation signal was linked to the meridional wind (v500), but including variability in the midtropospheric tropical geopotential height field increased the correlation with Kiremt rainfall. Perturbation in the northerly wind occurs in response to the east-west (mainly) oscillation of a weak tropical high over Africa and the eastward passage of subtropical westerlies, which break the weak zonal midtropospheric easterlies over the equatorial Indian Ocean. The positive v500- rainfall correlations indicate that the strengthening of equatorial southerlies over the western Indian Ocean during March is likely to enhance Ethiopian rainfall in summer.

In the lower troposphere, meridional moisture flux in the western Indian Ocean in March is associated strongly with Ethiopian Kiremt rainfall. March meridional moisture flux is affected by a transitory subtropical anticyclone over the Arabian Peninsula, which brings moisture for the short rains in Ethiopia from mid- February to mid-May (locally called Belg) when it moves eastward over the Arabian Sea. In association with the eastward displacement of the subtropical high/ anticyclone to the Arabian Sea (usually in response to a southward intrusion of westerlies), the dry continental northeasterlies are replaced by maritime northeasterly to easterly humid air flowing into Ethiopia from the Arabian Sea. The negative rainfall and meridional moisture flux at 925 hPa (qv925) association indicates that increased (decreased) northerly moisture flux over the northern Indian Ocean during March can lead to enhanced (reduced) Ethiopian seasonal rainfall during JJAS.

Predictors in the horizontal wind and vertical velocity fields include 500-200-hPa zonal and meridional winds and vertical velocity. In the upper troposphere, the product of 200 hPa u and w (uw200) during March possessed strong negative correlation with Kiremt rainfall over northeastern Arabian Sea. This correlation arises from the passage of a ridge-trough system that perturbs a normally weak zonal and subsiding flow over the northern Indian Ocean. The negative correlation indicates that increasing westerlies/ascending motion (easterlies/descending motion) during March likely lead to reduced (enhanced) Ethiopian Kiremt rainfall. Nicholson (2014) also found strong correlation between Horn of Africa rainfall and zonal wind at 200 hPa for May. The 200-hPa vertical velocity association with Ethiopian rainfall strengthens at 300 hPa with its footprint extending farther west to the western Indian Ocean (Fig. 3, middle).

In the middle troposphere, the zonal and meridional wind for March showed strong correlation with Ethiopian rainfall offthe coast of Kenya (Fig. 3, bottom). The predictability signal was obtained from both the zonal and meridional components, with the zonal and meridional wind interaction shifting the region of influence near the African coast compared to Fig. 2 (middle). The negative correlation reflects the simultaneous strengthening of northerlies (southerlies) and westerlies (easterlies) during March over the western Indian Ocean can enhance Ethiopian rainfall during JJAS.

Much of Ethiopian Kiremt rainfall predictability from SST is generated from tropical Pacific SST, where there are strong statistically significant correlations for both the 1970-90 and 1970-99 periods. Previous studies also found the northern tropical Pacific to be a source of Ethiopian rainfall predictability (e.g., Block and Rajagopalan 2007; Korecha and Barnston 2007; Nicholson 2014). After demonstrating the circulation patterns associated with the correlations, a physical basis for retention of these predictors is confirmed and a total of 20 predictors were selected for modeling of all-Ethiopian Kiremt rainfall (Table 1). These predictors were used for both RV and LOOCV approaches.

b. Retroactive verification of all Ethiopian predictions using March predictors

For RV of all-Ethiopian Kiremt rainfall predictions, 20 individual regression models (with intercepts) initially were developed for the 1970-89 training period, following the multimodel ensemble construction (step 2) described in section 2b. The numbers of model coefficients (not counting the intercepts) stepped into the initial regression models were 2 (5% of models), 3 (65%), and 4 (30%). The final multimodel ensemble set (11 models) was identified as described in step 3 (section 2b) using model statistics for the 1970-89 training period (regression coefficients and residuals, numbers of coefficients, multiple R2, VIF, and correlations between predictors).

Model performance is summarized in Fig. 5, which displays the correlations between model-predicted and observed standardized all-Ethiopian Kiremt rainfall anomalies for the training (1970-89) and independent verification (1990-2002) periods for the initial 20 and final 11 models (Fig. 5b). All 20 models reproduced well the observations for the training period (R2 . 0.77), but their cross-validated skill decreased significantly for the independent 1990-2002 period, with 55% of models having less than 50% explained variance and RMSE increasing by 39%-193% (Fig. 5a). Figure 5b shows the regression equations for the final 11 selected models that satisfied the assumptions of linear regression technique objectively and attained statistical significance of 5% (section 2b). All models have a maximum of four nonintercept coefficients. With the exception of the vw925 predictor, which appeared in 7 of the 11 final models as the second parameter, the other predictors are fairly uniformly distributed with #4 out of 11 frequency of occurrence in the ensemble. Comparison of Figs. 5a and 6 shows that significant prediction improvement resulted from averaging the ensemble members, consistent with findings of Krishnamurti et al. (2000) and Bohn et al. (2010) for dynamic multimodel ensemble forecasting. This benefit primarily arises fromcancelling (reinforcing) disagreements among (common features between) ensemble members (Wilks 2006, p. 235).

Figure 6 and Table 2 document the RV performance using the mean and median of the final 11-member model ensemble. To highlight the skill advantages that arise from the ensemble averaging, the median column in Table 2 gives the median values of the statistics under the observed versus predicted column but obtained for the final 11 regression models individually prior to ensemble averaging. Pearson's correlations between observed and predicted ensemble means (10.84) and medians (10.79) for the 1990-2002 verification period have ASLs of ,1% according to a nonparametric bootstrap test (Fig. 6). The Spearman's rank correlation increased slightly for the ensemble mean (10.87) but remained nearly the same for the median (10.78), both with similarly determined ASLs #1%, indicating strong monotonic association between the ranking of ensemble forecasts and observations. Predictions have similar variability as the observations, as evidenced by their standard deviations (Table 2, Fig. 6 inset). Absolute prediction error statistics (MAE and RMSE) are 46%- 59% of the observed standard deviation. All relative error measures in Table 2 indicate that the model ensemble mean satisfactorily reproduced the observed standardized rainfall anomalies. The prediction showed conventional skill score (SSClim; Wilks 2006, p. 281) improvement over 1990-2002 climatology (62% MSE reduction) and persistence (87%). For tercile predictions (below, near, and above normal), the ranked probability skill score (RPSSClim) exceeded that of climatology (10.55). Further, by comparing the ensemble mean statistics with the median average values for the individual regression model counterparts, there emerges a clear increase in skill for all statistics for the ensemble mean: increased prediction variance, reduced absolute errors, and much increased goodness-of-fit statistics.

Figure 7 documents the reliability of the above statistics in the context of empirically derived frequency distributions of bootstrapped samples and corresponding confidence intervals. Because the RPSSClim [skewness b1 5 20.3 and kurtosis b2 5 13.2, defined in Tamhane and Dunlop (2000, p. 118) and computed from sample replicates used in Fig. 7] and (especially) the Pearson's correlation r (b1 5 20.7, b2 5 14.3) are negatively skewed and leptokurtic compared to the standard normal distribution, the BCa method makes appropriate adjustments to correct the skewness bias in the estimation of their confidence intervals. The modified index of agreement d1 also is negatively skewed and leptokurtic compared to the standard normal distribution (b1 5 20.7, b2 5 13.6), while MAE is positively skewed (b1 5 0.3) and mesokurtic (b25 13.0).

To examine the cross-validated performance of the ensemble prediction as a function of ensemble sizes, the number of predictors, and hence initial ensemble model sizes, was varied from five to the maximum available To examine the cross-validated performance of the ensemble prediction as a function of ensemble sizes, the number of predictors, and hence initial ensemble model sizes, was varied from five to the maximum available variance of the prediction also decreases as the ensemble size increases, as shown by the dashed line in Fig. 8. However, the ensemble standard deviation for the final ensemble is still higher than the observed standard deviation (Table 2). It can also be seen from the figure that an increase in the number of predictors may not increase the ensemble size, because some of themodels associated with the added predictors do not make it to the final ensemble set as they fail to satisfy preset conditions of statistical significance and regression requirements. In general, diversifying the predictor set can increase the likelihood of capturing regional atmospheric predictive signals that improve the prediction of Kiremt rainfall.

c. Leave-one-out cross validation of all Ethiopian predictions using March predictors

All-Ethiopian Kiremt rainfall predictability also was assessed for 1970-2002 using March predictors via the LOOCV approach described in section 2b. The procedures described in section 2b were followed for model construction (step 2) and final ensemble model member selection (step 3). After the passage of all statistical tests, 7-14 models were accepted as the final ensemble set from the initial 20 models (Table 2). In addition to the intercept, the initial models had 3-6 predictor coefficients, with the majority of the models having 5 coefficients. The maximum number of nonintercept coefficients allowed in the final ensemble set was five, because reducing the number of regression parameters to four led to a single model for 1974 (Table 3) that did not satisfy the regression assumptions and statistical significance requirements.

The mean and median of the final ensemble model members were used as verification metrics for the forecasts. Figure 9 and Table 3 document the performance of the LOOCV approach for the predictions of standardized all-Ethiopian Kiremt rainfall anomalies. The Pearson's correlations between the observed and predicted ensemble mean and median respectively are 10.81 and 10.75 (ASLs ,1% according to a nonparametric bootstrap test) for 1970-2002. The Spearman's rank correlation was 10.81 for the ensemble mean and 10.77 for the median.

Table 4 provides additional prediction verification statistics (section 2c) for the final multimodel ensemble shown in Table 3 (last column). The performance statistics under the median column were obtained from individual regression estimates. For verification statistics that require pairwise observed-predicted time series, only the least common number of models in the 33 verification years (7 for 1973 and 1982) was used so that no data pairs are excluded or repeatedly used in cross validating individual regression estimates. This issue arises in the LOOCV approach because the number of final models (and hence the number of predicted values) varies from year to year (Table 3). To be as objective as possible in selecting the least common number of models, seven models were randomly sampled (without replacement) for all verification years with larger final model ensembles and were used for cross validation. As seen from Table 4, the LOOCV ensemble mean prediction proved more skillful than the individual regression estimates for all verification measures including prediction variance (standard deviation), SSClim, RPSSClim, r, and E1. In particular, the RPSSClim skill for the individual regression estimates was far inferior to the skill obtained for the ensemble mean. To ensure that this low skill was not an artifact of a one-time sampling issue, RPSSClim was computed for 10 more samples obtained for different initializations of the random number generator. In all cases, the median of the RPSSClim for the individual regression estimates was less than 0, indicating worse than climatology skill.

The prediction variability for the ensemble mean is close to the observed variation, with the prediction standard deviation being close to 74% of the standard deviation of the observations (Table 4). The MAE (0.18 standardized units) and RMSE (0.23 standardized units) are smaller than the observed standard deviation (0.39 standardized units). The LOOCV prediction approach showed substantial reduction in MSE compared to climatology (65%) and persistence (86%). Generally, the skill of the LOOCV remains close to the skill obtained for the RV approach and provided skillful all- Ethiopian Kiremt rainfall forecasts with the average regional atmospheric and global SST conditions observed in March (e.g., r510.81, d1566%, and RPSS5 45%).

d. Leave-one-out cross validation for local and regional predictions using March predictors

The LOOCV approach was applied on finer temporal and spatial scales for the prediction of August monthly rainfall totals at Addis Ababa and Combolcha (Fig. 1) and standardized JJAS rainfall anomalies for northeastern Ethiopia based on atmospheric and SST predictors observed in March. Tables 5-7 list the 20-predictor sets used for local and regional predictions. Following previous studies (e.g., Gissila et al. 2004; Block and Rajagopalan 2007; Korecha and Barnston 2007; Nicholson 2014), these predictors were selected based on a careful assessment of maps of predictand-predictor correlations. All selected predictors have statistically significant correlations with ASLs #5% according to a nonparametric bootstrap test for both the 1970-90 and 1970-99 periods. The predictors for Addis Ababa (Table 5) include upper tropospheric air temperature, meridional moisture flux over the northernArabian Sea, SST over the Bay of Bengal, and products of zonal and meridional winds with vertical velocities across the southern oceans, most of which are strongly correlated with the performance of the Belg rains in Ethiopia and the development of the monsoon systems in summer. For Combolcha (Table 6), lower-tropospheric meridional wind and specific humidity over the central Indian Oceanand SST over the southern Pacific possessed strong predictive signals, while for regional predictions (Table 7) midtropospheric zonal and meridional wind over the western Indian Ocean and vertical velocity over West Africa and/or the Gulf ofGuinea in March are major predictors of JJAS rainfall anomalies for the northeasternEthiopia.

The procedures presented in section 2b (steps 2 and 3) were followed for model construction and final ensemble model member selection for local predictions forAddis Ababa and Combolcha for 1970-99 and regional prediction for northeasternEthiopia for 1970- 2002. In all cases, models with a maximum of only five nonintercept coefficients were allowed in the final ensemble set.

For Addis Ababa, the final ensemble models range from 5 for 1984 to 14 for 1972, with the majority of the models having 4-5 nonintercept coefficients (Table 8). Training period degrees-of-freedom-adjusted R2 (coefficient of multiple determination) varied from 56% to 88% (from 61% to 91%) for all verification years. The LOOCV ensemble mean overestimated the three driest years (1987, 1972, and 1975) by 51-91mm and underestimated the wettest three years by 18-80mm (Fig. 10). However, differences between predicted and observed rainfall amounts were less than 50 (33)mm for 77% (73%) of the 1970-99 verification years. The observed and LOOCV ensemble means for 1970-99 differed by ,5mm, with the year-to-year variability of the ensemble mean being 86% of the observed standard deviation. The RPSSClim and all MSE-based skill scores (SSClim, SSPers, E1, d1, and d2) showed improved forecast skill relative to the 1970-99 climatology and persistence (Table 9). The Pearson's correlations between observed and predicted ensemble mean and median respectively are 10.72 and 10.68 (ASLs ,1% according to a conventional nonparametric bootstrap test), which respectively decreased to 10.64 and 10.58 when the maximum number of nonintercept coefficients was reduced to four. In the latter case, however, the final ensemble sizes were small, ranging from 3 or 4 models for many verification years to 12 models for 1972, which can affect the quantification of prediction uncertainty for years with small ensemble sizes. Compared to the median values of individual regression estimates, there is a substantial improvement of the ensemble averaging as evidenced by a smaller predicted-minus-observed mean difference, higher prediction variance, and smaller MAE and RMSE values. In addition, the ensemble mean provided higher skill values (Table 8) for all verification metrics including RPSSClim, SSClim, and E1 for five randomly selected model ensembles for all years except 1984, which only had five ensemble members.

For the prediction of August rainfall total for Combolcha, from 7 (for 1974) to 17 (for 1986) final ensemble memberswere selected from the initial 20models (Table 10). In addition to the intercept, the initial models stepped in 3-6 predictors, with the majority of the models having four regression coefficients. The LOOCV ensemble mean (median) forecasts (Fig. 11) for the final ensembles produced an observed-predicted Pearson's correlation of 10.68 (10.63), both with #1% ASLs according to a nonparametric bootstrap test. The Spearman's rank correlation decreased for both the ensemble mean (10.63) and median (10.56), indicating a reduced monotonic association between ensemble-mean predicted and observed time series. Reducing the maximum number of coefficients from five to four did not change the correlation skills, but lowered the ensemble sizes to between 5 (for many verification years) and 12 for 1996. The observed and prediction ensemble means have comparable values (Table 11), but the predicted yearto- year variability was 19% lower than the observed standard deviation. The prediction outperformed climatology and persistence, with a 45% (76%) reduction of the MSE obtained from 1970-99 climatology (persistence). Note that RPSSClim and E1 are not statistically different from zero. Despite the ensemble's poor RPSSClim and E1 skill values, the ensemble averaging showed modest improvements over the median values of individual regression estimates (excluding RPSSClim and E1), with percentage skill increases of 2%-37% for relative error measures, and reductions in absolute error measures of 5%-10%.

For regional JJAS rainfall prediction for northeastern Ethiopia, from 6 (for 1986) to 13 (1994) final model ensembles were selected. The majority of models again had 4-5 coefficients, but none of the final ensemble models had more than five nonintercept coefficients (Table 12). The Pearson's correlations between observed and predicted ensemble mean and median respectively are 10.80 and 10.75 (Fig. 12; ASLs ,1% according to a conventional nonparametric bootstrap test), while the Spearman's rank correlation counterpart for the ensemble mean (median) is 10.76 (10.75). The Pearson's correlation skill values for the ensemble mean and median slightly decreased respectively to 10.78 and 0.73 when the maximum number of nonintercept coefficient was reduced from five to four. However, the small change in correlation values was accompanied by significant reductions in ensemble sizes (4 and 5 for many years and 11 for 1982). The small ensemble sizes in turn affect the quantification of prediction uncertainty. Although the ensemble mean overestimated the driest four years and underestimated the wettest two anomalies, the prediction exhibited sufficiently high year-toyear variability (90% of the observed standard deviation). Unlike for Combolcha, the RPSSClim and all MSE-based skill scores (SSClim, SSPers, E1, d1, and d2) showed improved forecast skill relative to the 1970-2002 climatology and persistence (Table 13). Compared to the median values of individual regression estimates, the ensemble mean showed 20%-22% decreases in MAE andRMSE and7%-62%increases in SSClim, SSPers, r, E1, d1, and d2 for six randomly selected model ensembles for all years except for 1986, which had only six ensemble model members. Thus, the ensemble-based statistical technique provided substantively high skill scores for regional prediction of JJAS standardized rainfall anomalies two months in advance of the onset of Kiremt in northeastern Ethiopia.

4. Summary and discussion

This study evaluates the predictability of Ethiopian Kiremt monthly to seasonal rainfall at local, regional, and national levels based on the climate system causation of Kiremt rainfall variability over Ethiopia. Time-scale analysis in Part I showed the contemporaneous linkages of Ethiopian rainfall variability with both atmospheric and sea surface temperature (SST) forcing during June-September (JJAS). The monsoon's response to both slowly varying SST variability and higher-frequency regional atmospheric changes are found to be the primary climate system variability causation to assess Ethiopian seasonal and monthly rainfall predictability one to three months in advance.

A pool of predictors (20) deemed relevant to the rainfall was selected by careful assessments of a series of historical correlation maps that relate rainfall with individual regional atmospheric and global SST variables for March. Regression models were constructed using a forward stepwise model-fitting procedure, for which each selected predictor was specified as a first model parameter along with an intercept to ensure that a potential predictive signal uniquely associated with a predictor is not unduly discarded. This produces an ensemble of models. Forecast skill was assessed using several verification metrics.

In the retroactive verification (RV) approach, the ensemble prediction for 1990-2002 reproduced well the observed all-Ethiopian Kiremt rainfall variability two months in advance, with a Pearson's correlation [meansquare skill score over climatology (SSClim)] of 10.84 (62%). The leave-one-out cross verification (LOOCV) for 1970-2002, which is a fairer test of the predictive capability of the models, has a Pearson's correlation (SSClim) of 10.81 (65%). For probability forecasts of below normal, near normal, and above normal, the ensemble mean showed improvement compared to climatological forecasts, with a ranked probability skill score (RPSSClim) of 0.45 for the LOOCV approach. Results of LOOCV-based localized predictions of August rainfall atAddisAbaba andCombolcha for 1970- 99 showed that the predictions captured the relative interannual variability well. The correlations between observed and ensemble means for Addis Ababa(10.72) and Combolcha (10.68) were high. Consistent with decreasing Spearman's rank correlation, the RPSSClim skill is low especially for Combolcha (10.12) and not statistically significant at 5% level. For regional prediction of JJAS standardized rainfall for northeastern Ethiopia, strong linear correlation (10.80) was found.

There is a marked improvement in the performance of the current empirical ensemble prediction method compared to prediction skill found in previous studies. For example, the correlation skill derived herein between predicted and observed standardized all-Ethiopian Kiremt rainfall anomalies was 10.81 (10.84) for the LOOCV (RV) for models initiated from average March atmospheric and SST conditions. This can be compared to Korecha and Barnston (2007), who found a correlation skill of 10.64 (10.51) between predicted and observed standardized all-Ethiopian Kiremt rainfall anomalies for LOOCV (RV) verification approach for models developed from March-May SSTs (for improvements of 27%-65%). For regional northwestern Ethiopia prediction, Block and Rajagopalan (2007) found crossvalidated correlation (RPSSClim) skill of 10.69 (10.39), based on local polynomial regression models developed using March-May predictors. The current ensemble prediction technique has about 16%-44% gain in explained variance compared to the corresponding R2 in these studies. In addition, the ensemble prediction provided longer lead times compared to the above studies (May versus March). However, Nicholson (2014) found a high correlation skill of 10.75 for LOOCVbased July-September rainfall predictions for the larger Horn of Africa region encompassing Ethiopia and Sudan. Although enlargement of the domain could have reduced the rainfall variability [e.g., negative 1996 rainfall anomaly for the larger Horn of Africa in Fig. 12 of Nicholson (2014) versus the locally wet all-Ethiopian Kiremt in Figs. 6 and 8] and affect the comparison of results, the current ensemble-based prediction still shows improved skill at a longer lead times (March versus May) and finer spatial scales, with 8%-9% gain in explained variance compared to the correspondingR2 of 56% (for LOOCV) and 62% (for RV) in Nicholson (2014).

The RV (using a model developed for 1970-89 for upcoming season prediction) and LOOCV (using a model developed for all years up to the current year for upcoming season prediction) approaches can be applied for real-time predictions of all-Ethiopian Kiremt rainfall anomalies a few days after the end of March, as the necessary initialization data (real-time reanalysis, SST) are available 5-10 days after the end of the month. The predictors selected in this study may not pose a problem in the near-term climate. However, the predictor pool should be reevaluated every few years to account for 1) predictors that remain skillful but have decreasing frequency, 2) predictors that are decreasing in skill, 3) predictors that were not included in the selected sets that are starting to show additional skill, and 4) changes in the likelihood functions due to climate change. Updating the RV and LOOCV prediction models using the entire data with follow up real-time forecast evaluation realistically could lead to successful real-time operational use of the approaches. The simultaneous use of both approaches builds confidence in the value of the prediction if they yield similar forecasts. Moreover, to our knowledge, the methodology of building observational ensembles for statistical prediction is unique. The high-quality local and national prediction capability should have significant beneficial societal implications. In particular, the forecasting of seasonal anomalies at regional and national scales and monthly rainfall totals at specific localities with usable skill could play a key role in risk management to help minimize the damaging effects of recurring droughts in Ethiopia.

Acknowledgments. This research was supported by the NOAA Cooperative Institute for Mesoscale Meteorological Studies (CIMMS). The computations for this project were performed at the OU Supercomputing Center for Education and Research(OSCER) at The University of Oklahoma. The primary research data (1970-99) were provided by the National Meteorological Agency of Ethiopia in the early part of this research. Ethiopian seasonal rainfall station data for 2000-02 were kindly provided by Dr. A. G. Barnston. The comments and suggestions of three anonymous reviewers are appreciated. In particular, Reviewer 1's detailed analysis of the manuscript greatly improved the final product. The first three authors express their deep and lasting gratitude to the fourth author, the late Peter J. Lamb, who dedicated much of his life to investigation of climate variability in Africa. Without his support and expert assistance in preparing this manuscript, it would not have been possible.

Source: http://insurancenewsnet.com/

Ethiopia Today

Friday, May 8, 2015

Seasonal-to-Interannual Variability of Ethiopia/Horn of Africa Monsoon. Part II: Statistical Multimodel Ensemble Rainfall Predictions

No comments:

Post a Comment