Generalised block bootstrap and its use in meteorology

,


Introduction
In the last decades the bootstrap methodology has become more and more widespread in different areas of statistical applications. See e.g. Chernick (2011) for a review of possible areas from spatial models to financial data and data mining, where bootstrap may be used. In this paper we focus on the effect of the serial dependence, naturally arising in many time series data. The bootstrap samples must match the dependence within the data, so the block bootstrap is the suggested method for bootstrapping time series. Hall et al. (1995) investigate this approach in some detail, including suggestions for selecting the optimal block size. In an earlier paper Rakonczai et al. (2014), we investigated the possibilities for using the block bootstrap methods for checking the validity of the copula models. In this paper we present an improvement to the classical block bootstrap methodology, which is especially relevant in our applications.
In Section 2, we first briefly review the elements of stationary time series. In the bivariate case, the vector autoregression (VAR) process is one of the most important models, becoming popular first in the area of econometrics (Sims (1980)). For recent applications of VAR models in meteorology, see for example Hill et al. (2014), Norrulashikin et al. (2015) or Farook and Kannan (2016). We briefly present the main properties of VAR models, which are used in the sequel and present the notations.
In Section 3, we introduce the concept of copulas, the most convenient objects for analysing the dependence structures among variables. Their history goes as back as Hoeffding (1940), but their applications are much more recent. However, they have spread very quickly to the most important areas -for a recent analysis in meteorology, see Cong and Brady (2012). Most of these works use different parametric copula models, but we are more interested in testing for possible changes in the dependency structure of our temperature data, so also introduce the most recent approaches in testing homogeneity of such models, which are based on the empirical copula process.
Section 4 is devoted to the bootstrap resampling method, including the block bootstrap approach, which is suitable for the case of serial dependent observations. Here we introduce a generalisation, which helps overcoming the problem that originally the block size was supposed to be a natural number. In our approach the block size is a random variable, but it contains the original block bootstrap as a special case. Due to this small variance in the sample size, it overcomes the problem of extensive random error in case of the so-called stationary bootstrap (Politis and Romano (1994)).
In Section 5 we apply our approach to the gridded temperature data base of E-OBS, which is a product of the EU-FP6 project ENSEMBLES (Haylock et al. (2008)). Here we use the daily mean temperature data from the 0.5 grade-grid. Our focus is on checking for possible changes in the dependence pattern between the grid point of Budapest and some other grid points within the Carpathian Basin. We show that in some cases there is a significant deviation from homogeneity of the first and second part of the data. The conclusion summarizes our findings and gives some interesting open questions.

Vector autoregression (VAR) processes
We call the d-dimensional series {X t } t∈Z = (X 0 , X ±1 , X ±2 , . . .) a time series if its elements are d-dimensional random vectors, which are usually not independent from each other. Here we consider t as the time. Let us assume that the random variables have finite second moments. The time series {X t } t∈Z is weakly stationary (later we just call it stationary), if for all t, if neither the mean function E(X t ) nor the covariance matrix Cov(X t+s , X t ) depends on t for all s ∈ Z. Stationarity is an important property, it means that the time series is translation invariant.
One of the most frequently applied time series models are the so-called autoregressive (AR) processes and their multidimensional counterparts, the vector-autoregressive (VAR) models. In the following, we define the VAR(p) process and give its main properties in two dimensions as this is necessary to our applications.
The time series {X t } t∈Z = {(X 1,t , X 2,t ) T } t∈Z is called a zero-mean two-dimensional VAR(p) process if where A 1 , . . . , A p are 2×2 parameter matrices and the {ε t } t∈Z independent innovation process is a two-dimensional white noise with E(ε t ) = 0 = (0, 0) T and Cov(ε t ) = C symmetric positive definite covariance matrix. The VAR(p) process is stationary if the roots of the P (x) = det(I 2 − A 1 x − . . . − A p x p ) characteristic polynomial lie outside the unit circle. Any VAR(p) process can be rewritten as a VAR(1) process in the following way: This representation is more convenient in calculating the autocovariances. An equivalent condition for stationarity is that all the eigenvalues of the coefficent matrix A are smaller than one in modulus. In this case the time series is causal -it has an infinite moving average representation in the form For the remainder of this section we assume that X t is stationary. Let us denote with Γ X (h) = E(X 1+h X T 1 ) the autocovariance function of the process X t . Γ X (h) is a 2 × 2 matrix valued function, the symbols γ i,j (h) stand for its elements. We denote with It is easy to see that for 1 ≤ h ∈ Z, the autocovariances can be calculated by Γ Y (h) = A h Γ Y (0). The powers of the matrix A can easily be computed using the spectral decomposition. Lastly, we need the autocovariance matrix of the original process, and by the construction, it is the upper left 2 × 2 submatrix of Γ Y (h).
In the applications we will use the covariance matrix of the sample mean. The following asymptotic result will be crucial in our investigations: if where tr(·) denotes the trace of a matrix.
It is important to check if the chosen time series model is adequate. If the model fits well, the fitted residuals should behave as a realisation of a white noise process. The hard part is to check whether the residuals are independent, thus there is no serial dependence. There are several methods for verifying this problem, the most standard is the Ljung-Box test, which tests whether a specified group (usually the first 10-20 lags) of autocorrelations is different from zero. Another often applied serial correlation test is the Breusch-Godfrey test. A more recent multidimensional approach was published in the paper Kojadinovic and Yan (2011), where the test was based on the empirical copula process. The main ideas and the concept of this test stems from Genest and Rémillard (2004).
For further details about time series see for example Brockwell and Davis (2013) or Shumway and Stoffer (2011).

Copulas and their goodness-of-fit
Let X = (X 1 , . . . , X d ) T be a random vector with joint distribution function F X (x) = F X 1 ,...,X d (x 1 , . . . , x d ) and marginal distribution functions . Sklar's theorem claims that there exists a copula C, a distribution over the d-dimensional unit cube, with uniform margins, such that Moreover the copula C is unique if the marginal distribution functions are continuous. This construction allows the investigation of the dependence structure without specifying the marginal distributions. In the recent literature various families of copulas have been introduced, for an overview and examples see e.g. the introductory textbook of Nelsen (2007).
In this paper we focus on testing the homogeneity of copulas, motivated by the question whether the climate change has also an effect on the dependence between pairs of observations. If this change is indeed observable, then it may have a substantial effect on the spatial structure of temperature anomalies, worth for further meteorological investigations. So we do not have to go into the parametric inference, as we are just interested in the homogeneity analysis.
Let us suppose we have two independent samples of R d -valued vectors. The first sample is X 1 , . . . , X n and the second one is Y 1 , . . . , Y m . Formally we intend to test the hypothesis that the dependence structure of the two copulas has arisen from the same copula C 0 . The most obvious way for testing the homogeneity of two copulas is to consider multidimensional χ 2 approaches, but in this case we need to discretize the data, losing valuable information. In order to avoid its use, we can follow the approach of Rémillard and Scaillet (2009), who have developed a method for this problem. Their approach is based on the empirical copula, defined for the first sample as where u ∈ R d and U i denotes the d-dimensional vector of the rank based pseudo-observations: U i = U i,n = (U i1,n , . . . , U id,n ), where n refers to the size of first sample and U ij,n = n n+1 F j (X ij ). For illustrations see Figure 1. Similarly, based on the pseudo-observations V i of the second sample, we can define the empirical copula C 2,m (u).
The proposed tests for checking the homogeneity of two samples are based on functionals of the empirical process: where the asymptotic properties of the statistic can be based on the limit of the empirical copula processes. There are two different kind of approaches investigated in Genest et al. (2006): the Cramér-von Mises type statistic S n,m = 1 0 (κ n,m (t)) 2 dt, and the Kolmogorov-Smirnov type statistic T n,m = sup 0≤t≤1 |κ n,m (t)|. As the second approach is considered to be generally less powerful, we based our inference on the statistic K * = 1 is an appropriately fine division of the interval (0, 1). After some calculations, the Cramér-von Mises test statistic can be written in the following form (see Rémillard and Scaillet (2009)): where u ∨ v = max(u, v).
As the limit distribution of the above statistic is not distribution-free, a simulation algorithm is needed to get critical values. The algorithm is the following: 1. Generate an n-element bootstrap sample from the first of the observation vectors -see its details in the next section -, and compute the statistic S n,m , based on this and the original first sample; 2. Repeat the above steps as many times as needed to an accurate estimation of the p-values. The distribution function G of the bootstrap statistic can be considered as a good approximation of the test statistic under H 0 , so we used 1 − G(S n,m ) as the estimate of the p-value.

Bootstrap methods
The bootstrap is a usually computer-intensive, resampling method for estimating the distribution of a statistic of interest. The concept of the bootstrap was introduced in the classical article by Ben Efron (Efron (1979)) and since then, it has become one of the most widely used Monte Carlo methods in a number of aeras of applied sciences.

Bootstrap for i.i.d. data
Let X n = (X 1 , . . . , X n ) T be a sequence of independent, identically distributed (i.i.d.) random variables with unknown common univariate distribution F and let T n = t n (X n ; F ) be a statistic (like the sample mean X).
Our main purpose is to approximate the unknown distribution of T n or its function of interest, for example its standard deviation (the standard error) or some quantiles.
The basic bootstrap method (mostly referred as i.i.d. bootstrap) is the following. For a given X n , we draw a random sample X * m = {X * 1 , . . . , X * m } of size m (usually m = n) with replacement from X n . Therefore, the common distribution of the X * i 's is given by the empirical distributionF n = n −1 n i=1 δ X i , where δ z is the probability measure having unit mass at z. In the next step, we define the bootstrap version of the statistic T n : T * m,n = t m (X * m ;F n ). By repeating this procedure, we can approximate the unknown distribution G n by its bootstrap counterpart G * n . In most of the cases the distribution of G * n cannot be determined explicitely, but it can be approximated by simulation.

Block bootstrap methods
In our case we are interested in the effect of serial dependence on the homogeneity tests and on modelling in general, for example on the covariance matrix of our estimators. If the data are dependent then the estimates based on i.i.d. bootstrap methods my not be consistent.
In the presence of serial dependence, one of the most commonly used methods is the so-called block bootstrap, see Lahiri (2003) for details. In this paper, we generalise the circular block bootstrap (CBB) which can be defined as follows. First, we wrap the data X 1 , . . . , X n around a circle, i.e., define the series Y t = X t mod(n) (t ∈ N), where mod(n) denotes division "modulo n". This means that X k = Y k = Y k+n = Y 2k+n = . . . for all k ∈ {1, 2, . . . , n}. For some m, let i 1 , . . . , i m be a uniform sample from the set {1, 2, . . . , n}. Then, for a given block size b, we construct n = m · b (n ≈ n) pseudo-data: where j = 1, . . . , b and k = 1, . . . , m.
Finally, let us calculate the function of interest, for example the bootstrap sample mean as follows: Y * n = Y * 1 +...+Y * n n . Block length plays an important role in the process, and it is not trivial to determine its optimal value. Politis and White in their article Politis and White (2004) suggest an "automatic" block length selection algorithm (its correction was published in Patton et al. (2009)) -but the practical applications of this method are far from obvious as there are parameters in it, which have to be chosen.
We used a similar approach in our paper Rakonczai et al. (2014). Our idea was that we tried to find the best block size by fitting a model and then checking the variance of X with the help of the block bootstrap. The block size was determined as the b, for which the estimated trace of the covariance matrix was the nearest to the one got by the fitted VAR model: where Cov * (X In the literature, simulations are naturally based on integer block sizes. But using the block length of (4), the estimated trace of covariance may be not be close enough to the theoretical trace of covariance. The same is true for other methods for block size determination. This may cause substantial bias, as in our case the relative difference between subsequent values of tr(Cov * (X * b )) can be quite large, especially for small b. This can be overcome by the following generalisation of the block bootstrap methodology.
In case of b > 1, b ∈ R, let the generalised block bootstrap sample be defined as follows. Let k be a random integer between 1 and the sample size n, and again, let us wrap the sample around the circle. The bootstrap blocks are either of length b or b : denotes the upper, and b the lower integer part of b. At last, we put the blocks together. This procedure ensures that for integer-valued b the new definition coincides with the traditional one, so this is indeed a generalisation. In the applications (Section 5) we show the clear advantages of this approach. Actually all of the relevant algorithms for finding the optimal block size can easily be adapted to find a solution in this generalised sense. In our case, instead of (4), we simply solve the equation In the same way as the circular block bootstrap sample, our generalized bootstrap sample is not a stationary process, conditional on the original sample. The block bootstrap sample is in only one case conditionally stationary: when the block lengths follow a geometric distribution, independent from each other (Politis and Romano (1994)).
The covariance matrix Cov * (X * b ) can be explicitly calculated. Let us denote with L 1 , L 2 , . . . the block sizes -they are random variables independent from each other with common conditional distribution 1 P * where J i |X n follows a Bernoulli distribution with parameter p = b − b . Let N be the random variable, which gives the number of blocks with block size b . If we have N , we can calculate the number of blocks with block size b , we denote it with g(N ). So g(N ) = n−N · b b . Let us denote the remainder block size with r(N ), we can calculate it from the others: It can be seen that the conditional covariance matrix of the bootstrap mean can be calculated the following way: At last, we have to mention that the Politis and White algorithm actually gives a real and not an integer as the optimal block size -this could be used without any rounding by our proposed method. The main problem is that the algorithm gives an extremely large block length, making meteorologically no sense.

Applications
The used observations are the 63 years of daily temperature data of the European Climate Assessment (E-OBS, http://www.ecad.eu). The methodology of deriving the data for the grid points has been published in Haylock et al. (2008), where this database has been used extensively for climate analysis. We have worked with the part of the 0.5-grade grid -available for whole Europe and northern Africa -which lies in the Carpathian Basin. Figure 2 depicts the used grid points. For later reference, we chose the grid point next to Budapest, one grid point in the neightborhood of the Hungarian capital and four further grid points lying far from Budapest, in different directions. The quality of the data has been evaluated e.g. in Hofstra et al. (2009), and it turned out to be reliable for most of Central Europe. As we have used the grid points, belonging to the Carpathian Basin, this validates our results.
As we intend to use models, suitable for stationary data, first the stationarity had to be ensured. We have first subtracted the smoothed daily averages from the observations. The smoothing was made by loess regression, Figure  Figure 2: The map of the Carpathian Basin with the used grid points.
3a.) and Figure 3b.) depict the daily averages and standard deviations of the 63 years' data and the smoothing regression line for grid point Budapest. It turned out that the second-order stationarity is still far from being true (in winter the variances were substantially larger than in summer), so we have divided the observations with the smoothed estimated standard deviation for the given day:x t,n = x t,n − m t s t wherex t,n is the standardized value for day t in year n, based on the original observation x t,n for the same day and the smoothed average m t and smoothed standard deviation s t . Figure 3c.) shows the original daily observations and the standardized data between January 1, 2010 and December 31, 2012 for grid point Budapest.
In order to reduce the effect of the outliers of our results, we have finally computed the ten-days averages of thex values. There is a slight but significant upward linear trend in the data, but we did not remove it, as one of our main aims was to detect the changes in the dependencies of the investigated sites -and these should be based on the original (standardised) deviations, as constructed above.
In the next step, we examined the fixed grid point Budapest paired with other grid points of the database. Using the Akaike information criterion, Daily observations Standardized data Loess regression Figure 3: a.) The daily averages (annual cycle) and the smoothing loess regression; the daily standard deviations (annual cycle) and the smoothing loess regression; c.) the original and the standardized data for grid point Budapest.
we chose the lags of the most appropriate among vector autoregressions to model our data pairs. Despite the adjusted R 2 values being rather low (around 10%), the Ljung-Box Q-test, the Breusch-Godfrey test and the test of Kojadinovic and Yan (2011) could not detect the presence of further serial dependence that has not been included in the VAR model. Table 1 contains these results. Our main goal is to detect if there is a significant change in the dependence structure of the data. We separated the pairs of points into two partsthe first part corresponds to the first 31.5 years' observations and the second part to the second 31.5 years' observations. For five selected pairs of grid points, we wanted to test the null hypothesis if the copula of first half of the sample is equal to the copula of the second half of the sample. Table 2 and Figure 4 depict the optimal block lengths obtained from solving equation (5). The second column of Table 2 and the red line of Figure 5 show the trace of the covariance matrices of the mean, calculated from the fitted VAR(1) models, multiplicated by 1186 -the half of the original sample size. We can see on the left figure that the trace of the covariance matrix of the mean is not monotone in the neighborhood of the optimum, so we have to be cautious. The trace function (black line in Figure 5) is always continuous, but not necessarily differentiable, resulting from the construction of our generalized block bootstrap method. Generally, we noticed, that as the block sizes tend to be smaller, the trace function is nearer to be monotonic. This phenomenon can be explained by the expansion of Cov * (X * b ) in formula (6): if the block size is relatively small compared to the sample size, then the first and second terms are much more dominant over the third part -which contains the effect of the remainder block size. We got pretty small, 6.51 optimal block size for the pairs Budapest & Zaránd Mountains and 20.43 for Budapest & Sopron. In case of the five selected pairs, as many as 7 iterations were always enough to solve equation (5). We have to mention that there exist some pairs of grid points -especially at the southern part of the Carpathian Basin -, for which equation (5) is not solvable.
The last step was conducting the copula homogeneity test described in Section 3. Using the optimal block size, we generated boostrap samples via the generalized block bootstrap method for the first half of the sample. With the empirical copulas of these boostrap samples and the empirical copula of the second part of the original sample, we can calculate the test statistic S n,m . In our case n = m = 1186. In order to get accurate p-values, we used 2000 repetitions. Table 3 contains the results of the homogeneity test.

Budapest & Apatovac
Block size Trace of the covariance matrix of the mean Generalized block bootstrap Estimated VAR(1) model Figure 4: The trace of the covariance matrix of the mean for two selected grid points (first half of the sample).
The dependence structure proved to be different in the first 31.5 years at the two pairs Budapest & Apatovac and Budapest & Zaránd Mountains. Figure  5 depicts the standarized observations and their copula of the pair Budapest & Apatovac. We can see that preudo-observations are apparently somewhat different, and the test also detected deviance between the two copulas.

Conclusions
We can summarize our results as follows.
There are two major points to be mentioned. First, we proposed a sim-  ple generalisation of the block bootstrap methodology, which fits naturally to the existing algorithms, and which helps to overcome the problem of discreteness in the usual block size. Second, we have found some significant changes in the dependence structure between the standardised temperature values of pairs of stations within the Carpathian Basin. The direction of this change may be worth for investigating, as this may lead to a better understanding of the processes of our climate. The proposed generalized block bootstrap method can easily be applied to any other problem, where the block size plays an important role, as all block length determining algorithms give a real number as estimated block size.
It is an interesting question, to which models the proposed block size determining method can be successfully applied. We have checked by simulation that the method is reasonably stable for the VAR models, presented in the paper. For nonlinear time series we might need more observations to fit, which is similarly reliable as the one, presented in our paper.