Comparison of hidden and observed regime-switching autoregressive models for (u,v)-components of wind fields in the Northeast Atlantic

Several multisite stochastic generators of zonal and meridional components of wind are proposed in this paper. A regime-switching framework is introduced to account for the alternation of intensity and variability that is observed on wind conditions due to the existence of different weather types. This modeling blocks time series into periods in which the series is described by a single model. The regime-switching is modeled by a discrete variable that can be introduced as a latent (or 5 hidden) variable or as an observed variable. In the latter case a clustering algorithm is used before fitting the model to extract the regime. Conditionally to the regimes, the observed wind conditions are assumed to evolve as a linear Gaussian vector autoregressive (VAR) model. Various questions are explored, such as the modeling of the regime in a multisite context, the extraction of relevant clusterings from extra-variables or from the local wind data, and the link between weather types extracted 10 from wind data and large-scale weather regimes derived from a descriptor of the atmospheric circulation. We also discuss relative advantages of hidden and observed regime-switching models. For artificial stochastic generation of wind sequences, we show that the proposed models reproduce the average space-time motions of wind conditions; and we highlight the advantage of regime-switching models in reproducing the alternation of intensity and variability in wind conditions. 15


Introduction
In this section, we present the context of our work and then the data used to compare the proposed Markov-switching autoregressive models.

Introduction
Stochastic weather generators have been used to generate artificial sequences of small-scale me-20 teorological data with statistical properties similar to the dataset used for calibration. Various wind condition generators at a single site have been proposed in the literature; see (Brown, Katz, and Murphy, 1984;Flecher, Naveau, Allard, and Brisson, 2010;Ailliot and Monbet, 2012). However, few models have been introduced in a multisite context (Haslett and Raftery, 1989;Bessac, Ailliot, and Monbet, 2015). Artificial sequences of wind conditions provided by stochastic weather generators 25 enable assessment risks in impact studies; see, for instance, (Hofmann and Sperstad, 2013). Here we propose a multisite generator for Cartesian components of surface wind. As far as we know, only a few models have been proposed to simulate time series of Cartesian coordinates of wind {u t , v t } (Hering, Kazor, and Kleiber, 2015;Hering and Genton, 2010;Ailliot, Monbet, and Prevosto, 2006;Wikle, Milliff, Nychka, and Berliner, 2001;Fuentes, Chen, Davis, and Lackmann, 2005). Except in 30 (Hering, Kazor, and Kleiber, 2015), these models are designed for short-term wind prediction and not for the generation of artificial conditions of {u t , v t }. Consequently they are not focused on reproducing the same statistics we are interested in, namely, the marginal distribution of {u t , v t } and its spatiotemporal dynamics. In (Hering, Kazor, and Kleiber, 2015) a stochastic generator for multiple temporal and spatial scales is proposed. The proposed Markov-switching vector autoregressive 35 model enables reproduction of many spatial and temporal features; however complex dependencies between intensity and direction remain hard to model.
In the Northeast Atlantic, the spatiotemporal dynamics of the wind field is complex. This area is under the influence of an unstable atmospheric jet stream, whose large-scale fluctuations induce local alternations between periods with high wind intensity and strong temporal variability, and less 40 intense and variable periods. Scientists have proposed describing the North-Atlantic atmospheric dynamics through a finite number of preferred states, namely, weather regimes or weather types (Vautard, 1990). However, introducing regime-switching in the modeling of local wind, as we propose in this paper, enables us to better reproduce the spatiotemporal characteristics observed in the wind data. In practice, describing a time series by regimes involves a partitioning into time periods 45 in which the series is homogeneous and can be described by a single model. In this paper, we propose various vector autoregressive (VAR) models with regime-switching. One of the challenges is to achieve a regime-switching that is physically consistent and that enables appropriately describing the local observation by a VAR model. To this end, we introduce several frameworks of regimeswitching and compare them in terms of simulation of wind data. 50 Depending on the availability of good descriptors of the current weather state, regime-switching can be introduced with either observed or latent regimes. Regimes are said to be observed when they are identified a priori, before the modeling of the local dynamics. In this case, clustering methods are run on adequate variables to obtain relevant regimes: either the local variables or extra-variables characterizing the large-scale weather situation, such as descriptors of the large-scale atmospheric 55 2 circulation (Bardossy and Plate, 1992;Wilson, Lettenmaier, and Skyllingstad, 1992) or variables enabling the separation into dry and wet states (Richardson, 1981;Flecher, Naveau, Allard, and Brisson, 2010). For wind models, the wind direction can be considered since it is a good descriptor of synoptic conditions. In (Gneiting, Larson, Westrick, Genton, and Aldrich, 2006), the wind direction is used both to extract regimes and to parameterize of the predictive distribution. In this paper, we 60 propose a priori clusterings based on both large-scale and local variables.
When the regimes are said to be latent, they are introduced as a hidden variable in the model. This framework is more complex from a statistical point of view and the conditional distribution of wind given the regime has to be simple and tractable. Hidden Markov models (HMMs) have been widely used for meteorological data (Zucchini and Guttorp, 1991;Hughes, Guttorp, and Charles, 65 1999; Thompson, Thomson, and Zheng, 2007). Hidden Markov-switching autoregressive (MS-AR) models are a generalization of HMMs allowing temporal dynamics within the regimes (Hamilton, 1989). Models with regime-switching improve the modeling of wind intensity time series with classical autoregressive-moving-average (ARMA) models; see (Ailliot and Monbet, 2012), where the wind speed is modeled at one site. Here we propose a hidden MS-AR model and compare it with 70 several models with observed regime-switching.
To the best of our knowledge, no comparison between observed and latent regime-switching has been proposed in the field of stochastic generators of wind conditions. In (Pinson, Christensen, Madsen, Sorensen, Donovan, and Jensen, 2008), a comparison is presented in terms of wind prediction between models with hidden regimes and models driven by observed regimes. In this work, we 75 compare both kinds of models in a simulation framework.
In the multisite context, the regime can be either common to all sites (i.e., scalar; see (Ailliot, Thompson, and Thomson, 2009)) or introduced as a site-specific regime (Wilks, 1998;Kleiber, Katz, and Rajagopalan, 2012;Khalili, Leconte, and Brissette, 2007;Thompson, Thomson, and Zheng, 2007), which enables one to account for a wide range of space-time dependencies. However, a site-80 specific regime appears to be computationally challenging (Wilks, 1998). We will show that the choice of a regional regime is reasonable when a homogeneous area is selected.
The paper is organized as follows. MS-AR models are introduced in Section 2, and their inference is described in cases of both observed and latent regime-switching. The question of a regional regime is addressed in Section 3. In Section 4, we introduce and discuss different sets of a priori regimes 85 obtained by clustering. In Sections 6 and 7, respectively we discuss the advantages of the proposed models and highlight the differences between observed and latent regime-switching models.

Wind data
The data under study are zonal (west-east) and meridional (north-south) surface wind components {u t , v t } at 10 meters above sea level extracted from the ERA-Interim dataset produced by the Eu- ropean Center of Medium-range Weather Forecast (ECMWF). It can be freely downloaded from the URL http://data.ecmwf.int/data/ and used for scientific purposes.
We focus on gridded locations between latitudes 46.5 • N and 48 • N and longitudes 6.75 • W and 10.5 • W (15×7 grid points; see Figure 1). The dataset we have extracted consists of 32 December-January blocks of wind data from December 1979 to January 2011 picked every 6 hours. Further, 95 the statistical inference is based on the assumption that the 32 December-January blocks of wind components are 32 independent realizations of the same stationary process, a reasonable assumption given the strong interannual variability of the wintertime atmospheric dynamics at such a local scale. The training dataset is then composed of 32 independent blocks and each block has 4 × 62 observations. In order to study the relevance of using common regimes for all the locations, a spatial 100 hierarchical clustering has been used to choose a homogeneous area (see Figure 1). The clustering is run on the process of moving standard deviation of wind speed, which is described more precisely in Section 6. This process is a good descriptor of the temporal characteristics of wind time series (see Figure 4), and it is computed as the standard deviation of wind speed over nine consecutive time steps (i.e., two days). The dendogram associated with the clustering suggests the use of four clusters where {U t } and {Φ t } respectively denote wind speed and wind direction. In practice, α is chosen empirically equal to 1.5. This transformation has proven helpful in modeling the distribution of (Ailliot, Bessac, Monbet, and Pene, 2015).
2 Markov-switching vector autoregressive models 120 In this section, we introduce the proposed models and discuss their parameter estimation in cases of both observed and latent regimes.

The models
In this paper, we consider the following class of models. Let S t be a discrete Markov chain with values in {1, ..., M } describing the current weather type as a function of time t. Conditionally to the 125 weather type, the observed wind conditions are modeled as a vector autoregressive model. Given the current value of S t , the observation Y t is written as Y ∈ R 2K represents the observed power-transformed wind components {u t , v t } at the K locations, are 2K × 2K-matrices, and is a Gaussian white noise of dimension 2K. Conditional independencies between S and Y are displayed on the following directed acyclic graph (DAG) for p = 1 (see (Durand, 2003) for additional information about DAGs): In this model, the regime S can be latent or observed; both cases are discussed, respectively, in 135 Sections 3 and 4. The parameter estimation of the model can be performed by maximum likelihood but in a different way in each framework.
For both kind of models, covariates can be included. The easiest way is to include them in the intercept parameter A 0 or in transitions between regimes. Transitions between regimes can be parametrized with a covariate (when regimes are latent, a parameterization with an extra covari-140 ate is given in (Hughes and Guttorp, 1994) and with the studied variable in (Ailliot, Bessac, Monbet, 5 and Pene, 2015) and in (Vrac, Stein, and Hayhoe, 2007) when regimes are defined a priori). In the context of multisite models, the choice of the covariate of non-homogeneous transitions is delicate.
We do not discuss this topic here and consider only homogeneous transition models.
To avoid overparameterization of the conditional models, we first work with a reduced dataset.

145
In the following all the proposed models will be fitted on the subset of sites (1,6,10,13,18), the extension to a wider region being left for future studies.

Estimation by maximum likelihood
First, let us suppose that the complete set of observations (y 1 , ...y T , s 1 , ...s T ) is available, which is the case in Section 4. Assume that s 0 , y −1 and y 0 are observed. Then the complete log-likelihood, 150 associated with an autoregressive order p = 2 (we choose p = 2 according to a previous work (Ailliot, Bessac, Monbet, and Pene, 2015)), is written as where the transition matrix Π of the Markov chain S, and y T 1 = (y 1 , ..., y T ). Let us denote n i,j the number of occurrences of the event where δ is the Kronecker symbol, the total number of occurrences of the regime i: . The optimal estimates of A 2 are extracted from the estimatê The other optimal estimates are whereμ (i) = 1 n i t∈{t|st=i} y t is the empirical mean of Y in regime i and

175
Concerning the Markov chain S, the associated maximum likelihood estimator iŝ When observations only of the process Y are available and the realizations of S are not given a priori, as in Section 3, one inference method is to use the expectation-maximization (EM) algorithm, which is commonly run to estimate the parameters of models with latent variables by maximum likelihood. Since S is not observed, the EM algorithm aims at maximizing the incomplete loglikelihood function based on the observations Y : It is proven that through the iterations of the algorithm, a convergent sequence of approximation of the maximum likelihood estimator of θ is computed.
EM algorithm cycles through two steps: the expectation step and the maximization step (Wu, 1983;Dempster, M., and Rubin, 1977). The E-step is performed through forward-backward recursions (see (Hamilton, 1990) for hidden MS-AR models) that enable one to compute the smoothing . At the M-step, optimal expressions of parameters of θ (Y ) , given in (4), (5), and (6), are used. In each regime i, however, each observation y t is weighted by

The transition matrix is estimated from quantities
are derived at the E-step.
In this paper, we use AP-MS-VAR C to denote the a priori regime-switching model associated with the clustering C, and we use H-MS-VAR to denote the hidden regime-switching model.

Regime definition in a multisite context
When the current weather state is not estimated a priori, it is introduced as a latent variable. Hidden propose an extension of this model, when the process {u t , v t } is multi-site. In a multi-site context, the regime can be site-specific or common to all stations.
Here, the assumption of a common regional regime is investigated, and we show that this assumption is acceptable when the considered area is homogeneous. The homogeneous single-site MS-AR model introduced in (Ailliot, Bessac, Monbet, and Pene, 2015) for {u t , v t } with M = 3 regimes and 195 an autoregressive order p = 2 has been fitted at each site. The most likely regimes associated with the data are extracted from the estimation procedure of H-MS-VAR models described in the previous section. At each time, the regime corresponds to arg max see (Zucchini and MacDonald, 2009). In order to properly compare the regimes, they are ordered according to the increasing value of the determinant of the matrix Σ (i) . The intuition for sorting 200 regimes according the determinant of Σ (i) is that we expect the innovations to be more volatile, and consequently Σ (i) having greater eigenvalues, in cyclonic weather regimes. Conversely, we expect to observe innovations more persistent in time in calm weather regimes, this is associated with smaller eigenvalues of Σ (i) . The spatiotemporal coherence of the regimes of each of the 18 sites is checked and reveals a strong homogeneity that motivates using a regional regime in this area.

205
The sequences of regimes are compared in Figure 2, time series of a posteriori regimes and wind speed are depicted. The last two regimes are less coherent from one site to another. This effect is partly explained by the fact that these regimes are less persistent in time, especially the third one (see Table 1). Moreover, we can notice an eastward propagation in wind events, the darkest regimes being often observed at western stations (station 1) prior to eastern sites (10 and 18). The bottom 210 panel of the Figure 2, which depicts the sequences of regimes associated with the model fitted on the set of all locations with a common regime to all locations, reveals that this regional regime is coherent with the local ones, although it is less persistent. Indeed, when fitting the model to several  stations, the regime has to embed some spatial heterogeneity that is likely to decrease the temporal persistence.

215
In Figure 3, probabilities of occurrence of a given regime conditional to the simultaneous occurrence of the same regime at site 10 are depicted for all sites. In each picture, conditional probabilities should be compared with the reference value given at location 10, which is 1 by construction. The first regime has the best spatial coherence; and the third regime, which is the least persistent regime, is less coherent spatially. The ranges of values of these probabilities indicate a satisfying consistency 220 between the regimes across sites. At each site, the physical interpretation of each regime is similar.
Indeed, the first regime corresponds mainly to anticyclonic conditions with easterly winds and a slowly varying intensity (the variance of the innovation of the AR model is lower than in the two other regimes, and the first AR coefficient is larger; see Table 1). The two other regimes correspond to cyclonic conditions with westerly winds and a higher temporal variability in the intensity (see 225 Figure 4). These two regimes are discriminated mainly by the temporal variability, which is higher in the third regime. Moreover the wind direction, not depicted here, slightly differs: from southwesterlies in the second regime to northwesterlies in the third regime. In Figure 4, we can notice that wind conditions with weak temporal variability observed in the first regime are associated with weak values of the moving mean and variance processes, whereas more volatile periods in the second and  knowledge, few statistics enable us to characterize the alternation associated with regime-switching.
These two processes of moving mean and standard deviation enable to characterize the alternation of variability associated with the observed regime-switching and will be used in the following sections.
Coefficients of the autoregressive process Y in each regime and the transition matrix at each 235 site are comparable and spatially coherent (see Table 1). Other criteria such as the average field of {u t , v t } in each regime and distribution of {Φ t } in each regime were also explored and suggest similarities between regimes at all locations.
The assumption of a regional regime seems appropriate in the considered area and is thus kept for the modeling of the multisite wind in the following. under study or on extra-variables: the former leads to weather states that are more appropriate to the local data, while the latter can provide more meteorologically consistent regimes for example with more information about the large-scale situation. In this subsection, we propose three clusterings, which differ by the clustering method and/or by the variables used to derive the a priori regimes.

250
As a first clustering, we use a classification into four large-scale weather regimes that is commonly used in climate studies to characterize the wintertime atmospheric dynamics over the North Atlantic / European sector ( (Michelangeli, Vautard, and Legras, 1995;Cassou, 2008;Najac, 2008)). These regimes can be described as follows: - To derive these regimes, we use the same methodology as in (Cattiaux,Douville,and Peings,265 2013). We perform a k-means clustering on the 3,607 daily-mean maps of 500 mb geopotential height (Z500) anomalies (i.e., mean-corrected fields) over the North Atlantic / European sector (90 • W-30 • E / 20-80 • N) corresponding to days of December, January, and February 1981-2010.
Daily Z500 data are downloaded from the ERA-Interim archive. In order to reduce the computational time, the k-means algorithm is performed on the first ten principal components (PCs) of the 270 Z500 anomalies time series. These PCs are time series corresponding to the projections of the Z500 anomalies onto the empirical orthogonal functions (EOFs), which are eigenvectors of the spatial covariance matrix of the Z500 field. Such a decomposition enables extraction of the main modes of variability of the spatiotemporal process; here, the first ten EOFs explain 90% of the total variance.
Eventually, the obtained daily classification is converted to a 4×daily classification by repeating the 275 same regime for the four time steps of each day, a reasonable approach given the smoothness of the Z500 both in time and space. In the following, we denote this clustering C Z500 .

Derivation of observed regimes from the local variables: C EOF (u,v) and C Dif f (u,v)
To derive observed regimes from local wind variables, one can first use a k-means clustering procedure similar to the one used for C Z500 . However, while C Z500 provides persistent regimes in which 280 the conditional model satisfyingly describes {u t , v t }, local regimes resulting from such a k-means clustering are not persistent enough to reliably estimate the conditional VAR model. Consequently, in this subsection, we perform the local clustering via a hidden Markov model with Gaussian probability of emission.
The hidden structure of the Markov chain provides more stable regimes than with a k-means clustering. It corresponds to an H-MS-VAR model with VAR models of order p = 0. The EM algorithm is used to process the clustering, and the number of regimes is chosen at three. This number provides the most physically relevant local regimes; a greater number of regimes indeed leads to less discriminative regimes in terms of local wind conditions (not shown).
Then two sets of descriptors of the data (i.e. local variables) are proposed. The first partition, de-290 noted C EOF (u,v) , is obtained by clustering the time series associated with the first two EOFs of the anomalies of {u t , v t }, which explain 94% of the total variance. The second partition involves descriptors of the conditional distribution of p(Y t |Y t−1 ), in order to find a clustering that may be better adapted to the description of the conditional distribution by an autoregressive model. A simplified way to describe the dynamics is to consider the bivariate process of variables enables construction of regimes that discriminate well the temporal variability of the u,v) this second local clustering.

Analysis of the proposed clusterings
The proposed clusterings are compared through various analyses. We seek a clustering that is physically meaningful and appropriate in terms of conditional autoregressive models. For a proper com-300 parison, for all clusterings, we decide to order regimes from the more persistent to the less persistent. This is done according to the determinant of the matrice Σ (i) .

First visual comparison
Sequences of regimes from the proposed clusterings are shown in Figure 5. The top panel shows that C Z500 has very persistent regimes. This result is expected because it describes the alternation 305 between the preferred states of the large-scale atmospheric dynamics, whose typical time scale is a few days. One can see that the less volatile wind conditions are associated with the BL and AR phases, whereas the most variable wind conditions occur during the two NAO phases; see Figure   10. The three bottom panels correspond to local clusterings. For all of them, the first regime is associated with the less volatile conditions with weakest intensity, whereas the second and third to the third one (which is confirmed by the transition probabilities between regimes) and that this 315 second regime is most of the time associated with rises in wind speed intensity.
In Figure 6, the average fields corresponding to each regime of the four clusterings are plotted. The top row highlights the difficulty of discriminating local wind features when using regimes defined from a large-scale circulation variable. While the AR and NAO+ regimes of C Z500 are associated with strong local wind signatures (as described in Subsection 4.1), the BL and NAO− regimes have 320 a weaker discriminatory power on the local wind data. This issue was also observed in (Najac, 2008).
Since different descriptors are used, C Dif f (u,v) and C EOF (u,v) lead to very different results.
C EOF (u,v) leads to the most physically consistent regimes: a northeasterly regime, a northwesterly one, and a southwesterly one, which are flows corresponding to several of the large-scale weather regimes. The last two regimes are associated with stronger intensities. From the derivation of this 325 clustering, one naturally finds regimes that correspond to the main mean patterns of variability of the fields.
The regimes of C Dif f (u,v) have less persistence, which complicates their meteorological interpretation. The first regime corresponds to periods of weak wind intensities. The last two regimes are southwesterly regimes with different intensity from one to the other. The averaged fields of the  (u,v) , C Dif f (u,v) and H-MS-VAR, this is also seen in Table 3 and is in agreement with the average fields of these regimes displayed on Figure 6. The second axe opposes the regimes R2

R1 of C EOF
of H-MS-VAR and C Dif f (u,v) from the regimes R3, which is also an opposition from persistent to less persistent regimes. Most of these similarities between the regimes are also seen on Table 2 through the logarithm of the covariance of the innovations and the percentage of time spent in each 350 regime. The regime AR from C Z500 seems more difficult to associate with other regimes. The regime R3 from C EOF (u,v) is associated to the weather regime NAO+, which coincides with Table 3 and

Quantitative analyzing
Quantitative criteria are considered in order to complete this analysis. The optimal value of the 355 complete log-likelihood of the model is generally a good measure of the statistical relevance of a model. The complete log-likelihood, given in (3), evaluated at the maximum likelihood estimator of θ, is written in the case of observed regime-switching as the sum of the two following terms: and 360 log(L(θ (S) ; s 1 , ..., s T )) = M i,j=1 n i,j log n i,j n i,. .
Note that the first term is a function of the total time spent in each regime and the associated determinant of covariance matrix of innovation (notice that the one-step-ahead error of the forecast is linked to this quantity). The longer the time spent in a regime with a weak determinant of 365 covariance of innovation, the greater the log-likelihood (see Table 2). The maximal log-likelihood of θ (S) is equal to the opposite of the conditional entropy of S t given S t−1 . The conditional entropy is classically used as a quality measure of clustering. In prediction, the weaker the entropy, the stronger the predictability of S t given S t−1 . More generally one tends to minimize this measure.
Because of the range of values of the log-likelihood of θ (Y ) , the value of that of θ (S) has a low 370 contribution to the complete log-likelihood. If the complete log-likelihood is used to select models, the persistence of the Markov chain has a low impact. BIC indexes are also given in Table 2, where BIC = −2 log L + N p log(N obs ) with L the likelihood of the model, N p the number of parameters and N obs the number of observations. The BIC index enables one to consider a compromise between  of a priori and of latent regime-switching models. However the BIC indexes of these two classes of models can be compared with that of the unconditional VAR model, since it is a particular case.
The The model AP-MS-VAR C Dif f (u,v) , which exhibits the best likelihood, performs the most accurately among the AP-MS-VAR models to reproduce the moving average and moving variance processes; see Section 6. Besides in terms of BIC indexes, the smallest value among the AP-MS-VAR models is 385 that of AP-MS-VAR C Dif f (u,v) and it is also greater than that of the VAR model. In the following, the VAR model with shifts defined by C Dif f (u,v) is kept for further comparisons with the H-MS-VAR model in simulation; see Section 6. We choose this model although it is not the most physically meaningful because it leads to better results according to our criterion.

Link between large-scale weather regimes and local ones 390
In this section we quantitatively compare the large-scale regimes described by C Z500 with the local ones derived from the hidden MS-VAR. To this end, we compute the joint probability of occurrence of large-scale regimes (C Z500 ) and local regimes (successively C EOF (u,v) , C Dif f (u,v) and H-MS-VAR, Table 3).
For the three clusterings, the local regimes seem to appear in preferential large-scale weather regimes. The strongest link with C Z500 is found for C EOF (u,v) : the first regime coincides mainly with BL, the second one with AR, and the third one with NAO+. These results are not surprising because regimes of C EOF (u,v) are also easier to interpret physically. However, the association is not systematic: for instance, the second regime is observed not only during AR conditions but also during NAO+ conditions. Note that NAO− conditions split rather equiprobably among the three 400 local regimes.
The regimes of H-MS-VAR and of C Dif f (u,v) are more difficult to link with large-scale regimes.
The fact that they are less persistent than the C EOF (u,v) ones may explain why their joint occurrences with C Z500 are weaker. As previously said, H-MS-VAR regimes are driven mainly by the conditional autoregressive model in the sense of the likelihood, which results in a more difficult 405 physical interpretation. Some links can nevertheless be made: for both H- MS-VAR and C Dif f (u,v) , the second regime coincides mainly with NAO+, and to a lesser extent the first regime is connected to BL.

Comparison in simulation of the multisite wind models
In this section, we compare models VAR (2) the {v t } one is less accurately described. Results in (Ailliot, Bessac, Monbet, and Pene, 2015) are slightly more satisfying because of non homogeneous transitions between regimes. The description of this distribution by AP-MS-VAR C Dif f (u,v) is also satisfying and not shown here. Concerning 420 the temporal dependence, the regime-switching models are the most able to accurately reproduce the autocorrelation functions of both {u t } and {v t }. All the models tend to behave similarly in reproducing the correlation of {u t }. However, the VAR model tends to underestimate the dependence of {v t } between 2 and 5 days, and the regime-switching models improve the description of this dependence.

425
The space-time correlation function of the multivariate process {u t , v t } and its simulated replicates reveals that both models reproduce satisfyingly the general shape of this function and especially the non separable and anisotropic patterns; see Figure 9. The non separability is reflected in the asymmetry around the vertical axis at lag 0 is captured by the proposed models. To study patterns at an instantaneous time scale, we focus on the ability of the models to reproduce 430 the alternation of temporal variability. Indeed the alternation of different weather states induces an alternation in the intensity and temporal variability of wind. In Figure 10, the moving standard deviation of wind speed around its moving mean at the central site 10 is depicted as a function of its moving mean. Observations reveal a higher variability when the intensity is high, although a high variability may also be associated with weaker values when the moving window overlaps the 435 transition time. Models with regime-switching enable the reproduction of more temporal variability associated with moderate and high intensity of wind, which is not captured by an unconditional VAR model. For instance, the regime-switching models reproduce high variability around 5 and 10 m.s −1 which corresponds to transitions between weather states. This is ensured by the alternation, driven by a Markov chain, of periods associated with different parameters of the conditional model.

440
Similar diagnostics than in Figure 4 indicate that the distributions of the moving standard deviation and the moving mean within each simulated regime of the C Dif f (u,v) and of H-MS-VAR are clearly distinct from one regime to the other, which indicates characteristic behaviors of these two simulated processes within each regime (not shown). Moreover, the behavior in each simulated regime is close to the observed one.

7 Discussions and perspectives
In Section 3, we compare site-specific regimes to common regional regimes. We conclude according to mainly qualitative criteria that for this dataset the use of a regime common to all locations is reasonable. To go one step further, one would settle some likelihood-ratio test, to quantify more precisely to which extent the assumption of a regional regime against a site-specific regime is ac- In this paper we have introduced observed and latent regime-switching framework, and we have showed that both types of regime-switching models have various advantages. Models with observed switchings may account for relevant regimes that correspond to characteristic meteorological conditions in Europe. The choice of the clustering method and of the descriptors of the data is crucial, 455 as discussed in Subsection 4.2 where a k-means clustering led to irrelevant regimes in terms of estimation of the associated conditional model.
The hidden regime-switching framework seems to overcome this insufficiency by providing regimes that are driven by the conditional distribution and therefore adapted to the estimation. When considering hidden regime-switching models, however, the estimation procedure may become challenging 460 when sophisticated marginal models are considered. The extracted regimes are driven mainly by the local data and the proposed conditional distribution, and consequently they might have less physical interpretation than do regimes derived from other clusterings. Nevertheless, in this study we saw that for the proposed model and studied dataset, the associated regimes were not physically inconsistent. Moreover, the use of hidden regime-switching models saves efforts in choosing an appropriate 465 observed a priori clustering.
Concerning the proposed observed regime-switching models, there seems to be a compromise between physically interpretable regimes and a good description of the conditional model by a VAR, as highlighted in Section 4 when comparing AP-MS-VAR C Dif f (u,v) and AP-MS-VAR C EOF (u,v) models. Indeed we have chosen AP-MS-VAR C Dif f (u,v) because it provides the best BIC index despite the 470 fact that C Dif f (u,v) has less physical interpretation. This highlights the difficulty in finding relevant regimes that are adapted to the description of the data by conditional vector autoregressive models.
The proposed hidden regime-switching model seems to respond to this compromise in providing more interpretable regimes than the ones of C Dif f (u,v) and similar description of temporal patterns.
The improvement of BIC from the AP-MS-VAR C Dif f (u,v) with respect to the unconditional VAR is 475 4% whereas the improvement from the H-MS-VAR is15.3%.
Future work may involve investigating reduced parameterizations of the autoregressive coefficients and of the matrices of covariance of innovations, thus helping to adapt the model to a larger dataset. Indeed the number of parameters is already high with the small dataset under consideration, and attempts to use parametric shapes for parameters reveal that a huge effort will be needed to 480 extract consistent results. Furthermore, when looking at the autoregressive matrices, one sees generally privileged predictors according to the regimes, a situation that motivates the use of constraint matrices in each regime.