POLICY RESEARCH WORKING PAPER 1553.
Spatial Correlations A correction for spatial
correlation in panel data.
in Panel Data
John Driscoll
Aart Kraay
The World Bank
Policy Research Department
Macroeconomics and Growth Division
l)ecember 1995
I POLICY RESEARCH WORKING PAPER 1553
Summary findings
In many empirical applications involving combined time- inference procedures that combine time-series and cross-
series and cross-sectional data, the residuals from sectional data since these techniques typically require the
different cross-sectional units are likely to be correlated assumption that the cross-sectional units are
with one another. This is often the case in applications in independent. When this assumption is violated, estimates
macroeconomics and international economics where the of standard errors are inconsistent, and hence are not
cross-sectional units may be countries, states, or regions useful for inference. And standard corrections for spatial
observed over time. "Spatial" correlations among such correlations will be valid only if spatial correlations are
cross-sections may arise for a number of reasons, ranging of particular restrictive forms.
from observed common shocks such as terms of trade or Driscoll and Kraay propose a correction for spatial
oil shocks, to unobserved "contagion" or "neighbor- correlations that does not require strong assumptions
hood" effects which propagate across countries in concerning their form - and show that it is superior to a
complex ways. number of commonly used alternatives.
Driscoll and Kraay observe that the presence of such
spatial correlations in residuals complicates standard
This paper - a product of the Macroeconomics and Growth Division, Policy Research Department - is part of a larger
effort in the departmentto study international macroeconomics. Copies of thepaperare available free from theWorld Bank,
1818 H Street NW, Washington, DC 20433. Please contact Rebecca Martin, room Nl 1-059, telephone 202-473-9065,
fax 202-522-3518, Internet address rmartinl@worldbank.org. December 1995. (28 pages)
The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about
development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The
papers carry the names of the authors and should be used and cited accordingly. The findings, interpretations, and conclusions are the
authors' own and should not be attributed to the World Bank, its Executive Board of Directors, or any of its member countries.
Produced by the Policy Research Dissemination Center
Spatial Correlations in Panel Data'
John Driscoll
Department of Economics
Brown University, Box B
Providence RI 02912
jd@econ.pstc.brown.edu
and
Aart Kraay
The World Bank
1818 H Street NW
Washington DC 20433
akraay@worldbank.org
We would like to thank John Campbell, Greg Mankiw, Matthew Shapiro and especially Gary Chamberlain and Jim
Stock for helpful comments and suggestions. Financial support from the Earle A. Chiles Foundation (Driscoll) and the
Social Sciences and Humanities Research Council of Canada (Kraay) during work on earlier drafts of this paper is
gratefully acknowledged.
1 Introduction
Economists are frequently faced with the problem of drawing inferences from data sets
which combine cross-sectional and time-series data. In such situations, it has become standard
practice to base inferences on techniques which pool the cross-sectional and time-series
dimensions in some way. For such techniques to be valid, it must be the case that the error terms
are not correlated across different cross-sectional units, either contemporaneously or at leads and
lags. This condition is directly analogous to the usual requirement that the residuals from
different observations in a single cross-sectional regression be independent of each other. If this
condition is not met, estimates of standard errors will be inconsistent, and will not be useful for
inference.
This paper begins with the observation that in many applications, especially in
macroeconomics and international economics, the assumption of independent cross-sectional units
is inappropriate. While it may be reasonable to assume that cross-sectional units are independent
when they are households or individuals chosen according to a well-designed sampling scheme
from a large population, this assumption becomes less tenable when the cross-sectional units are
countries or regions. Countries or regions are likely to be subject to observable and unobservable
common disturbances which will cause the residuals from one cross-section to be correlated with
those of another. We will refer to such cross-sectional correlations as "spatial correlations"
Spatial correlations may arise for a number of reasons. For example, in applications in
which real GDP growth rates are the dependent variable, various channels of interdependence
such as trade, capital flows or policy coordination mechanisms will induce cross-country
correlations in GDP growth rates.' Unless the regressions of interest include right-hand side
variables which correctly specify these channels of interdependence, the residuals from these
regressions will be correlated across countries. Similarly, in studies of capital flows to developing
countries, common external shocks such as US interest rates, or else unobserved contagion effects
' See Kraay and Ventura (1995) for a discussion of the roles of trade and capital mobility in the synchronization of GDP
growth rates across countries. Ades and Chua (1993) and Easterly and Levine (1995) provide empirical evidence that
policies tend to be correlated among neighbours, leading to correlations of growth rates over long horizons.
1
(sometimes dubbed "tequila" effects in aftermath of the Mexican peso crisis) can cause residuals
from capital flows regressions to be correlated across countries.
A number of standard corrections for spatial correlations exist, all of which require strong
assumptions regarding the form of the spatial correlations. For example, it is common to include
time dummy variables in pooled time-series, cross-sectional regressions to capture the effect of
common disturbances. This technique is the appropriate correction for spatial correlation only if
one assumes that the contemporaneous correlations between any pair of cross-sectional units are
equal, and the lagged cross-sectional correlations are zero. Unfortunately, such strong
restrictions on the form of the spatial correlations are unlikely to be correct in most applications.
For example, different countries may react differently to common disturbances, or contagion
effects may spread across countries only after a lag. When the structure of the spatial correlations
is misspecified in this way, the properties of the resulting estimator are in general unknown.
Since it is not desirable to impose restrictions on the form of the spatial correlations, it is
less clear how to proceed. One alternative is to attempt to parametrically estimate the full
unrestricted matrix of spatial correlations for use in a feasible generalized least squares (FGLS)
procedure. This procedure, which is a variant of the Seemingly Unrelated Regressions (SUR)
technique, will only be effective in a limited set of applications. To see why this is so, suppose
that there are N cross-sectional units and T time-series observations. The NxN matrix of
contemporaneous cross-sectional correlations has N(N+1)/2 free parameters to be estimated using
the NT available observations. Thus, in order to obtain reliable estimates of the matrix of spatial
correlations, it must be the case that T>>(N+1)/2. However, in many cross-country applications
using annual data, there are many more countries in the sample of interest than there are time-
series observations, so this approach will be infeasible.
In this paper we propose an alternative correction for spatial correlation. Building on the
non-parametric heteroskedasticity and autocorrelation consistent (HAC) covariance matrix
estimation technique of Newey and West (1987) and Andrews (1991), we show how this
approach can be extended to a panel setting with cross-sectional dependence, in addition to serial
correlation and heteroskedasticity. We present very weak conditions on the form of the cross-
sectional and time-series dependence under which a simple variant on the Newey and West
2
estimator yields consistent estimates of standard errors. In particular, we can obtain consistent
estimates of standard errors in the presence of arbitrary contemporaneous cross-sectional
correlations, as well as lagged cross-sectional correlations which are restricted to become small
only as the time interval separating the two observations becomes large. This very general
structure is likely to encompass most forms of spatial correlations encountered in practice.
Our results on consistency are based on asymptotic theory which requires the time
dimension, T, to tend to infinity. Thus, our results will only be relevant for panel data sets in
which the time dimension is reasonably large (our Monte Carlo simulations suggest that a value of
T=20 or T=25 is the minimum). However, our results do not place any restrictions on the size of
the cross-sectional dimension, N, and we can even allow the extreme case in which N tends to
infinity at any rate relative to T. This implies that our techniques, in contrast to SUR, will be
applicable in situations such as cross-country panel data sets where the number of countries is
very large.
The rest of this paper proceeds as follows. In Section 2, we first develop the intuitions for
our results using a simple ordinary least squares example. We then provide a formal statement of
our result, using a mixing random field structure to characterize the permissible extent of cross-
sectional and time-series dependence. Since this structure is somewhat unfamiliar, we provide
some examples of forms of cross-sectional dependence which satisfy the conditions we impose.
In Section 3, we consider the finite-sample properties of our estimator using Monte Carlo
evidence, and find that our non-parametric estimator performs significantly better than common
alternatives such as time dummies or SUR. Section 4 concludes.
3
2 Consistent Covariance Matrix Estimation with Spatial Dependence
2.1 Preliminary Discussion
In order to develop the intuition for the results of this paper, consider the following simple
bivariate linear panel regression:
Yit= xj: + eit
1 , ...,N, T =I ... 1
{ E[TE,Ej] } =
To obtain an estimate of ,, it is common practice to pool the cross-sectional and time-series
observations and apply OLS to the full set of NT observations. If the errors are independently
and identically distributed (i.e. if Q =U2'lr), this will yield consistent estimates of 3 and its
standard errors. However, in the presence of spatial correlations, Q is no longer diagonal. In this
case, although the OLS estimator of ,B is still consistent, the OLS standard errors will be
inconsistent, and hence will not be useful for inference.
We can write the OLS estimator of P in the usual way as follows:
rT N
T/SE E xj,e1t
/T(DOOL j3) ; N(2)
{ I S EX3} NT
To simplify the above expression, denote the term in brackets in the denominator of (2) as QT 2,
and define
N
ht E Sxj,ei, ~~~~~~~~(3)
2 For the purposes of this illustrative example, we can assume that the x, are constants and that QT- Q>O as N,T-
4
Substituting into the expression for the OLS estimator, we obtain
- IOO =11: ht (4)
QTTrTt=i
This change of variables is useful because it reduces the original panel data estimation problem to
a simple time-series estimation problem. In other words, by defining a cross-sectional average h4
at every point in time, we have "collapsed" the cross-sectional dimension of the problem to a
single time-series observation by averaging over the N cross-sectional units in each period.
Since OLS estimates of P will be consistent even in the presence of spatial correlations,
our main concern is with obtaining consistent estimates of the variance of the OLS estimator.
Using the above notation, we can write this variance in terms of the h, as
VT ! Z-EE[hthj = -S5
VT = -T E (Q)
The main intuition of the paper is as follows. Given appropriate conditions on the h,, we can
apply standard time-series non-parametric covariance matrix estimation techniques such as those
employed by Newey and West to obtain a consistent estimate of ST, and hence of VT. These
conditions (known as "mixing conditions" in the standard time-series literature) place restrictions
on the autocovariances of the h1, requiring the dependence between h, and h, , to become small as
the time interval separating them, s, becomes large. Imposing restrictions on the autocovariances
of h, will amount to placing restrictions on the contemporaneous and lagged spatial dependence in
the residuals, E[EEj,,e] ., since the autocovariances of the sequence h, are a weighted average of
these covariances, i.e.
N N
E[htht-s] = N2 1 Xxy t _E[Eey tj (6)
In this paper, we show that only very weak restrictions on the form of the spatial correlations are
5
required to ensure that h1 satisfies the regularity conditions necessary for consistent estimation of
ST. In particular, we can permit arbitrary contemporaneous correlations, and we require only that
lagged cross-sectional dependence declines at a particular rate as the time separation becomes
large. As in Newey and West (1987), our asymptotic results rely on a large time dimension.
However, we do not need to restrict the size of the cross-sectional dimension, which can tend to
infinity at any rate relative to T.
We use a mixing random field structure to characterize the permissible extent of spatial
and temporal dependence. As mixing random fields are somewhat unfamiliar in the econometrics
literature, we briefly present the necessary intuitions here, and relegate the details to the appendix.
Random fields are simply random variables with multiple indices. For example, returning to
Equation (3), we can define the random field N,t=x,,Ejt, indexed by i and t. In the standard
univariate time-series literature, a time series is described as "mixing" if the dependence between
two random variables x, and x,-, becomes small as the time interval separating them, s, becomes
large. In this paper, we will analogously describe a random field as being "mixing" if the
dependence between h-, and h becomes small as the time interval s becomes large, for any pair
of cross-sectional observations i and j.3 In this way, the standard time-series definition of mixing
corresponds to the special case where i=j. Finally, the "size" of a mixing is defined as the rate at
which the dependence between two observations must decline as a function of the distance
between them.
This particular definition of a mixing random field has the extremely useful property that
the cross-sectional averges of this random field, h1 (as defined in Equation (3)), form a univariate
This definition of mixing departs from the standard definitions in the random field literature in that it treats the cross-
sectional and timne-series dimensions asymmetrically. Typically, mixing restriction would require the dependence
between hN, and h,,, to become small as the Euclidean distance d=((i-j)2+s2)"2 between these two random variables
becomes large. This is an unattractive property of standard definitions of mixing random fields for two reasons. First, in
our panel data applications, it precludes canonical forms of cross-sectional dependence such as equal contemporaneous
cross-unit correlations. To see why this is so, notice that the distance between h, and h, is simply li-jl according to the
above definition. Standard definitions of mixing would then rule out equal cross-sectional correlations between any hN
and hj, since this correlation will not decline as ji-l becomes large. The second problem is that in order to impose the
restriction that observations "far apart" in the cross-sectional ordering be approximately uncorrelated, it is necessary to
know what the cross-sectional ordering is. This is problematic, since unlike in the time dimension, in most cases there is
no natural ordering in the cross-sectional dimension.
6
mixing sequence of the same size as the underlying random field. This is true for any value of N
(the size of the cross-sectional dimension), including the limiting case where N--. If we impose
the restriction that the hit form a mixing random field of the appropriate size, then h, will be a
mixing sequence of the same size, and we can directly apply standard time-series covariance
matrix estimation techniques to obtain an estimate of ST in Equation (5). Thus, our results
amount to a simple extension of the Newey and West estimator, which may be viewed in the
above context as the case in which N= 1.
2.2 Results
In this section, we present our main result, which is simply a generalization and
formalization of the discussion of the previous section. The theorem is stated in terms of a broad
class of Generalized Method of Moments estimators, of which the OLS case discussed above is an
example.
7
Theorem
Consider the class of GMM models identified by a pxl vector of
orthogonality conditions E[/4(00, zd)]=O, where 019F6& is an axl vector of
parameters with asp, t9is a compact subset of R, z1, is a kxl vector of data, and
denote z,=(zi .'. ,zv,) 'and h,=h(O, z) =NA`Xi=,Mb(0, z,) Supposefurther that
(1) z, is an a-mixing random field of size 2(r+ o)/(r+ 6-1), as defined in the
Appendix;
(2) (a) #r(d, z) is continuously differentiable in 0 and measurable in z,,;
(b) E[IVI(fX, zJ 14r+°l] IP[F,nF2] - P[F,]P[F2]
A mixing random field is defined as follows:
Definition: A random field is mixing of size r/(r-l), r> I iffor some A>r/(r-l),
a(S) =O(sA).
" Random field structures have been developed extensively in the statistics literature. See Rosenblatt (1970), Deo
(1975), Bolthausen (1982), and Bulinskii (1988). Some economic applications include Wooldridge and White (1988),
Quah (I1990), and Conley (1994).
12 It is straightforward to extend these definitions and the results which follow to ¢-mixing random fields by defining 0-
mixing coefficients in the usual way.
16
This definition of mixing departs from the more standard a-mixing structures on random fields in
that it treats the cross-sectional dependence differently from the time-series dependence. Most
definitions of mixing13 restrict the dependence in both dimensions symmetrically, requiring the
dependence between two observations to decline as either the distance in the cross-sectional
ordering becomes large, or as the time separation becomes large (see, for example, Quah (1990)).
This restriction on the dependence across units is required to deliver (NT)' asymptotic normality
for double sums over i and t of the e1,, just as in the one-dimensional case restrictions on the
temporal dependence are required to deliver T"2 asymptotic nornality for appropriately
normalized sums.'4
The definition of mixing presented here, however, does not restrict the degree of cross-
sectional dependence. Instead, we only require the dependence between Ei1 and Ejt, to be small
when s is large, for any value of i and j. This is a desirable property, since it will not preclude
canonical forms of cross-sectional dependence, such as factor structures in which cross-sectional
units may be equicorrelated in a given time period or grouped structures in which observations are
correlated according to possibly unobservable group characteristics. This greater permissible
cross-sectional dependence comes at the cost that it will not be possible to obtain (NT)'
asymptotics for double sums over i and t of the Ei, However, we do not require this as we rely
exclusively on T112 asymptotics for this double sum.
A useful property of this random field structure is that the sequence of cross-sectional
averages of the Ei, forms a univariate a-mixing sequence, as summarized in the following lemma:.
S see Doukhan (1994) for an extensive survey of mixing in random fields and in other contexts.
" For such random fields, (NT)* asymptotics typically require N and T to go to infinity at the same rate, suggesting that
in finite sample applications, the cross-sectional and time-series dimension must be roughly equal for asymptotic
approximations to be plausible. For example, Quah (I 990) has the restriction that T=iN. We do not require this
restriction in our asymptotic theory.
17
Lemma
Suppose that E,, is an t-mixing randomfield of size r/(r-I), r>l. Then
=N,
hr = -E Ei,
is an a-mixing sequence of the same size as Ej,for any N.
Proof
The proof is simply a matter of verifying that h, satisfies the definition of univariate mixing.
Define B,={sls sup < GecS't, G2Ec9,+,> I P[Gjn G2]- P[G,]P[G2] |. Now we claim that cS-t.t
and c9§ Y+[c%t+ . Given this claim, we have ah(s)< a(s) Vs, and hence ah(s) converges at least as
quickly as a(s). Thus the sequence ht is mixing of the same size of E&t.
To verify the claim, note that hi:Q-R' is a Borel function of (Ejtji=l,...,N,.. }, and hence is
o(Ej Ii=1,...N,...)-measurable, Le. h)'(C)co(Ejtji=l,...,N,..) where (3 is the a-algebra
generated by the Borel sets. Thus by defimition o(h.)=o(h)('(3)) ca(c1tIi=1,..,N,..). Finally,
note that co(Us= a(hj)) and =o (U,=-,'Jo( ej| i=1,..,N,.. )), and so the claim is verified.
This lemma is useful, as it permits us to move from restrictions on temporal and spatial
dependence in the random field to simple mixing restrictions on the univariate sequence of cross-
sectional averages, h,.
18
Proof of Theorem
To prove consistency and asymptotic normality of the GMM estimator, we will verify the
conditions in Hamilton (1994), Proposition 14.1. Consistency of the covariance matrix estimator
will follow from the arguments of Newey and West (1987). To verify consistency of the GMM
estimator (Hamilton, Proposition 14.1, Condition (a)), we need only verify conditions for the
consistency of extremum estimators (for example, Amemiya (1985), Theorem 4.1.1). Conditions
A and B of Amemiya (1985), Theorem 4.1. I follow immediately from the compactness of e and
Assumption 2(a). Condition C of this theorem requires the minimand in the GMM problem to
converge uniformly in Oee. This condition will be satisfied if the sequence h=h(O, z,) obeys a
LLN for all 60E3. Since *r is a measurable function of Z,, it is a mixing random field of the same
size as z;, by an argument similar to the one use to prove Lemma 1. By Lemma 1, 4 is a
univariate a-mixing sequence of size 2(r+8)/(r+o-1) > r/(r-1). Thus, to apply the McLeish
(1975) LLN (See White (1984), Theorem 3.47) for a-mixing sequences of size r/(r-1), we need
only verify that 4 has finite (r+6)th moments. However, to prove consistency of the covariance
matrix estimator, we will require the stronger moment condition that E[Ihj4(r+6)] O (15)
uniformly in a as T-o. To verify this, observe first that the mixing property of h, allows us to
bound the autocovariances of h, in the usual way. That is, for s>O, we have I E[hl4h,j I < a(s)A
where a(s)=O(s<('+6) (White, (1984), Corollary 6.16). This corollary requires h, to be an a-
mixing sequence of size (2+2T)/rj, rq>O with E[ I htI 2+2,,] Ie |s EI| W(s,M) E Ztl > E|
4 E zts >
(22)
S ( Cm )2 TA(
AZC2m(J)3
2E2T
22
This final term converges to zero by the assumption that m(T)=o(T"3).
Consistency of the first term follows immediately from the consistency of the OLS
estimator and the final paragraph in Newey and West (1987).
23
Spatial Correlations with Factor Structure Representations
The following corrolary to the theorem verifies the claim made in Section 2.3 that it is
possible to obtain consistent covariance matrix estimates in the presence of spatial correlations
which have a factor structure representation.
Corrolary
Suppose that y,,=xjtp+l , , with Ef, =ff,'2i + v, f = (f,, . fm,) 'and x,,=g,'KX+uU.
g, = (gj,, ..., gp,) 'where A, and Ki are Mxl and Pxl vectors of uniformly bounded
constant factor loadings and M and P are finite constants. Suppose further that
fmt l f., andf , i vi, Vm, n, m on and Vt, and that gmt £ gnt and g., £ uit Vm, n,
m on and Vt, and that ELfJ=E[g^J =0 and Effm,/]=E[g., = 1 Vm, t. Suppose
further that Vij, t and m
(1) (a) (ft 'g, 'is an a-mixing sequence of size 2(r+ o5)/(r + A-l) for r> I
and some 6>0;
(b) Efvij=E[uJ=O andv, 1 v,, and u,, i u>,,for s#0;
(2) (a) E[x,,ej=0;
(b) E[Lfm,xit,(r+ 4)< oo