WPS3723
Determinants of City Growth in Brazil
Daniel da Mata*, Uwe Deichmann, J. Vernon Henderson,
Somik V. Lall, and Hyoung Gun Wang
* DIRUR, Instituto de Pesquisa Econômica Aplicada (IPEA), Brasilia
Development Research Group, The World Bank, Washington DC
Department of Economics, Brown University, Providence, RI
Abstract
In this paper, we examine the determinants of Brazilian city growth between 1970 and 2000. We
consider a model of a city, which combines aspects of standard urban economics and the new
economic geography literatures. For the empirical analysis, we constructed a dataset of 123
Brazilian agglomerations, and estimate aspects of the demand and supply side as well as a
reduced form specification that describes city sizes and their growth. Our main findings are that
increases in rural population supply, improvements in inter-regional transport connectivity and
education attainment of the labor force have strong impacts on city growth. We also find that
local crime and violence, measured by homicide rates, impinge on growth. In contrast, a higher
share of private sector industrial capital in the local economy stimulates growth. Using the
residuals from the growth estimation, we also find that cities that better administer local land use
and zoning laws have higher growth. Finally, our policy simulations show that diverting transport
investments from large cities toward secondary cities does not provide significant gains in terms
of national urban performance.
World Bank Policy Research Working Paper 3723, September 2005
The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the
exchange of ideas about development issues. An objective of the series is to get the findings out quickly,
even if the presentations are less than fully polished. The papers carry the names of the authors and should
be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely
those of the authors. They do not necessarily represent the view of the World Bank, its Executive Directors,
or the countries they represent. Policy Research Working Papers are available online at
http://econ.worldbank.org.
Acknowledgements
This paper is a product of a joint research program between the World Bank and the Instituto de Pesquisa
Econômica Aplicada (IPEA), Brasilia. This research has been partly funded a World Bank research grant
and by the Urban Cluster of the World Bank's Latin America and Caribbean Region, and is also an input to
the World Bank's urban strategy for Brazil. We have benefited from discussions with Carlos Azzoni, Pedro
Cavalcanti Ferreira, Ken Chomitz, Dean Cira, Marianne Fay, Mila Freire, João Carlos Magalhães, Maria da
Piedade Morais, Marcelo Piancastelli, Zmarak Shalizi, Christopher Timmins and Alexandre Ywata de
Carvalho. All errors are the authors'. A preliminary version of the paper was presented at the World Bank/
IPEA Urban Research Symposium in Brasilia (April 2005).
1
1. BACKGROUND AND MOTIVATION
Why are some cities more successful than their peers? Is the `success' of
individual cities driven by factors mostly external to any city's immediate control
(location, growth in market potential, being a port in a period of national trade growth,
national level decentralization and improved governance), or do individual city policies
and politics influence growth and development? Disentangling the relative contribution
of regional and local efforts is important for understanding the potential of alternate
policy interventions for stimulating growth of cities across the national urban system. At
this time, there is very little research examining the effectiveness of local and national
policy environments on urban growth in developing countries.
Brazil is a highly urbanized country 80 percent of its population lives in urban
centers and 90 percent of GDP is created in cities. According to estimates by the UN
Population Division for Brazil, the entire growth in population that is expected over the
next three decades will be in cities where the national urbanization rate is expected to rise
to over 90 percent (UN 2003). This will add about 63 million people to Brazil's cities,
and total urban population will be over 200 million. This population growth is occurring
across the Brazilian urban system (Table 1; see also Lemos et al. 2003). Of the 123 major
urban agglomerations in Brazil, only three were above 2 million people in 1970 versus
ten in 2000. In the middle of the size distribution in 2000, there were 52 agglomerations
with population between 250,000 and 2 million people compared to 25 in 1970. Thus, not
only is the scale of urbanization a major concern, but the distribution of population across
the urban hierarchy will also challenge policy makers to devise appropriate policies for
2
cities of different sizes. Across the urban system, there will be need to meet backlogs in
infrastructure, service delivery, and amenity provision, as well as accommodate further
growth.
In addition to population increases across the urban system, fiscal and
administrative decentralization has increased the role of individual cities in attracting
investments and in providing services that are responsive to the needs of local residents.
Brazil is one the most decentralized among developing countries. The 1988 Constitution
established municipalities as the third level of government, and provided states and
municipalities with more revenue raising power and freedom to set tax rates. However
many local governments have limited administrative and institutional capacity, and have
not been able to effectively use their autonomy to improve service delivery or attract new
investment. A recent study by the World Bank (World Bank 2002) identifies that
maximizing urban competitiveness from agglomeration economies and minimizing
congestion costs from negative externalities are key challenges facing national and local
governments in Brazil.
Under this backdrop of rapid population growth and decentralization of
administrative and fiscal responsibilities, it becomes essential to identify what types of
interventions stimulate growth of individual cities. In addition, we want to find out the
consequences of favoring investments in secondary cities on aggregate efficiency and
economic growth. There is an ongoing debate in Brazil's policy circles that the largest
agglomerations have become too big leading to significant negative externalities of
crime, social conflict, and high land costs, and policies should be designed to actively
stem the growth of these large agglomerations and favor investments in secondary cities.
3
It is however not clear if net agglomeration economies in large cites can be offset by
incentives and other measures to divert growth to smaller cities.
In this paper, we consider a model of a city, which consists of a demand side--
what utility levels a city can pay out--and a supply side--what utilities people demand to
live in a city. We estimate aspects of the demand and supply side; and then a reduced
form equation that describes city sizes and their growth. For the empirical analysis, we
construct a dataset of Brazilian agglomerations to examine city growth between 1970 and
2000. Much of the underlying data come from the Brazilian Bureau of Statistics (IBGE)
Population Censuses of 1970, 1980, 1991, and 2000. For the estimation, we make use of
GMM and spatial GMM techniques to correct for endogeneity in the presence of spatially
autocorrelated errors. Our main findings are that increases in rural population supply, and
improvements in inter-regional transport connectivity and education attainment of the
labor force have strong impacts on city growth. Both, labor force quality improvements
and base period education attainment matter significantly for growth. In terms of local
characteristics, we find that local crime and violence and a higher representation of public
industrial capital in the city lower city growth rates.
The rest of the paper is organized as follows. Section 2 provides the model and
estimation framework of urban demand and population supply models. The models
presented in this section combine traditional urban modeling with concepts from the new
economic geography literature. In Section 3, we discuss findings from the empirical
analysis and focus our attention on identifying main determinants of city growth. Section
4 provides results from simulations that examine if investments in secondary cites
stimulate growth. Section 5 concludes.
4
2. MEASURING CITY GROWTH
In this paper, we examine the local and regional determinants of city growth in
Brazil. Urban growth is represented by both individual city productivity growth and city
population growth, which are different indicators of city "success" and represent two
interconnected dimensions of successful urban growth. However before we can look at
any individual city's success, we need to understand the broader context, in which the
economy as a whole is changing. Cities from an economic perspective represent the way
modern production is carried out in a country and, as such, reflect what is occurring in the
country as a whole.
Production composition of cities varies by city size, where different types of
goods are best produced in bigger versus smaller cities. If national output composition
changes, altered by changing trade demand or domestic demand that changes with
economic growth, then demand moves away from goods produced in smaller types of
cities and those cities will suffer a setback. Some will falter; others will adjust what they
produce and perhaps upgrade, moving up the urban hierarchy. Which ones adjust well
may depend on "luck", but it may also depend on observable attributes such as education
of the labor force. A better educated labor force may allow for more nimble adjustment
and up-scaling of products produced-- what is called the reinvention hypothesis.
Similarly the skill composition of the labor force will vary across cities in systematic
ways, as output composition and skill needs vary. More generally, national productivity
growth comes from productivity growth within cities, which engender the close social-
spatial interactions inherent in innovation, knowledge accumulation and technological
5
improvements. To understand individual city success, we need to account for the
external, national factors driving urban changes, as well as to understand the sources of
local productivity growth.
At the same time we need to be able to measure when cities are being
"successful" versus less successful and what drives success. Much of success may be
driven by conditions external to the city, as just noted. In addition to demand changes,
changes in national institutions, for example providing smaller cities with greater
autonomy in local public sector decision making and greater access to fiscal resources
may make it easier for smaller cities to finance the infrastructure and public sector
services demanded by firms (transport and telecommunications) and by higher skilled
workers (e.g., better schools) and compete successfully with bigger cities for certain
industries. For terms of city level conditions, better run cities with more efficient use of
public sector revenues will be more attractive to both firms and migrants. And better run
cities will co-ordinate better with local businesses to help service their needs and make
them more productive. So part of measuring city success is measuring what local
producer and consumer amenities are valued and what cities are better at providing these
amenities.
In related work, Glaeser et al. (1995) examined how urban growth of the U.S.
cities between 1960 and 1990 is related to various urban characteristics in 1960, such as
their location, initial population, initial income, past growth, output composition,
unemployment, inequality, racial composition, segregation, size and nature of
government, and the educational attainment of their labor force. They showed income
and population growths are (1) positively related to initial schooling, (2) negatively
6
related to initial unemployment, and (3) negatively related to the initial share of
employment in manufacturing. Racial composition and segregation are not correlated
with later city population growth. Government expenditures (except for sanitation) are
also not associated with subsequent growth. However, per capita government debt is
positively correlated with later growth.1
In a long run analysis, Beeson et al. (2001) examine the location and growth of
the U.S. population using county-level census data from 1840 and 1990. They showed
access to transportation networks, either natural (oceans) or produced (railroads), was an
important source of growth over the period.2 In addition, industry mix (share of
employment in commerce and manufacturing), educational infrastructure, and weather
have promoted population growth.
In a recent paper for developing countries, Au and Henderson (2004) took a
slightly different approach. They modeled and estimated net urban agglomeration
economies for cities in China, which can be postulated by inverted-U shapes of net output
or value-added per worker against city employment. They found urban agglomeration
benefits are high real incomes per worker rise sharply with increases in city size from a
low level, level out nearer the peak, and then decline very slowly past the peak. The
inverted-U shifts with industrial composition across the urban hierarchy of cities. Larger
peak sizes are for more service oriented cities, but smaller for intensive manufacturing
cities. In addition, (domestic) market potential and accumulated FDI per worker have
significant and beneficial effects on city productivity, measured by value-added per
1They attributed this correlation to higher expected growth which made it cheaper to borrow, or
government invest heavily in infrastructure to serve that growth.
2Transportation network is represented by a group of dummy variables indicating ocean, mountain,
confluence of two rivers, railroads, and canals.
7
worker. However, percentage of high school graduates, distances to a major highway and
to navigable rivers, and kilometers of paved road per person have no effects, once market
potential is controlled for.
We now describe the model and estimation strategy employed in our analysis.
The data used for the analysis have been produced through a joint research program
between IPEA, Brasilia and the World Bank. Detailed description of the variables and
their sources are provided in Appendix C, and a descriptive overview of Brazilian city
growth is in da Mata et. al (2005). There is no official statistical or administrative entity
in Brazil that reflects the concept of a city or urban agglomeration that is appropriate for
economic analysis. Socioeconomic data in Brazil tend to be available for municípios, the
main administrative level for local policy implementation and management. Municípios,
however, vary in size. In 2000, São Paulo município had a population of more than ten
million, while many other municípios had only a few thousand residents. Furthermore,
many functional agglomerations consist of a number of municípios, and the boundaries of
these units change over time. Our analysis therefore adapts the concepts of
agglomerations from a comprehensive urban study by IPEA, IBGE and UNICAMP
(2002) resulting in a grouping of municípios to form 123 urban agglomerations (Figure
1). Throughout this paper we refer to these units of analysis as agglomerations, urban
areas, or cities.
8
Model and estimation strategy
The model consists of a demand side--what utility levels a city can pay out--and
a supply side--what utilities people demand to live in a city. We estimate aspects of the
demand and supply side; and then a reduced from equation that describes city sizes and
their growth. In the end the focus is on the last item.
Demand side
The demand side is given by the schedule of utility levels a city can offer workers,
as city size increases. A prime determinant of that is income, I, which consists of wage
income and income from rents and other non-labor sources. In addition in an indirect
utility function we also have a vector of items, Qi , such as commuting costs, housing
rents, local taxes, and local public services and amenities, so that
Ui =U(Ii,Qi)
D (1)
For wage income there is a wage rate component and then a work effort
component discussed momentarily. The wage rate component comes from value of
marginal productivity relationships, where
wi = w(MPi,ri,ei, Ni) (2)
In (2) r is the rental rate on capital, e is the quality or education level of workers, MP is
market potential reflecting the demand for a city's output and hence the price it receives,
and N is a measure of scale, such as city employment. MP from the new economic
geography and monopolistic competition literature has a specific form with components
we can't measure. We make two adjustments. First we use "nominal" market potential,
which is simply the distance discounted sum of total incomes of all MCAs in Brazil for
city i , or
9
MPi = TI j (3)
j, ji ij
TI is total income andij represents the transport cost between i and j.3 The calculation of
market potential is described in Appendix B, where we use distance as the measure of
transport costs. However travel times and costs vary by more than distance. Brazil for
1968, 1980 and 1995 has a measure of the transport cost from each city to its state
capital. We divide that variable by distance from the city to the state capital to get a city
specific measure of local transport costs which producers in a city face in selling in the
local region. The variable "inter-city transport costs",ii , will be determined by intercity
road infrastructure investment.
The major items from urban theory affecting worker well-being, apart from the
wage rate are rents and commuting costs. Commuting costs are time costs, of which part
will be reflected in lost work time or energy for work, and part in out-of-pocket
commuting costs. So total wage income is a function of both the wage rate and hours and
energy available to work, where the later will be negatively affected by commuting times.
Housing costs are tricky, since higher housing rents are also reflected in higher non-labor
income earned by landowners.
For demand side estimation, what we know from the data is total income per
worker in each city. We model that as a function of the determinants of the wage rate and
then factors affecting work time/energy and housing rental income. Both are a function of
city size. In sum we estimate:
Ii = I D(MPi,ii,ei, Ni) (4)
3The MCAs (Minimum Comparable Areas) are groups of municípios. The detailed description is in
Appendix C.
10
The scale variable, N, captures three things, scale externality effects on wage
rates, increasing housing rental incomes, and reduced work time/energy. As such its sign
is uncertain--if cities are at a size where the commuting cost aspects of urban living
weigh heavily, at the margin increases in scale could detract from incomes. That will be
the case in our estimation (which is also good for "stability" given supply curves are
upward sloping--being on the rising part of the "demand curve" can be problematical
and also makes sign interpretations in the city size equation more difficult as discussed
later).
Population Supply
The population supply relationship we estimate has population supplied to a city
increasing in utility offered per worker, which we approximate by income per worker.
This will tell us the supply elasticity of people to a city. In addition supply is shifted by
attributes, Zi , of the surrounding area--or substitutes of places to work for population in
the area. We have supply to a city of population from nearby rural areas. It is decreasing
in surrounding rural incomes where we use a gravity measure of surrounding rural
incomes, and it is increasing in surrounding rural population supply where again we use a
gravity measure of surrounding rural population. The calculation details are in Appendix
B.
The supply equation is given by
Ni = NS (U s(Ii),Zi), where N S /I > 0, N S /Z > 0 (5)
Note the inverse we will use later is
Ii = I S (Ni,Zi) where I S /N > 0, I S /Z < 0. (6)
11
City Size Level and Growth Equations
The final estimating equation comes from equating income demand and supply
equations in (4) and (6) and solving for N to get
Ni = N(MPi,ii,ei,Zi) where N /MP > 0,N /i > 0,N /e > 0,N /Z > 0. (7)
Also by differentiating (4) and (6) we can show
dN = -(IS /Z)dZ +(ID /MP)dMP+(ID /i)di +(ID /e)de (8).
IS /N -ID /N
Note (IS /Z)<0. And IS /N -ID /N >0 for "stability", where that is helped by the fact
that empirically in Table 2 (discussed momentarily)ID /N <0.
3. DETERMINANTS OF GROWTH - DEMAND AND SUPPLY SIDES
Having described the model and estimation strategy in Section 2, we now discuss
the main findings from demand, supply, and city growth models. Results from estimating
the demand side model (equation 4) are presented in Table 2, pooling three years (1980,
1991, and 2000). We focus on the GMM-IV results in column 1, which are from the two-
step efficient GMM in the presence of arbitrary heteroskedasticity and arbitrary within-
state correlation.4 We also give OLS results in column 2. In columns 1 and 2 the scale
measure is total workers in each city. In column 3, population instead of total workers is
used to represent urban scale. The instruments along with statistical test results are listed
in the footnotes. The GMM results of columns 1 and 3 pass specification tests for the
listed variables, and average partial R2's (average partial F's) are .44 and .43 (52.7 and
4The results are almost identical to 2SLS ones. All the GMM estimations in this paper are the two-step
efficient GMM in the presence of arbitrary heteroskedasticity and arbitrary within-state correlation.
12
51.6) respectively, which are relatively strong.5 In column 4, we provide the effects on
outcomes of a one standard deviation increase in covariates. All variables have big
impacts on total income per worker. For average schooling and Ln(market potential), one
standard deviation increases (1.26 and 1.01) increase total income per worker by 37.5%
and 36.5%. Also for Ln(number of workers) and Ln(intercity-transport costs), reduction
of one standard deviation (-1.13 and -.344) increases total income per worker by 34.4%
and 7.4% respectively. Of course for covariates in log form we already have elasticities.
The inter-city transport costs variable is significant although it can be fragile. For
intercity-transport costs we use the 1980 value for years 1980 and 1990; and we use the
1995 value for 2000. We give zero values to Ln(intercity-transport costs) of state capital
cities and add to covariates a dummy variable indicating state capitals. Results for
transport costs to São Paulo are much more fragile and have not been included in the
specifications reported in Table 2.
Finally, note the strong negative scale effects at the margin, suggesting we are on
the downward sloping portion of inverted U's (of income against city size) as we should
be.6 We had no success in estimating a quadratic specification or interacting scale with
the manufacturing to service ratio, to examine interactions between city scale and
industrial composition.
5Partial R2 is a squared partial correlation between the excluded instruments and the endogenous regressor
in question, and the F-test of the excluded instruments corresponds to this partial R2.
6Theory suggests that, under free migration within a country, if particular cities are not a their peak of
inverted U's, they will be to the right of the peak, due to either "stability" conditions in migration-labor
markets or conditions on what constitutes a Nash equilibrium in migration decisions (Au and Henderson,
2004; Duranton and Puga, 2004).
13
Growth or differenced versions of this equation and the population supply one
have very poor IV results, which is mainly due to a weak instrument problem. For the
growth specifications, we only focus on the final reduced form specification (Table 5).
Results for population supply are provided in Table 3. Again, for the estimation
we pool three years (1980, 1991, and 2000). Columns 1 and 2 give the GMM-IV and then
OLS results. The instruments, listed in the footnote of the table, pass specification tests
and produce strong first-stage regression results. All terms have strong, expected sign
coefficients. In column 1, a 1% increase in a city's total income per capita increases city
population by 2.4%. The gravity measures of surrounding rural population supply and
rural income opportunities have the expected opposite effects with similar magnitudes. A
1% increase in surrounding rural population supply increases city population by 5.9%,
and a 1% increase in surrounding rural income opportunities decreases city population by
5.2%. Thus, city populations are very sensitive to rural population supply and earning
opportunities.
In columns 3-5, we present supply elasticities by year. The coefficients of all the
three covariates increase over time, indicating increasing mobility. Population supply to a
city has become more elastic to changes in attributes of the city and nearby rural areas.
However, even in 2000, the elasticity, 2.9, is far from perfect mobility elasticity.7
7Under perfect labor mobility, we expect a horizontal population supply curve. All the cities offer the same
utility level, and city sizes are only determined by demand-side factors.
14
City Size Results
Results for city size from estimating equation (7) are given in Table 4. Column 1
gives GMM-IV results, column 2 OLS, and column 3 the effects of a one standard
deviation increase in covariates on city size. For instruments, we use 1970 values and
time-invariant variables.8 Again the instruments pass specification tests, and show strong
first-stage regression results.
If the reduced form results are indeed from combining demand and supply sides,
we expect the coefficient estimates in Table 4 to be consistent with the imputed values
from the demand side (Table 2) and the supply side (Table 3). The imputed values can be
calculated using (8), such that
ci = dN = ID /Q = bi
dQ IS /N -ID /N 1/ a1 -b4
- IS /Z
( )
cj = dN = = -aj
dZ IS /N -ID /N 1/ a1 -b4
where ci,cj are reduced form coefficient estimates in Table 4, bi the demand side of
( )
Table 2, and aj the supply side of Table 3. The comparison with imputed values, noted in
the footnote, confirms a rough consistency between Tables 2 to 4.9
8The instruments are semi-arid area dummy, port dummy, illiteracy rate (1970), ln(industrial capital per
worker, 1970), ln(distance to state capital)*ln(market pot. agric. land availability, 1970), ln(humidity),
ln(avg. temperature), ln(rural pop. supply, 1970), ln(rural income opportunities, 1970), ln(market potential,
1970), and state capital and time dummies.
9
Imputed
[from Tables 2 (3) and 3 (1)] Table 4 (1)
Ln(market potential) b1/(1/a1-b4) 0.468 2.693
Ln(inter-city trans. costs) b2/(1/a1-b4) -0.250 -1.395
Average Schooling b3/(1/a1-b4) 0.381 0.220
Ln(rural pop. supply) -a2/(1/a1-b4) 3.053 1.661
Ln(rural income opportunities) -a3/(1/a1-b4) -3.468 -3.664
15
Table 4 suggests two things. First, market potential for goods, the rural population
supply, and rural income opportunities have significant effects on city populations with
roughly similar magnitudes. A 1% increase in market potential and rural population
increase city size by 2.7% and 1.7% respectively. In comparison, a 1% decrease in rural
income opportunities would increase city size by 3.7%. Second, intercity-transport costs
and educational attainment (average schooling) are also important, although GMM-IV
results are somewhat fragile.
Growth Results
Next we turn to growth equations, where we difference the reduced form equation
(7). While in principle results should be the same, a differenced equation has three
possible advantages and one draw-back. First a growth formulation allows us to separate
out labor force quality improvements from the effect of education on technology
(knowledge accumulation spillovers). The latter is inferred from the effect on city growth
of base period education levels, in a common specification in the growth literature.
Second, while the levels formulation we estimated passes specification tests, one might
have strong priors that there are time invariant unobservables affecting city size that are
difficult to instrument for; differencing removes these. Third, a growth formulation
allows us conceptually to move beyond the equilibrium static allocation framework used
in the specification to test for growth effects where adjustments processes are involved.
The drawback in differencing equations is that the effects of variables which have small
changes over time may be poorly estimated, given lack of variation in the data.
16
Table 5-1 shows the GMM-IV and OLS growth results pooling 1991-1980 and
2000-1991 differenced equation years for equation (7). For instruments, we add to the IV
list of Table 4 ln(distance to São Paulo), ln(transport costs to São Paulo, 1968), and
ln(transport costs to state capital, 1968). All covariates, except changes in rural income
opportunities, have strong and expected sign coefficients. The poor performance of rural
income opportunities is most probably due to the limited variance in the data over time,
as discussed next.
Relative to the levels equation in Table 4, the growth equation coefficients
reported in column 1 are similar for market potential and (change) in schooling. However
results for changes in rural situation variables and transport costs differ in magnitude. For
ln(rural population supply) and ln(rural income opportunities), not only is there little
variation, the two variables are strongly negatively correlated.10 So the high coefficient
on ln(rural population supply) may be picking up some of the effect of ln(rural income
opportunities). For the inter-city transport cost variable, differences over time may be
poorly measured. While we instrument for this variable, the instruments include historical
levels of the same measure, and therefore may be subject to the same measurement
issues. As a result, reductions in inter city transport costs have a much smaller effect in
the growth estimation. Nevertheless coefficients are consistent in sign with those of the
level equation in Table 4.
In examining the results in Table 5, we focus on column 3. The main difference
between the GMM results in columns 1 and 3 is that we introduce base period population
and manufacturing to service ratios in the latter specification. Controlling for population
allows for dynamic adjustment to steady state levels from the base, and introducing
10The correlation coefficients are -.719 (for 1991-1980) and -.481 (for 2000-1991).
17
industrial composition allows for adjustment relative to changes in national output
composition. For results in column 3, the instrument list readily passes the specification
test. First stage regressions for the covariates have average partial R2's and F's of
respectively .52 and 2852, which are strong for differenced covariates. For differenced
intercity-transport costs, we use the difference between 1995 and 1980 for 2000-1991;
and the difference between 1980 and 1968 for 1991-1980.
We find that increases in rural population supply, market potential of goods, labor
force quality improvements (measured by changes in educational attainment) increase the
growth rate of city population. As a new effect, educational attainment in the base period
increases city population growth rates afterwards, confirming spillover effects of
knowledge accumulation. But as noted above, reductions in intercity-transport costs have
a moderate effect on city population growth rate. A 10% decrease in intercity-transport
costs increases city population growth by .9% over a decade. Initial city size has a
negative coefficient, suggesting some conditional convergence in population growth
across cities. Also, cities with high manufacturing ratios in the base period experience
faster growth. We also find that once base period population and industrial composition
are controlled for, state capitals are growing faster than other cities.
In Table 5-2, we introduce two additional local characteristics to the specification
in Table 5-1, column 3. These are (1) ratio of public industry capital to total industry
capital stock in 198011 and (2) base period homicide rates. The main difference between
the GMM results in column 3, Table 5-1 and those from Table 5-2 is that the statistical
significance for the change in market potential drops to 20 percent. Other results are
11Total industry capital includes both public and private industry capital stocks. The capital stock data
comes from Morandi and Reis (2004). Due to data limitation, we use capital stock in 1980, which is the
most recent year available.
18
consistent with those reported in Table 5-1. The GMM results suggest that homicide rates
and an increasing share of public industry capital have a detrimental effect on city
growth. For example, a 10% increase in base period homicide rates reduces city growth
by 1.1% over the next decade. The findings on public industrial capital accumulation
suggest that public investment in industry tends to crowds out private investment (at least
in the short term), and the potential inefficiency of state enterprises may also deter
economic growth.12
Decomposing City Growth
In Table 6, we decompose the city population growth results of Table 5-1 (3) into
contributions of each covariate. We focus on the covariates which are statistically
significant. The contribution of each covariate is calculated as a fitted value (the mean
value multiplied by the estimated coefficient) relative to the sum of all the fitted values.
Column 5 shows the overall contributions for all cities. There is a strong negative effect
of city size in base period (-83.4%). This effect is compensated by increases in market
potential (63.8%) and educational attainment (66.7%), along with base period's
educational attainment (46.7%) which affects local technology growth.
The estimated effects of market potential and technology spillovers support the
new economic geography emphasis on local markets and the endogenous growth
literature emphasis on human capital accumulation. These results are also consistent with
cross country findings in Henderson and Wang (2005).13 Columns 6 and 7 compare city
12La Porta and López-de-Silanes (1999) showed privatization in Mexico in 1980s and 1990s led to a
significant improvement in firm performance, as profitability increased 24 percentage points and converged
to levels similar to those of private firms.
13Henderson and Wang (2005) analyzes how urbanization in a country is accommodated by increases in
numbers versus population sizes of cities. Using a worldwide dataset on all metro areas over 100,000
population from 1960-2000, they show market potential, educational attainment, and the degree of
democratization strongly affect growth in both city numbers and individual city sizes.
19
growth decompositions of large versus small cities. We find no major difference in these
effects across city size.
Robustness Tests Spatial Dependence
Interaction among cities due to trading and technological linkages is likely to
influence city growth. In the presence of technology spillovers, copy cat policy adoption,
and inter regional transport connectivity, growth in any given city will be related to other
cities in the urban system, and the impact of these spillovers is likely to be higher among
cities which are geographically close to each other. Much of these interactions however
are not observed in the data that we have been able to compile, and thus is relegated to
the error specification. In the presence of spatial autocorrelation, standard errors from the
city growth estimation are likely to be inaccurate and introduce efficiency problems in
the various estimations.
To address this issue, we test whether the clustered estimation results of Tables 2
to 5-2 are robust to residual spatial dependence. Tests for spatial dependence (Moran's I
and Geary's C) show that there is residual spatial autocorrelation in the error terms. To
address this issue, we employ the GMM methodology reported by Conley (1999), who
uses weighted averages of spatial autocovariance terms to correct the standard errors of
parameter coefficients for possible serial dependence based on location. This approach is
robust to misspecification of the degree of spatial correlation among the units. In this
nonparametric application, the researcher can specify a cutoff point beyond which spatial
dependence is thought to be unimportant. We use latitude and longitude of the
agglomeration centroid as coordinate variables. Cutoffs are set to be 1.5 standard
deviations of latitude and longitude (10.23, and 8.20), which correspond to 900 miles.
20
Thus, spatial correlation between cities declines linearly and is zero beyond 1.5 standard
deviations of latitude and longitude.
Appendix Tables A to D report the two-step spatial GMM and spatial OLS results
which correspond to each specification of Tables 2 to 5-2. In general we find that the
GMM results are robust and the spatial GMM results are very similar to the clustered
ones.
Decomposition of City Growth Residuals
We now use the residuals from the GMM estimations in Table 5-2 (1), and
examine if they have any systematic association with time invariant local characteristics.
Our main interest is in examining if local management or governance, and inter industry
linkages are associated with city growth. In principle, autonomous local government
would actively work to provide local public goods for its constituents, and develop
policies to stimulate growth and manage externalities. For our analysis, we have two
measures of local government efforts: (1) existence of laws to collect IPTU tax (property
tax), (2) percentage of population under land zone laws.
In terms of inter industry linkages; we expect a clustered or densely populated
region to provide a rich environment for competition and collaboration among firms and
workers in the region, which lead to economic growth. As Saxenian (1994) observed,
regional development is more distinct in a region consisting of many small size firms
than that of a few large firms.14 A city with a rich set of forward and backward linkage
14Saxenian (1994) examined different regional economic performances between Silicon Valley in
California and Route 128 in Massachusetts. Dense social networks and open labor market in Silicon Valley
have facilitated informal communication and collaborative practices, and produced a regional network-
21
industries performs better than an enclave-a small pocket of firms. We measure the
density of economic activities by (1) ln(no. firms relative to workers) = ln(no. formal
firms / no. workers in formal firms), and (2) ln(population density).
The basic estimation results from decomposing the residuals of Table 5-2 (1) are
reported in Table 7. The basic structure is that city growth residuals between t and (t-1)
years are affected by city characteristics in year (t-1). However, when data in year (t-1)
are not available, we use the city characteristics in year t assuming long-lasting
persistence of city characteristics across years. In any case, the estimation result should
be interpreted as associations of contemporary variables rather than a causal relationship.
We find that population growth is higher in cities with better enforcement of land
use and zoning laws the estimates suggest that city growth is associated with increases
in the percentage of city population under land zone laws.15 However, we do not find any
statistically significant association between city growth and existence of laws to collect
IPTU (property tax). This is most likely because there is almost no variation in the IPTU
collection data most cities have laws to collect the property tax. A richer set of inter
industry linkages is also associated with growth the OLS coefficient for the number of
(formal) firms relative to (formal) workers is statistically significant and has the expected
sign. A higher number of firms relative to workers stimulate competition and
collaboration among firms and workers in a city, and is associated with higher city
growth.
based industrial system. The Route 128 region, in contrast, is dominated by autarkic (self-sufficient)
corporations that internalize a wide range of productive activities. She concluded that this difference in
regional socio-economic structure accounts for the divergent prosperity of two regional economies, in spite
of their common origins in postwar military spending and university-based research, and even though they
enjoyed roughly the same employment levels in 1975.
15We can get a similar result when we use a dummy variable indicating more than 50% of population is
under land zone laws.
22
4. POLICIES FAVORING SECONDARY CITIES
Using the results from the regressions of city growth, let us consider the following
policy experiment. There is considerable policy debate in Brazil that investments need to
be directed towards secondary cities to stimulate local economic development and limit
the growth of the largest metropolitan areas. However, the impact of these initiatives on
overall economic growth and urban efficiency is unclear.
Suppose the Brazilian government invests in transportation infrastructure in order
to decrease inter-city transport costs. An issue is whether favoring investments in small
cities vis-à-vis large cities increase overall productivity growth, and therefore higher
overall economic growth in Brazil. To make the analysis tractable, we first assume that
the amount of transportation investment to reduce one unit of inter-city transport cost (per
mile) is proportional to city population. So one unit decease in inter-city transport costs
for a city of 1 million is assumed to cost the same amount of government expenditure as
those for 10 cities of 100,000 people.
In 2000, the largest city, São Paulo, has 17.9 million residents, which is
equivalent to the total population of the 88 smallest cities (Table 8). The total population
of the 7 largest cities is the same as that of remaining 116 small cities (Our data consist of
123 cities). Our assumption says that total transportation investment needed to decrease
one unit of transport costs for São Paulo will also reduce one unit of transport costs for
the 88 smallest cities, if invested in those cities.
Table 2 (3) describes the determinants of income per worker, in which average
schooling, market potential, city population, and inter-city transport costs affect income
per worker. From this equation, we can calculate the total urban income in Brazil, s. t.
23
123
total urban income = income per workeri × no. workersi
i=1
123
X b^
i GMM × no. workersi.
i=1
Now suppose the government invests in transportation infrastructure. In Table 8,
we compare the effect on total urban income of investments favoring big cities versus
small cities. The first column is the total urban income relative to the baseline income
when infrastructure investments favor largest cities, specifically a ½ standard deviation
(.4) decrease in inter-city transport cost of largest cities. The baseline income is the
predicted value of Table 2 (3). The second column is the total urban income when the
same amounts are invested in the smallest cities to decrease those cities' transport cost by
the same magnitude (.4). We experiment with several combinations of cities in Table 8.
The simulation results show that there are very small differences in total urban
income from favoring small cities vis-à-vis large cities. These income differences range
around 0.3 ~ 0.7%p of total urban income growth in 2000. The difference is highest when
we favor the 104 smallest cities vis-à-vis than the largest two cities (.698%p). These
results tell that there are no major gains in terms of overall urban income from diverting
investments from the largest cities to secondary cities.
5. SUMMARY AND CONCLUSIONS
In this paper, we have examined the determinants of Brazilian city growth
between 1970 and 2000. For the analysis, we constructed a dataset of 123
agglomerations, and examined factors that influence wages and labor supply. Our main
findings are the following. (1) Increases in rural population supply is a major driver of
city growth. (2) Inter-regional transport improvements that lead to increases in the market
24
potential of goods and reduce inter city transport costs stimulate growth. In fact, we find
that increases in market potential have the strongest impact on city growth. (3)
Improvements in labor force quality and the spillover effects of knowledge accumulation
(measured by initial levels of education attainment) have strong growth impacts.
In terms of inter regional transport improvements, the Brazilian government has
made significant investments in infrastructure to integrate the national economy and
lower business costs in peripheral regions. Most of the improvements in the road network
occurred between the 1950s and 1980s, leading to significant reduction in transportation
and logistics costs. Castro (2002) measures the benefits of improvements in highway
infrastructure from 1970-1995 as the change in equivalent paved road distance from each
municipality to the state capital of São Paulo, accounting for the construction of the
network as well as the difference in vehicle operating costs between earth/gravel and
paved roads. He shows that transport cost reductions were quite significant for the
Northern region and Central region state of Mato Grosso, with numbers varying from
5,000 to 3,000 equivalent kilometers of paved road. Average reductions fall to the 1,000
km range in the Central region states of Goiás and Mato Grosso do Sul, the southern
states, and the coastal northeastern states. Using this measure, Castro (2002) finds that the
reduction in interregional transport costs was one of the major determinants of both the
expansion of agricultural production to the central regions of Brazil after the 1960s as
well as increases in the country's agricultural productivity
In terms of city level characteristics, we find that local homicide rates have a
negative impact on city growth rates. In addition, cities with high shares of public
industrial capital also experience slower growth. Thus, there is considerable scope for
25
local initiatives to reduce the costs imposed by crime and violence, along with local
economic development programs to improve access to finance for small and medium
sized businesses.
Our decompositions of city growth residuals tentatively show that local land use
and zoning enforcement is positively associated with city growth, as is the presence of a
diverse set of inter industry linkages. One of the major limitations in our efforts to
identify the contribution of local characteristics to city growth has been the lack of
longitudinal data, which makes it difficult to draw causal relationships. It would be useful
to get better data on historic land use and zoning regulations, as well as local public
goods, services, and amenities. In further work, we hope to collect additional data on city
level characteristics to better identify their impacts on city growth.
26
6. REFERENCES
Alesina, A. and D. Rodrik (1994), "Distribution Politics and Economic Growth," The Quarterly
Journal of Economics, 109, 456-490.
Beeson, P., D. DeJong and W. Troesken (2001), "Population Growth in U.S. Counties, 1840-
1990," Regional Science and Urban Economics, 31, 669-699.
Castro, N. (2002). "Transportation costs and Brazilian agricultural production: 1970-1996" Texto
para Discussão - NEMESIS LXVI, http://ssrn.com/author=243495", Social Science
Research Network.
Conley, T. (1999), "GMM Estimation with Cross Sectional Dependence," Journal of
Econometrics, 92, 1-45.
Da Mata, D., U. Deichmann, V. Henderson, S. Lall, and H. Wang (2005). Examining the Growth
Patterns of Brazilian Cities. Mimeo.
Duranton, G. and D. Puga (2004), "Micro-Foundations of Urban Agglomeration Economies," in
J. V. Henderson and J.F. Thisse (eds.) Handbook of Regional and Urban Economics, Vol
4. North-Holland.
Galor, O. and J. Zeira (1993), "Income Distribution and Macroeconomics," Review of Economic
Studies, 60, 35-52.
Glaeser E., J. Scheinkman and A. Shleifer (1995), "Economic Growth in a Cross-Section of
Cities," Journal of Monetary Economics, 36, 117-143.
Henderson, J. V. and H.G. Wang (2005), "Urbanization and City Growth: the Role of
Institutions," Brown University, mimeo.
Henderson, J.V. and C.C. Au (2004), "Are Chinese Cities Too Small?," Brown University,
mimeo.
Hummels, D. (2001), "Toward a Geography of Trade Costs", Purdue University, mimeo.
IPEA, IBGE, and UNICAMP (2002), Configuração Atual e Tendêncies da Rede Urbana, Serie
Configuração Atual e Tendêncies da Rede Urbana, Instituto de Pesquisa Econômica
Aplicada, Instituto Brasileiro de Geografia e Estatistica, Universidade Estadual de
Campinas, Brasilia.
Lemos, M,. Moro, S., Biazi, E., Crocco, M. (2003). A Dinâmica urbana das Regiões
Metropolitanas Brasileiras. Economia Aplicada, 7, 1:213-244.
Korenman, S. and D. Neumark (2000), "Cohort Crowding and Youth Labor Markets: A Cross-
National Analysis," in D. Blanchflower and R. Freeman, Youth Employment and
Joblessness in Advanced Countries, University of Chicago Press, pp. 57-105.
La Porta, R. and F. López-de-Silanes (1999), "The Benefits of Privatization: Evidence From
Mexico," The Quarterly Journal of Economics, 114, 1193-1242.
Morandi, L. and E. Reis (2004), "Estoque De Capital Fixo No Brasil, 1950-2002," Anais do
XXXII Encontro Nacional de Economia, Proceedings of the 32th Brazilian Economics
Meeting.
Persson, T. and G. Tabellini (1994), "Is Inequality Harmful for Growth?," American Economic
Review, 84, 600-622.
27
Saxenian, A. (1994), Regional Advantage: Culture and competition in Silicon Valley and Route
128, Harvard University Press.
United Nations (2003). World Urbanization Prospects.
Weil, D. (2005). Economic Growth, Addison-Wesley.
World Bank (2003). Brazil: Equitable, Competitive and Sustainable - Contributions for Debate.
World Bank, Washington DC.
28
Source: IPEA, IBGE
Figure 1: Urban Agglomerations by population size
29
Table 1: City Size Distribution
Population size 1970 1980 1991 2000
> 5 million 2 21) 32) 3
2 million - 5 million 1 3 7 7
1 million - 2 million 4 5 5 8
500,000 - 1 million 5 10 15 14
250,000 - 500,000 16 21 23 30
100,000 - 250,000 44 43 44 46
< 100,000 51 39 26 15
Total number of cities 123 123 123 123
Average size 350,857 507,242 657,602 788,222
Min 20,864 41,454 76,816 86,720
Max 8,139,705 12,588,745 15,444,941 17,878,703
1) "São Paulo" and "Rio de Janeiro"
2) "Porto Alegre" is newly added.
30
Table 2. Demand Side: Determinants of Income Per Workera,b,c
(robust standard errors in parentheses)
(1) (2) (3) (4)
The effect of
GMM-IV OLS GMM-IV increase
in covariate
based on (1)
Average Schooling 0.298*** 0.280*** 0.271*** 0.375
(0.032) (0.026) (0.033)
Ln(market potential) 0.363*** 0.048** 0.333*** 0.365
(0.080) (0.018) (0.070)
Ln(no. workers) -0.304*** 0.005 -0.290*** -0.344
[ln(population) for (3)] (0.095) (0.016) (0.079)
Ln(inter-city transport costs) -0.216* 0.016 -0.178* -0.074
(0.112) (0.032) (0.092)
state capital dummy 0.019 -0.090 0.075
(0.146) (0.062) (0.144)
time dummies Yes Yes Yes
Observations 369 369 369
R2 0.807
Hansen J statistic
(overidentification test) 1.593 1.439
(p-value) (0.661) (0.696)
Average of Partial R2 0.435 0.425
Average of Partial F's 52.67 51.58
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.
a. The instruments are semi-arid area dummy, ln(distance to state capital), ln(distance to São
Paulo), manufacturing/service employment ratio (1970), infant mortality (1970), ln(humidity),
average years of schooling (1970), state capital and time dummies.
b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary
heteroskedasticity and arbitrary intra-group (within-state) correlation.
c. OLS regressions are with robust cluster standard errors. We assume the observations may be
correlated within states, but would be independent between states.
31
Table 3. Population Supplya,b,c
(robust standard errors in parentheses)
(1) (2) (3) (4) (5)
GMM-IV OLS GMM-IV GMM-IV GMM-IV
(1980) (1991) (2000)
Ln(income per capita) 2.370*** 1.813*** 1.830*** 2.636*** 2.886***
(0.683) (0.378) (0.569) (0.704) (0.933)
Ln(rural income opportunities: -5.151*** -4.152*** -4.821*** -5.316*** -5.624***
market potential) (1.454) (0.819) (1.457) (1.354) (1.824)
Ln(rural pop. supply market 5.851*** 4.878*** 5.559*** 5.978*** 6.317***
potential) (1.368) (0.752) (1.378) (1.281) (1.705)
time dummies Yes Yes No No No
R2 0.745
Hansen J statistic
(overidentification test) 1.909 1.297 1.148 1.655
(p-value) (.591) (0.730) (0.765) (0.647)
Average of Partial R2 0.657 0.691 0.644 0.662
Average of Partial F's 55.50 34.41 37.48 64.29
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.
a. The instruments are semi-arid area dummy, ln(distance to São Paulo), ln(market pot. agric. land
availability, 1970), port dummy, ln(per capita capital stock, 1970), southern region and time
dummies.
b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary
heteroskedasticity and arbitrary intra-group (within-state) correlation.
c. OLS regressions are with robust cluster standard errors. We assume the observations may be
correlated within states, but would be independent between states.
32
Table 4. City Size Equationsa,b,c,d
(robust standard errors in parentheses)
(1) (2) (3)
The effect of
GMM-IV OLS increase
in covariate
based on (1)
Ln(rural pop. supply) 1.661*** 1.216*** 1.558
(0.643) (0.425)
Ln(rural income opportunities) -3.664*** -1.999*** -3.701
(0.894) (0.600)
Ln(market potential) 2.693*** 1.426** 2.720
(0.916) (0.586)
Average Schooling 0.220** 0.231** 0.277
(0.091) (0.106)
Ln(inter-city transport costs) -1.395*** 0.081 -0.480
(0.337) (0.110)
State capital dummy -0.260 1.091***
(0.395) (0.170)
time dummies Yes Yes
Observations 369 369
R2 0.801
Hansen J statistic
(overidentification test) 1.770
(p-value) (.880)
Average of Partial R2 .477
Average of Partial F's 129.47
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.
a. The instruments are semi-arid area dummy, port dummy, illiteracy rate (1970), ln(industry
capital per worker, 1970), ln(distance to state capital)*ln(market pot. agric. land availability,
1970), ln(humidity), ln(avg. temperature), ln(rural pop. supply, 1970), ln(rural income
opportunities, 1970), ln(market potential, 1970), and state capital and time dummies.
b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary
heteroskedasticity and arbitrary intra-group (within-state) correlation.
c. OLS regressions are with robust cluster standard errors. We assume the observations may be
correlated within states, but would be independent between states.
d. Average of Partial R2 and Partial F's are for average schooling and Ln(inter-city transport
costs). Market potential and gravity measures are almost completely correlated with those in
1970 (Partial R2's are around .99).
33
Table 5-1. City Size Growth Equationa,b,c
(robust standard errors in parentheses)
(1) (2) (3) (4)
GMM-IV OLS GMM-IV OLS
Ln(rural pop. supply market 9.188*** 3.216*** 9.429*** 3.064***
potential) (2.309) (0.892) (2.410) (0.631)
Ln(rural income opportunities: 0.756 0.364 0.358 0.198
market potential) (0.883) (0.517) (0.728) (0.317)
Ln(market potential) 2.294*** 2.860*** 1.284** 2.738***
(0.761) (0.798) (0.512) (0.551)
Average schooling (t-1) 0.078*** 0.021 0.071*** 0.021
(0.021) (0.014) (0.013) (0.012)
Average schooling 0.275* 0.067* 0.384*** 0.097***
(0.141) (0.033) (0.104) (0.033)
Ln(inter-city transport costs) -0.078** -0.092** -0.089*** -0.088**
(0.035) (0.037) (0.026) (0.037)
state capital dummy 0.016 0.080*** 0.154*** 0.129***
(0.036) (0.024) (0.035) (0.037)
Ln(population) (t-1) -0.047*** -0.018*
(0.009) (0.010)
Manu / service (t-1) 0.140*** 0.096***
(0.027) (0.019)
time dummies Yes Yes Yes Yes
Observations 246 246 246 246
R2 0.364 0.403
Hansen J statistic
(overidentification test) 5.786 8.204
(p-value) (.565) (.514)
Average of Partial R2 .412 .526
Average of Partial F's 395.70 2852.4
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.
a. For (1), instruments are the IV list of Table 4, ln(distance to São Paulo), ln(transport costs to São
Paulo, 1968), and ln(transport costs to state capital, 1968). For (3), we drop ln(industry capital per
worker, 1970) from (1), and add ln(population, 1970), manu/service ratio (1970), manu/service
ratio(1970)*ln(population, 1970), manu/service ratio(1970)*ln(income per capita, 1970), and
manu/service ratio(1970)*ln(market potential, 1970).
b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary
heteroskedasticity and arbitrary intra-group (within-state) correlation.
c. OLS regressions are with robust cluster standard errors. We assume the observations may be
correlated within states, but would be independent between states.
34
Table 5-2. City Size Growth Equation (continued)a,b,c
(robust standard errors in parentheses)
(1) (2)
GMM-IV OLS
Ln(rural pop. supply market 5.727** 3.227***
potential) (2.488) (0.684)
Ln(rural income opportunities: -0.534 0.229
market potential) (0.917) (0.359)
Ln(market potential) 1.546 2.127***
(1.257) (0.355)
Average schooling (t-1) 0.064*** 0.035***
(0.016) (0.011)
Average schooling 0.323** 0.093**
(0.138) (0.034)
Ln(inter-city transport costs) -0.082* -0.059
(0.043) (0.036)
state capital dummy 0.139*** 0.113***
(0.036) (0.030)
Ln(population) (t-1) -0.044*** -0.023**
(0.008) (0.008)
Manu / service (t-1) 0.067** 0.066**
(0.032) (0.027)
Ln(homicide / pop) (t-1) -0.115*** -0.092***
(0.033) (0.025)
Public industry capital / -0.764** -0.780
total industry capital in 1980 (0.298) (0.502)
time dummies Yes Yes
Observations 245 245
R2 0.469
Hansen J statistic
(overidentification test) 5.549
(p-value) (.698)
Average of Partial R2 .498
Average of Partial F's 3014.5
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.
a. Public industry capital / total industry capital (1980) is assumed to be exogenous by adding it to
the IV list of (3).
b. GMM estimates are from the two-step efficient GMM in the presence of arbitrary
heteroskedasticity and arbitrary intra-group (within-state) correlation.
c. OLS regressions are with robust cluster standard errors. We assume the observations may be
correlated within states, but would be independent between states.
35
Table 6. Decomposition of City Size Growth
Coef. of ( ) Decomposition of city growth
Table 5-1 Mean bi (ai ×bi /c), %
(3), ai
( ) Total Large Small Large Small
citiesb citiesb Total citiesb citiesb
No. cities 123 61 62
Ln(city pop) 0.226 0.264 0.188
Ln(rural pop. supply 9.429 -0.006 -0.005 -0.008 -8.5 -6.5 -10.6
market potential)
Ln(market potential) 1.284 0.346 0.346 0.345 63.8 62.2 65.5
Average schooling (t-1) 0.071 4.568 4.773 4.366 46.7 47.4 45.9
Average schooling 0.384 1.208 1.215 1.201 66.7 65.3 68.2
Ln(inter-city transport -0.089 -0.215 -0.191 -0.239 2.8 2.4 3.1
costs)
State capital dummy 0.154 0.171 0.344 0.000 3.8 7.4 0.0
Ln(population) (t-1) -0.047 12.339 13.172 11.520 -83.4 -86.6 -80.1
Manu / service (t-1) 0.140 0.406 0.428 0.385 8.2 8.4 8.0
c = ai ×bi
0.695 0.715 0.676
i
sum 100.0 100.0 100.0
a. Means are for 2000-1991 and 1991-1980. For average schooling (t-1), it is for 1991 and 1980.
b. We define large (small) cities if they have greater (less) than median city population in each year.
36
Table 7. Regression of City Growth Residualsa,b
(robust standard errors in parentheses)
. (1)
OLS
Laws to collect property tax 0.035
(0.042)
% of pop under land zone law 0.050***
(0.014)
Ln(no. formal firms / 0.046*
no. workers in formal firms) (0.024)
Ln(pop density) 0.001
(0.007)
Small city dummy -0.044***
(0.015)
time dummies Yes
Observations 245
R2 0.093
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.
a. Small city dummy has a value 1 if a city has less than median city population in each year.
b. OLS regressions are with robust cluster standard errors. We assume the observations may be
correlated within states, but would be independent between states.
37
Table 8. Policy Simulation: favoring largest cites versus smallest ones
(½ standard deviation (.4) decrease in inter-city transport costs in 2000)
Total urban income relative to the
Comparison baseline income (%) (b-a, %p)
Favoring largest Favoring smallest
cities (a) cities (b)
1 largest vs. 88 smallest 102.072 102.763 0.691
2 largest vs. 104 smallest 103.761 104.458 0.698
3 largest vs. 109 smallest 105.227 105.550 0.323
4 largest vs. 112 smallest 106.072 106.413 0.341
5 largest vs. 113 smallest 106.651 106.715 0.064
6 largest vs. 115 smallest 107.020 107.517 0.497
7 largest vs. 116 smallest 107.679 108.033 0.354
38
Appendix A. Means and Standard Deviations of Variables (N= 369, 123 cities for 3 years)
Variable mean Standard
deviation
Ln (income per worker) 6.53 .279
Average schooling 5.13 1.26
Ln (market potential) 27.3 1.01
Ln (inter-city trans. costs: 1980,
excluding state capitals) .857 .344
Ln( no. workers) 11.5 1.13
Ln (population) 12.4 1.12
Ln(rural pop. supply market
potential) 20.2 .938
Ln( rural income opportunities:
market potential) 12.4 1.01
39
Appendix B. Market potential measures
(1) Basic Market Potential
Market potential of agglomeration i is defined as the sum of its member MCAs' market
potential. Therefore the market potential of agglomeration i in year t is
3659
yj (t)× popj (t)
-1 .
kii j=1 (Ad )
ki , j
where yj t is per capita income of MCA j in year t, and popj t population of MCA j in year
( ) ( )
t. di is the distance between MCA i and j (100 miles). The distance of own MCA di, ( ) is the
, j i
2 area
average distance to city center, which is equal to . is assumed to be 2, is 0.3 (0.22
3
between two port cities), and A is such that Adi,j =1 for the smallest land area city (Au and
0.3
Henderson, 2004; Hummels, 2001).
(2) Incomes offered in local rural areas competing with own city for local population
The gravity measure of surrounding rural per capita incomes is a market potential measure of
agglomeration i in year t , such that
rural 3659 GDPj (t)/ rural popj (t) .
kii j=1
ji (Ad )
-1
ki , j
The MP calculation does not include the rural per capita MCA incomes of the same
agglomeration. All parameters are the same as (1). Rural GDPs of (1970, 1980, 1985, and 1996)
are assigned to those of (1970, 1980, 1991, and 2000).
(3) Potential supply of people to the city from local rural areas
The gravity measure of surrounding rural population is also a market potential measure of
agglomeration i in year t , such that
rural 3659 popj (t) .
kii j=1
ji (Ad ) -1
ki , j
The MP calculation is the same as (2).
(4) Market potential measure of agricultural land availability
The agricultural land market potential is calculated in the same way as (1), such that
40
agri3659 land j (t)
-1
kii j=1
(Ad )
ki , j
where agri land j t is agricultural area of MCA j in year t. All parameters are the same as
( )
previous ones.
41
Appendix C. Data sources and definitions
There is no official definition of "city" or "agglomeration" in Brazil. The lowest administrative
level consists of more than 5000 municípios. However, these vary greatly in size and many
functional economic and population agglomerations consist of a number of municípios. In this
paper, we therefore follow the example of a study of Brazilian urban dynamics by IPEA, IBGE
and UNICAMP (2002). It defined agglomerations based on their place in the urban hierarchy
from "World Cities" (São Paulo and Rio de Janeiro) to subregional centers. For each
agglomeration, this study identified the municípios that were a functional part of the urban area.
The municípios belonging to each agglomeration were then further classified into eight categories
according to how tightly they are integrated in the agglomeration, from "maximum" to "very
weak". The main criteria used in these classifications were centrality, function as a center of
decision making, degree of urbanization, complexity and diversification of the urban areas, and
diversification of services. These were measured by a range of census and other variables such as
employed population in urban activities, urbanization rate, and population density. We modified
this classification slightly by also including smaller municípios to existing agglomerations if their
population exceeded 75,000 population and more than 75 percent of its residents lived in urban
areas in 1991, or if they were completely enclosed by an agglomeration.
The agglomeration definitions developed by IPEA, IBGE and UNICAMP (2002) are based on
municípios boundaries valid at the time of the Brazilian Population Census of 1991 and the
Population Count of 1996, while our study captures dynamics from 1970 to 2000. During this
time, many new municípios were created by splitting or re-arranging existing ones. In fact, the
number of municípios increased from 3951 to 5501 during these three decades. To create a
consistent panel of agglomerations for the 1970 to 2000 period, we therefore used the Minimum
Comparable Area (MCA) concept as implemented by IPEA researchers. MCAs group municípios
in each of the four census years so that their boundaries do not change during the study period.
All data have then been aggregated to match these MCAs. The resulting data set represents 123
urban agglomerations that consist of a total of 447 MCAs.
The sources for the majority of data employed in this paper are the Brazilian Bureau of Statistics
(IBGE) Population and Housing Censuses of 1970, 1980, 1991 and 2000. We used the full
Brazilian census counts to get information about total population and housing conditions
(urbanization rate). Other data were collected only for a sample of households. We used this
census sample information for income, industrial composition, education, piped water provision,
42
and electricity availability. The sample sizes varied across census years (1970: 25 percent; 1980:
25; 1991: 12.5; 2000: 5)., but all are representative at the município level, and thus are also
reliable at the MCA level employed in this study. Income figures are compiled from monthly
data, deflated to 2000 Real (R$).
The transportation cost (proxy for transportation connectivity) between all Brazilian
municipalities and the nearest State capital and between all Brazilian municipalities and São
Paulo come from Professor Newton De Castro at the Federal University of Rio De Janeiro, and
available at www.ipeadata.gov.br.
Existence of Ports and Brazilian Regions dummies are from the Bureau of Statistics (IBGE)
Municipalities Profile of 1999. Homicides are from DATASUS / Brazilian Ministry of Health
dataset. Local government expenditures are from the Brazilian Treasury dataset of 1991 and
2000. Formal employment data are from RAIS dataset / Brazilian Ministry of Labor. Morandi and
Reis (2004) capital stock data employed in our analysis come from Brazilian Economic Censuses
of 1970, 1975 and 1980.
43
Appendix D. Robustness test for spatial dependence
Table A. Demand Side: Determinants of Income Per Workera,b
(standard errors corrected for spatial dependence in parentheses)
(1) (2) (3)
Spatial GMM Spatial OLS Spatial GMM
Average Schooling 0.286*** 0.280*** 0.260***
(0.032) (0.023) (0.030)
Ln(market potential) 0.404*** 0.048*** 0.371***
(0.083) (0.016) (0.069)
Ln(no. workers) -0.318*** 0.005 -0.304***
[ln(population) for (3)] (0.113) (0.018) (0.092)
Ln(inter-city transport costs) -0.246** 0.016 -0.218**
(0.122) (0.024) (0.102)
state capital dummy -0.010 -0.090** 0.041
(0.157) (0.039) (0.143)
time dummies Yes Yes Yes
Observations 369 369 369
Hansen J statistic
(overidentification test) 0.884 0.901
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.
a. The instruments are semi-arid area dummy, ln(distance to state capital), ln(distance to São Paulo),
manufacturing/service employment ratio (1970), infant mortality (1970), ln(humidity), average
years of schooling (1970), state capital and time dummies.
b. Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and
longitude (10.23, and 8.20), which correspond to about 900 miles.
44
Table B. Population Supplya,b
(standard errors corrected for spatial dependence in parentheses)
(1) (2) (3) (4) (5)
Spatial GMM Spatial OLS Spatial GMM Spatial GMM Spatial GMM
Ln(income per capita) 2.539*** 1.813*** 1.846*** 2.771*** 3.072***
(0.624) (0.359) (0.476) (0.613) (0.879)
Ln(rural income opportunities: -5.536*** -4.152*** -4.873*** -5.638*** -6.040***
market potential) (1.445) (0.830) (1.285) (1.334) (1.849)
Ln(rural pop. supply market 6.231*** 4.878*** 5.615*** 6.313*** 6.719***
potential) (1.376) (0.788) (1.223) (1.276) (1.755)
time dummies Yes Yes No No No
Observations 369 369 123 123 123
Hansen J statistic
(overidentification test) 1.355 1.014 1.463 1.684
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.
a. The instruments are semi-arid area dummy, ln(distance to São Paulo), ln(market pot. agric. land
availability, 1970), port dummy, ln(per capita capital stock, 1970), southern region and time
dummies.
b. Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and
longitude (10.23, and 8.20), which correspond to about 900 miles.
45
Table C. City Size Equationsa,b
(standard errors corrected for spatial dependence in parentheses)
(1) (2)
Spatial GMM Spatial OLS
Ln(rural pop. supply) 1.706*** 1.216***
(0.635) (0.386)
Ln(rural income opportunities) -3.317*** -1.999***
(0.864) (0.462)
Ln(market potential) 2.322*** 1.426***
(0.660) (0.468)
Average Schooling 0.181* 0.231**
(0.099) (0.112)
Ln(inter-city transport costs) -1.346*** 0.081
(0.280) (0.083)
State capital dummy -0.211 1.091***
(0.330) (0.187)
time dummies Yes Yes
Observations 369 369
Hansen J statistic
(overidentification test) 1.659
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.
a. The instruments are semi-arid area dummy, port dummy, illiteracy rate (1970), ln(industry capital
per worker, 1970), ln(distance to state capital)*ln(market pot. agric. land availability, 1970),
ln(humidity), ln(avg. temperature), ln(rural pop. supply, 1970), ln(rural income opportunities,
1970), ln(market potential, 1970), and state capital and time dummies.
b. Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and
longitude (10.23, and 8.20), which correspond to about 900 miles.
46
Table D-1. City Size Growth Equationa,b
(standard errors corrected for spatial dependence in parentheses)
(1) (2) (3) (4)
Spatial GMM Spatial OLS Spatial GMM Spatial OLS
Ln(rural pop. supply market 8.894*** 3.216*** 5.590*** 3.064***
potential) (2.078) (0.703) (1.790) (0.639)
Ln(rural income opportunities: 2.300 0.364 -0.700 0.198
market potential) (1.834) (0.389) (0.738) (0.271)
Ln(market potential) 1.837 2.860*** 3.956*** 2.738***
(1.266) (0.674) (0.953) (0.606)
Average schooling (t-1) 0.036 0.021 0.063*** 0.021*
(0.027) (0.013) (0.016) (0.012)
Average schooling 0.115 0.067** 0.604*** 0.097***
(0.117) (0.031) (0.116) (0.026)
Ln(inter-city transport costs) -0.121*** -0.092*** -0.132** -0.088***
(0.044) (0.027) (0.051) (0.025)
state capital dummy 0.080** 0.080*** 0.220*** 0.129***
(0.033) (0.026) (0.037) (0.033)
Ln(population) (t-1) -0.057*** -0.018*
(0.009) (0.010)
Manu / service (t-1) 0.190*** 0.096***
(0.033) (0.018)
time dummies Yes Yes Yes Yes
Observations 246 246 246 246
Hansen J statistic
(overidentification test) 3.582 5.381
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.
a. For (1), instruments are the IV list of Table 4, ln(distance to São Paulo), ln(transport costs to São
Paulo, 1968), and ln(transport costs to state capital, 1968). For (3), we drop ln(industry capital per
worker, 1970) from (1), and add ln(population, 1970), manu/service ratio (1970), manu/service
ratio(1970)*ln(population, 1970), manu/service ratio(1970)*ln(income per capita, 1970), and
manu/service ratio(1970)*ln(market potential, 1970).
b. Coordinate variables are latitude and longitude. Cutoffs are 1.5 standard deviations of latitude and
longitude (10.23, and 8.20), which correspond to about 900 miles.
47
Table D-2. City Size Growth Equation (continued)a,b
(standard errors corrected for spatial dependence in parentheses)
(5) (6)
Spatial GMM Spatial OLS
Ln(rural pop. supply market 5.815*** 3.227***
potential) (1.779) (0.655)
Ln(rural income opportunities: -0.632 0.229
market potential) (0.720) (0.244)
Ln(market potential) 1.257 2.127***
(0.890) (0.480)
Average schooling (t-1) 0.066*** 0.035***
(0.016) (0.010)
Average schooling 0.489*** 0.093***
(0.092) (0.024)
Ln(inter-city transport costs) -0.107** -0.059**
(0.047) (0.025)
state capital dummy 0.183*** 0.113***
(0.038) (0.025)
Ln(population) (t-1) -0.056*** -0.023***
(0.008) (0.009)
Manu / service (t-1) 0.131*** 0.066***
(0.031) (0.022)
Ln(homicide / pop) (t-1) -0.105*** -0.092***
(0.031) (0.023)
Public industry capital / 0.006 -0.780*
total industry capital in 1980 (0.385) (0.425)
time dummies Yes Yes
Observations 245 245
Hansen J statistic
(overidentification test) 3.945
*** significant at 1% level; ** significant at 5% level; * significant at 10% level.
a. Public industry capital / total industry capital (1980) is assumed to be exogenous by adding it to
the IV list of (3).
b. Coordinate variables are latitude and longitude. Cutoffs are 3/2 standard deviations of latitude and
longitude (10.23, and 8.20), which correspond to about 900 miles.
48