ï»¿ WPS5878
Policy Research Working Paper 5878
Average and Marginal Returns to Upper
Secondary Schooling in Indonesia
Pedro Carneiro
Michael Lokshin
Cristobal Ridao-Cano
Nithin Umapathi
The World Bank
Development Research Group
&
East Asia Pacific Region
Social Protection Unit
November 2011
Policy Research Working Paper 5878
Abstract
This paper estimates average and marginal returns to per year of schooling for those very likely to enroll in
schooling in Indonesia using a non-parametric selection upper secondary schooling, or as low as âˆ’10 percent for
model estimated by local instrumental variables, and those very unlikely to do so. Returns to the marginal
data from the Indonesia Family Life Survey. The analysis student (14 percent) are well below those for the average
finds that the return to upper secondary schooling varies student attending upper secondary schooling (27
widely across individual: it can be as high as 50 percent percent).
This paper is a product of the Development Research Group; and East Asia and Pacific Region, Social Protection Unit. It
is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development
policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.
org. The author may be contacted at numapathi@worldbank.org.
The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.
Produced by the Research Support Team
Average and Marginal Returns to Upper Secondary Schooling
in Indonesia
Pedro Carneiroâ€
Michael Lokshin
Cristobal Ridao-Cano
Nithin Umapathi
JEL Code: J31
Key words: Returns to Schooling, Marginal Return, Average Return, Marginal
Treatment Effect
__________________
â€
Pedro Carneiro is affiliated with University College London, IFS, Cemmap and Georgetown University,
Nithin Umapathi is Economist with Social Protection sector in East Asia and Pacific, Michael Lokshin is
Advisor, DECRG, and Cristobal Ridao-Cano is a Senior Economist with Education sector in ECA.
Carneiro and Umapathi gratefully acknowledge the financial support from the Economic and Social
Research Council for the ESRC Centre for Microdata Methods and Practice (grant reference RES-589-28-
0001) and the hospitality of the Development Economic Research Group of the World Bank. Carneiro
gratefully acknowledges the support of ESRC-DFID (grant reference RES-167-25-0124), the European
Research Council through ERC-2009-StG-240910-ROMETA and Orazio Attanasioâ€Ÿs ERC-2009 Advanced
Grant 249612 â€œExiting Long Run Poverty: The Determinants of Asset Accumulation in Developing
Countriesâ€?. These are the views of the authors and do not reflect those of the World Bank, its Executive
Directors, or the countries they represent. Correspondence by email to numapathi@worldbank.org
1. Introduction
The expansion of access to secondary schooling is at the center of development
policy in most of the developing world. Analyzing the effects of such expansion requires
knowledge of the impact of education on earnings. In contrast with the standard model,
much of the recent literature on the returns to schooling emphasizes that returns vary
across individuals, and are correlated with the amount of schooling an individual takes
(e.g., Card, 2001, Carneiro, Heckman and Vytlacil, 2011). In terms of the traditional
Mincer equation, Y ï€½ a ï€« bS ï€« u (where Y is log wage and S is years of schooling), b is a
random coefficient potentially correlated with S. This has dramatic consequences for the
way we conduct policy analysis.
In this model there is no single average return that summarizes the distribution of
returns to schooling in the population. For example, the individual at the margin between
two levels of schooling may have very different returns from all the infra-marginal
individuals. Standard instrumental variables estimates of the returns to schooling estimate
the Local Average Treatment Effect (or LATE; Imbens and Angrist, 1994), which does
not in general correspond to the return to the marginal person, who is most likely to be
affected by the expansion of secondary schooling than anyone else in the economy.
Furthermore, different policies may affect different groups of individuals.
This paper studies the returns to upper secondary schooling in Indonesia in a setting
where b varies across individuals and it is correlated with S (which in this paper is a
dummy variable indicating whether an individual enrolls in upper secondary school or
not). We find that the return to upper secondary schooling for the marginal person who is
indifferent between going to secondary schooling or not is much lower than that of the
average person enrolled in upper secondary schooling (14.2% vs. 26.9% per year of
schooling).1 Finally, we simulate what would happen if distance to upper secondary
schooling was reduced by 10% for everyone in the sample, and we estimate that the
return to upper secondary schooling for those induced to attend schooling by such an
incentive is 14.2%.
1
The estimated average and marginal returns to upper secondary schooling in Indonesia are 96% and 111%
respectively. Average years of schooling for those who have and who have not enrolled in upper secondary
schooling in Indonesia are 13.133 and 5.341, so the difference between the two is 7.79. We use this number
to annualize the returns to schooling from total return.
2
When evaluating marginal expansions in access to school, the relevant quantities are
the returns and costs for the marginal student, not the returns and costs for the average
student. In spite of the importance of this topic, there are hardly any estimates of average
and marginal returns to schooling in developing countries. Two exceptions using Chinese
data are Heckman and Li (2004) and Wang, Fleisher, Li and Li (2011).
Estimating the extent to which returns vary across individuals is also important if we
want to understand schooling choices, and their relationship with labor market
performance. Studying who is more likely to attend school, and documenting how returns
differ between those who attain high and low levels of education, and across individuals
more generally, is essential for learning about the main incentives and constraints to
school attendance.
We estimate a semi-parametric selection model of upper secondary school attendance
and wages using the method of local instrumental variables (Heckman and Vytlacil,
2005). Our data comes from the Indonesia Family Life Survey. Carneiro, Heckman and
Vytlacil (2011) use a similar model to estimate the returns to college in the US. Although
they examine a different country in a different time period, and a different level of
schooling, they also find that the returns to college vary widely across individuals in the
US, and that the return to college for the marginal student is well below the return to
college for the average student (see also Carneiro and Lee, 2009, 2011).2
These papers document, across very different environments, how important it is to
account for heterogeneity in the returns to schooling. They also show how it is possible to
take exactly the same data which is used to estimate a measure of the return to education
by instrumental variables methods (IV), and extract much more information from it
(allowing us to characterize the heterogeneity in returns across individuals). This can be
done using fairly standard parametric methods for estimating selection models, or using a
more recent non-parametric approach to the same problem.
Vytlacil (2002) shows that the assumptions underlying standard IV estimates of the
effect of a particular program (such as attendance of upper secondary school) are the
same as the assumptions underlying a fairly standard non-parametric selection model, and
2
There exist also papers which estimate returns for average and marginal student but which account only
for selection and heterogeneity given by observable variables (ignoring selection on unobservables). One
example is Dearden, McGranahan and Sianesi (2004).
3
thus the two are equivalent. Heckman and Vytlacil (2001a, 2005) explain how to estimate
the model using the method of Local Instrumental Variables. We present estimates from
both parametric and non-parametric models. Both show the importance of heterogeneity.
The former are more precise than the latter, but the parametric model is more restrictive.
This paper also proposes a methodological innovation. In the presence of multiple
control variables, constructing various average parameters of interest (average returns for
different groups of individuals) using the framework of Heckman and Vytlacil (2005)
requires the estimation of high dimensional conditional densities, which are notoriously
difficult to implement. We use instead a simulation method that avoids this high
dimensional non-parametric estimation problem (in contrast, Carneiro, Heckman and
Vytlacil, 2010, 2011, need to impose restrictive assumptions to reduce the dimensionality
of the problem).
Since schooling is endogenously chosen by individuals, we require (at least) one
instrumental variable for schooling. We propose to use as the instrument the distance
from the community of residence to the nearest secondary school (Card, 1995). Distance
takes the value zero if there is a school in the community of residence. This variable is a
strong determinant of enrolment in upper secondary school. But one could be concerned
that the forces driving the location of schools and parents are correlated with wages,
implying that distance is an invalid instrument. We discuss this problem in detail.
We control for several family and village characteristics, namely fatherâ€Ÿs and
motherâ€Ÿs education, an indicator of whether the community of residence was a village,
religion, whether the location of residence is rural, province dummies, and distance from
the village of residence to the nearest health post. Our assumption is that if we take two
individuals with equally educated parents, with the same religion, living in a village
which is located in an area that is equally rural, in the same province, and at the same
distance of a health post, then distance to the nearest secondary school is uncorrelated
with direct determinants of wages other than schooling. We present evidence that this
assumption is likely to hold. In particular, we show that, once these variables are
controlled for, there is no correlation between the distance to the nearest secondary
school and whether the individual ever failed a grade in elementary school, how many
times he repeated a grade in elementary school, and whether he had to work while
4
attending elementary school. In addition, we show (using a different sample) that our
distance variable is uncorrelated with test scores (Math, Bahasa, Science, and Social
Studies) in elementary school. These are very important dimensions of the pre-secondary
school experience that are measures of early ability and early home environments, and
which we would expect to be correlated with distance to the nearest secondary school if
this variable was endogenously determined.
Our instrumental variable estimates of the returns to schooling are higher than the
returns to schooling for Indonesia found in Duflo (2000) with the qualification that the
dataset, the instrumental variable, and the time period are not the same. Petterson (2010)
finds similar rates of return using the same year, same data but a different sample and a
different instrument variable.
This paper proceeds as follows. Section 2 discusses the data. Section 3 reviews the
econometric framework. Section 4 presents our empirical results. Section 5 concludes.
2. Data
We use data from the third wave of the Indonesia Family Life Survey (IFLS) fielded
from June through November, 20003. The IFLS is a household and community level
panel survey that has been carried out in 1993, 1997 and 2000. The 2000 sample was
drawn from 321 randomly selected villages, spread among 13 Indonesian provinces
containing 83% of the countryâ€Ÿs population containing over 43,000 individuals. The
sample we used consists of males aged 25-60 who are employed in the labor market and
who have reported non-missing wage and schooling information. We consider only
salaried workers, both in the government and in the private sector. We exclude females
from the analysis because of low labor force participation, and we exclude self-employed
workers because it is difficult to measure their earnings. The dependent variable in our
analysis is the log of the hourly wage. Hourly wages are constructed from self-reported
monthly wages and hours worked per week. The final sample contains 2608 working age
males.
3
For a description of the survey see Strauss, J., K. Beegle, B. Sikoki, A. Dwiyanto, Y. Herawati and F.
Witoelar. "The Third Wave of the Indonesia Family Life Survey (IFLS): Overview and Field Report",
March 2004. WR-144/1-NIA/NICHD. In the appendix we list the main variables we use.
5
In our empirical model we collapse schooling into two categories: i) completed lower
secondary or below, and ii) attendance of upper secondary or higher. While this division
groups together several levels of schooling, it greatly simplifies the model and is standard
in many studies of the returns to schooling (e.g., Willis and Rosen, 1979). The transition
to upper secondary schooling is of substantial interest in the Indonesian context given its
current effort to expand secondary education. We present both the return to upper
secondary schooling, as well as an annualized version of this parameter which we obtain
by dividing the estimated return by the difference in average years of schooling
completed by those with lower secondary or less and those with upper secondary or
more. Upper secondary schooling corresponds to 10 or more years of completed
education.4 In order to compare our estimates with the rest of the literature (in particular,
Duflo, 2000) in the appendix we also present OLS and IV estimates of returns using a
continuous education variable, corresponding to years of completed schooling.
We run ordinary least squares (OLS) and IV regressions of log wages on years of
schooling. The control variables we use are dummies for age, indicators of the level of
schooling completed by each of the parents (no education, elementary education,
secondary education; we also have an indicator for unreported parental education), an
indicator for whether the individual was living in a village at age 12, dummies for the
province of residence, an indicator of rural residence, and distance (in Km) from the
community of residence to the nearest community health post.
Our instrumental variable for schooling is the distance from the office of the
community head to the nearest secondary school. The distance is self-reported by the
community head in the Service Availability Roster of the IFLS.5
Table 2 presents descriptive statistics on the main variables used in our analysis. It
shows that individuals with upper secondary or higher levels of education have, on
average, 108% higher wages than those with lower education. They also have 7.778 more
years of schooling. The respondents with an upper secondary education are younger than
4
It is possible to estimate a non-parametric selection model with multiple levels of schooling but the data
requirements to do it are very strong. In particular, one needs one instrumental variable for each transition.
It is not feasible to pursue this with our dataset.
5
We would have liked to use instead the distance between the community of residence in childhood and
the nearest school in childhood. Our hope is that current residence and current school availability are good
approximations (as in Card, 1995). We show below that this measure of distance to school is a good
predictor of upper secondary school attendance.
6
those without. They are also more likely to have better-educated parents, to have lived in
towns or cities at age 12, and to live closer to upper secondary schools, when compared
to those with less than an upper secondary education.
3. Theoretical Framework
3.1 A Semi-Parametric Selection Model
We consider a standard discrete choice model of schooling, as used in Willis and
Rosen (1979) or Carneiro, Heckman and Vytlacil (2011). Consider a model with two
schooling levels:
Y1 ï€½ ï?¡1 ï€« Xï?¢1 ï€« U 1
(1)
Y0 ï€½ ï?¡ 0 ï€« Xï?¢ 0 ï€« U 0
S ï€½ 1 if Zï?§ ï€ U s ï€¾ 0 (2)
Y1 are log wages of individuals with upper secondary education and above, Y0 are log
wages of individuals without upper secondary education, X is a vector of observable
characteristics that might affect wages, and U1 and U 0 are the error terms. Z is a vector of
characteristics affecting the schooling decision.
Equation (2) is a reduced form model of schooling. Agents decide whether to enroll
or not in upper secondary schooling based on the expected net present value of earnings
with and without upper secondary schooling, and costs, which can be financial or
psychic. There can also be liquidity constraints. There is heterogeneity and we expect
agents with the highest returns to upper secondary schooling ( Y1 ï€ Y0 ) to be more likely
to enroll in higher levels of schooling. Costs and returns to schooling can be arbitrarily
correlated. For a more detailed explanation see Willis and Rosen (1979) or Carneiro,
Heckman and Vytlacil (2011).
It is convenient to rewrite the selection equation as:
S ï€½ 1 if P(Z ) ï€¾ V (3)
P(Z ) ï€½ FU S (Zï?§ ) and V ï€½ FU S (U S ) and FU S is a cumulative distribution function of Us . V
is distributed uniformly by construction. This is an innocuous transformation given that
US can have any density.
Finally, observed wages are:
7
Y ï€½ SY1 ï€« (1 ï€ S )Y0 (4)
Notice that the return to schooling is
Y1 ï€ Y0 ï€½ ï?¡1 ï€ ï?¡ 0 ï€« X (ï?¢1 ï€ ï?¢ 0 ) ï€« U1 ï€ U 0 (5)
The return to schooling varies across individuals with different Xâ€Ÿs and different U1, U0.
We require that Z is independent of ( U1 ,U 0 ) given X, and that Z is correlated with S
(see Heckman and Vytlacil, 2005, for the full set of assumptions). These are the usual IV
assumptions. In practice we use a stronger assumption: X, Z is independent of U1, U0, US.
This stronger assumption is fairly standard in empirical applications of a selection model
of the type described here. We discuss the advantages of using this stronger assumption
in the empirical section (see also Carneiro, Heckman and Vytlacil, 2011).
The marginal treatment effect (MTE) is the central parameter of our analysis. In the
notation of our paper it can be expressed as:
MTEï€¨x, v ï€© ï€½ E ï€¨Y1 ï€ Y0 | X ï€½ x, V ï€½ v ï€©
(6)
ï€½ ï?¡1 ï€ ï?¡ 0 ï€« x( ï?¢1 ï€ ï?¢ 0 ) ï€« E ï€¨U1 ï€ U 0 | X ï€½ x, V ï€½ v ï€©
The MTE measures the returns to schooling for individuals with different levels of
observables (X) and unobservables (V), and therefore it provides a simple characterization
of heterogeneity in returns. Heckman and Vytlacil (2005) show how to construct
parameters of interest as weighted averages of the MTE. For example:
ATE ( x ) ï€½ ïƒ² MTE ( x, v ) fV |x (v | x )dv
ATT ( x ) ï€½ ïƒ² MTE ( x, v ) fV |x (v | x, S ï€½ 1)dv (7)
ATU ( x ) ï€½ ïƒ² MTE ( x, v ) fV |x (v | x, S ï€½ 0)dv
where ATE(x) is the average treatment effect, ATT(x) is average treatment on the treated,
ATU(x) is average treatment on the untreated (conditional on X=x), and is the
density of V conditional on X.
A less standard parameter but equally (if not more) important is the policy relevant
treatment effect (PRTE), introduced in the literature by Heckman and Vytlacil (2001b). It
measures the average return to schooling for those induced to change their enrolment
status in response to a specific policy. Obviously, it depends very much on the policy
being considered. Consider a determinant of enrolment Z, which does not enter directly
8
in the wage equation. The policy shifts Z from Z=z to Z=zâ€Ÿ. The weights for the
corresponding PRTE are:
3.2 Estimating the MTE
Assuming that the unobservables in the wage (1) and selection (2) equations are
jointly normally distributed the MTE can be estimated a standard switching regression
model (see Heckman, Tobias and Vytlacil, 2001). Assume:
U 0 ,U1 ,U s ~ N (0, ï?—) (8)
where ï?— represents the variance and covariance matrix. Under this assumption:
ï?³ U ,1 ï?³ U ,0 ï€1
MTE( x, v) ï€½ E (Y1 ï€ Y0 | X ï€½ x, V ï€½ v) ï€½ (ï?¡1 ï€ ï?¡ 0 ) ï€« x( ï?¢1 ï€ ï?¢ 0 ) ï€« ( S
ï€ )ï?† ( P( Z ))
S
ï?³U S
ï?³U S
where ï?³ U S denotes variance of U s , ï?³ i2 variance of U i with i = 0,1, ï?³ U S
2 2
,i covariance
between U s and U i , ï?³ i2 , j the covariance between U i and U j and Î¦ is the c.d.f. of the
standard normal. Therefore MTE can be constructed by estimating parameters
ï?¡1 , ï?¡ 0 , ï?¢1 , ï?¢ 0 , ï?²1 , ï?² 2 .
This model relies on strong assumptions about the distribution of the error terms in
equations (1-2). To relax these restrictions, we use the method of local instrumental
variables that imposes no distributional assumptions on the unobservables of the model
(Heckman and Vytlacil, 2000). In particular, Heckman and Vytlacil (2000) show that:
ï‚¶E ï€¨Y | X , P ï€©
MTE( x, v) ï€½ | X ï€½ x , P ï€½v
ï‚¶P (9)
where,
E (Y | X , P) ï€½ Eï?›ï?¡ 0 ï€« Xï?¢ 0 ï€« S ï€¨ï?¡1 ï€ ï?¡ 0 ï€© ï€« SX ï€¨ï?¢1 ï€ ï?¢ 0 ï€© ï€« U 0 ï€« S ï€¨U1 ï€ U 0 ï€© | X , P ï??
ï€½ ï?¡ 0 ï€« Xï?¢ 0 ï€« Pï€¨ï?¡1 ï€ ï?¡ 0 ï€© ï€« PX ï€¨ï?¢1 ï€ ï?¢ 0 ï€© ï€« E ï€¨U1 ï€ U 0 | S ï€½ 1, X , P ï€©P (10)
ï€½ ï?¡ 0 ï€« Xï?¢ 0 ï€« Pï€¨ï?¡1 ï€ ï?¡ 0 ï€© ï€« PX ï€¨ï?¢1 ï€ ï?¢ 0 ï€© ï€« K ( P)
(K(P) is a function of P, which can be estimated non-parametrically). Therefore, taking
the derivative of (10) with respect to P:
9
ï‚¶E ï€¨Y | X , P ï€©
MTE ( x, v ) ï€½ |X ï€½ x ,P ï€½v ï€½ X ( ï?¢1 ï€ ï?¢0 ) ï€« K '( P) (11)
ï‚¶P
V can take values from 0 to 1. However, in practice it is only possible to estimate the
MTE over the observed support of P. In our data the support of P is almost the full unit
interval, so we are able to estimate the MTE close to its full support.
If we had assumed that Z is independent of ( U1 ,U 0 ) given X, instead of full
independence between (Z,X) and ( U1 ,U 0 ), it would be much more difficult to estimate
the MTE with full support. In that case, for each value of X it is only possible to estimate
the MTE over the support of P conditional on X, which in general can be much smaller
than the unconditional support of P (for a detailed discussion see Carneiro, Heckman and
Vytlacil, 2011). The assumption of full independence of (Z,X) and ( U1 ,U 0 ) is fairly
standard and it allows us to use the full support of P.
Equations (10) and (11) can be estimated using standard methods. In particular, we
use the partially linear regression estimator of Robinson (1988) to estimate ( ï?¢1 , ï?¢ 0 ). Then
we compute R ï€½ Y ï€ ï?›ï?¡ 0 ï€« Xï?¢ 0 ï€« PX ï€¨ï?¢1 ï€ ï?¢ 0 ï€©ï??. ( ï?¡1 , ï?¡ 0 ) cannot be identified separately
from K(P). K(P) and Kâ€™(P) is estimated using a locally quadratic regression (Fan and
Gijbels, 1996) of R on P. A simple test of heterogeneity in the impact by unobserved
characteristics is a test of whether Kâ€™(P) is flat, or if E(Y |X, P) is nonlinear in P. If the
derivative is flat then heterogeneity is not important.
3.3 Average Marginal Returns to Education
Economic decisions involve comparisons of marginal benefits and marginal costs.
Therefore it is especially interesting to estimate the returns to upper secondary schooling
for those individuals at the margin between enrolling or not. These would be those
individuals more likely to change their schooling as a response to a change in education
policy.
The definition of who is marginal depends on the policy being considered. This is
made clear in Carneiro, Heckman and Vytlacil (2010, 2011), who consider three
particularly interesting definitions of individuals at the margin:
10
P
i ) P ï€ V ï€¼ ï?¥ , ii ) Zï?§ ï€ U s ï€¼ ï?¥ , iii ) ï€1 ï€¼ ï?¥.
U
These correspond to three different marginal policy changes.6
In this paper we estimate the average marginal returns to upper secondary schooling
in Indonesia according to the definition of marginal in ii) above, although we could have
chosen a different one. The MTE provides a general characterization of heterogeneity in
returns and from it we can construct various other parameters.
Carneiro, Heckman and Vytlacil (2010) show how it is possible to write the average
marginal treatment effect (or AMTE, the return for the marginal person) as a weighted
average of the MTE:
(12)
3.4 Estimating vs. Simulating the Weights: A New Procedure
So far this section has shown how to recover the MTE from the data, and how to
construct economically interesting parameters as weighted averages of the MTE.
Heckman and Vytlacil (2005) and Carneiro, Heckman and Vytlacil (2010) provide
formulas for the necessary weights in equations 7 and 12, conditional on X. Once these
are applied it is simple to average over the relevant distribution of X, when that is what is
required. In particular, these papers show that:
f V | X ï€¨v ï€© ï€½ 1
1 ï€ FP| X ï€¨v | X ï€©
f V | X ï€¨v | X , S ï€½ 1ï€© ï€½
E ï€¨P | X ï€©
FP| X ï€¨v | X ï€©
f V | X ï€¨v | X , S ï€½ 0 ï€© ï€½ (13)
E ï€¨P | X ï€©
FP| X ï€¨v | X ï€© ï€ FP '| X ï€¨v | X ï€©
f V | X ï€¨v | X , S ( z ) ï€½ 0, S ( z ' ) ï€½ 1ï€© ï€½
ïƒ² ï?›F ï€¨v | X ï€© ï€ F ï€¨v | X ï€©ï??dv
P| X P '| X
f V | X ï€¨v | X , Zï?§ ï€ V ï€¼ ï?¥ ï€© ï€½
ï?›
f P| X ï€¨v | X ï€© f U S | X F ï€1
US |Xï€¨v | X ï€©ï??
E ï?› f U | X ï€¨Zï?§ | X ï€©ï??
S
6
The three policy changes considered are (i) a policy that increases the probability of attending college by
an amount Î±, so that ; (ii) a policy that changes each personâ€Ÿs probability of attending college
by the proportion (1+ Î±), so that ; and (iii) a policy intervention that has an effect similar to
k
a shift in one of the components of Z, say Z , so that and for .
11
where f P| X ï€¨ p | X ï€© and FP| X ï€¨ p | X ï€© are respectively the p.d.f and the c.d.f. of P
conditional on X, fU S | X ï€¨uS | X ï€© and FU S | X ï€¨uS | X ï€© are respectively the p.d.f and the c.d.f.
of U S conditional on X, and FP '| X ï€¨ p | X ï€© is the c.d.f. of P conditional on X when Z takes
value zâ€™.
In practice it is difficult to implement these formulas since they involve estimation of
conditional density and distribution functions, and X is a high dimensional object in many
applications (there are 28 variables in X in our empirical work). Therefore, Carneiro,
Heckman and Vytlacil (2010, 2011) have aggregated X into an index, namely
I ï€½ X ï€¨ï?¢1 ï€ ï?¢ 0 ï€© , and proceeded by estimating conditional densities and distributions of P
with respect to I.
There is little theoretical basis for this aggregation which makes it quite unattractive.
In this paper we use an alternative procedure, which avoids making this aggregation, and
sidesteps the problem of estimating a multidimensional conditional density function.
Notice that the selection equation relates S, X, Z, and or V. Using the estimates of
the parameters of the selection equation, it is straightforward to simulate the following
objects:
Once we have these, we just need apply them to equations (7) and (12). This simulation
procedure is simple, and its steps are described in detail in the appendix.
4. Empirical Results
4.1 Is Distance to School a Valid Instrument?
To account for the potential endogeneity of the schooling decision we instrumented
schooling with the distance to the nearest secondary school, measured in kilometers.7 In
order for this to be a valid procedure distance to school needs to satisfy two assumptions:
i) distance to school should affect the probability of school enrolment and ii) it should
have no direct effect on adult wages.
7
Distance to the nearest school has been used as an instrument for schooling by Card (1995), Kane and
Rouse (1995), Kling (2001), Currie and Moretti (2003), Cameron and Taber (2004) and Carneiro, Heckman
and Vytlacil (2011).
12
Condition ii) is controversial if families and schools do not randomly locate across
locations in Indonesia. For example, Carneiro and Heckman (2002) and Cameron and
Taber (2004) show that individuals living closer to universities in the US have higher
levels of cognitive ability and come from better family backgrounds. In fact, it is also true
Indonesia that those who have better educated parents are located closer to secondary
schools. However, it is possible that school location is exogenous after we account for a
very detailed set of individual and regional characteristics, namely: age (or cohort),
parental education, religion, an indicator for whether the individual was living in a city or
in village at age 12, an indicator for whether the individual lived in a rural area at age 12,
dummies for the province of residence, and distance to the nearest health post.
One way to investigate the plausibility of this assumption is to check whether
distance to the nearest secondary school is correlated with pre-secondary educational
outcomes of each individual. If there was non-random sorting of families and schools
across locations in such a way that distance to secondary school was correlated with adult
wages, it would surely appear in these variables.
Table 3 examines whether distance to upper secondary school is correlated with
whether an individual ever repeated a grade in elementary school, the number of
repetitions in elementary school (both of which are measure of early school success), and
whether the individual worked while in primary school. If our instrument is valid it
should not be correlated with such early measures. Our results show no apparent
correlation between distance to school and these variables.
In addition, Table 4 examines comprehensive exam scores in math, science, social
studies and Bahasa. The sample used in this table is not exactly the sample used in our
regressions, because it is only possible to gather elementary school test scores for a very
small proportion of individuals in our original sample. Therefore, in the regression
showed in this table, we placed no age or gender restrictions in the sample. Again, we
find no correlation between the distance to school and test scores.8 This evidence is
suggestive that our empirical strategy is valid.
8
Considering a more restricted sample results in a small number of observations. Our main conclusions are
unchanged, but results are fairly imprecise.
13
The first column of Table 5 shows that distance to the nearest secondary school is a
strong predictor of enrolment in secondary school. We run a logit regression where the
dependent variable is an indicator taking value 1 if an individual ever attended upper
secondary school and the regressors include distance to the nearest secondary school and
all the control variables mentioned above. The table displays marginal effects of each
variable on the probability of enrolling in upper secondary education. We include
distance to health post as a proxy for location characteristics and, unlike distance to
school, distance to health post does not predict school enrollment. Children of highly
educated parents are more likely to attend upper secondary school than children of
parents with low levels of education. Catholics and Protestants are much more likely to
attend secondary school than Muslims (the omitted category). Children living in small
villages and in rural areas were less likely to attend upper secondary school than those
living in large cities and urban areas.
In the second column of table 5 we present estimates of a more flexible model where
the impact of distance on secondary school attendance varies with X. In particular, we
interact distance to school with age (which, for a fixed year, also captures cohort),
religion, parental education, and rural residence. It is useful to estimate this richer model
for two related reasons. First, it is much more flexible than the model in column 1.
Second, because by allowing the impact of the instrument to vary will the variables in X
we allow the effect of the instrument to vary more. As a result, the standard errors in the
IV estimates and in the selection model are smaller than if we just use the model in
column 1. Therefore, the basic estimates in this paper will come from this model, while
estimates of the simpler model are presented in the appendix (we discuss them below).
Notice also that table 5 displays p-values for tests of the null hypothesis that distance
to school does not affect upper secondary school attendance. In column 1, we look to the
single coefficient on distance, while in column 2 we perform a joint test on all
coefficients involving distance. In both columns we reject that distance to school does not
determine upper secondary school attendance.
There is another important reason why condition ii) may be violated. If regions where
schools are abundant are also regions where other infrastructure is also abundant, then we
may be confounding the impact of school availability on wages with the impact of other
14
infrastructure on wages (see the argument in Jalan and Ravallion, 2002). This will be true
unless labor is perfectly mobile, which is unlikely to be the case in Indonesia.
However, we include detailed regional controls in our models which should absorb
much of this variation. Therefore, our argument is that our assumption is valid
conditional on all the controls we have in the model. In addition, we show that removing
these detailed regional controls hardly affects our results, which indicates that this
problem is unlikely to be that important in our setting. Perhaps, as Duflo (2004) argues,
the response of other infrastructure (be it private or public) to school construction and to
a better skilled workforce is very slow.
4.2 Standard Estimates of the Returns to Schooling
In order to more easily make a comparison between our data and estimates and those
in the literature we start by presenting standard OLS and IV estimates of the returns to
schooling. Throughout the paper schooling takes two values: 0 for less than upper
secondary, and 1 for upper secondary or above. We use the log hourly wage in 2000 as
our dependent variable. The full set of controls consists of: age (or cohort), parental
education, religion, an indicator for whether the individual was living in a city or in a
village at age 12, an indicator for whether the individual lived in a rural area at age 12,
dummies for the province of residence, and distance to the nearest health post.
We present ordinary least squares (OLS) and IV results. This is shown in Table 6.
The annualized OLS estimate of the return to upper secondary schooling is 9%, while the
IV estimate is 12.9%. Recall from table 2 that individuals with upper secondary
schooling or above have on average 13.133 years of schooling, while those with less than
upper secondary have on average 5.341 years of schooling. The difference between the
two groups is 7.792 years of schooling. Using this figure to annualize the returns to upper
secondary education we have an OLS estimate of 9% (=70.5%/7.792) and an IV estimate
of 12.9% (=100%/7.792).9
9
Appendix table A1 presents OLS and IV estimates where we use years of schooling as the main
explanatory variable (as opposed to upper secondary schooling). The first column in this table shows
coefficients of an OLS regression of log wages on years of schooling and several controls. The estimated
return to a year of schooling is 9.6%. The second column shows the first stage of the two stage least
squares estimator, i.e., a regression of years of schooling on the instrument and the control variables. It
shows that distance to school is negatively related to schooling attainment. Finally, column 3 shows the IV
estimate of the return to schooling, which is 15.7%.
15
These estimates are higher than (but of comparable magnitude to) those in Duflo
(2001), although we use more recent data. Petterson (2010) finds a return of 14% using
the same data from the same year but a different sub-sample and instrument.
As in most of the literature, our IV estimates of the return to education are larger than
OLS estimates. Card (2001) suggests that such a finding indicates that returns to
schooling are heterogeneous and the marginal individual induced to enroll in school by
the change in the instrument has a higher return than the average individual. Carneiro,
Heckman and Vytlacil (2011) show that in the case of college attendance in the US, IV
estimates can be above OLS estimates even if the marginal individual has a lower return
than the average. Another reason why IV can exceed OLS is measurement error in
schooling. Although schooling is relatively well measured in the US (Card, 1999) and
other developed countries, that is not necessarily the case in Indonesia.
In appendix table A2 we also present IV estimates of returns for models where we do
not interact the instrument with the variables in X. The point estimate is smaller than the
one in Table A1, and the standard error is larger. Nevertheless, the main pattern remains:
the IV estimate is much higher than the OLS estimate. In a model with heterogeneous
returns, it is not surprising that the instrumental variable is sensitive to the choice of
instrument. For the remaining of the paper, we present a parallel set of results in the
appendix in which we do not interact the instrument with X in the selection equation.10
Finally, in appendix table A3 we present results were we omit all regional dummies from
the model. Our IV estimate is very similar to the ones in tables A1 and A2. This indicates
that regional variation in infrastructure, which is correlated with the availability of
schooling, is unlikely to be driving our results.
OLS and IV estimates hide considerable heterogeneity in returns and, as emphasized
in Heckman and Vytlacil (2005), Heckman, Urzua and Vytlacil (2006), and Carneiro,
Heckman and Vytlacil (2011), it is not clear what economic question is answered by the
IV estimate. In order to further investigate this issue we use the framework of section 3.
10
We do this for two reasons. First, to show that the main patterns in our results are not driven by choosing
the specific way the instrument enters the model. Second, because the first stage F-statistic is higher in the
case where we use a single IV (F=11.34) than when we use multiple IVs (F=3.62) consisting of distance
interacted with different components of X. We will see throughout the paper that using the expanded set of
instruments allows us to get similar results and lower standard errors than we use a single (but apparently
stronger) instrument.
16
We estimate parametric (assuming joint normality of (U1, U0, US)) and semi-parametric
versions of the model (which does not put assumptions on the joint distribution of (U1,
U0, US)).
4.3 Average and Marginal Treatment Effect Estimates
We start with the semi-parametric model. We construct P as a predicted probability of
ever attending upper secondary school from a logit regression of upper secondary school
attendance on the X and Z variables of section 3. Table 5, discussed above, reports the
coefficients of the logit model. All average derivatives presented in the table have the
expected sign.
It is only possible to identify the MTE over the support of P. Therefore, we need to
examine the density of P for individuals who attend upper secondary school or above and
those who do not. This is done in both panels of Figure 1, which shows the distributions
of the predicted propensity score (P) for these two groups. The supports for these
distributions overlap almost everywhere, although the support at the tails is thin for low
values of P among those with upper secondary school or above. We construct the MTE as
described in Section 2. In order to estimate K(P) we run a local quadratic regression of R
on P, using a Gaussian kernel and a bandwidth of 0.2. The implied MTE(x,v) is computed
by calculating the slope on the linear term of the local quadratic regression.11
Figure 2 displays the estimated MTE (which we evaluate at the mean values of the
components of X). The MTE is monotonically decreasing for all values of V. Returns are
very high for individuals with low values of V (individuals who are more likely to enroll
in upper secondary school or facing high costs). The figure demonstrates substantial
heterogeneity in the return to schooling, which ranges from 34% for individuals with V
around 0.1 to 13% for those with V close to 0.5, and becomes negative for those with
values of V close to 1. The fact that returns are the lowest for individuals who are least
likely to go to school is consistent with a simple economic model where agents sort
themselves based on their comparative advantage.
Unfortunately the standard errors on our estimated MTE are quite wide (standard
errors are estimated using the bootstrap). However, it is still possible to reject that the
11
The coefficients on X in the outcome equations are presented in table A4 in the appendix.
17
MTE is flat. Table 7 tests whether adjacent segments of the MTE are equal. Take, for
example, the first line of the table. In the first column we show the average value the
MTE takes when X is fixed at its mean and V takes values between 0.1 and 0.2, while the
second column corresponds to values of V between 0 and 0.1. The third column shows
the t-statistic for a test of whether the numbers in the first two columns are equal. We
reject equality in almost all lines of the table at the 5% significance level. Therefore, we
are able to reject that the MTE is flat, even with the large standard error bands shown in
figure 2.
Standard errors improve when we estimate the MTE assuming joint normality of (U1,
U0, US) as shown in Figure 3. The shape of the MTE is declining as in Figure 2, although
the normality assumption does not allow the MTE to have a flat section as in Figure 2 so
the MTE is declining everywhere, again taking negative values for very high values of V.
Table 8 presents average returns to upper secondary schooling for different groups of
individuals. The return to upper secondary school for a random person is 12.3%. The
return for the individuals who were enrolled in upper secondary schooling is considerably
higher, at 26.9%. If individuals who did not go to upper secondary school would have
gone there, they could expect the returns of 1.7%. In the parametric case the average
parameters are estimated with the assumption of full support. Estimates of the return to
the marginal student (AMTE) are robust to the lack of full support. The return to the
marginal student is 14.2%, well below the return to the average student in upper
secondary school (26.9%).
Finally, the last line of Table 8 reports the average return for those induced to attend
upper secondary school by a particular policy shift: a 10% reduction in distance to an
upper secondary school. This is the parameter needed to understand the impacts of such
an education expansion. By coincidence, it is remarkably similar to the MPRTE.
In the appendix we show that results are similar but more imprecise when we do not
interact Z and X in the selection equation. This is reassuring, and shows the usefulness for
the precision of our estimates of accounting for a more flexible model.12
12
See tables A2 , A5 and A6, and figures A1, A2 and A3.
18
5. Conclusion
Indonesia has an impressive record of educational expansion since the 1970s. The
enrollment rates are nearly universal for elementary schooling and are around 75% for
secondary education. There is an ongoing effort to extend universal education attainment
to the secondary level. And although enrollment in secondary education continues to rise
we find striking inequality in returns to education. The individuals who are most likely to
be attracted by educational expansions at the upper secondary level (marginal) have
lower average returns than those already attending upper secondary schooling. In this
paper we document a large degree of heterogeneity in the returns to upper secondary
schooling in Indonesia. We estimate the return to upper secondary education to be 12
percentage points higher (per year of schooling) for the average than for the marginal
student.
Therefore, efforts aimed at educational expansion will attract students with much
lower levels of returns, although the returns are still fairly high for the marginal person,
and therefore further expansions are probably justified. But our estimates also show that
it is probably not optimal to bring everybody into upper secondary education.
Is it possible to reduce such a high degree of inequality in the returns to schooling?
There is a growing body of literature that argues that human capital outcomes later in life
are largely determined early in life (e.g., Carneiro and Heckman, 2003). It is therefore
important for the design of schooling policy to determine whether the inequality in
secondary schooling outcomes can be remedied at earlier stages, for example during
elementary education or even earlier. In an impressive drive to increase the quantity of
education, there should be a renewed emphasis on quality of education that ensures a
more relevant learning environment for the disadvantaged children that reduces the
inequity in lifetime outcomes. This can be achieved by raising the return for the marginal
student, and therefore the increased equity need not cost in terms of efficiency.
19
References
Bjorklund, A. and R. Moffitt (1987) , â€œThe Estimation of Wage Gains and Welfare
Gains in Self-Selection Models,â€? Review of Economics and Statistics , 69:42-49.
Cameron, S. and C. Taber (2004), â€œEstimation of Educational Borrowing Constraints
Using Returns to Schoolingâ€?, Journal of Political Economy , part 1, 112(1): 132-
82.
Card, D. (1995), â€œUsing Geographic Variation in College Proximity to Estimate the
Return to Schooling â€œ, Aspects of Labour Economics: Essays in Honour of John
Vanderkamp , edited by Louis Christofides, E. Kenneth Grant and Robert
Swindinsky. University of Toronto Press.
Card, D. (1999), â€œThe Causal Effect of Education on Earnings,â€? Orley Ashenfelter and
David Card, (editors), Vol. 3A, Handbook of Labor Economics, Amsterdam:
North-Holland.
Card, D. (2001), â€œEstimating the Return to Schooling: Progress on Some Persistent
Econometric Problems,â€? Econometrica , 69(5): 1127-60.
Card, D. and T. Lemieux (2001), â€œCan Falling Supply Explain the Rising Return to
College For Younger Men? A Cohort Based Analysis,â€? Quarterly Journal of
Economics 116: 705-46.
Carneiro, P. and J. Heckman (2002), â€œThe Evidence on Credit Constraints in Post-
secondary Schooling,â€? Economic Journal 112(482): 705-34.
Carneiro, P. and J. Heckman (2003), â€œHuman Capital Policy,â€? in Inequality in America:
What Role for Human Capital Policy, J. Heckman, A. Kruger and B. Friedman
eds, MIT Press.
Carneiro, P., J. Heckman and E. Vytlacil (2010), â€œEvaluating Marginal Policy Changes
and the Average Effect of Treatment for Individuals at the Marginâ€?,
Econometrica.
Carneiro, P., J. Heckman and E. Vytlacil (2011), â€œEstimating Marginal Returns to
Education â€œ, forthcoming in American Economic Review.
Carneiro, P. and S. Lee (2009), â€œEstimating Distributions of Potential Outcomes using
Local Instrumental Variables with an Application to Changes in College
Enrolment and Wage Inequalityâ€?, Journal of Econometrics.
20
Carneiro, P. and S. Lee (2011), â€œTrends in Quality Adjusted Skill Premia in the US: 1960
to 2000â€?, forthcoming in American Economic Review
Currie, J. and E. Moretti (2003), Mother's Education and the Intergenerational
Transimission of Human Capital: Evidence from College Openings, Quartely
Journal of Economics , 118:4.
Dearden, L., L. McGranahan, B. Sianesi (2004), â€œReturns to Education for the â€žMarginal
Learnerâ€Ÿ: Evidence from the BCS70â€?, CEEDP 45, Center for the Economics of
Education, London School of Economics and Political Science.
Duflo, E. (2001), â€œSchooling and Labor Market Consequences of School Construction in
Indonesia: Evidence from an Unusual Policy Experimentâ€?, American Economic
Review, 91(4), 795-813.
Duflo, E. (2004), â€œThe medium run effects of educational expansion: evidence from a
large school construction program in Indonesiaâ€?, Journal of Development
Economics, 74, 163-197.
Fan, J. and I. Gijbels (1996), Local Polynomial Modelling and its Applications, New
York, Chapman and Hall.
Griliches, Z. (1977), â€œEstimating the Return to Schooling: Some Persistent Econometric
Problemsâ€?, Econometrica.
Petterson, G. "Do supply-side education programs targeted at under-served areas work?
The impact of increased school supply on education and wages of the poor and
women in Indonesia." PhD dissertation (Draft), Department of Economics,
University of Sussex
Heckman, J. and X. Li (2004), â€œSelection Bias, Comparative Advantage, and
Heterogeneous Returns to Education: Evidence from China in 2000â€?, Pacific
Economic Review.
Heckman, J., S. Urzua and E. Vytlacil (2006), â€œUnderstanding What Instrumental
Variables Really Estimate in a Model with Essential Heterogeneityâ€?, Review of
Economics and Statistics.
Heckman, J. and E. Vytlacil (1999), Local Instrumental Variable and Latent Variable
Models for Identifying and Bounding Treatment Effects, Proceedings of the
National Academy of Sciences, 96, 4730-4734.
21
Heckman, J. and E. Vytlacil (2001a), â€œLocal Instrumental Variables,â€? in C. Hsiao, K.
Morimune, and J. Powells, (eds.), Nonlinear Statistical Modeling: Proceedings of
the Thirteenth International Symposium in Economic Theory and Econometrics:
Essays in Honorof Takeshi Amemiya , (Cambridge: Cambridge University Press,
2000), 1-46.
Heckman, J. and E. Vytlacil (2001b), â€œPolicy Relevant Treatment Effectsâ€?, American
Economic Review Papers and Proceedings.
Heckman, J. and E. Vytlacil (2005), â€œStructural Equations, Treatment, Effects and
Econometric Policy Evaluation,â€? Econometrica, 73(3):669-738.
Imbens, G. and J. Angrist (1994), â€œIdentification and Estimation of Local Average
Treatment Effects,' Econometrica, 62(2):467-475.
Jalan, J. and M. Ravallion (2002), â€œGeographic poverty traps? A micro model of
consumption growth in rural Chinaâ€?, Journal of Applied Econometrics, 17, 329-
346.
Kane, T. and C. Rouse (1995), â€œLabor-Market Returns to Two- and Four-Year Collegeâ€?,
American Economic Review, 85(3):600-614.
Kling, J. (2001), â€œInterpreting Instrumental Variables Estimates of the Returns to
Schoolingâ€?, Journal of Business and Economic Statistics, 19(3), 358-364.
Robinson, P. (1988), â€œRoot-N-Consistent Semiparametric Regressionâ€?, Econometrica,
56(4), 931-954.
Vytlacil, E. (2002), â€œIndependence, Monotonicity, and Latent Index Models: An
Equivalence Resultâ€?, Econometrica, 70(1), 331-341.
Wang, X., B. Fleisher, H. Li and S. Li, â€œAccess to Higher Education and Inequality: the
Chinese Experimentâ€?, working paper, Ohio State University.
Willis, R. and S. Rosen (1979), â€œEducation and Self-Selection,â€?Journal of Political
Economy, 87(5):Pt2:S7-36.
22
Table 1: Definitions of variables used in the empirical analysis
Variable Definition
Y Log hourly earnings for salaried males
S=1 Ever enrolled in upper secondary school; zero otherwise
X Age, age squared, respondentâ€Ÿs religion â€“ protestant, catholic and other,
motherâ€Ÿs and fatherâ€Ÿs education â€“ elementary, secondary or higher,
distance to the nearest health post in km from the community, rural
residence, province of residence â€“ West Sumatra, South Sumatra, Lampung,
Jakarta, Central Java, Yogyakarta, East Java, Bali, West Nussa Tengara,
South Kalimanthan, South Sulawesi
Z Distance in km to nearest secondary school from community heads office,
interactions of distance with age, age squared, religion, parental education
and rural residence
23
Table 2: Sample statistics for the treatment groups
Upper secondary or higher Less than upper secondary
N = 1085 N = 1523
Log hourly wages 8.198 7.481
Years of education 13.133 5.341
Distance to school in km 1.053 1.564
Distance to health post in km 0.889 1.079
Age 37.058 38.675
Religion Protestant 0.050 0.022
Catholic 0.029 0.009
Other 0.062 0.043
Muslim 0.860 0.927
Father uneducated 0.130 0.383
â€¦elementary 0.503 0.507
...secondary and higher 0.330 0.061
...missing 0.020 0.037
Mother uneducated 0.201 0.425
â€¦elementary 0.484 0.406
...secondary and higher 0.204 0.022
...missing 0.098 0.133
Rural household 0.240 0.476
North Sumatra 0.057 0.063
West Sumatra 0.047 0.058
South Sumatra 0.048 0.032
Lampung 0.016 0.027
Jakarta 0.181 0.095
Central Java 0.085 0.163
Yogyakarta 0.092 0.054
East Java 0.121 0.180
Bali 0.056 0.038
West Nussa Tengara 0.050 0.048
South Kalimanthan 0.040 0.020
South Sulawesi 0.035 0.035
Source: Data from IFLS3. Sample restricted to males aged 25-60 employed in salaried jobs in government
and private sectors. Hourly wages constructed based on self-reported monthly wages.
24
Table 3: Regression of elementary education experience on distance to school
Number of
Failed grade Worked
repeats
Dist to nearest secondary school in km 0.007 0.011 -0.001
(0.007) (0.008) (0.005)
Individual and family controls Yes
Location fixed effect Yes
Number of observations 2,248 2,244 2,250
R2 0.041 0.043 0.043
Note: Sample restricted to males with the repeated grade information non-missing. The individual and
family controls include age, age squared, religion, fathers and motherâ€Ÿs schooling levels completed,
distance to local health outpost, rural and province dummies. Standard errors are robust to clustering at the
community level. Standard Errors are in the parentheses with significance at *** p<0.001, ** p< 0.05, * p<0.1
indicated.
Table 4: Regression of comprehensive exam test scores from elementary school on distance to school
Math Bahasa Science Social Studies
Distance to nearest secondary school 0.001 -0.002 -0.004 -0.005
(0.005) (0.005) (0.005) (0.005)
Individual and family controls Yes
Location fixed effect Yes
Number of observations 1,652 1,668 1,621 1,605
R2 0.134 0.187 0.124 0.115
Note: Sample includes everyone with non-missing test scores. Test scores recorded from score cards. The
individual and family controls include age, age squared, religion, fathers and motherâ€Ÿs schooling levels
completed, distance to local health outpost, rural and province dummies. Standard errors are robust to
clustering at the community level. Absolute t-statistics are in the parentheses with significance at
***
p<0.001, ** p< 0.05, * p<0.1 indicated.
25
Table 5: Upper school decision model â€“ Average Marginal Derivatives
Coef/(se) Average Derivative/(se)
Dist to sec school in km -0.123*** -0.0300**
(0.040) (0.0127)
Age 0.077* 0.0130
(0.044) (0.0090)
Age Squared -0.096* -0.0162
(0.055) (0.0111)
Protestant 0.730*** 0.1382***
(0.264) (0.0484)
Catholic 1.211*** 0.2123**
(0.395) (0.0890)
Other religions 0.245 0.0552
(0.363) (0.0878)
Fathers education elementary 0.766*** 0.1342***
(0.127) (0.0217)
Father higher education 1.835*** 0.3769***
(0.178) (0.0320)
Mother education elementary 0.443*** 0.0852***
(0.123) (0.0230)
Mother higher education 1.851*** 0.3730***
(0.237) (0.0418)
Rural -0.593*** -0.1143***
(0.110) (0.0276)
Distance to health post in km -0.017 0.0000
(0.040) (0.0083)
Location fixed effect Yes
Test for joint significance of 9.42/0.0021 22.26/0.051
instruments: Chi-square/p-value
Note: This table reports the coefficients and average marginal derivatives from a logit regression of upper
school attendance (a dummy variable that is equal to 1 if an individual has ever attended upper school and
equal to 0 if has never attended upper secondary school but graduated from lower secondary school. Type
of location is controlled for using province dummy variables. Dummy variable for missing parental
education is included in the regressions but not reported in the table. The first column presents coefficients
of logit where only distance to school is used an IV. In the second column average derivatives are presented
and instruments include distance to secondary school and interactions with all the Xs. Reference categories
are Muslim, not educated. Standard errors are robust to clustering at the community level. Standard Errors
are in the parentheses with significance at *** p<0.001, ** p< 0.05, * p<0.1 indicated.
26
Table 6: Annualized OLS and IV estimates of the return to upper secondary schooling
OLS 2SLS IV estimate
***
Upper secondary (annualized) 0.090 0.129***
(0.005) (0.048)
Age 0.052*** 0.048**
(0.019) (0.020)
Age Squared -0.042* -0.037
(0.023) (0.025)
Protestant 0.182** 0.142
(0.084) (0.104)
Catholic 0.059 0.001
(0.189) (0.202)
Other religions 0.109 0.097
(0.126) (0.125)
Fathers education elementary 0.135*** 0.091
(0.048) (0.070)
Fathers education secondary or higher 0.215*** 0.101
(0.067) (0.153)
Motherâ€Ÿs education elementary -0.052 -0.080
(0.048) (0.060)
Motherâ€Ÿs education secondary or higher -0.031 -0.128
(0.078) (0.136)
Rural household 0.111** 0.152**
(0.045) (0.068)
Distance to health post in km -0.023 -0.020
(0.018) (0.017)
Location controls YES YES
Number of observations 2,608 2,608
Test for joint significance of instruments: F-stat/p-value 2.22/0.00
R2 0.210 0.190
Note: This table reports the coefficients for OLS and 2SLS IV for regression of log of hourly wages on
upper school attendance (a dummy variable that is equal to 1 if an individual has ever attended upper
school and equal to 0 if has never attended upper secondary school but graduated from lower secondary
school. controlling for parental education, religion and location. First stage results report average marginal
derivative. Excluded instruments are distance to secondary school and interactions with parental education,
religion and age. Type of location is controlled using province dummies. Dummy variable for missing
parental education is included in the regressions but not reported in the table. Reference categories are
Muslim, not educated.Standard errors are robust to clustering at the community level. Standard Errors are
in the parentheses with significance at *** p<0.001, ** p< 0.05, * p<0.1 indicated.
27
Table 7: Testing for heterogeneity in returns: comparing adjacent sections of the semi-parametric
MTE
Ranges of US for
(0,0.1) (0.1,0.2) (0.2,0.3) (0.3,0.4) (0.4,0.5) (0.5,0.6) (0.6,0.7) (0.7,0.8) (0.8,0.9)
LATEj
Ranges of US for
(0.1,0.2) (0.2,0.3) (0.3,0.4) (0.4,0.5) (0.5,0.6) (0.6,0.7) (0.7,0.8) (0.8,0.9) (0.9,1)
LATEj+1
Difference in
-0.078 -0.039 -0.013 -0.012 0.00 0.005 -0.014 -0.024 -0.04
LATEs
p-value 0.00 0.00 0.00 0.00 0.597 0.759 0.005 0.00 0.00
Note: In order to compute the numbers in this table we construct groups of values of Us and average the
MTE within these groups, where and are the lowest and highest values of Us defined for interval j.
Then we compare the average MTE across adjacent groups and test whether the difference is equal to zero
using the bootstrap with 250 replications.
Table 8: Estimates of Average Returns to Upper Secondary Schooling with 95% confidence interval
Parameter Non parametric Estimate Normal selection model
ATT 0.269*** 0.201***
(.069, 0.47) (0.05,0.35)
ATE 0.123* 0.066
(-0.019, 0.266) (-0.029,0.163)
ATU 0.017 -0.029
(-0.236, 0.27) (-0.175,0.116)
MPRTE 0.142***
(.038, 0.246)
PRTE 0.142***
(.038, 0.247)
Note: This table presents estimates of various returns to upper secondary school attendance for the semi-
parametric and normal selection models: average treatment on the treated (ATT), average treatment effect
(ATE), treatment on the untreated (ATU), and marginal policy relevant treatment effect (MPRTE). Returns
to upper school are annualized to show returns for each additional year. Standard errors bootstrapped using
250 repetitions. 95% confidence interval in parentheses. Absolute t-statistics are in the parentheses with
significance at *** p<0.001, ** p< 0.05, * p<0.1 indicated.
28
Figure 1: Propensity score (P) support for each schooling group S = 0 and S = 1
.1
.08
.06 Propensity score by treatment status
Fraction
.04
.02
0
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
less than upper secondary upper secondary
Note: P is estimated probability of going to upper secondary school. It is estimated from a logit regression
of upper school attendance on Xs, distance to school, interactions of X and distance to school (See Table
5).
29
Figure 2: Marginal treatment effect with 90% Confidence Interval â€“ Semi-parametric regression
estimates
0.9
0.7
0.5
0.3
MTE
0.1
-0.1 0.00 0.10 0.20 0.30 0.40 0.51 0.61 0.71 0.81 0.91
-0.3
-0.5
lower bound MTE upper bound
Note: To estimate the E(Y1-Y0|X, Us) function we used a partial linear regression of log wages on X and K(P) ,with a
bandwidth of 0.2. X includes age, age squared, religion, parental education, rural and province dummy variables. 90%
confidence interval constructed using 250 boostrap repetitions. Values of V on the x-axis.
Figure 3: MTE with 90% Confidence Interval â€“ Parametric normal selection model estimates
0.80
0.60
0.40
0.20
MTE
0.00
0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91
-0.20
-0.40
-0.60
MTE Upper Lower
Note: Parametric MTE is estimated using the standard switching regression model.
30
Appendix
Simulation-based approach for estimating average treatment effects in equations 7 and
12.
Step 1: Estimate MTE(x, v) as described in section 3.
Step 2: For each individual in the sample construct the corresponding P(Z) and take n
draws from (recall that we assumed that V was independent of X and Z).
Since there are 2608 individuals in the sample this creates a simulated dataset of size
2608*n (we use n=1000). Evaluate the MTE(x,v) for each value of X and each value of
simulated V.
Step 3: In this simulated dataset both X and V are observed for all 2608*n observations. In
addition, we have estimates of MTE(x,v) for each of them. Therefore it is trivial to
construct the following quantities:
by respectively averaging the MTE for everyone in the simulated sample, for those who
have P>V, and for those with Pâ‰¤V.
Step 4: There is one parameter that remains to be estimated: the AMTE. The version of
the AMTE we use in this paper defines marginal individuals as those for whom:
Carneiro, Heckman and Vytlacil (2010) show that this is equivalent to estimating the
average return to schooling for those induced to enroll in upper secondary schooling
when one of the components of Z, say the intercept, changes my a marginal amount. This
is exactly what we do in our simulations: we change the intercept of the selection
equation marginally and we see which members of our simulated dataset change their
schooling decision. Finally, we average the MTE for this group.
31
Table A1: OLS and IV estimates of the return to a year of schooling
OLS First stage IV estimate
Average
Coef se Marginal se Coef se
Derivative
Years of education 0.096*** 0.005 0.157*** 0.037
Age 0.058*** 0.017 0.027 0.078 0.055*** 0.018
Age Squared -0.047** 0.022 -0.062 0.098 -0.042* 0.022
Muslim
Protestant 0.084 0.082 2.033 0.381 -0.037 0.118
Catholic 0.003 0.152 2.196 0.856 -0.117 0.149
Other religions 0.055 0.121 0.987 0.754 0.002 0.128
Father uneducated
â€¦ elementary 0.062 0.048 1.759 0.228 -0.049 0.080
â€¦ secondary or higher 0.135** 0.067 3.627 0.312 -0.083 0.144
Mother uneducated
â€¦ elementa -0.086* 0.046 1.000 0.216 -0.147** 0.063
â€¦ secondary or higher -0.119 0.078 3.173 0.344 -0.316** 0.145
Rural household 0.149*** 0.044 -1.146 0.301 0.234*** 0.073
Distance to health post in km -0.020 0.015 0.037 0.084 -0.015 0.013
Location controls Yes
Dist to nearest sec school -0.298*** 0.102
Number of observations 2,608 2,608
Test for joint significance of
3.62/0.000
instruments: F-Stat/p-value
R2 0.260 0.204
Note: This table reports the coefficients for OLS and 2SLS IV for regression of log of hourly wages on
years of schooling controlling for parental education, religion and location. First stage results report
average marginal derivative. Excluded instruments are distance to secondary school and interactions with
parental education, religion, age and distance to health center. Type of location is controlled using province
dummies. Dummy variable for missing parental education is included in the regressions but not reported in
the table. Standard errors are robust to clustering at the community level. Standard Errors are in parentheses
with significance at *** p<0.001, ** p< 0.05, * p<0.1 indicated.
32
Table A2: IV estimates of the return to a year of schooling without distance and X interactions
IV estimate First stage
coef se coef se
Years of education 0.144*** 0.053
***
Age 0.056 0.017 0.036 0.077
Age Squared -0.043* 0.022 -0.072 0.096
Muslim
Protestant -0.011 0.141 2.050*** 0.380
Catholic -0.091 0.164 2.229** 0.906
Other religions 0.014 0.128 0.839 0.778
Father uneducated
â€¦ elementary -0.025 0.102 1.800*** 0.231
â€¦ secondary or higher -0.036 0.198 3.525*** 0.316
â€¦ education missing -0.034 0.109 0.353 0.444
Mother uneducated
â€¦ elementary -0.134* 0.073 0.973*** 0.215
â€¦ secondary or higher -0.274 0.185 3.180*** 0.331
â€¦ education missing -0.183*** 0.063 0.367 0.301
Rural household 0.215** 0.091 -1.144*** 0.302
Distance to health post in km -0.016 0.013 0.007 0.082
W Java
N Sumatra 0.114 0.088 -0.615 0.500
W Sumatra 0.282** 0.112 -0.704 0.476
S Sumatra 0.137 0.125 0.667 0.476
Lampung -0.044 0.108 0.149 0.477
Jakarta -0.077 0.078 0.752* 0.421
*
C Java 0.051 0.091 -0.937 0.498
Yogyakarta -0.303*** 0.100 1.128** 0.570
E Java -0.007 0.066 -0.300 0.411
Bali -0.197 0.159 1.027 0.946
W Nusa Tenggara -0.176 0.107 0.715 0.839
*** ***
S Kalimantan 0.298 0.114 1.726 0.540
S Sulawesi 0.032 0.097 0.226 0.702
Dist to nearest sec school -0.244*** 0.072
Number of observations 2,608
Test for joing significance of instruments:
11.34/0.00
F-stat/p-value
R2 0.206
Note: This table reports the coefficients for 2SLS IV for regression of log of hourly wages years of
schooling, controlling for parental education, religion and location. Excluded instruments are distance to
secondary school. Type of location is controlled using province dummies. Dummy variable for missing
parental education is included in the regressions but not reported in the table. Reference categories are
Muslim, not educated. Standard errors are robust to clustering at the community level. Standard Errors are
in the parentheses with significance at *** p<0.001, ** p< 0.05, * p<0.1 indicated.
33
Table A3: IV estimates of the return to a year of schooling without regional dummies
IV estimate
coef se
***
Years of education 0.135 0.034
Age 0.059*** 0.018
Age Squared -0.046** 0.022
Muslim
Protestant -0.032 0.100
Catholic -0.153 0.154
Other religions -0.109 0.091
Father uneducated
â€¦ elementary -0.006 0.077
â€¦ secondary or higher -0.004 0.141
â€¦ education missing -0.002 0.107
Mother uneducated
â€¦ elementary -0.074 0.057
â€¦ secondary or higher -0.190 0.131
â€¦ education missing -0.156*** 0.060
Rural household 0.227*** 0.072
Distance to health post in km -0.008 0.014
Number of observations 2,608
Test for joing significance of instruments: F-stat/p-value 4.08/0.00
R2 0.22
Note: This table reports the coefficients for 2SLS IV for regression of log of hourly wages years of
schooling, controlling for parental education, religion and location. Excluded instruments are distance to
secondary school. Type of location is controlled using province dummies. Dummy variable for missing
parental education is included in the regressions but not reported in the table. Reference categories are
Muslim, not educated. Standard errors are robust to clustering at the community level. Standard Errors are
in the parentheses with significance at *** p<0.001, ** p< 0.05, * p<0.1 indicated.
34
Table A4: Outcome equation: Partial linear regression estimates
Coeffients Standard Errors
Age 0.070* 0.042
Age Squared -0.076 0.051
Protestant -0.022 0.368
Catholic -0.816 0.634
Other religions 0.786* 0.406
Father with elementary education 0.042 0.192
â€¦ secondary or higher 0.103 0.675
â€¦ education missing 0.425 0.292
Mother with elementary education -0.144 0.156
â€¦ secondary or higher -1.570* 0.938
â€¦ education missing -0.173 0.170
Rural household 0.288* 0.161
Distance to health post in km -0.016 0.030
N Sumatra 0.333 0.214
W Sumatra 0.177 0.218
S Sumatra 0.233 0.309
Lampung 0.253 0.294
Jakarta -0.248 0.233
C Java 0.071 0.153
Yogyakarta -0.127 0.301
E Java -0.071 0.149
Bali -1.022** 0.478
W Nusa Tenggara -0.267 0.325
S Kalimantan 0.013 0.451
S Sulawesi -0.434 0.274
N Sumatra -0.550 0.465
S Sumatra -0.134 0.595
C Java -0.197 0.415
Yogyakarta -0.127 0.602
E Java 0.326 0.357
Bali 1.660* 0.898
W Nusa Tenggara 0.192 0.711
S Kalimantan 0.367 0.860
W Sumatra*P 0.465 0.535
Lampung*P -0.993 0.839
Jakarta*P 0.394 0.452
S Sulawesi*P 0.979 0.598
Age*P -0.069 0.097
Age Squared*P 0.124 0.121
Protestant*P 0.130 0.639
Catholic*P 1.171 0.931
Other religions*P -1.261* 0.703
Father with elementary*P 0.053 0.605
Father with secondary/higher*P 0.002 1.280
Father education missing*P -1.322 0.942
Mother with elementary*P 0.187 0.393
Mother with secondary/higher *P 1.977 1.433
Mother education missing*P 0.109 0.458
Rural *P -0.275 0.362
Distance to health post*P 0.037 0.082
Number of observations 2,608
R2 0.080
***
note: ** *
p<0.01, p<0.05, p<0.1 The table presents the coefficients on X and P*X from the Robinsonâ€Ÿs
(1988) double residual semi-parametric regression estimator. The logit estimated pscore (P) enters the
equation nonlinearly according to a non-binding function and estimated using a gaussian kernel regression
with bandwidth equal to 0.2.
35
Table A5: Testing for equality of LATEs over different Intervals of MTE
Ranges of US for (0,0.1) (0.1. 0.2) (0.2,0.3) (0.3,0.4) (0.4,0.5) (0.5,0.6) (0.6,0.7) (0.7,0.8) (0.8,0.9)
LATEj
Ranges of US for (0.1. 0.2) (0.2,0.3) (0.3,0.4) (0.4,0.5) (0.5,0.6) (0.6,0.7) (0.7,0.8) (0.8,0.9) (0.9,1)
LATEj+1
Difference in -0.078 -0.04 -0.014 -0.012 -0.010 -0.011 -0.012 -0.014 -0.014
LATEs
p-value 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Note: In order to compute the numbers in this table we construct groups of values of Us and average the
MTE within these groups, where and are the lowest and highest values of Us defined for interval j.
Then we compare the average MTE across adjacent groups and test whether the difference is equal to zero
using the bootstrap with 250 replications.
Table A6: Estimates of Average Returns to Upper Secondary Schooling with 95% confidence
interval
Parameter Non parametric Estimate Normal selection model
ATT 0.217 0.198**
(-.1, 0.525) (-0.041,0.438)
ATE 0.13 0.065
(-0.06, 0.32) (-0.099, 0.231)
ATU 0.07 -0.028
(-0.227, 0.365) (-0.217, 0.160)
Note: This table presents estimates of various returns to upper secondary school attendance for the semi-
parametric and normal selection models: average treatment on the treated (ATT), average treatment effect
(ATE), treatment on the untreated (ATU), and marginal policy relevant treatment effect (MPRTE). Returns
to upper school are annualized to show returns for each additional year. Standard errors bootstrapped using
250 repetitions. 95% confidence interval in parentheses. Absolute t-statistics are in the parentheses with
significance at *** p<0.001, ** p< 0.05, * p<0.1 indicated.
36
Figure A1: Propensity score (P) support for each schooling group S = 0 and S = 1
Propensity score by treatment status
.1
.08
.06
Fraction
.04
.02
0
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
less than upper secondary upper secondary
Note: P is estimated probability of going to upper secondary school. It is estimated from a logit regression
of upper school attendance on Xs, distance to school, interactions of X and distance to school (See Table
5).
Figure A2: Marginal treatment effect with 90% Confidence Interval â€“ Semi-parametric regression
estimates (without distance and Xs interactions)
1
0.8
0.6
0.4
0.2
0
0 0.10 0.20 0.30 0.40 0.51 0.61 0.71 0.81 0.91
-0.2
-0.4
-0.6
lower bound MTE upper bound
Note: To estimate the E(Y1-Y0|X, Us) function we used a partial linear regression of log wages on X and K(P) ,with a
bandwidth of 0.2. X includes age, age squared, religion, parental education, rural and province dummy variables. 90%
confidence interval constructed using 250 boostrap repetitions. Values of V on the x-axis.
37
Figure A3: MTE with 90% Confidence Interval â€“ Parametric normal selection model estimates
0.8
0.6
0.4
0.2
0
0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91
-0.2
-0.4
-0.6
MTE upper lower
38