WPS5527
Policy Research Working Paper 5527
Shrinking Classroom Age Variance Raises
Student Achievement
Evidence from Developing Countries
Liang Choon Wang
The World Bank
Development Research Group
Human Development and Public Services Team
January 2011
Policy Research Working Paper 5527
Abstract
Large classroom variance of student age is prevalent in fourth graders' achievement in developing countries. A
developing countries, where achievement tends to be simulation demonstrates that re-grouping students by age
low. This paper investigates whether increased classroom in the sample can improve math and science test scores
age variance adversely affects mathematics and science by roughly 0.1 standard deviations. According to past
achievement. Using exogenous variation in the variance estimates for the United States, this effect size is similar to
of student age in ability-mixing schools, the author finds that of raising expenditures per student by 26 percent.
robust negative effects of classroom age variance on
This paper is a product of the Human Development and Public Services Team, Development Research Group. It is part
of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy
discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org.
The author may be contacted at lwang12@worldbank.org.
The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.
Produced by the Research Support Team
Shrinking Classroom Age Variance Raises Student Achievement
Evidence from Developing Countries*
Liang Choon Wang+
JEL Codes: I20, O15.
*
I thank Luis Benveniste, Michael Ewens, Deon Filmer, Ha Nguyen, and Adam Wagstaff for comments and
suggestions. The findings, interpretations, and conclusions expressed in this paper are those of the author and do not
necessarily represent the views of the World Bank, its Executive Directors, or the governments they represent.
+
Development Research Group's Human Development and Public Services Team, The World Bank, Mailstop MC
3-311, 1818 H St NW, Washington, DC 20433, USA; email: lwang12@worldbank.org.
1. Introduction
Developing countries often face shortages of infrastructure and teachers when striving for
universal primary school attendance. It is fairly common to see students of diverse ages attending
the same grade level in relatively poor countries (figure 1). Since students in poorer countries
tend to have lower achievement (figures 2), could the larger variance of student age be one of the
factors responsible for their relatively low achievement? As cognitive skills are linked to
economic growth (Hanushek and Woessmann 2008), earnings (Murnane, Willett, and Levy
1995), and productivity (Bishop 1989), understanding how variance of student age affects
achievement can be important for countries pursuing the millennium development goals.
A large variance of student age within the classroom may pose challenges to teachers in
providing instruction appropriate to students with different academic readiness. These
classrooms may also be prone to discipline and behavior problems as students with different
mental maturities interact with each other. As a result, large classroom age variance may impede
learning. Nonetheless, an age heterogeneous classroom may provide a venue for younger
students to learn closely from older students and for older students to gain from helping and
studying with younger peers. Hence, the effect of classroom age heterogeneity on achievement
may be ambiguous.
The lack of resources and the non-enforcement or absence of compulsory schooling laws
in developing countries often lead to students of diverse ages beginning formal schooling at the
same time. Similarly, a successful promotion of universal primary education may also generate a
sudden influx of relatively old first graders into schools. Identifying whether increased classroom
age variance impedes student achievement will permit policy makers to respond with appropriate
strategies to ameliorate its adverse effect, if it exists. For example, principals may consider
2
grouping students into classrooms on the basis of age and assigning teachers most qualified to
teach the respective groups. If grouping students into different classrooms by age is not feasible,
educators may form students into different age groups within the classroom and tailor instruction
accordingly to minimize the adverse effect of classroom age heterogeneity. To the extent that test
scores influence future earnings and economic growth, appropriate policy responses altering
classroom age heterogeneity can have long-term economic consequences.
Past studies on the effects of age differences between students on their outcomes mostly
focus on how they relate to school entry age policies in developed countries (Bedard and Dhuey
2006; Black, Devereux, and Salvanes forthcoming; Cascio and Schanzenbach 2007; Datar 2006;
Elder and Lubotsky 2009). These studies argue that older students may outperform younger
students in the same grade level because: (a) older students have accumulated more human
capital prior to formal schooling as a result of their greater absolute age; and/or (b) the superior
physical and mental capabilities of older students due to their relative age advantage reinforce
their confidence over time and attract more school inputs at the expense of younger students.
Findings by Elder and Lubotsky (2009), for example, show that absolute age differences explain
the achievement gap better than relative age differences, implying that classroom age variance
may have little negative effect on achievement. Although these results are important for
evaluating school entry age policy in developed countries, they may not be informative for
policy in developing countries where the variance of student age within grade level is
significantly larger and the schooling decision is often complicated by the lack of resources and
accessible schools.
A recent experimental study on the effect of ability grouping on achievement in Kenyan
primary schools shows that reducing ability heterogeneity in classrooms generates an effective
3
teaching and learning environment that benefits all types of students (Duflo, Dupas, and Kremer
forthcoming). 1 Given the positive correlation between age and achievement, ability grouping
basically narrows the relative age differences within the classroom and Duflo et al.'s
(forthcoming) results imply that decreased classroom age variance should also lead to
achievement gain. Several factors may explain the different implications based on studies of the
effects of school starting age versus the study on the effect of ability grouping on achievement.
First, studies examining the effects of school starting age tend to focus on why school starting
age matters for outcomes and whether it is worthwhile to delay school entry age, rather than to
identify whether increased classroom variance of student age influences achievement. Second,
differences in the education systems studied and identification strategies employed by previous
studies may be responsible for the disparate findings. Third, samples drawn from places where
variance of student ages within grade level is small due to the enforcement of compulsory
attendance laws and entry age policy may not be suitable candidates for examining the effect of
classroom age heterogeneity, as the small variation in student age across a relatively small
number of classrooms or schools might yield imprecise estimates.
This paper uses exogenous variation in the classroom variance of student age in 14
developing countries to examine its effects on student achievement. To utilize variation in
classroom age variance that is arguably exogenous, I employ a school fixed effects estimator and
focus on the variation of student age within ability-mixing elementary schools sampled from two
waves of the Trends in International Mathematics and Science Study (TIMSS). Because ability-
1
Studies on the effects of ability grouping and tracking on student achievement using observational data from
developed countries show mixed findings. Examples include Betts and Shkolnik (1999), Figlio and Page (2002),
Manning and Pischke (2006), and Hanushek and Woessman (2006).
4
mixing schools do not assign students into classrooms on the basis of student ability, differences
in student age across classrooms are likely orthogonal to other determinants of student
achievement. 2 Nonetheless, it is difficult to rule out implicit sorting on the basis of age. To
address potential selection bias, I simulate the average and standard deviation of classroom age
that each student would experience if the school assigned students into classrooms on the basis
of age. The variation in classroom age variables not explained by age sorting permits the
implementation of an instrumental variable strategy. More importantly, the large number of
classrooms sampled provides significant variation in classroom age variance to precisely identify
its effect on student outcomes. Although this paper focuses on developing countries, the findings
may also be applicable to schools in developed nations that have large classroom age variances
due to the practice of combination classrooms or the implementation of grade promotion and
retention policy.3
I find that greater classroom age variance leads to lower fourth graders' achievement in
mathematics and science. For every one month increase in the classroom standard deviation of
student age, average achievement falls by 0.03 standard deviations for both math and science.
However, classroom average age has an insignificant negative effect on achievement. The
negative effect of classroom age variance appears to (weakly) persist as the cohort of fourth
2
An ability-mixing school is one in which the school principal claimed that students were not assigned into different
classrooms based on their ability in mathematics and science.
3
This does not necessarily imply that the current results inform the effects of combination or multi-grade classrooms
on student achievement, since teachers instructing these classrooms usually receive special training and use
pedagogies different to those in traditional single-grade classrooms (see Benveniste and McEwan (2000) for a case
study on multi-grade schools in Colombia). Studies examine the effects of multi-grade classrooms (e.g., see Sims
[2008]) often face the difficulty associated with identifying causal relationship because multi-grade schools and
students attending such schools likely differ in many aspects that are not easy to control for.
5
graders entered into the eighth grade. Similarly, there is weak evidence suggesting that boys and
students above the median age are less affected by classroom age variance. On the other hand,
increased classroom age variance is not associated with negative behavioral problems that
students encounter in schools. The findings imply that the adverse effect of classroom age
heterogeneity is likely restricted to academic achievement. Finally, a policy simulation
demonstrates that by switching from age mixing to age sorting students, achievement in both
mathematics and science can improve by roughly 0.1 standard deviations. In other words, age
grouping students may help an average school achieve the benefit associated with increasing
expenditures per student by roughly 26 percent according to Sander's (1999) estimate or that of
cutting class size by 2.5 students based on Angrist and Lavy's (1999) estimate. Given the low
administrative cost, age grouping seems like a cost effective method to raise average
achievement.
2. Identification Strategy and Econometric Specifications
2.1 The Effect of Classroom Age Heterogeneity on Achievement
Differences in the variance of student age across countries and schools are likely correlated with
other unobserved influences of achievement, such as income, the extent of urbanization, and
educational expenditures. In contrast, differences in the variance of student age across
classrooms within a school are more likely exogenous, if the school does not assign students
and/or teachers into classrooms based on the students' prior achievement. Because students are
essentially randomly assigned into classrooms in ability-mixing schools, it is unlikely that other
6
determinants of student achievement are systematically correlated with classroom age variance.4
As long as principals do not selectively assign teachers according to the classroom variance of
student age, a school fixed effects estimator will yield a consistent estimate of the effect of
classroom age variance on achievement.
The school fixed effects specification is:
y icjk jk cjk 1 aicjk 2 aicjk a cjk xicjk u icjk
2
(1)
The dependent variable y icjk is the achievement of student i in classroom c of school j in country
k. jk is a set of school fixed effects, which ensures that I exploit the variation in age variance
across classrooms within schools. cjk is the standard deviation of student age (in years) for
classroom c, which measures the extent of classroom age heterogeneity. The coefficient of
interest, , is expected to be negative if classroom age variance impedes achievement. aicjk is
student i's age measured in years. Because within-grade-level age range is large in developing
countries, the achievement is expressed as a quadratic function of age to account for potential
non-linearity of achievement in age. The average age of students in classroom c is a cjk . The
coefficient captures the "social" age effect of the classroom on student i. 5 Given the age
dispersion of a classroom, if being present in a relatively "old" classroom hurts the student's
achievement, then is expected to be negative. x is a set of background characteristics and
4
For examples, see Kang (2007) and Wang (2010) that exploit the variation in peer quality across classrooms in
ability-mixing schools to study the effects of peers on student outcomes.
5
This follows Manski's (1993) definition of exogenous social effect, where the group's average predetermined
characteristics includes student i's own predetermined characteristics. The measure differs slightly from a typical
peer effect study where the social effect excludes student i's own characteristics. Because own age is separately
included in the regression equation, the current approach only alters the interpretation of the social effect.
7
teacher characteristics specific to student i. If classroom age variance is exogenously determined,
the exclusion of x should have little effect on the estimate of .
2.2 Instrumental Variables
Although principals may claim to mix students of all ability types in classrooms, they may still
implicitly sort students into classrooms by age or assign teachers of different quality based on the
ex-post age distributions of classrooms. For example, a principal may assign a more competent
teacher to teach a classroom that has slightly younger students or have more diverse age groups.
The school fixed effects specification (1) will not adequately address this type of selection bias.
One way to correct for this form of selection bias is to exploit the variation in classroom
age variables that are unrelated to age sorting. I simulate the hypothetical age distribution of a
student's classroom under the assumption that students were sorted into classrooms by age.6 If
age sorting is present, then the actual age distribution of a student's classroom and the simulated
age distribution of a student's classroom and other observables are expected to be positively
correlated. The variation in actual classroom average of age or standard deviation of age not
explained by the simulated one and other observables is likely free of the effect of age sorting.
Specifically, I will generate instrumental variables (IV) for classroom average age and classroom
standard deviation of age using the regression residuals of the following regressions:
^
a cjk M ^ M a cjk M cjk ^1M aicjk ^2 a icjk xicjk M eicjk
^ jk Sim M 2
^M (2)
^
cjk SD ^ SD cjk SD a cjk ^1SD aicjk ^2SD aicjk xicjk SD eicjk
^ jk Sim 2
^ SD (3)
6
This is done by ranking students in the sample by age and then assigning them into classrooms in the sample.
8
cjk is the simulated classroom standard deviation of age and acjk is the simulated classroom
Sim Sim
average age when students are perfectly sorted into classrooms by age. The coefficients ^ M and
^ SD capture the relationship between age sorting and classroom average age and classroom
standard deviation of age, respectively. They indicate whether estimates of and in equation
^M ^ SD
(1) may suffer from selection bias. By construction, the residuals eicjk and eicjk obtained from the
regressions are orthogonal to age sorting and other observables, and can be used as instrumental
variables for a cjk and cjk in equation (1).
This instrumental variable strategy effectively exploits the variation in age distributions
across classrooms for students who are not perfectly matched to their classmates and teachers on
the basis of their age and other observables. The instrumental variables will be highly relevant if
the simulated age distribution and observed characteristics of students and teachers do not have
much explanatory power. This will be the case if the claim of ability-mixing corresponds to the
random assignment of students of different age groups and teachers of different quality into
classrooms. The validity of these instruments rests on the assumption that the extent of non-
random selection and other threats to identification are fully captured by the simulated age
sorting distributions and observables.
2.3 Behavior Outcomes and Attitudes
One of the concerns against early school entry or mixing students of different ages relates to the
possibility that an age heterogeneous classroom increases the chances of younger students being
bullied, teased, or left out by older students. These behavioral issues may harm students' self
esteem, which in turn affects their learning outcomes. To assess whether increased classroom
9
variance of student age may lead to increased behavioral problems, I replace the dependent
variable in regression equation (1) with a set of variables measuring whether students
experienced behavioral problems inflicted by others and whether they find school enjoyable. I
also estimate the effects of classroom age heterogeneity on these measures of behaviors and
attitudes separately for students who are younger than the median age of other fourth graders in
their countries.
2.4 Differential Effects of Classroom Age Heterogeneity on Achievement
Classroom age heterogeneity may have differential effects on students depending on their age
and gender. For example, young students may require more teacher attention and if teachers
tailor instruction to the median or average students, a diverse classroom may have a stronger
negative effect on their achievement than on older students' achievement. Similarly, the effect of
age heterogeneity on achievement may differ depending on gender. For instance, boys are
perhaps more likely to be distracted than are girls in heterogeneous classroom. It is also possible
that girls are more vulnerable to age heterogeneity. To examine whether there exist differential
effects, I estimate equation (1) separately by student age group and gender.
3. Data
The data used are sourced from the Trends in International Mathematics and Science Study
(TIMSS) in 2003 and 2007. TIMSS provides student-level data on mathematics and science
10
achievement of fourth graders and eighth graders in a large number of countries.7 In addition to
internationally comparable standardized test scores, TIMSS also collected student surveys,
teacher surveys, and school surveys.
Because TIMSS asked principals whether they grouped students into classrooms on the
basis of ability in mathematics and science, and sampled at least two classrooms from numerous
schools in several countries, I am able to exploit classroom level variance of student age within
each ability-mixing school through a school fixed effects estimator.8 The focus on ability-mixing
schools is important, as principals in these schools are less likely to selectively assign teachers
according to the age distribution of students or the prior achievement of students across
classrooms. Similarly, as age and achievement are positive correlated, the classroom age
distributions of ability-mixing school are also less likely correlated with students' prior
achievement and other determinants of achievement. Since eighth graders tend to attend various
mathematics and science classes with different levels of difficulties and with different set of
peers, even within schools claiming not to group students based on ability, classroom age
heterogeneity measured in the eighth grade is more likely confounded with unobserved factors
and measurement error. Consequently, I focus primarily on fourth graders and only examine
eighth graders to assess whether the effect persists into the eighth grade. I include student data
7
However, countries were not consistently covered across different waves of TIMSS or across grade levels within
each wave of TIMSS. Furthermore, each student was only tested once and individual schools and students were not
followed over time in TIMSS, limiting the use of various estimation techniques.
8
An ability-mixing school is one in which students of different ability levels are mixed in a classroom. I only
include schools that do not ability group students in math and science classes, based on principals' responses to the
survey.
11
from 14 countries classified as low and middle income countries by the World Bank in 2007 and
estimate the models using pooled data from TIMSS 2003 and 2007 in most of the analysis.9
Table 1 reports descriptive statistics of the variables used in this study. Since the focus of
this study is on classroom age variance, it is crucial that there is a considerable amount of
variation in the classroom standard deviation of student age and classroom age distributions are
fairly symmetric on average. Indeed, the classroom standard deviation of age has a standard
deviation of 0.21 years (or 2.5 months) and a range of 1.7 years. Furthermore, the average
classroom age skewness is only 0.24, indicating that the extent of asymmetry in classroom age
distributions is reasonably low. Hence, using classroom standard deviation of age as the measure
of age heterogeneity appears sensible. Nevertheless, alternative measures of age dispersion are
also considered to assess whether the estimates are sensitive.
Table 2 verifies the claim that the classroom variance of student age is orthogonal to
other influences of achievement and that teachers are not systematically assigned to students
depending on classroom age heterogeneity. It reports the regression estimates of a set of student
background characteristics and teacher characteristics as the dependent variable against the
classroom standard deviation of student age and classroom average of student age, after
controlling for school fixed effects, own age, and own age squared. If classroom age variance is
exogenously determined, it should not be correlated with student background characteristics and
teacher characteristics. Except in one instance where parental nativity status is significant at the
10% level, all other predetermined student and teacher characteristics are not significantly
9
These countries are selected because of their development status and their samples of multiple classrooms per
school. Four of these countries are in TIMSS 2003 and thirteen in TIMSS 2007. Three of them classified as low or
middle income countries appear in both waves of TIMSS. I also estimate the models using TIMSS 2003 and TIMSS
2007 data separately. The results are presented in a robustness check section.
12
correlated with classroom standard deviation of student age. Thus, I am quite confident with the
identification strategy used to estimate the effects of classroom age variance.
However, Table 2 shows that classroom average age is significantly correlated with a few
observables at the 5% or 10% level. In particular, it appears that students in classrooms with
higher average age also tend to have less qualified teachers. This means that there may be some
extent of age sorting and non-random assignment of teachers, which may bias the school fixed
effects estimates. Thus, I will need to rely on the instrumental variable estimator to isolate the
potential of selection bias and to make causal inferences on the estimated effects of classroom
average age.
Table 3 presents evidence that there exists some form of age sorting. Column (1) shows
that simulated classroom average age (under age sorting) is positively correlated with the actual
classroom average age. This means that the school fixed effects estimate of classroom average
age effect will likely suffer from selection bias and highlights the need to implement IV
estimation. In contrast, column (2) shows that the actual classroom standard deviation of age is
fairly exogenous to selection bias, as it is not significantly correlated with the simulated
classroom standard deviation of age. Hence, I must rely on IV estimates to make causal
interpretation of the estimated effects of classroom average of age in the following section.
4. Empirical Results and Discussion
4.1 Classroom Age Heterogeneity and Achievement
The regression estimates for math achievement based on equation (1) and its variants is
presented in Table 4. Table 5 reports the estimates for science achievement.
13
Table 4 shows that classroom standard deviation of age has an adverse effect on
mathematics achievement. The estimated effect is significantly negative in all specifications.
Comparing to the simple Ordinary Least Squares (OLS) specification, the country and school
fixed effects specifications tend to show a smaller negative effect of classroom age
heterogeneity, highlighting the bias inherent in a simple cross-country or cross-school analysis.
The school fixed effects specifications without (column 3) and with (column 4) student and
teacher characteristics yield similar estimates of the effect of classroom age heterogeneity,
supporting the claim that classroom age variance within a school is exogenous. 10 Since the
variation in classroom age variance is fairly exogenous, the instrumental variable (IV) estimate is
similar to the school fixed effects estimate. The preferred IV specification (5) shows that for
every one month increase in the classroom standard deviation of age, average math achievement
is expected to fall by 0.03 standard deviations. 11
Table 4 also shows that the estimated effect of classroom average age on achievement is
mostly negative, which means that being placed in an older classroom hurts a student's
achievement. However, the estimated effect is insignificant in the school fixed-effects
specification. The IV estimate shows that the correction for potential selection bias increases the
size of the negative effect of classroom average age, but the estimated effect remains statistically
insignificant. Note that as the instrumental variables have very high partial F statistics, the
estimates are unlikely to suffer from a weak instrumental variable problem.
10
Although the coefficients of student and teacher characteristics are not reported, student and teacher
characteristics are jointly significant in explaining achievement.
11
These numbers measured in month of age are obtained by dividing the coefficient estimates by 12.
14
Table 5 presents estimates for science achievement. Similar to the effect of classroom
standard deviation of age on math achievement, the effect on science achievement is also
significantly negative across various specifications. The school fixed effects specifications and
the preferred IV specification (5) yield similar point estimates of the effect of classroom age
heterogeneity. The preferred IV estimate indicates that for every one month increase in the
classroom standard deviation of age, average science achievement is expected to fall by 0.03
standard deviations. Similarly, the effect of classroom average age is estimated to be negative,
but statistically insignificant.
Because the standard deviation of classroom standard deviation of student age is 0.21
years (or 2.5 months), the estimated effect size of a one standard deviation increase in classroom
age heterogeneity is roughly -0.075 standard deviations for mathematics and -0.081 standard
deviations for science. To gauge how large these effect sizes are, it is helpful to use past
estimates on the effects of class size reduction and increased school expenditures on achievement
to make a simple comparison (even though some of these estimates were debated). 12 For
example, Sander's (1999) instrumental variable estimate shows that for every one dollar increase
in the spending per student, math achievement in Illinois is predicted to increase by 0.0034
points. Converting this effect size to standard deviation of change in test score with respect to
percentage change in expenditures, the current estimates are roughly equivalent to an increase in
expenditures per student of 23 percent. Similarly, comparing to Angrist and Lavy's (1999)
largest instrumental variable estimate of the effect of class size reduction on math achievement in
12
For examples, see Hanushek (1995, 1997), Krueger (2003), and Woessmann (2000) for the debates on the
effectiveness of school resources on student achievement.
15
Israel, the effect of a one standard deviation decrease in classroom standard deviation of student
age is almost as large as cutting class size by 2 students.
In sum, the results show fairly robust and large adverse effect of classroom age variance
on student achievement. In contrast, the effect of classroom average age is negative but
statistically insignificant, implying that grouping students by age will not significantly benefit
younger students at the expense of older students. The results imply that grouping students by
age can significantly improve test scores without redistributing (much) achievement gain from
older students to younger students.
4.2 Effects on Behavior and Attitude toward Schooling
Having classmates of various ages may increase the incidence of students, especially young
ones, being bullied and shunned by classmates, as well as make schooling experience less
enjoyable.
Columns (1) to (3) of Table 6 report the estimated effect of classroom age heterogeneity
on the likelihood of an average student reported being bullied, left out of activities, and not liking
school, respectively. Columns (4) to (6) report the estimates for the sample of students at the
median age or younger.13 The top panel presents school fixed effects estimates, and the bottom
panel presents IV estimates. As the preferred IV estimates are statistically insignificant, there is
little evidence suggesting that greater classroom age heterogeneity increases the chances that
students reported being bullied, left out of activities in school, or not liking school. Together with
the estimates reported in the previous section, the results imply that the negative effect of
13
Median age is defined in accordance with the grade-level age distribution of the fourth grader's school. The
estimates are not sensitive to using the grade-level age distribution of the fourth grader's country.
16
classroom age variance is more likely academic specific. Nonetheless, these findings should be
interpreted with caution because the surveys asked students about their experience in school, but
not in class, and it is possible that school level measures are noisily related to classroom level
measures.
4.3 Who Loses More? Effects by Age and Gender
One concern for mixing students of large age differences in the same classroom is in its potential
adverse effect on younger students. Table 7 reports the differential effects of classroom age
heterogeneity on achievement of students who are above the median age and students who are at
the median age or below. The point estimates reveal that younger students tend to be more
affected by greater classroom age variance. The differences between old and young students are
larger in science than in math, and the school fixed effects estimates and IV estimates are similar.
For a one month increase in the classroom age standard deviation, the differential effect on the
change in math achievement between young and old students is at most 0.004 standard
deviations. Even though the size of the differential effect is minute, the overall pattern is
consistent with the view that younger students are disadvantaged more than older students in
age-diverse classrooms. Similarly, the estimated effects of classroom average age show that
young students are more hurt when placed in relatively older classrooms, even though both the
school fixed effects and IV estimates are statistically insignificant.
Table 8 presents estimates for boys and girls separately. The coefficient estimates of
classroom standard deviation of age are more negative for girls than for boys, especially in
science. However, similar to the differential effects by age, the differential effects by gender are
also small in magnitude. For a one month increase in the classroom age standard deviation, the
17
differential effect on the change in math achievement between boys and girls is at most 0.003
standard deviations. The estimated effects of classroom average age on achievement show that
boys appear to be more disadvantaged by being placed with relatively old classmates, but the
estimates are statistically insignificant.
To summarize, the estimates presented in this section show weak evidence that younger
students and girls are more disadvantaged by increased classroom age variance.
4.4 Robustness Checks
4.4.1 Sensitivity to Functional Form of Age
In the analysis presented above, all regressions included age and age squared as explanatory
variables. Table 9 presents estimates from regressions using different order of age polynomial as
regressors. The estimated effects of classroom standard deviation of age and classroom average
age are fairly insensitive to different functional form assumption of the relationship between
achievement and student's age. The effect of age on achievement is negative in the linear
specification, but increasing at a decreasing rate in the quadratic specification. Since the average
age in the sample is roughly 10.7 years, the quadratic specification is more consistent with
previous literature on the positive relationship between achievement and age. The cubic
specification yields estimated effect of classroom age heterogeneity similar to the quadratic
specification. Hence, the quadratic functional form chosen for the main analysis appears
reasonable.
18
4.4.2 Alternative Measures of Classroom Age Dispersion
The measure of classroom age heterogeneity used so far has been the standard deviation of
student age. Table 10 shows that the adverse effect of classroom age heterogeneity is robust to
alternative measures of dispersion. Columns (1) and (3) present estimates based on the difference
between the 75th and 25th percentile of the classroom age distribution and columns (2) and (4)
report estimates based on age range (i.e., maximum minimum). Since the standard deviation of
the 75th-25th percentile age difference is 0.256 (Table 1), the effect size is roughly 0.07 standard
deviations for both mathematics and science. Similarly, as the standard deviation of age range is
1.092, the effect size is also roughly 0.06 standard deviations. These estimated effect sizes are
similar to the 0.08 using the standard deviation of age as the measure of classroom age
dispersion.
4.4.3 Estimating Using TIMSS 2003 and TIMSS 2007 Separately
The estimates presented above are all based on pooled TIMSS data. For pooled data to be
sensible, the point estimates should be similar across both waves of TIMSS. Table 11 shows that
the estimated effects of classroom age heterogeneity remain significantly negative despite the
reduction in sample size. Furthermore, even though only three countries overlap in TIMSS 2003
and TIMSS 2007, the estimated effects of classroom age heterogeneity are not too different
between the two waves of TIMSS. Thus, the estimates are not sensitive to pooling both waves of
TIMSS.
19
4.5 Does the Effect Persist in the Eighth Grade?
Estimates presented in section 4.3 shows weak evidence that older students are less adversely
affected by classroom age variance than younger students in the fourth grade. However, does
classroom age variance continue to impede achievement as students enter higher grade levels?
Since the cohort of fourth graders in TIMSS 2003 attended grade eight in 2007, it is possible to
examine whether eighth-grade classroom age variance continues to affect their achievement.
However, because individuals were not followed over time in TIMSS, I can only compare the
performance of the same grade cohort in countries sampled in both waves of TIMSS, leaving us
a sample from three countries.
There are a number of limitations when the same grade cohort is compared in the two
waves of TIMSS. First, the same set of students was not followed over time and sampling
differences across the two waves of TIMSS make comparison less reliable. Second, in addition
to sampling variation, individuals most negatively affected by classroom age heterogeneity might
repeat grade or drop out of school, and hence not observed in the eighth grade sample, leading to
potential selection bias against the finding of a negative effect. Third, students who attended
ability mixing schools in the fourth grade may switch to ability grouping schools in the eighth
grade, making the comparison less meaningful. Fourth, the structure of eighth grade courses may
also introduce error in the measure of classroom age heterogeneity, as eighth graders are more
likely to take different courses that vary in difficulties with different set of peers. Despite these
shortcomings, comparing the effects of classroom age variance on achievement for the same
cohort of students within ability-mixing schools is the only option to gauge whether classroom
age variance persists to impede achievement as students advance to higher grade levels.
20
Table 12 compares the estimates for fourth graders and eighth graders using data from
Armenia, Latvia, and Lithuania. First, note that although the smaller sample size greatly reduces
the statistical significance of the estimates, the estimated effect of classroom standard deviation
of age on fourth graders' math achievement remains statistically significant at the 10% level
(column 1) and is similar in magnitude to that using the full sample. Column (3) and column (4)
show that the negative effect is much smaller in the eighth grade than in the fourth grade.
Specifically, the negative effect is not statistically significant for science and only significant at
the 15% level for mathematics. Based on the IV estimates, the reduction in the effect size is
roughly 18% for mathematics and 38% for science. Given the many caveats highlighted above
and the smaller sample size, it appears that the negative effect of classroom age heterogeneity on
achievement, especially for mathematics, does persist to some extent.
4.6 Policy Simulation: Age Grouping and Achievement Gain
Given the significant negative effects of classroom age variance on achievement, school
principals may improve student achievement by reducing the extent of age variance within the
classroom through grouping students by age. The question is how much achievement gains are
feasible for all ability-mixing schools to switch to age grouping, given its age distribution and the
point estimates presented above? Does age grouping lead to greater or less inequality between
students of different ages?
4.6.1 Mean Effects of Re-assignment
The preferred point estimates for mathematics and science reported respectively in specification
(5) of Table 4 and Table 5 can be used to simulate the achievement gains attainable by
21
reassigning students into classrooms on the basis of their age.14 First, I construct the classroom
standard deviation of age and the classroom average of age for each classroom by age grouping
students. Note that by re-grouping students, classroom standard deviation of age shrinks for all
students, but students assigned into a younger classroom will have a lower classroom average
age than those assigned into an older classroom. The reduction in classroom standard deviation
of age leads to achievement gain, but regrouping lowers the achievement of older students, and
increases that of younger students, because the coefficient of classroom average of student age is
negative (even though it is statistically insignificant). Second, the differences in the classroom
standard deviation of age and classroom average of age between the original age mixing scenario
and the new age grouping scenario are then multiplied by the respective point estimates to derive
the net achievement gains for mathematics and science. Finally, averages by country are reported
in Table 13.
Column (1) of Table 13 shows the standard deviation of age at the grade level for each
country. Column (2) reports the average differences of classroom standard deviation between age
grouping classrooms and age mixing classrooms. Column (3) reports the average differences of
classroom average age between age grouping classrooms and age mixing classrooms. Columns
(4) and (5) present the average predicted achievement gain for mathematics and science,
respectively. The bottom row reports the averages for the sample of countries.
Table 13 shows that countries with the largest within-grade standard deviation of age also
tend to realize the greatest reduction in classroom standard deviation of age if schools were to
14
Estimates reported in Tables 4 and 5 are used because the differential effects of classroom age heterogeneity
(between boys and girls or between young and old students) are small in magnitudes. Simulation is conducted using
TIMSS 2007 sample.
22
switch from age mixing to age grouping. The reduction in classroom standard deviation of age
ranges from 0.14 years in Russia to 0.63 years in Morocco. The corresponding achievement gain
in mathematics ranges from 0.05 standard deviations in Russia to 0.21 standard deviations in
Morocco. The achievement gain is slightly greater for science 0.06 and 0.23 standard
deviations in Russia and Morocco, respectively. Overall, the average reduction in classroom age
standard deviation is 0.26 years, and the average achievement gain is 0.09 standard deviations
for mathematics and 0.10 standard deviations for science. These gains from reassignment are
roughly equivalent to raising expenditures per student by 26 percent in accordance with Sander's
(1999) estimates. Similarly, using Angrist and Lavy's (1999) estimated effect of class size
reduction on achievement as a comparison, these predictions suggest that regrouping students
can bring about an effect equivalent to cutting average class size by approximately 2.5 students.
Figure 3 and figure 4 plot the predicted achievement gain in mathematics against national
incomes and achievement, respectively. Figure 3 illustrates that poorer countries tend to gain the
most by switching from age mixing to age grouping students. In particular, countries that have
lower average achievement, such as El Salvador and Morocco, are also the ones that will benefit
the most through age grouping students (figure 4). Given that grouping students by age involves
little administrative cost, it is an attractive option to raise achievement, especially for countries
with large age variance, low achievement, and low incomes.
4.6.2 Distributional Effects of Re-assignment
A policy change may be difficult to justify if some students will be significantly disadvantaged
by the change. Since re-assignment will lower the classroom average age for relatively young
students and raise it for relatively old students, the former will gain while the latter will lose
23
through the (insignificant) negative effect of classroom average age. Similarly, because
classroom age heterogeneity has a slightly more negative effect on students below the median
age, age grouping may benefit them more. Although the gain from reduced classroom age
heterogeneity is likely greater than any loss from having older classmates for students above the
median age, it can be useful to compare achievement gains of the two groups of students to
evaluate the distributional effect of re-assignment. I simulate the achievement gain for students
above the median age and for students at the median age or younger based on estimates reported
in Table 8.
Table 14 summarizes the simulation results by student age group and country. Except
Tunisia, where re-assignment lowers math achievement of students above the median age by
0.006 standard deviations, all other countries experience achievement gains in math and science
for all students through re-assignment. Specifically, the gains are greater for students at the
median age or below than for students above the median age. Therefore, the simulation shows
that age grouping not only improves average achievement, but also reduces achievement
differences between older and younger students.
5. Conclusions
This paper presents evidence that increased classroom age variance is detrimental to student
achievement in mathematics and science. Using arguably exogenous variation in classroom
variance of age within ability-mixing schools in 14 developing countries, I show that a one-
month increase in the classroom standard deviation of student age will lead to approximately a
0.03 standard deviation reduction in fourth graders' math and science achievement. However, the
effect of classroom average age is statistically insignificant. There is weak evidence suggesting
24
that younger students and girls are more negatively affected by increased classroom age
heterogeneity. Classroom age variance also appears to impede student achievement as they
progress into the eighth grade. Although classroom age variance hurts academic achievement, it
does not significantly increase the incidence of behavioral problems or make schooling
experience less enjoyable for students.
The robust negative effect of classroom age variance and the insignificant effect of
classroom average age on student achievement suggest that grouping students by age can lead to
test score improvements. A simulation shows that by switching from age mixing students into
age grouping students, schools can reduce classroom standard deviation of age by 0.26 years on
average. The corresponding average achievement gain is 0.09 standard deviations in math score
and 0.10 standard deviations in science score. According to Sander's (1999) estimates using U.S
data, such effect sizes are similar to that of raising expenditures per student by 26 percent.
Furthermore, gains are experienced by students of all age groups, but more so for students at the
median age or younger, leading to smaller achievement differences between older and younger
students. Countries that have larger within-grade-level variance of student age and lower average
achievement are the ones that tend to gain the most from age grouping.
Given the low administrative cost, age grouping shows promise as a method to improve
learning outcomes. Nevertheless, since the estimates are based on observational data and it is not
possible to completely rule out the presence of unobserved influences which are correlated with
classroom age distribution, readers should be cautious in attaching a causal interpretation to the
estimates. It will certainly improve the confidence in recommending age grouping to policy
makers in countries with large classroom age variances if randomized controlled experiments can
be conducted to ascertain whether findings reported here stand up to scrutiny.
25
References
Angrist, Joshua D. and Victor Lavy. (1999). "Using Maimonides' Rule to Estimate the Effect of
Class Size on Scholastic Achievement." Quarterly Journal of Economics, vol.114 (2), pp. 533-
575.
Bedard, Kelly and Elizabeth Dhuey. (2006). "The Persistence of Early Childhood Maturity:
International Evidence of Long-run Age Effects." Quarterly Journal of Economics, vol.121, pp.
1437-1472.
Benveniste, Luis A. and Patrick J. McEwan. (2000). "Constraints to Implementing Educational
Innovations: The Case of Multigrade Schools," International Review of Education, vol.46 (1-2),
pp. 31-48.
Betts, Julian and Jaime Shkolnik. (1999). "Key Difficulties in Identifying the Effects of Ability
Grouping on Student Achievement." Economics of Education Review, vol.19 (1), pp. 243-266.
Bishop, John. (1989). "Is the Test Score Decline Responsible for the Productivity Growth
Decline?" American Economic Review, vol.79 (1), pp. 178-197.
Black, Sandra E., Paul J. Devereux, and Kjell G. Salvanes. (forthcoming). "Too Young to Leave
the Nest? The Effects of School Starting Age." Review of Economics and Statistics.
Cascio, Elizabeth U. and Diane Schanzenbach. (2007). "First in the Class? Age and the
Education Production Function." NBER Working Paper No. 13663.
Datar, Ashlesha. (2006). "Does Delaying Kindergarten Entrance Give Children a Head Start?"
Economics of Education Review, vol.25, pp. 43-62.
Duflo, Esther, Pascaline Dupas and Michael Kremer. (forthcoming). "Peer Effects and the
Impact of Tracking: Evidence from a Randomized Evaluation in Kenya." American Economic
Review.
Elder, Todd E. and Darren H. Lubotsky. (2009). "Kindergarten Entrance Age and Children's
Achievement: Impacts of State Policies, Family Background, and Peers." Journal of Human
Resources, vol.44 (3), pp. 641-683.
Figlio, David and Marianne Page. (2002). "School Choice and the Distributional Effects of
Ability Tracking: Does Separation Increase Inequality?" Journal of Urban Economics, vol.51,
pp. 497-514.
Hanushek, Eric A. (1995). "Interpreting Recent Research on Schooling in Developing
Countries." World Bank Research Observer, vol.10 (2), pp. 227-246.
26
Hanushek, Eric A. (1997). "Assessing the Effects of School Resources on Student Performance:
An Update." Education Evaluation and Policy Analysis, vol.19 (2), pp. 141-164.
Hanushek, Eric A. and Ludger Woessmann. (2008). "Does Educational Tracking Affect
Performance and Inequality? Differences-in-differences Evidence Across Countries." Economic
Journal, vol.116 (510), pp. C63-C76.
Hanushek, Eric A. and Ludger Woessmann. (2008). "The Role of Cognitive Skills in Economic
Development." Journal of Economic Literature, vol.46 (3), pp. 607-668.
Kang, Changhui. (2007). "Classroom Peer Effects and Academic Achievement: Quasi-
randomization Evidence from South Korea." Journal of Urban Economics, vol.61, pp. 458-495.
Krueger, Alan. (2003). "Economic Considerations and Class Size." Economic Journal, vol. 113,
F34-F63.
Manski, Charles. (1993). "Identification of Endogenous Social Effects: The Reflection Problem."
Review of Economic Studies, vol.60 (3), pp. 531-542.
Manning, Allen and Jorn-Steffen Pischke. (2006). "Comprehensive Versus Selective Schooling
in England & Wales: What Do We Know?" Centre for the Economics of Education (LSE)
Working Paper No. CEEDP006.
Murnane, Richard J., John B. Willett, and Frank Levy. (1995). "The Growing Importance of
Cognitive Skills in Wage Determination." Review of Economics and Statistics, vol.77 (2), pp.
251-266.
Sander, William. (1999). "Endogenous Expenditures and Student Achievement." Economics
Letters, vol.64 (2), pp. 223-231.
Sims, David. (2008). "A Strategic Response to Class Size Reduction: Combination Classes and
Student Achievement in California." Journal of Policy Analysis and Management, vol.27, pp.
457-478.
Wang, Liang Choon. (2010). Three Essays in Labor Economics. Ph.D. Dissertation, University
of California, San Diego.
Woessmann, Ludger. (2001) "New Evidence on the Missing Resource-Performance Link in
Education." Kiel Working Paper No. 1051, Kiel Institute of World Economics.
27
Figure 1: Variance of Fourth Graders' Age against GDP per Capita (PPP)
1.5
Yemen
Colombia
Variance of Age within Grade
El Salvador
Morocco
1
Mongolia
Algeria
.5
Tunisia
Iran
Qatar
Kazakhstan Hong Kong
Germany Kuwait
Georgia Russia Hungary Republic
Armenia Latvia
Ukraine Slovak Republic
Lithuania Czech
Singapore
Austria United States
Netherlands
Australia
Denmark
New Italy UK
0 -.5 JapanCanada Norway
Slovenia Zealand
Sweden
0 20000 40000 60000
GDP Per Capita (PPP), 2003
High Income Low and Middle Income
Fitted values
Notes: Data sourced from Third International Mathematics and Science Study (TIMSS) 2007 and the
World Development Indicators. 37 economies are included in the sample. The variance of age for United
Kingdom is calculated based on the weighted average of the figures of England and Scotland. The
variance of age for Canada is calculated by the weighted average of Alberta, British Columbia, Quebec,
and Toronto provinces. GDP per capita (PPP) in 2003 is selected so that income matches the time that the
fourth grade cohort in TIMSS 2007 commenced primary education.
Figure 2: Variance of Age against Average Test Score of Fourth Graders
1.5
Yemen
Colombia
Variance of Age within Grade
El Salvador
Morocco
1
Algeria Mongolia
.5
Tunisia Iran
Qatar
Hong Kong
Kazakhstan
Kuwait Germany
Russia
Hungary
ArmeniaNetherlands Singapore
Republic
Georgia Ukraine Slovak LatviaStates
United
Australia
Austria
Czech Republic
Lithuania
Denmark
Italy
Canada
New Zealand
Sweden
Norway Slovenia Japan
UK
0
-2.25 -1.5 -.75 0 .75 1.5
Average Standardized Score
High Income Low and Middle Income
Fitted values
Notes: Data sourced from TIMSS 2007 and the World Development Indicators. 37 economies are
included in the sample. The test score for United Kingdom is calculated based on the weighted average of
the figures of England and Scotland. The test score for Canada is calculated by the weighted average of
Alberta, British Columbia, Quebec, and Toronto provinces. GDP per capita (PPP) in 2003 is selected so
that income matches the time that the fourth grade cohort in TIMSS 2007 commenced primary education.
Average standardized score is the average of the standardized international scale mathematics and science
scores.
28
Figure 3: Predicted Gain in Math Achievement against GDP per Capita (PPP)
Morocco
.2
Predicted Gain in Math
.15
Mongolia
El Salvador
Colombia
.1
Tunisia
Armenia Kazakhstan Latvia
Yemen Georgia Ukraine
.05
Russia Lithuania
2000 4000 6000 8000 10000 12000
GDP per capita (PPP), 2003
Low and Middle Income Fitted values
Notes: Author's own calculation using TIMSS 2007 and the World Development Indicators. 13
economies are included in the sample. GDP per capita (PPP) in 2003 is selected so that income matches
the time that the fourth grade cohort in TIMSS 2007 commenced primary education.
Figure 4: Predicted Gain against Actual Average Achievement in Mathematics
Morocco
.2
Predicted Gain in Math
.15
Mongolia
El Salvador
Colombia
.1
Tunisia
Armenia Latvia
Kazakhstan
Yemen Georgia
Ukraine
.05
Lithuania
Russia
-2.25 -1.5 -.75 0 .75
Average Standardized Math Score
Low and Middle Income Fitted values
Notes: Author's own calculation using TIMSS 2007. 13 economies are included in the sample. Average
standardized math score is the actual scale math score standardized internationally.
29
Table 1: Descriptive Statistics
Variables Obs. Weighted Mean Std. Dev. Min Max
Mean
Student Characteristics
Math 22841 0.101 0.052 0.873 -3.88 2.56
Science 22841 0.013 -0.015 0.827 -3.82 2.57
Classroom Age SD (years) 22841 0.460 0.470 0.210 0.06 1.74
Classroom Age 75th-25th Percentile (years) 22841 0.582 0.585 0.256 0.08 3.25
Classroom Age Range (years) 22841 1.805 1.906 1.092 0.08 7.08
Classroom Age Skewness (years) 22841 0.236 0.241 0.793 -3.06 3.11
Classroom Ave. Age (years) 22841 10.68 10.64 0.397 9.63 12.36
Age (years) 22841 10.68 10.64 0.641 6.17 15.00
Bullied 22841 0.405 0.404 0.491 0 1
Left Out 22841 0.164 0.164 0.370 0 1
Like School 22841 0.841 0.852 0.355 0 1
Native Born 22841 0.833 0.834 0.372 0 1
Parents Native Born 22841 0.884 0.876 0.329 0 1
Speak National Language 22841 0.853 0.857 0.350 0 1
Boy 22841 0.504 0.506 0.500 0 1
Books 22841 0.797 0.802 0.399 0 1
Calculator 22841 0.793 0.795 0.404 0 1
Computer 22841 0.511 0.510 0.500 0 1
Study Desk 22841 0.784 0.784 0.412 0 1
Dictionary 22841 0.778 0.792 0.406 0 1
Teacher Characteristics
Math Teaching Experience (years) 22841 20.00 20.26 11.02 0 50
Math Teacher Certificate 22841 0.686 0.667 0.471 0 1
Major in Math 22841 0.347 0.340 0.474 0 1
Male Math 22841 0.196 0.066 0.248 0 1
Science Teaching Experience (years) 22841 18.56 18.77 11.54 0 50
Science Teacher Certificate 22841 0.652 0.637 0.481 0 1
Major in Science 22841 0.241 0.248 0.432 0 1
Male Science 22841 0.194 0.048 0.214 0 1
Notes: Author's own calculated based on data sourced from TIMSS 2003 and TIMSS 2007. The weighted
means are computed based on TIMSS sampling weights. Only observations with achievement and age
available are included. Mathematics and Science scores reported are the international scale scores
standardized to a standard normal distribution. The mean test scores reported above do not have zero
means because only the subset of ability mixing schools is included. 538 schools are in the sample. See
data appendix for variable construction.
30
Table 2: Verification of Exogenous Variation in Classroom Age Heterogeneity
Classroom Age Classroom Age
S.D. Average S.D. Average
Student characteristics Ave. Math Teacher
Native born 0.001 0.002 Teaching experience 0.008 -4.432**
(0.038) (0.039) (2.995) (2.202)
Parents native born -0.058* -0.014 Teacher certificate 0.136 -0.190**
(0.031) (0.026) (0.107) (0.093)
Speak national language -0.003 -0.000 Major in math 0.097 0.042
(0.037) (0.029) (0.124) (0.078)
Boy 0.076 0.053 Male -0.081 0.040
(0.062) (0.035) (0.123) (0.102)
Home characteristics Ave. Science Teacher
Some books -0.022 -0.056** Teaching experience -1.162 -1.288
(0.041) (0.026) (2.731) (1.964)
Calculator 0.068 -0.059* Teacher certificate 0.130 -0.157*
(0.044) (0.031) (0.105) (0.090)
Computer -0.028 -0.052 Major in science 0.068 -0.018
(0.038) (0.032) (0.115) (0.089)
Study desk 0.006 -0.033 Male -0.113 0.036
(0.041) (0.029) (0.118) (0.098)
Dictionary -0.058 -0.019
(0.041) (0.030)
Notes: Classroom standard deviation of age is the key independent variable. The constant term and
coefficients of age and age squared not reported. Depending on whether the dependent variable is a
student characteristic or a teacher characteristic, either a student non-response indicator or a teacher non-
response indicator is included to control for missing values. Teacher characteristics are averages because
multiple teachers are involved in some cases. Regressions are weighted by the sampling weights. Robust
standard errors clustered by school reported in parentheses. See data appendix for variable construction.
*** p<0.01, ** p<0.05, * p<0.1
31
Table 3: Simulated Age Distribution and Instrumental Variables
(1) (2)
------ Classroom Age ------
Ave. S.D.
Simulated Classroom Ave. Age 0.021**
(0.010)
Simulated Classroom S.D. Age 0.021
(0.014)
Observations 22841 22841
R-squared 0.922 0.812
Notes: Only students with both mathematics and science test scores available are included. Indicators for
non-responses to student survey and teacher survey are included to control for missing values. Simulated
classroom average age and standard deviation of age are constructed based on the assumption that
students are perfectly sorted by age and assigned into classrooms of equal size. All regressions include
school fixed effects, student and teacher characteristics, and other age variables. Regressions are weighted
by sampling weights. Robust standard errors clustered by schools are reported in parentheses. *** p<0.01,
** p<0.05, * p<0.1
32
Table 4: Classroom Age Heterogeneity on Mathematics Achievement
(1) (2) (3) (4) (5)
Classroom S.D. Age -1.063*** -0.346*** -0.369*** -0.359*** -0.366***
(0.160) (0.106) (0.102) (0.093) (0.093)
Classroom Ave. Age 0.705*** -0.243*** -0.101 -0.064 -0.077
(0.061) (0.076) (0.074) (0.068) (0.067)
Age 1.432*** 0.791*** 0.856*** 0.634*** 0.631***
(0.206) (0.169) (0.139) (0.127) (0.125)
Age squared -0.071*** -0.041*** -0.044*** -0.033*** -0.033***
(0.010) (0.008) (0.006) (0.006) (0.006)
Fixed Effects No Country School School School
Student and Teacher Characteristics No No No Yes Yes
Instrumental Variables (IV) No No No No Yes
First-stage Summary:
Partial F for S.D. Age IV 66466
- Shea Partial R-squared 0.992
Partial F for Ave. Age IV 21112
- Shea Partial R-squared 0.984
Observations 22841 22841 22841 22841 22841
R-squared 0.182 0.392 0.541 0.586 0.110
Notes: Only students with both mathematics and science test scores available are included. Indicators for
non-responses to student survey and teacher survey are included in specification (4) to control for missing
values. Regressions are weighted by sampling weights. Robust standard errors clustered by schools are
reported in parentheses. *** p<0.01, ** p<0.05, * p<0.1
33
Table 5: Classroom Age Heterogeneity on Science Achievement
(1) (2) (3) (4) (5)
Classroom S.D. Age -0.774*** -0.394*** -0.391*** -0.388*** -0.388***
(0.147) (0.117) (0.108) (0.102) (0.100)
Classroom Ave. Age 0.678*** -0.160** -0.081 -0.032 -0.045
(0.059) (0.076) (0.067) (0.062) (0.062)
Age 0.858*** 0.493** 0.622*** 0.415*** 0.415***
(0.223) (0.203) (0.167) (0.154) (0.152)
Age squared -0.044*** -0.027*** -0.033*** -0.022*** -0.022***
(0.010) (0.009) (0.008) (0.007) (0.007)
Fixed Effects No Country School School School
Student and Teacher Characteristics No No No Yes Yes
Instrumental Variables No No No No Yes
First-stage Summary:
Partial F for S.D. Age IV 130000
- Shea Partial R-squared 0.996
Partial F for Ave. Age IV 12940
- Shea Partial R-squared 0.975
Observations 22841 22841 22841 22841 22841
R-squared 0.141 0.358 0.517 0.559 0.100
Notes: Only students with both mathematics and science test scores available are included. Indicators for
non-responses to student survey and teacher survey are included in specification (4) to control for missing
values. Regressions are weighted by sampling weights. Robust standard errors clustered by schools are
reported in parentheses. *** p<0.01, ** p<0.05, * p<0.1
34
Table 6: Classroom Age Heterogeneity on Behaviors and Attitudes
(1) (2) (3) (4) (5) (6)
------- All Students ------- ------- Young Students -------
Bullied Left out Like school Bullied Left out Like school
School FE Results
Classroom S.D. Age 0.021 0.014 -0.042 -0.025 -0.011 -0.017
(0.056) (0.039) (0.035) (0.065) (0.052) (0.050)
Classroom Ave. Age 0.034 0.035 -0.008 0.079* 0.053 -0.047
(0.035) (0.029) (0.025) (0.048) (0.041) (0.044)
Age -0.187 -0.183 0.032 -0.019 0.002 0.356**
(0.119) (0.116) (0.072) (0.200) (0.158) (0.162)
Age squared 0.009 0.009* -0.002 -0.000 -0.000 -0.018**
(0.006) (0.005) (0.003) (0.010) (0.008) (0.008)
Observations 22841 22841 22841 12539 12539 12539
R-squared 0.128 0.093 0.144 0.145 0.108 0.156
IV Results
Classroom S.D. Age 0.028 0.011 -0.040 -0.020 -0.012 -0.011
(0.056) (0.039) (0.035) (0.066) (0.051) (0.050)
Classroom Ave. Age 0.018 0.023 -0.008 0.066 0.042 -0.043
(0.034) (0.029) (0.025) (0.048) (0.040) (0.043)
Age -0.185 -0.184 0.032 -0.020 -0.000 0.358**
(0.118) (0.113) (0.071) (0.195) (0.155) (0.158)
Age squared 0.009 0.009* -0.002 -0.000 -0.000 -0.019**
(0.006) (0.005) (0.003) (0.010) (0.008) (0.008)
First-stage Summary:
Partial F for S.D. Age IV 17589 17589 17589 16927 16927 16927
- Shea Partial R-squared 0.983 0.983 0.983 0.982 0.982 0.982
Partial F for Ave. Age IV 3096 3096 3096 3163 3163 3163
- Shea Partial R-squared 0.949 0.949 0.949 0.944 0.944 0.944
Observations 22841 22841 22841 12539 12539 12539
R-squared 0.004 0.009 0.043 0.004 0.008 0.037
Notes: Only students with both mathematics and science test scores available are included. All
regressions include school fixed effects, a set of student characteristics listed in Table 2, as well as
indicators for non-responses to the student survey. Young students are those at the median age or below.
Median age is defined according to the grade-level age distribution of the school. Regressions are
weighted by sampling weights. Robust standard errors clustered by schools are reported in parentheses.
*** p<0.01, ** p<0.05, * p<0.1
35
Table 7: Classroom Age Heterogeneity on Achievement by Student Age
(1) (2) (3) (4)
------- Mathematics ------- --------- Science ---------
Old Young Old Young
School FE Results
Classroom S.D. Age -0.325*** -0.341*** -0.331*** -0.383***
(0.121) (0.094) (0.120) (0.101)
Classroom Ave. Age -0.055 -0.076 0.022 -0.098
(0.094) (0.070) (0.086) (0.072)
Age -1.080*** 0.432* -1.859*** 0.329
(0.320) (0.221) (0.375) (0.235)
Age squared 0.038*** -0.021* 0.072*** -0.016
(0.014) (0.011) (0.016) (0.012)
Observations 10301 12539 10301 12539
R-squared 0.624 0.585 0.606 0.555
IV Results
Classroom S.D. Age -0.338*** -0.353*** -0.335*** -0.385***
(0.118) (0.093) (0.115) (0.099)
Classroom Ave. Age -0.046 -0.089 0.020 -0.110
(0.090) (0.069) (0.082) (0.070)
Age -1.082*** 0.429** -1.859*** 0.327
(0.311) (0.216) (0.364) (0.229)
Age squared 0.038*** -0.021* 0.072*** -0.016
First-stage Summary:
Partial F for S.D. Age IV 63814 63299 63814 63299
- Shea Partial R-squared 0.993 0.993 0.993 0.993
Partial F for Ave. Age IV 15145 21491 15145 21491
- Shea Partial R-squared 0.985 0.984 0.985 0.984
Observations 10301 12539 10301 12539
R-squared 0.125 0.101 0.122 0.090
Notes: Only students with both mathematics and science test scores available are included. All
regressions include school fixed effects, a set of student and teacher characteristics listed in Table 2, as
well as indicators for non-responses to student and teacher surveys. Teacher characteristics vary across
subjects. "Old" students are those above the median age of students in the same school and grade level.
Regressions are weighted by sampling weights. Robust standard errors clustered by schools are reported
in parentheses. *** p<0.01, ** p<0.05, * p<0.1
36
Table 8: Classroom Age Heterogeneity on Achievement by Gender
(1) (2) (3) (4)
------- Mathematics ------- --------- Science ---------
Boy Girl Boy Girl
School FE Results
Classroom S.D. Age -0.321*** -0.346*** -0.339*** -0.372***
(0.111) (0.103) (0.121) (0.108)
Classroom Ave. Age -0.088 -0.045 -0.075 0.006
(0.087) (0.068) (0.081) (0.065)
Age 0.697*** 0.557*** 0.540** 0.341**
(0.168) (0.177) (0.219) (0.168)
Age squared -0.036*** -0.029*** -0.028*** -0.019**
(0.008) (0.008) (0.010) (0.008)
Observations 11547 11293 11547 11293
R-squared 0.604 0.593 0.585 0.558
IV Results
Classroom S.D. Age -0.322*** -0.358*** -0.338*** -0.372***
(0.109) (0.102) (0.117) (0.105)
Classroom Ave. Age -0.106 -0.055 -0.089 -0.009
(0.085) (0.067) (0.079) (0.063)
Age 0.697*** 0.554*** 0.540** 0.341**
(0.164) (0.172) (0.213) (0.164)
Age squared -0.036*** -0.028*** -0.028*** -0.019**
(0.007) (0.008) (0.010) (0.008)
First-stage Summary:
Partial F for S.D. Age IV 57659 64796 57659 64796
- Shea Partial R-squared 0.992 0.993 0.992 0.993
Partial F for Ave. Age IV 16326 24241 16326 24241
- Shea Partial R-squared 0.984 0.985 0.984 0.985
Observations 11547 11293 11547 11293
R-squared 0.124 0.097 0.109 0.088
Notes: Only students with both mathematics and science test scores available are included. All
regressions include school fixed effects, a set of student and teacher characteristics listed in Table 2, as
well as indicators for non-responses to student and teacher surveys. Teacher characteristics vary across
subjects. Regressions are weighted by sampling weights. Robust standard errors clustered by schools are
reported in parentheses. *** p<0.01, ** p<0.05, * p<0.1
37
Table 9: Sensitivity of Estimates to Functional Forms of Age
(1) (2) (3) (4) (5) (6)
School FE Results
Classroom S.D. Age -0.394*** -0.359*** -0.359*** -0.413*** -0.388*** -0.389***
(0.092) (0.093) (0.093) (0.100) (0.102) (0.102)
Classroom Ave. Age -0.061 -0.064 -0.063 -0.030 -0.032 -0.029
(0.067) (0.068) (0.068) (0.062) (0.062) (0.063)
Age -0.071*** 0.634*** 1.102* -0.068*** 0.415*** 1.656**
(0.011) (0.127) (0.663) (0.012) (0.154) (0.662)
Age squared -0.033*** -0.076 -0.022*** -0.139**
(0.006) (0.064) (0.007) (0.064)
Age cubed 0.001 0.004*
(0.002) (0.002)
Observations 22841 22841 22841 22841 22841 22841
R-squared 0.585 0.586 0.586 0.559 0.559 0.559
IV Results
Classroom S.D. Age -0.401*** -0.366*** -0.366*** -0.412*** -0.388*** -0.389***
(0.092) (0.093) (0.093) (0.098) (0.100) (0.100)
Classroom Ave. Age -0.076 -0.077 -0.073 -0.045 -0.045 -0.041
(0.066) (0.067) (0.067) (0.061) (0.062) (0.062)
Age -0.070*** 0.631*** 1.092* -0.067*** 0.415*** 1.647**
(0.011) (0.125) (0.655) (0.011) (0.152) (0.653)
Age squared -0.033*** -0.076 -0.022*** -0.138**
(0.006) (0.063) (0.007) (0.063)
Age cubed 0.001 0.004*
(0.002) (0.002)
First-stage Summary:
Partial F for S.D. Age IV 65702 66813 66542 130000 130000 130000
- Shea Partial R-squared 0.992 0.992 0.992 0.996 0.996 0.996
Partial F for Ave. Age IV 21231 22275 21622 13015 12940 13575
- Shea Partial R-squared 0.984 0.985 0.985 0.975 0.975 0.976
Observations 22841 22841 22841 22841 22841 22841
R-squared 0.109 0.110 0.110 0.099 0.100 0.100
Notes: Only students with both mathematics and science test scores available are included. All
regressions include school fixed effects, a set of student and teacher characteristics listed in Table 2, as
well as indicators for non-responses to student and teacher surveys. Teacher characteristics vary across
subjects. Regressions are weighted by sampling weights. Robust standard errors clustered by schools are
reported in parentheses. *** p<0.01, ** p<0.05, * p<0.1
38
Table 10: Sensitivity to Alternative Measures of Classroom Age Heterogeneity
(1) (2) (3) (4)
------- Mathematics ------- --------- Science ---------
75th25th Percentile Age Difference -0.253*** -0.252***
(0.070) (0.069)
Max-Min Age Difference -0.052*** -0.059***
(0.017) (0.017)
Classroom Ave. Age -0.068 -0.105 -0.039 -0.075
(0.075) (0.069) (0.072) (0.064)
Age 0.679*** 0.672*** 0.470*** 0.453***
(0.128) (0.129) (0.151) (0.155)
Age squared -0.035*** -0.034*** -0.025*** -0.024***
(0.006) (0.006) (0.007) (0.007)
Observations 22841 22841 22841 22841
R-squared 0.586 0.585 0.559 0.558
IV Results
75th25th Percentile Age Difference -0.256*** -0.249***
(0.069) (0.068)
Max-Min Age Difference -0.053*** -0.060***
(0.017) (0.016)
Classroom Ave. Age -0.052 -0.095 -0.021 -0.066
(0.072) (0.067) (0.070) (0.061)
Age 0.678*** 0.671*** 0.471*** 0.452***
(0.126) (0.127) (0.149) (0.153)
Age squared -0.035*** -0.034*** -0.025*** -0.024***
(0.006) (0.006) (0.007) (0.007)
First-stage Summary:
Partial F for S.D. Age IV 42980 57179 40143 42383
- Shea Partial R-squared 0.993 0.993 0.992 0.991
Partial F for Ave. Age IV 2711 6036 2758 5266
- Shea Partial R-squared 0.945 0.971 0.939 0.960
Observations 22841 22841 22841 22841
R-squared 0.111 0.109 0.100 0.099
Notes: Only students with both mathematics and science test scores available are included. All
regressions include school fixed effects, a set of student and teacher characteristics listed in Table 2, as
well as indicators for non-responses to student and teacher surveys. Teacher characteristics vary across
subjects. The mean and standard deviation of 75th-25th percentile age difference are 0.585 and 0.256
respectively. The mean and standard deviation of max-min age difference are 1.906 and 1.092
respectively. Regressions are weighted by sampling weights. Robust standard errors clustered by schools
are reported in parentheses. *** p<0.01, ** p<0.05, * p<0.1
39
Table 11: Estimates by TIMSS 2003 and TIMSS 2007
(1) (2) (3) (4)
------- TIMSS 2003 ------- ------- TIMSS 2007 -------
Mathematics Science Mathematics Science
School FE Results
Classroom S.D. Age -0.538** -0.393* -0.347*** -0.381***
(0.223) (0.223) (0.096) (0.113)
Classroom Ave. Age -0.261 -0.230 0.001 0.011
(0.230) (0.185) (0.062) (0.061)
Age 1.092** 0.829* 0.577*** 0.388**
(0.476) (0.443) (0.124) (0.165)
Age squared -0.053** -0.040** -0.030*** -0.021***
(0.022) (0.020) (0.006) (0.008)
Observations 6487 6487 16354 16354
R-squared 0.511 0.508 0.614 0.578
IV Results
Classroom S.D. Age -0.548** -0.387* -0.343*** -0.383***
(0.224) (0.220) (0.095) (0.111)
Classroom Ave. Age -0.278 -0.230 -0.023 -0.006
(0.227) (0.186) (0.062) (0.060)
Age 1.087** 0.834* 0.577*** 0.387**
(0.467) (0.436) (0.122) (0.163)
Age squared -0.053** -0.040** -0.030*** -0.021***
(0.021) (0.020) (0.006) (0.008)
First-stage Summary:
Partial F for S.D. Age IV 11473 32893 77873 15000
- Shea Partial R-squared 0.989 0.992 0.995 0.998
Partial F for Ave. Age IV 4461 1590 29052 16611
- Shea Partial R-squared 0.973 0.934 0.990 0.986
Observations 6487 6487 16354 16354
R-squared 0.137 0.103 0.105 0.102
Notes: Only students with both mathematics and science test scores available are included. All
regressions include school fixed effects, a set of student and teacher characteristics listed in Table 2, as
well as indicators for non-responses to student and teacher surveys. Teacher characteristics vary across
subjects. Regressions are weighted by sampling weights. Robust standard errors clustered by schools are
reported in parentheses. *** p<0.01, ** p<0.05, * p<0.1
40
Table 12: Comparison of the Effects in the Fourth Grade and Eighth Grade
(1) (2) (3) (4)
TIMSS 2003 TIMSS 2007
----- Fourth Grade ----- ----- Eighth Grade -----
Mathematics Science Mathematics Science
School FE Results
Classroom S.D. Age -0.424* -0.254 -0.364 -0.171
(0.221) (0.220) (0.243) (0.347)
Classroom Ave. Age -0.258 -0.236 -0.062 -0.234
(0.235) (0.191) (0.230) (0.268)
Age 1.480*** 1.164** 0.192 0.887
(0.465) (0.470) (0.701) (1.023)
Age squared -0.071*** -0.056*** -0.010 -0.033
(0.021) (0.021) (0.023) (0.034)
Observations 6036 6036 4934 4934
R-squared 0.517 0.513 0.266 0.339
IV Results
Classroom S.D. Age -0.435* -0.244 -0.358 -0.092
(0.223) (0.216) (0.243) (0.327)
Classroom Ave. Age -0.282 -0.239 -0.066 -0.262
(0.232) (0.193) (0.226) (0.259)
Age 1.474*** 1.172** 0.197 0.949
(0.454) (0.462) (0.688) (0.982)
Age squared -0.071*** -0.056*** -0.011 -0.035
(0.020) (0.021) (0.023) (0.033)
First-stage Summary:
Partial F for S.D. Age IV 9156 26686 50154 1819
- Shea Partial R-squared 0.989 0.991 0.994 0.959
Partial F for Ave. Age IV 4290 1501 780000 2123
- Shea Partial R-squared 0.973 0.932 1.00 0.963
Observations 6036 6036 4934 4934
R-squared 0.137 0.103 0.053 0.034
Notes: The sample includes Armenia, Latvia, and Lithuania. Only students with both mathematics and
science test scores available are included. All regressions include school fixed effects, a set of student and
teacher characteristics listed in Table 2, as well as indicators for non-responses to student and teacher
surveys. Teacher characteristics vary across subjects. Regressions are weighted by sampling weights.
Robust standard errors clustered by schools are reported in parentheses. *** p<0.01, ** p<0.05, * p<0.1
41
Table 13: Simulation Age Grouping and Achievement Gains in Grade Four
Average Grouping Mixing
Grade-level Differences in Classroom Age --- Predicted Gain ---
Country Age S.D. S.D. Ave. Math Science
Armenia 0.490 -0.172 0.000 0.063 0.067
Colombia 1.131 -0.310 0.012 0.113 0.120
El Salvador 1.072 -0.434 0.285 0.137 0.156
Georgia 0.460 -0.167 0.020 0.059 0.064
Kazakhstan 0.530 -0.164 -0.007 0.060 0.064
Latvia 0.448 -0.175 0.009 0.064 0.068
Lithuania 0.420 -0.145 0.009 0.052 0.056
Mongolia 0.913 -0.472 0.419 0.140 0.164
Morocco 1.038 -0.626 0.310 0.205 0.229
Russia 0.498 -0.144 0.007 0.052 0.056
Tunisia 0.680 -0.199 0.011 0.072 0.077
Ukraine 0.473 -0.151 0.024 0.054 0.058
Yemen 1.214 -0.170 0.074 0.057 0.063
Average 0.721 -0.256 0.090 0.087 0.095
Notes: The simulation is based on ability-mixing schools with at least two classrooms sampled in TIMSS
2007 (ability-grouping schools are excluded). Grade level standard deviation of age is the sample
standard deviation of age for the whole country. The point estimates used to construct the predicted gains
are sourced from specification (4) in Table 3 and Table 4.
42
Table 14: Simulation Age Grouping and Distributional Effects in Grade Four
Country Age Grouping Mixing Differences in --------- Predicted Gain ---------
Classroom S.D. Age Classroom Ave. Age ----- Math ------ ---- Science -----
Young Old Young Old Young Old Young Old
Armenia -0.175 -0.159 -0.282 0.326 0.087 0.039 0.098 0.060
Colombia -0.386 -0.210 -0.585 0.662 0.188 0.040 0.213 0.083
El Salvador -0.462 -0.406 -0.602 0.645 0.217 0.108 0.244 0.149
Georgia -0.185 -0.137 -0.276 0.314 0.090 0.032 0.101 0.052
Kazakhstan -0.155 -0.163 -0.261 0.310 0.078 0.041 0.088 0.061
Latvia -0.226 -0.115 -0.245 0.296 0.101 0.025 0.114 0.045
Lithuania -0.138 -0.157 -0.236 0.285 0.070 0.040 0.079 0.058
Mongolia -0.262 -0.350 -0.447 0.503 0.132 0.095 0.150 0.127
Morocco -0.251 -0.555 -0.381 0.428 0.123 0.168 0.139 0.195
Russia -0.130 -0.156 -0.245 0.298 0.068 0.039 0.077 0.058
Tunisia -0.328 -0.033 -0.310 0.385 0.143 -0.006 0.160 0.019
Ukraine -0.176 -0.124 -0.253 0.298 0.085 0.028 0.096 0.047
Yemen -0.288 -0.111 -0.475 0.554 0.144 0.012 0.163 0.048
Average -0.243 -0.206 -0.354 0.408 0.117 0.051 0.133 0.077
Notes: The simulation is based on ability-mixing schools with at least two classrooms sampled in TIMSS
2007 (ability-grouping schools are excluded). Grade level standard deviation of age is the sample
standard deviation of age for the whole country. "Young" students are at the median age or younger;
"Old" students are those above the median age of their school. The point estimates used to construct the
predicted gains are sourced from Table 6.
43
Data Appendix
1. Sample selection
The four countries sampled from TIMSS 2003 (T03) are Armenia, Latvia, Lithuania, and
Moldova. The thirteen countries sampled from TIMSS 2007 (T07) are Armenia, Colombia, El
Salvador, Georgia, Kazakhstan, Latvia, Lithuania, Mongolia, Morocco, Russia, Tunisia, Ukraine,
and Yemen. These countries are selected because they were classified as low and middle income
countries by the World Bank in 2007 and they sampled multiple classrooms in several schools in
TIMSS. The principals of the sampled schools stated that their students were not grouped into
different classrooms on the basis of their ability in mathematics and science. Note that schools
that grouped students according to ability in either only math or science are also excluded.
Students with missing test scores and age are dropped from the final sample.
2. Variable Construction
a. Standardized test scores
Scaled scores reported by TIMSS are standardized with respect to the standard normal
distribution (within each wave of TIMSS) using the full TIMSS sample.
b. Age variables
The precision of age is only up to the month of birth. All age variables used are measured in
years and based on the variable "asdage" in TIMSS data files. Median age is defined by the
median age of each student's school. "Old" means above the median age, and "young" means
at the median age or below.
c. Other dependent variables
"Bullied" is a dummy variable taking the value of 1 if a student was reported to have been
hurt (T03's "as4ghurt" or T07's "asbghurt"), made to do things ("as4gmade" or
"asbgmade"), or teased ("as4gmfun" or "asbgmfun") by other students in school. "Left out"
is a dummy variable taking the value of 1, if a student was ever left out of activities by other
students in school ("as4gleft" or asbgleft"). "Like school" is a dummy variable taking the
value of 1 if a student agreed with the statement that he/she liked going to school ("as4galbs"
or "asbgalbs").
d. Nativity and language variables
"Native born" is a dummy variable taking the value of 1, if a student was born the in country
(T03's "as4gborn" or T07's "asbgborn"). "Parent native born" is a dummy variable taking
the value of 1, if a student's father or mother was born in the country (T03's "asbgmbrn" and
44
"asbgfbrn" or T07's "asdgborn"). "Speak national language" is a dummy variable taking the
value of 1, if a student always or almost always speaks the language of test at home (T03's
"as4golan" or T07's "asbgolan").
e. Things available at home
"Some books" is a dummy variable taking the value of 1, if a student was reported to have at
least 11 books at home. "Calculator" is a dummy variable taking the value of 1, if a student
was reported to have a calculator at home. "Study desk" is a dummy variable taking the value
of 1, if a student was reported to have a study desk at home. "Dictionary" is a dummy
variable taking the value of 1, if a student was reported to have a dictionary at home.
f. Teaching characteristics
Teacher's experience is the average years of teaching experience of a student's teachers,
because some students have multiple teachers for each subject.
g. Teaching certificate
Teaching certificate is the average of the binary variable indicating whether a student's
teacher in a subject has the relevant teaching certificate. Average is used because some
students have multiple teachers for a subject.
h. Teacher's major
Teacher's major is the average of the binary variable indicating whether a student's teacher
in a subject majored in the subject during college. Average is used because some students
have multiple teachers for a subject.
45