WPS4824
P olicy R eseaRch W oRking P aPeR 4824
Measuring Subjective Expectations
in Developing Countries
A Critical Review and New Evidence
Adeline Delavande
Xavier Giné
David McKenzie
The World Bank
Development Research Group
Finance and Private Sector Team
January 2009
Policy ReseaRch WoRking PaPeR 4824
Abstract
The majority of economic decisions taken by individuals and new analysis of subjective expectations data from
are forward looking and thus involve their expectations developing countries and refutes each of these concerns.
of future outcomes. Understanding the expectations The authors find that people in developing countries can
that individuals have is thus of crucial importance to generally understand and answer probabilistic questions,
designing and evaluating policies in health, education, such questions are not prohibitive in time to ask, and the
finance, migration, social protection, and many other expectations are useful predictors of future behavior and
areas. However, the majority of developing country economic decisions. The paper discusses the different
surveys are static in nature and do not contain methods being tried for eliciting such information,
information on the subjective expectations of individuals. the key methodological issues involved, and the open
Possible reasons given for not collecting this information research questions. The available evidence suggests that
include fears that poor, illiterate individuals do not collecting expectations data is both feasible and valuable,
understand probability concepts, that it takes far too suggesting that it should be incorporated into more
much time to ask such questions, or that the answers developing country surveys.
add little value. This paper provides a critical review
This paper--a product of the Finance and Private Sector Team, Development Research Group--is part of a larger effort
in the group to improve survey methodology. Policy Research Working Papers are also posted on the Web at http://econ.
worldbank.org. The authors may be contacted at xgine@worldbank.org and dmckenzie@worldbank.org.
The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.
Produced by the Research Support Team
Measuring Subjective Expectations in Developing Countries:
A Critical Review and New Evidence #
Adeline Delavande, RAND and Universidade Nova de Lisboa
Xavier Giné, World Bank and BREAD
David McKenzie, World Bank, BREAD and IZA
Keywords: Subjective Expectations; Survey Methodology; Development.
JEL Codes: D84; C81; O12.
#
We thank many of the researchers working in this area for generously sharing their experiences and questionnaires,
and Orazio Attanasio for comments on a first draft of this paper. All opinions expressed in this paper are our own,
and do not necessarily represent those of our respective employers.
1. Introduction
Most economic decisions involve uncertainty and are therefore shaped not only by
preferences but also by expectations of future outcomes. For example, if an individual takes an
umbrella to work on a clear day, one may infer that the individual is extremely risk averse, but
the fact would also be consistent with the individual believing that it may rain later in the day.
Due to this fundamental identification problem (Manski, 2004), understanding the expectations
that individuals have is of crucial importance. While elicitation of expectations questions have
been increasingly used in major surveys in the United States (see Manski, 2004 for a recent
review), most of the major surveys in developing countries are static in nature and do not contain
information on the subjective expectations of individuals. 1
Possible reasons given for not collecting this information include fears that poor illiterate
individuals do not understand probability concepts, that it takes far too much time to ask such
questions, or that the answers are of little value. As data collection is becoming more common, a
few innovative surveys have experimented with eliciting subjective expectations in developing
countries. We review these experiences and conduct new analysis of expectations survey data
collected in several developing countries. The results refute each one of the arguments made for
not collecting expectations information. People in developing countries do appear to understand
probabilistic questions, they are not prohibitive in terms of time to ask, and expectations do
provide meaningful information about economic behavior.
We discuss the different methods used for eliciting such information, the relative costs,
the key methodological issues involved, and the open research questions. The available evidence
suggests that collecting expectations data is both feasible and valuable, suggesting that it should
be incorporated into more developing country surveys
The remainder of the paper is structured as follows. Section 2 describes and critiques the
different methods which have been used to elicit expectations in developing country surveys.
Section 3 assesses the evidence of whether typical survey respondents in developing countries
are able to understand and answer probabilistic questions, while Section 4 discusses the time and
interviewer requirements for implementing such questions in practice. Section 5 shows that the
1
For example, some of the best known large scale development surveys such as the World Bank's Living Standards
Measurement Surveys (LSMS), ORC-Macro's Demographic and Health Surveys (DHS), and the Indonesian Family
Life Survey (IFLS) do not include any probabilistic measures of expectations.
-2-
elicited expectations are useful in predicting the future outcomes of events of interest, and in
predicting economic behaviors. Given all this, Section 6 concludes that these measures should be
used more in developing country surveys.
2. Methods for collecting expectations data
In developing countries, most data collection is done in person, that is, with an
enumerator visiting the home or workplace of the respondent. In this paper we focus on methods
for eliciting expectations that can be collected in the course of a one-on-one interview. 2 These
methods have also been used in online surveys, although the use of such surveys is rare in
developing countries, and is typically restricted only to surveys of highly educated and/or
wealthy individuals. 3
The methods used for eliciting expectations differ depending on whether the researcher is
interested in eliciting a point estimate or the whole distribution.
2.1. Eliciting a point estimate
To elicit a point estimate, Likert scales have been used as well as cardinal scales.
Likert scale
Although Likert scales have been widely used in Psychology to assess the degree to which
respondents agree with certain statement, they have also been used to elicit the likelihood that an
event will occur. In the context of probability elicitation, such a question may read:
"How likely is it that it will rain tomorrow?"
The responses come from a three, five or seven point scale. The five point scale for a
probabilistic question is "Very likely", "Likely", "Neither likely nor unlikely", "Unlikely" and
"Very unlikely". One of the problems with the scale is that different respondents may interpret
the scales differently, and that the scale can be related with the "true" probability of rainfall, but
2
The interested reader can consult Chapter 10 in Tourangeau, Rips and Rasinski (2000) for a critical comparison of
different modes of data collection.
3
An example is the survey of the top academic achievers from a number of Pacific Islands discussed in Gibson and
McKenzie (2008), which used the percent chance formulation to ask expectations questions about migration using
online and in-person questionnaires.
-3-
also with other factors, such as optimism, education, gender etc of the respondent. Given this
concern, interpersonal comparisons may be problematic. McFadden et al. (2005) illustrate this
point with an example of a health status measure that uses a Likert scale ranging from Excellent
to Poor. Sixty-two percent of Danish men, but only 14 percent of French men reported their
health to be "Excellent", yet French men enjoy two more years of life expectancy.
Cardinal (Percent Chance) scale
The wording of the question for the rainfall example using a cardinal scale would be as follows:
"On a scale from 0 to 100, where 0 means no chance, and 100 means certainty, what would you say is the
probability (or percent chance) that it will rain tomorrow?"
As Manski (2004) notes, the use of an explicit probability scale offers several advantages over
Likert scales and other non-cardinal measures. In particular, they although for interpersonal
comparisons and can be compared with actual observed event frequencies to assess accuracy.
When multiple questions of this nature are asked, the laws of probability can be used to provide a
check on the internal consistency of responses. In later sections of the paper we will assess how
well a percent chance scale conforms to the laws of probability and is predictive of behavior.
Figure 1 provides a first comparison of Likert responses to subjective probabilities in a
developing country context. Delavande and Kohler (2008a) elicit from individuals in Malawi the
probability that they are HIV positive, and also ask them how likely they think it is that they are
HIV positive, with the answer on a Likert scale of No likelihood, low, medium, and high. We see
almost all individuals who say the probability is zero also say there is "no likelihood" on the
Likert scale, and the majority of individuals who think the probability is 0.8 or higher say that
the likelihood is "high" on the Likert scale. However, it is also clear that different individuals
associate the same subjective probability with different Likert grades. For example, 53 percent of
individuals who think the subjective probability they have HIV is 0.6 answer "high" on the
Likert scale, 35 percent answer "medium", 6 percent answer "low", and 6 percent answer "no
likelihood".
-4-
Only 5 percent of the sample (130 observations) actually have HIV. 4 It is the case that
both higher Likert scores and higher subjective probabilities are positively associated with
actually being HIV positive. However, it is also the case that conditional on one's Likert score,
those with higher subjective probabilities are marginally more likely to have HIV. Given the
small sample size, it would be useful to compare Likert and subjective probabilities in other
settings to confirm that the subjective probabilities add information above and beyond that
revealed in a Likert score.
2.2. Eliciting a distribution
When the event under study involves several states of nature, the whole distribution has
to be elicited. In developed countries, the most well-known method of eliciting the distribution is
again a percent chance formulation, developed by Dominitz and Manski (1997). They elicit a
cumulative distribution function for income by first asking individuals what the lowest and
highest amount they could earn would be, and then using these answers to define four threshold
levels, Y1, Y2, Y3 and Y4. Respondents are then asked "what is the percent chance that your
income will be less than Y1" and similarly the percent chance it would be less than Y2, less than
Y3, and less than Y4. Interviewers can either prompt respondents to ensure that the answers are
non-decreasing, or the answers can be checked for this to see whether individuals understand
probability.
An example of the use of this same percent chance formulation in a developing country is
McKenzie et al. (2007), who ask Tongans their expectations of income if they were to migrate to
New Zealand, and similarly ask Tongan migrants in New Zealand their expectations of income if
they had stayed in Tonga. They find that these questions appear to be understood by their survey
respondents, who have accurate expectations about incomes in Tonga, but underestimate how
much they could earn in New Zealand. The average education level in their sample was 11 to 12
years, with no one having below 8 years of education.
A related method of eliciting a distribution by directly asking about probabilities is found
in Attanasio, Meghir and Vera-Hernandez (2005), who elicit income expectations in Colombia,
and Attanasio and Kaufman (2008) who elicit income expectations of junior high school students
4
The project tested individuals to ascertain their true status, with these subjective probability and Likert questions
asked of individuals before the results of these tests were revealed to them.
-5-
in Mexico. In both cases they ask respondents what is the maximum and the minimum amount
they could earn. The enumerator then computes the midpoint M and Attanasio and Kaufman
(2008) then ask:
From zero to one hundred, what is the probability that your earnings at that age will be at least M?
These methods of eliciting distributions directly give probabilistic measures of certain
percentiles of the subjective distribution. However, to move from these percentiles to means,
medians, standard deviations, and other moments of interest requires imposing further
assumptions. In the case of the percent chance formulation used by Dominitz and Manski (1997)
and McKenzie et al. (2007), the authors fit a log-normal distribution to the four percentiles
elicited, and then use this to recover these moments. In both cases the log-normal appears to fit
well for income. Attanasio and Kaufman (2008) assume that the minimum and maximum elicited
are truly the minimum and maximum of the subjective distribution, a point which we will return
to, and then fit a step-wise uniform, bi-triangular, and triangular distribution to their data in order
to recover the moments of interest.
These studies have shown that in some settings, the same percent chance formulations
used in developed countries have been successfully employed in developing countries. However,
in many settings it is felt that simply asking respondents for a probability or percent chance is too
abstract, and visual aids are needed to help them express probabilistic concepts. This commonly
involves asking respondents to allocate stones, balls, beans, or sticks into a number of bins.
An example of this method is given in Luseno et al. (2003) and Lybbert et al. (2007),
who ask pastoralists with little formal education in Ethiopia and Kenya to allocate 12 stones
across three different piles on the ground, one for ``above normal,'' one for ``normal,'' and one
for ``below normal,'' with the number of stones in each pile representing the individual's
prediction about the likelihood that rainfall in the coming long rains season would be in each of
the given states. They found only 16 of the 244 households gave degenerate forecasts in which
all 12 stones were placed in a single pile.
A second example is provided by Hill (2007), who collects expectations from coffee
farmers in Uganda about coffee prices. The respondents were given twenty beans and a handout
marked with three squares of different price categories (less than $0.10 (200 shillings), between
-6-
$0.10 and $0.20 (between 200 and 400 shillings), and more than $0.20 (400 shillings)). They
were asked to place beans on the squares in accordance with what they thought was the chance
of that outcome. If the respondent thought one option was very likely they were instructed to put
many beans on the corresponding square, if the respondent thought the option was unlikely they
were instructed to place few beans there.
A notable feature of several of these applications using visual aids is that they never
explicitly tell respondents to interpret the answers as probabilities. A concern then is that if
respondents are asked to allocate the stones or beans to different piles in accordance with how
"likely" they think each state is, there may be interpersonal differences in how "likely" is
interpreted, in a similar manner to Likert scales. For example, one person may use one of ten
stones to conceptualize an unlikely event, whereas another person may use zero or two stones to
again indicate they consider this event unlikely.
In contrast, Delavande and Kohler (2008a), in a survey in Malawi, explicitly link the
number of beans placed in a pile to a probability. Their instructions to the respondents read:
"I will ask you several questions about the chance or likelihood that certain events are going to happen. There are 10
beans in the cup. I would like you to choose some beans out of these 10 beans and put them in the plate to express
what you think the likelihood or chance is of a specific event happening. One bean represents one chance out of 10.
If you do not put any beans in the plate, it means you are sure that the event will NOT happen. As you add beans, it
means that you think the likelihood that the event happens increases. For example, if you put one or two beans, it
means you think the event is not likely to happen but it is still possible. If you pick 5 beans, it means that it is just as
likely it happens as it does not happen (fifty-fifty). If you pick 6 bins, it means the event is slightly more likely to
happen than not to happen. If you put 10 beans in the plate, it means you are sure the event will happen. There is not
right or wrong answer, I just want to know what you think. Let me give you an example. Imagine that we are
playing Bawo. Say, when asked about the chance that you will win, you put 7 beans in the plate. This means that
you believe you would win 7 out of 10 games on average if we play for a long time."
Delavande and Kohler (2008a) use this elicitation method to ask respondents point
estimates about a number of events, ranging from going to market within the next two weeks, to
experiencing a food shortage, and to contracting HIV/AIDS. They do not use it to elicit
subjective distributions, but the same phrasing could easily be applied there. In future research it
would be of interest to compare in an experimental setting the results from eliciting distributions
when stones and beans are explicitly linked to probabilities in this manner, to the results from
-7-
simply asking respondents to allocate stones or beans to piles in accordance with how "likely"
they think each outcome is.
In pilot fieldwork in Sri Lanka, one of the authors tried using the percent chance
formulation to ask microenterprise owners their expectations of profits three months into the
future. Respondents, who typically had between 6 and 10 years education, struggled with this
format, and a decision was made to employ a more visual representation. However, a second
issue which arose was the difficulty respondents had in separating variability in sales due to
seasonality, from the stochastic variability within a given month. The solution that was used was
to move from the more abstract stones or beans, to ask microenterprise to think of a fixed
number of businesses just like theirs. De Mel et al. (2008) ended up using the following
formulation:
"Think now about 20 businesses that are JUST LIKE YOURS. The owners have the same age, education,
experience, skill level, commitment and similar locations to you. Think about all the reasons why your profits may
be higher or lower IN DECEMBER. For example, you might have a big customer come along, a family member
could get sick, some inputs may not be available, you could find some inputs more cheaply than usual, etc. Taking
all these different possibilities into account, mark how many of the 20 businesses you think would end up with
PROFITS in DECEMBER in each of the intervals given".
The use of these various forms of visual aids has the advantage of allowing respondents
to better conceptualize probabilities, and by giving a fixed number of stones, beans, balls or
businesses to allocate, ensures that the probabilities add up to one. A practical issue which then
arises is how many stones to use, and how many bins or piles to give respondents to place them
in. Ten and twenty stones appear to be the most common choices in the literature, and are easily
interpreted as probabilities. One hundred stones would be likely to tax the patience of a
respondent. An exception is Luseno et al. (2003) and Lybbert et al. (2007) who use 12 stones,
with three states of nature. The advantage of 12 stones in their setting is that it allows
respondents to answer with a uniform distribution, which would not be possible with 10 stones.
Nevertheless, 20 stones would still allow respondents to give close to a uniform distribution,
while allowing easier to calculate probabilities to be expressed by other respondents.
The number of bins to use depends on whether the outcome of interest is discrete or
continuous, and on how individualized the support given to each respondent is. Most outcomes
-8-
of interest are continuous, such as prices, incomes, and rainfall, and so allowing more bins with
narrower intervals can provide more precision in estimating subjective means. These visual aid
methods typically provide a histogram of the subjective distribution. As with the percent chance
methods, further assumptions are needed to calculate many of the moments of interest. A
common approach is to use the midpoints of intervals to discretize the continuous variable, and
then use the elicited probabilities along with these midpoints to calculate moments of interest. 5
The support of the distribution over which the respondent should answer can either be
predetermined or elicited directly from the respondent. Because eliciting the support usually
involves some real time calculations, enumerators using paper and pencil methods are less likely
to make mistakes with a pre-determined support. In addition, if the whole distribution needs to
be compared across respondents, then a pre-determined (common) support is required. However,
if the likelihood of the event is very different across respondents, then a pre-determined support
may jeopardize accuracy if the intervals are too coarse. In the end, pilot testing will be necessary
to refine the support so that no probability mass is allocated to the extremes of the support and
that the grid of the support is fine enough so that the probability mass is distributed across
several grid points.
If the event is discrete, then the support can easily be pre-determined. For example, Giné,
Townsend and Vickery (2008) elicit the probability that the monsoon will start in a given 15-day
period and so the support they use contains all 15-day periods spanning two months prior to the
normal onset and three months afterwards. No respondent gave mass to the first or last period in
the support, again suggesting that they were not constrained by it.
An example of a pre-determined support with a continuous outcome comes from the
panel data on Sri Lankan microenterprises, collected by de Mel et al. (2008). Microenterprise
owners were first asked how much they expected profits to be in the following two months, and
based on their answer they were then given a grid with between 17 and 29 profit range bins. The
bins were wider at the end points and narrower in the middle to cover a wide range of answers
5
For example, Hill (2007, p.40) writes "For each farmer the beans were split into seven 100 shilling intervals from
100 to 800. A common lower and upper limit was placed on the data. The class mark for each of these 100 shilling
7
classes was taken as the midpoint of the class. The mean was calculated as f x where is x the given class mark
i =1
i i i
for class i and fi is the probability the price would fall into this class.
-9-
while distinguishing the distribution more finely around the majority of points. Microenterprise
owners with lower profits were given a lower number of bins. They find that only 17 percent of
owners allocating any mass to the extreme points of the distribution, with most of this coming
from poor owners allocating mass to the chance their profits could be very low or zero,
suggesting that for most owners this predetermined support did not constrain the respondents'
answers.
If the support needs to be elicited, enumerators typically ask for the maximum and
minimum of the support, and compute one or several midpoints in between. For example, Giné
and Klonner (2007) elicit expectations about future fish catches among boat owners in India. The
ask owners what the minimum and maximum number of catches they expect is, and use this to
compute the midpoint M. They then compute two additional midpoints, the first between the
minimum and M and the second between M and the maximum so they end up with a 5 point
support. Similarly Attanasio and Kaufman (2008) ask the maximum and minimum, and then use
this to arrive at a midpoint.
At present it seems unlikely that any one particular method of eliciting expectations will
dominate in all survey contexts: some methods will likely work better in some applications,
while others will work better in others. The education level of the participants (and the
enumerators) and the extent to which there are local parallels in games of chance are important
factors in deciding which method to use. Researchers should therefore allow sufficient time in
piloting to contextualize the method to their local context, and to make sure that respondents
appear to understand the concept of probability used in the questions they pose. Nevertheless,
from a methodological perspective it would also be useful for several future studies to randomize
the expectations elicitation method across different subsamples, in order that explicit comparison
of methods may be made.
2.3. What should we expect from asking "what do you expect" in lieu of subjective expectations?
Often the main item of interest is the subjective mean. Eliciting the full subjective
distribution is a necessary first-step in obtaining this mean. A commonly used alternative is to
simply ask individuals "what do you expect?" For example, Jensen (2006) and Nguyen (2008)
ask students "how much do you think you will earn" under different scenarios. A common
critique of such approaches is that it is not clear whether individuals answer with a mean, mode,
- 10 -
median, or something else. However, we are not aware of any empirical evidence which
compares the simpler "what do you expect" approach to the more formal methods for eliciting
subjective expectations discussed in this paper in a developing country. 6
We therefore compare the two methods using panel data on Sri Lankan microenterprises,
collected by de Mel et al. (2008), who use the elicited subjective distribution to measure the
uncertainty individuals have about future profits. They elicited expectations about future profits
of the enterprise in two ways. The first was to directly ask the owner "How much do you expect
the profits of your business to be in December?", two months into the future. Secondly, they
elicited expectations of the full subjective distribution of future profits as explained above. We
use the elicited subjective distribution to calculate the subjective mean, median, and mode of
future profits. 7
Figure 2 plots the subjective mean, median and mode against the individual's answer to
the simple "what do you expect" question (hereafter referred to as the "simple expectation") for
the 564 microenterprises answering all questions. Figure 2 shows a strong positive association
between the different measures: the correlation between the simple expectation and the
subjective mean and median is 0.90. Nevertheless, there is considerable scatter around the 45
degree line. Only 19 percent of the simple expectations are within 10 percent of the subjective
mean, only 34 percent are within 25 percent of the subjective mean, and 29 percent differ in
magnitude by 50 percent or more.
In terms of whether the simple expectation gives a mean, median, or mode, we find that
in only 23 percent of cases is it closest to the mean, in 55 percent it is closest to both the median
and the mode, in 11 percent of cases it is closest to the median alone, and in 7 percent of cases it
is closest to the mode alone. Thus the simple expectation is not giving the mean in the majority
of cases.
6
Engelberg et al. (forthcoming) compare point predictions of GDP growth and inflation with the subjective
probability distributions held by professional forecasters. They find that point predictions are quite close to the
central tendencies of subjective distributions but that the deviations between point predictions and the central
tendencies tend to be asymmetric, with forecasters tending to report point predictions that give a more favorable
view of the economy than do their subjective means/medians/modes. Delavande and Rohwedder (2007) compare
individuals' point expectations about Social Security expected claiming age with their elicited subjective
distributions and find that respondents are more likely to report the median or the mode than the mean of their
distribution when providing a point estimate.
7
The subjective mean is estimated by taking midpoints of each profit interval to discretize the distribution. For the
end points, we added the length of the previous interval to the endpoint. E.g. if the last two intervals were 9000 to
9,999 in profits, and 10,000 and above, the last interval was assigned value 11,000 in calculating the mean. Since
there were few observations in the last intervals, this choice makes little difference to the results.
- 11 -
Thus we have seen that the simple expectation, although correlated with the subjective
mean and median, can differ substantially from them. The key question is then which does a
better job of predicting future outcomes. The subsequent wave of the Sri Lankan Microenterprise
Surveys collected data on December 2005 profits, allowing us to see which method gets closer to
the realized profits. A first measure of fit is obtained by comparing the mean and median
absolute error between the two methods. The mean (median) absolute error is 3798 (2000) when
the simple what to expect question is used, versus 3047 (1852) when the subjective mean taken
from eliciting the full distribution is used. Thus the mean absolute error is 25 percent higher
using the simple question.
Table 1 regresses actual December profits on the different elicited expectations. Columns
1 to 3 use the simple what do you expect question, and columns 4 to 6 the subjective mean. Both
are positively associated with true profits, but the coefficient is substantially larger for the
subjective mean: for each 100 rupees more of higher expected profits according to the simple
expectation, true profits are 26 rupees higher, compared to 47 rupees higher with the subjective
mean. We can reject that either coefficient is one, consistent with attenuation bias due to the
expected mean equal to the true mean and some measurement error or with those who give
very high expectations being too optimistic and those who give very low expectations being too
pessimistic.
Columns 3 and 6 of Table 1 add actual September 2005 profits, collected at the same
time as the expectations questions were asked. They show that the expectations continue to have
predictive power about future profitability, even conditional on current profitability. Part of the
reason for the relatively poor performance of the simple expectation appears to be the presence
of outliers. The standard deviation of the simple expectation is 1.8 times that of the subjective
mean. Eliciting subjective expectations, by forcing microenterprise owners to consider the
likelihood of each different range of profits occurring, seems less prone to outliers than a one-off
casual answer to the simple expectation. Even after we trim the top and bottom 1 percent of the
simple expectation and the subjective mean, the mean squared error from using the simple
expectation is still 1.63 times that using the subjective mean. Hence there is considerable
accuracy gain from eliciting the subjective mean rather than simply asking "what do you
expect".
- 12 -
2.4. Is a self-reported maximum actually a maximum?
As we have discussed in Section 2.2, several of the methods of eliciting the subjective
distribution ask respondents for a maximum and a minimum, and then assume that the
distribution is truly bounded by the elicited maximum and minimum. However, just as simply
asking what you expect is unlikely to give you a statistical mean, we believe that in practice
individuals are unlikely to reply with a statistical maximum or minimum to such questions, but
rather answer with some percentiles on their subjective earnings.
To demonstrate this in practice, we use McKenzie et al. (2007)'s data on expectations of
incomes in New Zealand for Tongans in Tonga, and on expectations of income in Tonga for
Tongan migrants in New Zealand. The survey first asked individuals what the lowest amount and
the highest amount they think they could be earning per week if they were in the other country
was. The midpoint of this high and low was then used to define four threshold values, and
respondents were asked the percent chance that their income would be less than each of these
four threshold values. It transpires that for 95 individuals, the top threshold value was exactly
equal to their reported highest amount. 8 Table 2 then tabulates these respondents' answers of the
percent chance of being above the self-reported maximum. 9 Only 17 percent say there is 0
percent chance of being above the maximum, with a median of 5 percent and mean of 9 percent
as the chance of being above. There are a few observations with even higher reports. These
results are then consistent with individuals giving the 95th or 90th percentile of their subjective
distribution as the answer when asked for a maximum, and suggest caution in assuming that one
gets a true maximum through a non-probabilistic question.
8
Respondents were asked a maximum and a minimum for the income they could earn. The midpoint of this was
then used to choose which of 26 pre-assigned sets of Y1, Y2, Y3, and Y4 thresholds to ask the respondent about. For
example, if the respondent said a minimum income of $300 and maximum of $500, the midpoint would be $400.
Anyone with a midpoint between $400 and $449 was asked the percent chance their income was less than thresholds
Y1=350, Y2=400, Y3=450, and Y4=500. Thus in this example, the Y4 threshold and maximum would coincide. In
contrast, if they said a maximum of $550, they would still be asked about the same thresholds, but the maximum and
upper threshold would not coincide.
9
Calculated as 100 minus the percent chance of being below this threshold.
- 13 -
3. Do people in developing countries understand probabilities?
3.1. Do respondents respect basic properties of probabilities?
One important possible reason given for not collecting probabilistic expectations from
survey respondents is a fear that poor illiterate individuals do not understand the concept of
probability. We review here evidence that shows that elicited probabilistic expectations follow
basic properties of probabilities in a developing country context.
Delavande and Kohler (2008a) ask respondents in rural Malawi to allocate 10 beans to
reflect the likelihood of some events occurring. They evaluate whether respondents understand
the concept of probability by asking about two nested events: going to the market within (a) two
days, and (b) two weeks. If respondents understand the concept of probability, they should
provide an answer for the two-week period that is larger than or equal to the one of the two-day
period. 10 A remarkably high number of respondents provided an answer for the event "going to
the market within two days" smaller than or equal to for the event "going to the market within
two weeks." Only 19 respondents out of 3,222 (0.6%) violated the property of the probability of
nested events. This high consistency rate is not driven by the fact that respondents provided the
same answers to both questions (only 6% of the respondents did). Adding 2 and 3 beans was the
most common action taken by respondents when the length of time increases from two days to
two weeks.
Attanasio, Meghir and Vera-Hernandez (2005) elicit income expectations in Colombia.
They first ask about the minimum and maximum expected household income in the next month.
Using a ruler they then ask respondents the probability that actual income would fall between the
minimum and the mid-point between the minimum and the maximum (group A), or the
probability that actual income would fall between the mid-point and the maximum (group B).
Respondents were randomly allocated into group A or B. To evaluate whether respondents
understand basic probability law, they test the hypothesis that the sum of the averages of the
probabilities answered by group A and B equals 1, and cannot reject it. This leads them to
conclude that one cannot reject the hypothesis that respondents' answers conform to the basic
laws of probabilities.
10
Interviewers were instructed to leave the number of beans on the plate after the respondents had responded to the
likelihood of going to the market within two days, thereby ensuring that s/he remembers the answer when answering
about the two-week period in the next question.
- 14 -
Mahajan et al. (2008) use 10 stones as a visual aid to get participants in Orissa, India to
express probabilities and ask individuals the maximum and minimum income they household
would earn in the next agricultural year. Unlike many applications, they do not force individuals
to use only 10 stones, but instead give them 10 stones to express the probability that earnings
would be above the midpoint, and separately 10 stones to express the probability that earnings
would be below the midpoint. 11 They find that only 513 out of the 1945 individuals have
probabilities adding exactly to one. However, the mean sum of probabilities is 1.13, which falls
to 1.06 if they exclude individuals who say the probability that income is below the midpoint (or
above the midpoint) is exactly zero or one. Since the stones only allow individuals to express
probabilities in multiples of 0.1, the fact that the sum of the probabilities is within 0.1 of unity
when no restrictions were put on this is encouraging. Nevertheless, there are a minority of
observations who have probabilities which sum to something more substantially different than
one, indicating that not everyone understood the concept.
Finally we note that work by developmental psychologists in developed countries has
found that children as young as five or six are capable of understanding probabilities when visual
aids are provided (see Reyna and Brainerd, 1995 for a review), and that the concept of
probability can be communicated to children in experimental economic games (see Harbaugh et
al. 2002). It therefore seems unreasonable to argue that adults in developing countries are unable
to understand risk and probabilities, or to cite this as a reason for not asking these questions.
3.2. Individuals' expectations and observable characteristics
Researchers seeking to validate elicited expectations in developed countries have
determined that they vary with risk factors in the same way as the actual outcomes do (e.g., Hurd
and McGarry, 1995). Only a small portion of the papers using probabilistic expectations in
developing countries have undertaken a systematic comparison between risk factors or
characteristics and subjective expectations, but the papers which have done so support the view
that people understand probabilities.
For example, Delavande and Kohler (2008a) find that, despite substantial heterogeneity
in beliefs, the median and percentiles of the distribution of beliefs vary with observable
characteristics in the a priori expected direction. For example, respondents' subjective
11
Note that they do not ask the probability that earnings would be exactly the midpoint. If the midpoint is a round
number, it is conceivable there could be mass on this point, especially for wage earnings.
- 15 -
probabilities about experiencing food shortages and the need to rely on family members for
financial assistance in the next 12 months vary meaningfully with respondents' socioeconomic
status (SES). Those who have more education, are married, own relatively large land and have
any savings report lower subjective probabilities of experiencing a food shortage and the need to
rely on financial assistance than their counterparts who have low education, are
divorced/separated/widowed, own little land or do not have savings. Similarly, 1-year, 5-year
and 10-year mortality expectations vary with age, education, HIV status (not known to the
respondents at the time of the survey), number of sexual partners and the time horizon as
expected, e.g., respondents who have more sexual partners (and are thus at higher risk of being
infected with HIV) tend to report a higher probability of dying within the next years (Delavande
and Kohler, 2008a).
Similarly re-assuring associations are found regarding income expectations. For example,
Attanasio, Meghir and Vera-Hernandez (2005) find that the minimum and maximum expected
household incomes increase with education level of the household head and spouse, and with the
household size. McKenzie, Gibson and Stillman (2007) elicit the subjective distribution of
income if respondents from Tonga were to migrate to New-Zealand by using a percent chance
format. They find that education is associated with a higher median of the distribution of income.
3.3. Individuals expectations about future outcome and past outcomes
Another way to evaluate whether respondents' elicited expectations are accurate is to
compare them with individual past outcomes or historical realizations.
Past outcomes experienced by individuals are found to be correlated with expectations
about future outcomes. For example, in Hill (2006), farmers in Uganda are asked to allocate
twenty beans into three price categories in accordance with what they thought was the likely
occurrence of coffee prices in six months time. The most recent price received by the farmer is
found to be a strong predictor of both expected price and expected variance: a higher recent price
is associated with a higher expected price and a lower variance. In Malawi, respondents who
went to the market more frequently in the past month report a higher probability of going to the
market in the next 2 days and 2 weeks (Delavande and Kohler, 2008a). In Colombia, Attanasio,
Meghir and Vera-Hernandez (2005) find that households who experienced greater income
volatility in the past 3 years report a wider range of income, as measured by the difference
- 16 -
between the reported maximum and minimum income. Similarly, respondents in Tonga with
current higher income expect to receive a higher income if they were to migrate to New-Zealand
(McKenzie, Gibson and Stillman, 2007).
Similarly, historical realizations are correlated with individuals' expectations in various
contexts. In Gine, Townsend, and Vickery (2007), respondents in India are instructed to place 10
stones in different boxes, each representing a time period, according to the likelihood that the
monsoon would start in each period. They find that the lower and upper bound of both the
subjective and historical distributions are remarkably similar. In Malawi, the ordering of the
mean and percentiles by region of the distribution of answering regarding experiencing food
shortage is consistent with historical regional variation in drought and food shortage, and the
patterns of answers of beliefs about infant mortality match that of actual regional variations
(Delavande and Kohler, 2008a).
3.4. The formation of expectations and the role of information
Little is known about how individuals use available information to formulate their
subjective expectations. The fact that demographic characteristics, risk factors and past outcomes
are systematically associated with elicited beliefs suggests that those play a role in the formation
of beliefs. Additional evidence suggests that social networks and neighbours' outcomes matter.
For example, the results in Jensen (2007) are consistent with students making inference about the
return to schooling based on the experience of people in their community. As a result of
economic segregation, students in low income communities may underestimate the returns to
education, and therefore under-invest in education. Delavande and Kohler (2008a) also shows
that external mortality-related events are key predictors of own mortality expectations. Among
relatives, parents' fate seems to be an important source of information: respondents whose both
parents are dead expect to die sooner. In addition, respondents who have more than 3 relatives
who are sick or died from AIDS report higher mortality expectations, especially in the short-
term. What happens to other people in one's environment is also relevant: both the number of
known people who died in traffic accident in the past two years and the number of funerals
attended last month increases mortality expectations. All this however points to association
between expectations and contextual outcomes, and no causal effects can be concluded.
- 17 -
A few studies have analyzed how providing information influence beliefs. Lybbert et al.
(2003) provide computer-based climate forecast to pastoralists in southern Ethiopia and northern
Kenya and observe their ex-post beliefs about the rain season. They find that the beliefs of
pastoralist receiving and believing computer-based forecasts are consistent with updating in
response to forecast information, albeit with a systematic bias toward optimism. Nguyen (2008)
uses a randomized intervention to provide various types of information about the return to
schooling (mean earnings by education, the story of a role model, or both) to students in
Madagascar. Her results suggest that students and their parents are able to process new
information. She finds that informing students about the mean earnings reduce the gap between
perceived and actual returns, and improve test score. A role model story alone or combined with
information about the mean earnings has a smaller effect. Delavande and Kohler (2008b)
investigate the response of an individual's subjective expectations about being infected with HIV
to information about his/her HIV this issue. They use a randomized design for HIV testing
(Thornton, forthcoming) and instrumental variable technique to overcome potential selection
associated with the self-selection of respondents into learning their HIV status. They find that
learning about one's HIV-positive status has no impact on medium-term subjective beliefs about
one's own HIV infection. They also show that learning one's HIV-negative status results in
higher subsequent beliefs about one's own infection, as well as larger prediction errors about
one's HIV status.
3.5. Does the forecasting horizon matter?
Another aspect that may affect the accuracy of the beliefs is the horizon of the event.
Table 3 presents some supporting evidence using a sample of 232 boatowners from coastal
Tamilnadu, India 12 . During a survey in the months of November 2005 to January 2006,
boatowners were asked to predict daily catches for the second week of April 2006 as well as for
the following week. Columns 1 through 4 in Panel A of Table 3 regress the rupee value of
average daily catches during the second week of April 2006 against the mean of the elicited
subjective distribution and a constant (column 1), the mean of the elicited subjective distribution
alone (column 2), the mean of the subjective distribution and the catches obtained the day when
12
See Giné and Klonner (2007) for further details.
- 18 -
the distribution was elicited (column 3), and the variables in columns 3 plus the average catches
during the previous week.
As expected, and similarly to Table 1 columns 4-6, the coefficient on the mean of the
subjective distribution is always positive and significant, even in columns 3 and 4 which control
for current and past catches. In addition, when the constant is omitted from the regression, the
point estimate on the mean of the subjective distribution is very close to 1, indicating that
expectations are on average unbiased. This is not true however when the constant is included
column 1.
Columns 5-8 report the same regressions as columns 1-4 but for expectations and
outcomes of the following week's average daily catches. Notice that the coefficient and R-
squared in column (5) are significantly higher than those in column (1) suggesting that
expectations are more accurate when the horizon is (much) shorter. However, column (6)
suggests that boatowners tend to overpredict short-term catches. Perhaps more interestingly, the
coefficient on the mean of the subjective distribution in columns 7 and 8 is no longer significant
once we add current catches. The reason may be that over such a short horizon (one week), the
process is very stationary and so expectations do not add additional explanatory power over and
beyond that of current catches.
Panel B regresses the standard deviation of catches in the second week of April (columns
1-3) and the week following the survey (columns 5-7) on the standard deviation of the elicited
distribution. Interestingly, the coefficient on the standard deviation of the subjective distribution
is always significant, suggesting that the elicited subjective distribution can explain not only the
first but also the second moment. That is, individuals who express more uncertainty about their
average catch also experience more variability in future catches, even conditional on the
variability experienced in present catches.
4. How burdensome is it to ask these questions?
As well as a concern that individuals may not understand expectations, the other main
reason commonly expressed for not eliciting expectations questions is that these questions are
too costly in terms of the quality of enumerators and survey time. Enumerators need to be at
least high school graduates with an understanding of the concept of probability. These are not
- 19 -
hard requirements to meet since enumerators are typically educated and have basic knowledge of
mathematics and statistics.
In any event, the quality of the enumerators may still affect the accuracy with which
probabilities are elicited some enumerators may do a better job explaining expectations
questions than others. We check this using the Sri Lanka microenterprise dataset of de Mel et al.
(2008). 19 enumerators were used by the survey firm (AC Nielsen Lanka) to interview 587
microenterprise owners spread over 24 geographic areas, with multiple enumerators used per
geographic area. We regress the absolute value of the difference between the realization and
subjective mean of profits against enumerator fixed effects and geographical dummies. The
enumerator fixed effects are jointly insignificant (p-value=0.43) when the mean from the
subjective distribution is used as expectation, but they are jointly significant (p-value=0.03)
when the "what do you expect" question is used instead. The F-test of enumerator effects in the
simple expectation question becomes insignificant once outliers are removed, so perhaps some of
the enumerators are better than others at identifying and double-checking crazy answers. This is
the only example we are aware of testing for enumerator effects in expectations questions, so it
would be useful for future research to further examine this issue.
In terms of survey time, eliciting a point estimate may take from 2 to 5 minutes, while
eliciting a full distribution can take up to 10 minutes if the support also needs to be elicited. But
because there is a fixed cost of explaining the question first, subsequent elicitations can be
obtained much quickly. Despite the fact that eliciting expectations can take some time, our
experience has been that interviewees respond well to such questions. If visual aids are used, the
questions provide a nice break from more tedious survey questions. Indeed, item non-response
on the expectations questions in all of the datasets we have collected have been negligible.
It is true that most of the surveys which have elicited expectations in developing
countries have been specialized surveys with relatively small sample sizes, leading some to
wonder if these questions can be scaled up into large-scale nationally representative surveys.
However, several recent surveys have demonstrated that it is feasible to include such questions in
large surveys. For example, Delavande and Kohler (2008a) include a large number of
expectations questions in a survey of over 3,000 adults in Malawi, and Attanasio and Kaufman
(2008) include them in a survey of 23,000 young adults in Mexico.
- 20 -
A final practical is whether respondents should be provided with monetary incentives for
their answers. 13 The fear is that respondents will not provide accurate enough answers if not
rewarded properly. Even if one believes that they should be rewarded, it is not clear how, since
the scoring rule may affect the elicitation process itself. One of the first studies to reward
respondents is Grisley and Kellogg (1983) who surveyed farmers in northern Thailand about
their price, yield and net income expectations at harvest time. Farmers were rewarded based on
their accuracy yet critics thought that the rule followed could induce farmers to misrepresent
their beliefs since risk aversion could influence their answers. Nelson and Bessler (1989) settled
the issue by showing that for the untrained subject, that is, the one that was unfamiliar with the
questions and methods as most respondents would be, the scoring rule does not matter. In any
event, the question of whether rewarding elicitations leads to more accuracy has not been yet
settled, since Grisley and Kellogg (1983) did not randomize rewards. All of the recent studies
reviewed here do not provide monetary rewards during the elicitation process. As the next
section shows, even without payment, the answers received from such questions appear
reasonable.
5. Do subjective expectations help predict economic decisions and future outcomes?
We have shown that there are ways of eliciting subjective expectations in developing
countries, that individuals do understand probabilities, and that asking these questions need not
be prohibitive in terms of survey time. The key question is then whether there are demonstrated
benefits from including such questions.
An important rationale for eliciting subjective expectations is that they include
information not captured by the standard list of variables commonly collected, and as such, are
useful for predicting economic decisions and future outcomes of interest. Economic theory
certainly gives plenty of examples of cases where expectations should matter for predicting
behavior. For example, investment decisions should depend on the expectations of the return and
riskiness of the project, while classic theories of migration such as Sjaastad (1962) and Harris
and Todaro (1970) assume that expectations of incomes and employment probabilities in the
migrant destination should be important determinants of the migration decision. However, it is
ultimately an empirical question as to whether the expectations themselves yield any additional
13
This debate is much more prevalent in measurement of risk aversion.
- 21 -
predictive power beyond that offered by correlates of expectations, such as socioeconomic
characteristics and current levels of the outcome of interest. In this section we summarize the
results of several studies and provide additional evidence which together do suggest that
expectations are useful in predicting behavior and future outcomes.
5.1 Do subjective expectations predict future outcomes? Are they accurate?
The first question of interest is whether the expectations about future events elicited from
people in developing countries are informative about the likelihood of these future events
occurring, and if so, how accurate these expectations are. The available evidence seems to
suggest that the expectations are indeed informative of future events, with people more likely to
experience an event typically giving a higher elicited expectation of such an event occurring. The
accuracy of these expectations has been found to be high for reasonably common, regularly
occurring events, but less accurate for rare or unexperienced events.
An early example showing reasonably accurate expectations of common events is found
in work by Ravallion (1987), who asked rice traders in Bangladesh whether they thought the
price would go up, go down, or stay the same, and the amount of change. Expectations were
found to track prices fairly well on average, although there was a tendency to overestimate price
changes. More recently, reasonably accurate expectations have been found in several agricultural
contexts. For example, Luseno et al. (2003) find pastoralists in Ethiopia and Kenya have
reasonably accurate perceptions of rainfall and Lybbert et al. (2005) show that these same
pastoralists update their expectations when they receive external climate forecasts. In a more
complicated elicitation, Santos and Barrett (2006) elicit both expectations for rainfall next year,
and of herd size next year, and show that Ethiopian herders' expectations of herd size vary with
rainfall in a way which matches well the nonstationary herd dynamics that herd history data
would suggest.
However, recent studies have also shown that in a context of incomplete information,
expectations can be quite inaccurate, even if they predict future outcomes. Jensen (2006) finds
school children in rural Dominican Republic significantly understate the returns to education,
which he attributes in part to adults with education migrating to cities, leaving children in rural
areas with little information about the earnings and jobs that education brings. A related finding
is seen in McKenzie et al. (2007), who show that Tongans dramatically underestimate the
- 22 -
earnings possible when migrating to New Zealand. Delavande and Kohler (2008a) elicit
expectations about HIV/AIDS in Malawi. Individuals were tested for HIV at the time of the
survey, and many did not know their HIV status at the time of the expectations questions. They
do find that the elicited expectation of having HIV/AIDS varies with socioeconomic factors in a
way that matches the relationship with true status, and that individuals who were later revealed to
be HIV positive are more likely to expect that they have it. Nevertheless, they find that the
subjective expectation of being infected with HIV remains low for those who are later found to
have been infected. In the same study, they show that individuals dramatically overestimate the
likelihood of a rare event (death in the next year or next five years) occurring.
5.2. Do subjective expectations predict economic behaviors?
Several studies provide evidence that subjective expectations do affect economic
behavior. Some of the most striking evidence is found in studies which look at perceptions about
the returns to education. The first of such studies is Jensen (2006), who finds that while the
measured returns to schooling in the Dominican Republic are high, the returns perceived by
students are extremely low. He finds that when students at a randomly selected subset of schools
were informed of the returns estimated from earnings data, their perceived returns increased
when re-interviewed 4 to 6 months later, and that four years later they had completed on average
0.20 years more schooling. He finds further that the effects are non-existent for poorer
households, but large for students from households with above the median income per capita.
Nguyen (2008) uses a similar approach in Madagascar, and finds providing statistics on actual
returns reduces the dispersion in perceived returns, and improves test scores, particularly for
those underestimating the returns. Related evidence is also seen in Attanasio and Kaufmann
(2008) who find expected returns, perceived employment probabilities, and earnings risks to be
significant predictors of college and high school attendance choices in Mexico, but only for
richer individuals.
Agriculture is the other main area where expectations have been elicited in the
developing country context, and the existing literature has found expectations to be associated
with agricultural decisions. Hill (2007) finds that Ugandan farmers' expectations about future
coffee prices are significantly associated with the share of labor allocated to coffee. Her
respondents were asked to allocate twenty beans into three price categories in accordance with
- 23 -
what they thought was the likely occurrence of coffee prices in six months time. The more beans
the farmer placed in the bottom category (corresponding to a negative return), the less labor the
farmer had allocated to coffee, conditional on a host of farmer and land characteristics: a
difference of 50 percent in the subjective expectation of a negative return is associated with a 10
percentage point reduction in the share of labor allocated to coffee. She interprets this as
evidence that downside risk is influencing behavior under uncertainty, but as the specification
does not also include a measure of the average expected return, the measure used reflects both
the mean and variance of expected prices.
Gine et al. (2008) use stones with bins to elicit expectations from Indian farmers about
when the monsoon will start. Using the mean of the subjective distribution, they find farmers
who believe the monsoon are likely to start later are more likely to plant later, less likely to
replant, have purchased a lower share of total production inputs before the onset of the monsoon,
and are more likely to buy weather insurance even after controlling for a wide range of farmer
characteristics, including proxies for risk aversion and discount rates.
Bellemare (2008) shows how eliciting expectations can help explain the institution of
reverse share tenancy in Madagascar, where poorer landlords enter share-cropping agreements
with richer tenants a contrast to standard share-cropping agreements where richer and
presumably less risk-averse landlords share some of the risk encountered by poorer tenants.
Bellemare hypothesizes that the risk of land expropriation will depend on the form of tenancy.
The landlords were given 20 tokens, and asked to distribute them between two boxes, where one
box represents a state of world where the landlord lost her claim to the land as a result of the
contract signed, and the other box one where she kept her claim to the land. The landlord did this
for both a sharecropping and a fixed rent contract, allowing calculation of the subjective
expectations of the landlord of asset loss under the two alternative contracts. The results show
that the expected chance of losing their claim to land is on average greater than with a fixed rent
contract, and that the larger the gap in expectations between the two contract types, the more
likely the landlord is to enter a sharecropping contract.
Finally, in a migration context, McKenzie et al. (2007) use the percent chance
formulation to elicit expectations in Tonga of what income could be earned in New Zealand.
They find that, even conditional on current income and employment in Tonga, individuals who
expect to earn more in New Zealand are more likely to apply to work in New Zealand.
- 24 -
Moreover, the effect is large in size: for a given income level and employment status in Tonga,
an additional NZ$100 per week in median expected income in New Zealand is associated with a
10 percentage point increase in the likelihood of applying for migration. A move from the mean
25th percentile of expected earnings (NZ$140) to the mean 75th percentile of expected earnings
(NZ$230) is thus associated with a 19 percentage point increase in the probability of applying for
migration.
In many of the existing studies, the research question of interest has been whether first
moments predict behavior. However, there are many economic models where other moments of
the distribution are also important. As soon as one moves from risk neutrality to risk aversion,
individuals will also care about second moments in making their decisions. An example where
eliciting the whole subjective distribution allows the authors to examine the impacts of both first
and second moments is provided by Gine and Klonner (2007). They study the amount of credit
that auctioneers give to their client boatowners after they have adopted a new fishing vessel.
Since the boatowner's ability is uncertain, and because contracts between the auctioneer and the
boatowner are not exclusive, the amount of debt that an auctioneer is willing to extend should
depend positively on mean of the prior distribution but negatively on its variance. By directly
eliciting the auctioneers' prior distributions, the authors find that indeed the amount of debt
relates to these two moments as theory predicts.
These studies demonstrate that subjective expectations can predict economic behavior in
developing countries. However, the existing studies do not allow us to say which measure of
subjective expectations predicts behavior better. That is, is behavior guided more by the median,
the mean, the mode, or other moments of the subjective earnings distribution? The answer to this
is likely to depend on the behavior under study, and on the degree of skewness in the subjective
earnings distribution. Nevertheless, we can examine how sensitive the results of two such studies
are to alternative measures of expectations.
Table 4, panel A examines the sensitivity of the finding of McKenzie et al. (2007) to
different moments and quantiles of Tongans' subjective distributions of earnings in New
Zealand. Column 1 repeats the specification used in that paper, showing a higher median
earnings expected is associated with a greater likelihood of migration. Columns 2 through 6
show the results when the mean, 10th, 25th, 75th and 90th percentiles are used instead of the
median. The point estimates are very similar across these specifications, and are always
- 25 -
statistically significant, in each case indicating that a NZ$100 increase in earnings expected at
this point in the distribution is associated with a 9-11 percentage point increase in the likelihood
of applying to migrate, conditional on current income and employment in Tonga. The level of
significance and pseudo R-squared are greater for larger quantiles of the distribution, suggesting
that it is the "upside" expectations that have the greatest predictive power. Columns 7 and 8 add
the standard deviation of the expected earnings distribution. This has a positive sign, which is
significant at the 10 percent level when used with the median, but not quite significant at the 10
percent level using the mean. Since most of the risk of earnings with migration is upside
potential compared to the low earnings in Tonga, this shows that as well as the average earnings,
the upside potential perceived helps predict migration decisions.
In Table 5, the same specifications are run using data from Gine et al. (2008). Column 1
repeats the specification used of the paper, showing that mean expectation of a later onset of the
monsoon is correlated with a lower likelihood of replanting. Columns 2 through 6 show the
results when the median, 10th, 25th, 75th and 90th percentiles are used instead of the mean, and
again show the point estimates are of similar size and significance across specifications.
6. Conclusions
Despite the importance of expectations for understanding and predicting the majority of
economic decisions, expectations questions are rarely included in developing country surveys.
This paper has shown through the lens of the existing literature and our own experiences that
these questions can be profitably included in developing country surveys in a variety of contexts.
One should not use as an excuse fears that poor, illiterate individuals do not understand
probabilities, since there is now sufficient evidence to suggest that they can. The questions do
not take a prohibitive amount of time, and they can play a strong role in predicting future
outcomes and future behaviors. This is not to argue that we should ask expectations about every
conceivable future outcome, or that asking expectations will always be justified at the margin
compared to other questions of interest. But given that expectations, together with preferences,
are the foundation of economic decision-making, we do argue that such questions should be
asked much more often than they currently are.
- 26 -
References
Attanasio, Orazio and Katja Kaufmann (2008) "Educational choices, subjective expectations,
and credit constraints", Mimeo.
Attanasio, Orazio, Costas Meghir, and Marcos Vera-Hernández (2005) "Elicitation, Validation,
and Use of Probability Distributions of Future Income in Developing Countries," Paper
prepared for the 2005 Econometric Society Meeting.
Bellemare, Marc (2008) "Insecure Land Rights and Reverse Share Tenancy in Madagascar",
Mimeo. Duke University.
Delavande, Adeline and Hans-Peter Kohler (2008a) "Subjective Expectations in the Context of
HIV/AIDS in Malawi", Working Paper, University of Pennsylvania.
Delavande Adeline and Hans-Peter Kohler (2008b) "HIV Testing and Infection Expectations in
Malawi," Working Paper, University of Pennsylvania.
Delavande Adeline and Susann Rohwedder (2007) "Eliciting Subjective Probabilities about
Social Security Expectations," Mimeo. Rand.
De Mel, Suresh, David McKenzie and Christopher Woodruff (2008) "Returns to capital: Results
from a randomized experiment", Quarterly Journal of Economics, 123(4): 1329-1372.
Dominitz, Jeff and Charles Manski (1997) "Using Expectations Data to Study Subjective Income
Expectations", Journal of the American Statistical Association 92(439): 855-67.
Engelberg Joseph, Charles Manski and Jared Williams (forthcoming) "Comparing the Point
Predictions and Subjective Probability Distributions of Professional Forecasters," Journal of
Business and Economic Statistics.
Gibson, John and David McKenzie (2008) "The Microeconomic Determinants of Emigration and
Return Migration of the Best and Brightest: Evidence from the Pacific", Mimeo. World Bank.
Giné, Xavier and Stefan Klonner (2007) "Technology Adoption with Uncertain Profits: The
Case of Fibre Boats in South India, Mimeo World Bank.
Giné, Xavier, Robert Townsend, and James Vickery (2008) "Rational Expectations? Evidence
from Planting Decisions in Semi-Arid India".
Grisley, W. and E.D. Kellogg (1983) "Farmers' Subjective Probabilities in Northern Thailand:
An Elicitation Analysis" American Journal of Agricultural Economics 65: 74-82
Harris, John and Michael Todaro (1970) "Migration, Unemployment and Development: A Two-
Sector Analysis", American Economic Review 60(1): 126-142.
Harbaugh, William, Kate Krause and Lise Vesterlund (2002) "Risk Attitudes of Children and
Adults: Choices Over Small and Large Probability Gains and Losses", Experimental
Economics 5(1): 53-84.
Hill, Ruth Vargas (2006) "Coffee Price Risk in the Market: Exporter, Producer and Trader Data
from Uganda", Mimeo. IFPRI.
Hill, Ruth Vargas (2007) Using Stated Preferences and Beliefs to Identify the Impact of Risk on
Poor households, Mimeo. IFPRI.
Jensen, Robert (2006) "The Perceived Returns to Education and the Demand for Schooling",
Mimeo Brown University.
Luseno, Winnie K., John G. McPeak, Christopher B. Barrett, Getachew Gebru and Peter D.
Little (2003), "The Value of Climate Forecast Information for Pastoralists: Evidence from
Southern Ethiopia and Northern Kenya," World Development, vol. 31, no. 9, pp. 1477-1494
Lybbert, Travis, Christopher B. Barrett, John McPeak and Winnie K. Luseno (2007), "Bayesian
herders: asymmetric updating of rainfall beliefs in response to external forecasts", World
Development 35(3): 480-497.
- 27 -
Mahajan, Aprajit, Alessandro Tarozzi, Joanne Yoong, and Brian Blackburn (2008) "Bednets,
Information, and Malaria in Orissa", Mimeo. Stanford University.
Manski, Charles (2004) "Measuring Expectations", Econometrica 72(5): 1329-76
McFadden, Daniel L. Albert C. Bemmaor, Francis G. Caro, Jeff Dominitz, Byung-Hill Jun,
Arthur Lewbel, Rosa Matzkin, Francesca Molinari, Norbert Schwarz, Robert J. Willis and
Joachim K. Winter (2005) "Statistical Analysis of Choice Experiments and Surveys"
Marketing Letters 16 (3-4): 183-196.
McKenzie, David, John Gibson and Steven Stillman (2007) "A land of milk and honey with
streets paved with gold: Do emigrants have over-optimistic expectations about incomes
abroad?" World Bank Policy Research Working Paper No. 4141.
Nelson, Robert G. and David A. Bessler (1989) "Subjective Probabilities and Scoring Rules:
Experimental Evidence" American Journal of Agricultural Economics 71(2): 363-369.
Nguyen, Trang (2008) "Information, Role Models and Perceived Returns to Education:
Experimental Evidence from Madagascar", Mimeo MIT.
Norris, Patricia E. and Randall A. Kramer. (1990) "The Elicitation of Subjective Probabilities
with Applications in Agricultural Economics" Review of Marketing and Agricultural
Economics 58(2-3):127-147.
Ravallion, Martin (1987) Markets and Famine, Clarendon Press, Oxford University.
Reyna, Valerie and Charles Brainerd (1995) "Fuzzy-trace Theory: An Interim Synthesis."
Learning and Individual Differences, 7: 1-75
Santos, Carlos and Christopher Barrett (2006) "Heterogeneous wealth dynamics: on the roles of
risk and ability", Mimeo. Cornell University.
Sjaastad, Larry (1962) "The Costs and Returns of Human Migration", Journal of Political
Economy 70(5): 80-93.
Thornton, Rebecca. "The Demand for and Impact of Learning HIV Status: Evidence from a Field
Experiment." American Economic Review (forthcoming).
Thornton, Rebecca and David Lam. "Measuring Subjective Life expectancy in Developing
Countries: The Case of Malawi and South Africa," Working Paper, University of Michigan,
2007.
Tourangeau, Roger Lance J. Rips and Kenneth Rasinski (2000) The Psychology of Survey
Response New York NY and Cambridge, UK: Cambridge University Press.
- 28 -
Figure 1: Subjective Probability Vs Likert Response
1
Proportion answering Likert Scale with given Category
0.9
0.8
0.7
0.6 No likelihood
Low
0.5 Medium
High
0.4
Don't Know
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Subjective Probability of Being Infected with HIV/AIDS
Source: Malawi data from Delavande and Kohler (2008a).
Figure 2: Comparison of Simple "What do You Expect" and Mean, Median and Mode of
Subjective Distribution of Future Profits for Sri Lankan Microenterprises
60000
40000
20000
0
0 20000 40000 60000
Simple question: what do you expect?
Mean of subjective distribution Median of subjective distribution
Mode of subjective distribution 45 degree line
Source: Sri Lankan microenterprise data from De Mel et al. (2008).
- 29 -
T able 1: Does does the subjective mean do better than asking "what do you expect" in predicting the future?
Dependent Variable: December 2005 profits of Sri Lankan microenterprises
(1) (2) (3) (4) (5) (6)
Simple expectation ("How much do you expect?") 0.257*** 0.522*** 0.201***
(0.0527) (0.0467) (0.0590)
Mean of subjective profits distribution 0.468*** 0.825*** 0.370***
(0.0715) (0.0429) (0.0797)
Actual September 2005 profits 0.119* 0.133**
(0.0621) (0.0635)
Constant 3419*** 3235*** 2757*** 2617***
(374.6) (350.4) (379.1) (336.8)
Observations 487 487 482 487 487 482
R-squared 0.204 0.243 0.198 0.249
Notes:
Robust standard errors in parentheses, *** p<0.01, ** p<0.05, * p<0.1
Top and Bottom 1% of true profits in December 2005 trimmed to reduce sensitivity to outliers.
T able 2: Percent Chance of Earning Incom e above Maxim um
Results from Tongans asked income expectations
Percent chance # of observations % of observations
0 16 17
2 2 2
5 30 32
10 29 31
15 6 6
20 7 7
25 1 1
30 1 1
35 1 1
40 1 1
50 1 1
TOTAL 95
- 30 -
T able 3: Does the forecasting horizon m atter?
Dependent Variable: Daily fish catches in second week of April (columns 1-4) and daily fish catches next week (columns 5-8)
Panel A: Mean Daily catches in second week of April Daily catches next week
(1) (2) (3) (4) (5) (6) (7) (8)
Mean of subjective catches distribution 0.12*** 1.05*** 0.209*** 0.205*** 0.66*** 1.631*** -0.082 -0.095
(0.042) (0.037) (0.036) (0.035) (0.112) (0.072) (0.086) (0.086)
Actual catches on last day fishing 0.292*** 0.216*** 0.573*** 0.524***
(0.027) (0.044) (0.038) (0.056)
Mean catches on last week fishing 0.08** 0.055
(0.037) (0.047)
Constant 770.8*** 454.5*** 456.7*** 520.1*** 308.0*** 310.0***
(30.7) (39.3) (39.1) (52.0) (36.1) (36.1)
Observations 232 232 232 232 156 156 156 156
R-squared 0.035 0.355 0.369 0.184 0.679 0.682
Panel B: Standard Deviation
Std. Dev of subjective catches distribution 0.509*** 2.371*** 0.432*** 0.411** 2.003*** 0.797***
(0.119) (0.104) (0.079) (0.171) (0.140) (0.136)
Std. Dev of catches last week fishing 0.224*** 0.1 64**
(0.061) (0.064)
Constant 568.5*** 445.8*** 304.7*** 152.2***
(30.4) (31.8) (26.1) (27.0)
Observations 232 232 124 156 156 124
R-squared 0.073 0.692 0.229 0.036 0.569 0.318
Absolute value of t statistics in parentheses
* significant at 10% ; ** significant at 5%; *** significant at 1%
- 31 -
T able 4: How sensitive are m igration decisions to the expectations metric used?
Marginal effects for a probit for the likelihood of applying to migrate under the Pacific Access Category
(1) (2) (3) (4) (5) (6) (7) (8)
Median of subjective distribution 0.103 0.097
(2.19)** (2.04)**
Mean of subjective distribution 0.115 0.097
(2.44)** (2.02)**
10th percentile of subjective distribution 0.102
(1.76)*
25th percentile of subjective distribution 0.103
(1.94)*
75th percentile of subjective distribution 0.102
(2.47)**
90th percentile of subjective distribution 0.092
(2.67)***
Standard deviation of subjective distribution 0.087 0.073
(1.85)* (1.54)
Current income in Tonga 0.036 0.029 0.055 0.047 0.025 0.020 0.027 0.027
(0.41) (0.33) (0.63) (0.54) (0.28) (0.22) (0.31) (0.31)
Current employment in Tonga -0.058 -0.056 -0.061 -0.060 -0.054 -0.048 -0.047 -0.045
(0.45) (0.43) (0.47) (0.46) (0.42) (0.37) (0.36) (0.35)
Observations 128 128 128 128 128 128 128 128
Pseudo R-squared 0.037 0.045 0.026 0.030 0.046 0.053 0.062 0.061
Notes:
T-statistics in parentheses, *, **, and *** indicate significance at the 10%, 5% and 1% levels respectively
Mean and quantiles are for the unconditional earnings distribution. See McKenzie et al. (2007) for details of how these are calculated.
- 32 -
T able 5: How sensitive are replanting decisions to the expectations metric used?
Marginal effects for a probit for the likelihood of having replanted in the last 10 years
(1) (2) (3) (4) (5) (6) (7) (8)
Median of subjective distribution -0.404*** -0.397***
(0.071) (0.071)
Mean of subjective distribution -0.466*** -0.458***
(0.075) (0.076)
10th percentile of subjective distribution -0.352***
(0.049)
25th percentile of subjective distribution -0.365***
(0.054)
75th percentile of subjective distribution -0.334***
(0.073)
90th percentile of subjective distribution -0.327***
(0.071)
Standard deviation of subjective distribution 0.348 0.323
(0.249) (0.253)
Individual definition of monsoon -0.008 -0.008* -0.009* -0.008* -0.009 -0.009 -0.008 -0.009*
(0.005) (0.005) (0.005) (0.005) (0.005) (0.005) (0.005) (0.005)
Observations 960 960 960 960 960 960 960 960
Pseudo R-squared 0.08 0.08 0.07 0.08 0.07 0.06 0.08 0.08
Notes:
Robust standard errors in parentheses, * significant at 10% ; ** significant at 5%; *** significant at 1%
Mean and quantiles are for the unconditional onset of monsoon subjective distribution.
Data from Gine et al. (2008).
- 33 -