WPS3748
The Construction and Interpretation of Combined
Cross-Section and Time-Series Inequality Datasets
Joseph F. Francois
Tinbergen Institute (Erasmus University Rotterdam) and CEPR
Hugo Rojas-Romagosa
Tinbergen Institute (Erasmus University Rotterdam)
Abstract: The inequality dataset compiled in the 1990s by the World Bank and extended by
the UN has been both widely used and strongly criticized. The criticisms raise questions
about conclusions drawn from secondary inequality datasets in general. We develop
techniques to deal with national and international comparability problems intrinsic to such
datasets. The result is a new dataset of consistent inequality series, allowing us to explore
problems of measurement error. In addition, the new data allow us to perform parametric
non-linear estimation of Lorenz curves from grouped data. This in turn allows us to estimate
the entire income distribution, computing alternative inequality indexes and poverty
estimates. Finally, we have used our broadly comparable dataset to examine international
patterns of inequality and poverty.
Keywords: Income distribution datasets, inequality trends, Lorenz curve estimation, poverty
estimation
JEL codes: D31, C80, O15
World Bank Policy Research Working Paper 3748, October 2005
The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of
ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are
less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings,
interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent
the view of the World Bank, its Executive Directors, or the countries they represent. Policy Research Working Papers
are available online at http://econ.worldbank.org.
We acknowledge support from the EU research and training network (RTN) "Trade,
Industrialization, and Development," as well as research support from DFID and the World
Bank.
Address correspondence to: J. Francois, Tinbergen Institute, Erasmus University Rotterdam,
Burg Oudlaan 50-H8-18, 3000DR Rotterdam, NETHERLANDS.
Email: francois@few.eur.nl.
Data are available at www.intereconomics.com/francois/data.html.
Non-technical Summary
The empirical study of cross-country inequality benefits from, but is also limited by, the
heterogeneity and vast amount of available data. The inequality dataset compiled in the
1990s by the World Bank and extended by the UN has been both widely used and strongly
criticized. The criticisms raise questions about conclusions drawn from secondary inequality
datasets in general. We develop techniques to deal with national and international
comparability problems intrinsic to such datasets. The result is a new dataset of consistent
inequality series, allowing us to explore problems of measurement error. Our approach yields
six main inequality series that can readily be used in empirical tests and within these series
the implicit measurement error has been reduced.
In addition, the new data allow us to perform parametric non-linear estimation of Lorenz
curves from grouped data. This in turn allows us to estimate the entire income distribution,
computing alternative inequality indexes and poverty estimates. Working with these data, we
introduce improvements to existing methods for estimating Lorenz curves from grouped
data. Furthermore, using the resulting estimated Lorenz curves one can estimate Atkinson
indexes, which are a complement to the information provided by the Gini coefficient. We
find that in roughly a third of the cases both indexes report different inequality trends and
thus, the use of both indexes is advisable in order to obtain robust conclusions about income
inequality.
Finally, we have used our broadly comparable dataset to examine international patterns of
inequality and poverty. A first conclusion is that between-country inequality variation is
more significant than within-country. This suggests that country specific characteristics have
a bigger role in explaining inequality levels than time trends. However, we also find that
within-country inequality is still important and there are significant time trends in our series.
Therefore, we reject the "glacial change" hypothesis that inequality does not vary
significantly over time. For the specific case of OECD countries, we clearly detect a U-shape
pattern that confirms the "U-turn" hypothesis recently flagged by Atkinson. For developing
countries the cross-country pattern is less clear, but it suggests a decrease in inequality for
most of the analyzed period, with a slight increase in the 1990s. Country-specific time trends
are diverse and it is difficult to spot precise trends. The choice of income concept, basic or
extended series and the use of pool data may produce different results. Nevertheless, this
variety of choice emphasizes the richness of our inequality dataset, which is not limited by a
single series and provides wider information from where to draw conclusions. With respect
to poverty, we find a decline in the poverty ratios in most of the countries covered by our
sample. The only (though admittedly quite significant) exception is the poverty experience in
the African continent.
ii
Overview
There is a sizeable literature regarding the interaction between income inequality and other
economic variables, such as growth, poverty, trade and economic policy. Beginning with Kuznets
(1955), the theoretical work has steadily grown, and recently there has been a surge in the topic,
reflected in a new wave of publications (Atkinson, 1997). Yet, the study of income inequality has
been seriously limited by data constraints. The introduction of a cross-country inequality dataset by
the World Bank (Deininger and Squire, 1996) has complemented the recent literature and has itself
launched a series of influential econometric studies.
While at the core of most recent work in the area, the structure of the inequality dataset
compiled by Deininger and Squire, henceforth DS, recently has been criticized by Atkinson and
Brandolini (2001), henceforth AB. These criticisms also extend implicitly to the recent extension of
the DS data in the World Income Inequality Dataset (WIID). AB forcefully argue for the need to
assess the mechanical use of such "secondary" datasets and to deal more systematically with the
measurement problems involved. In this paper we do this, focusing on the empirical and theoretical
difficulties related to income inequality measurement, analyzing the characteristics of secondary
datasets, and developing a methodological approach for reducing the measurement error problems
common to inequality information.
Substantial difficulties arise in the empirical measurement of inequality. The most basic is the
lack of an institution and agreed procedures that can assure data quality and consistency. In other
words, there is no equivalent to the United Nations System of National Accounts, which provides
macroeconomic statistics that are constructed by national agencies and are reasonably consistent
over time and countries. In the absence of such an institution, some organizations have constructed
"secondary" datasets, of which the best known are DS, the World Income Inequality Database
(UNU/WIDER-UNDP, 2000) and the Luxembourg Income Study (LIS). These datasets compile
available national inequality statistics and perform quality assessments of all the data observations.
This has been an important first step towards the creation of internationally comparable inequality
time series. The Deininger and Squire dataset combines a large number of inequality observations
for the entire world, with each observation classified following three quality criteria. More recently,
the World Income Inequality Database (WIID) has extended and updated the DS dataset, using
similar quality criteria. (Throughout this paper, we use the larger compilation of data provided by the
WIID as our main inequality data source.)
Beyond quality criteria issues, there are additional problems that increase the measurement error
present in national series and in international inequality comparisons. In particular, national
inequality statistics generally include observations that differ on concepts measured (i.e. expenditure,
gross and net income), reference units (e.g. household, person, family) and/or sources. Subsequently
we refer to these three distinctive characteristics as the inequality data definitions and we consider an
inequality series to be consistent when these definitions are comparable for all observations.
Although some countries have relatively extensive and consistent time series, the general rule is that
inequality observations are sparse and differ on definitions over time. Hence, to create relatively
extensive inequality time series that can be used in econometric studies, it is often necessary to
assume the comparability of some of the definitions to handle the problem of sparsity. Deininger
and Squire have assumed that all definitions are broadly comparable and used their "high quality"
observations to construct the most consistent inequality time series for each country. However, they
caution about the potential problems of this comparability assumption and as an alternative they
advise the use of dummy variables to adjust and account for different definitions. Using this
approach, they generated a single inequality series for a wide number of countries, which has since
feature prominently in subsequent empirical research. While convenient, these simplifying
assumptions (i.e. the complete comparability of definitions and sources), introduce false patterns and
noise into the data. Furthermore, as AB have stressed, the use of dummy variables is not an
adequate solution to this problem.
In this paper, we build on previous efforts to overcome known limitations with secondary
inequality datasets. In particular, we assemble a combined inequality dataset based on a consistent
grouping methodology of heterogeneous observations from existing secondary datasets. This yields
a new cross-section and time-series dataset that we use to examine comparability problems, and to
then revisit recent estimates of the relationships between income distribution and other
macroeconomic variables. Our approach yields six main inequality series that can readily be used in
empirical tests and within these series the implicit measurement error has been reduced. We also
explore conceptual issues of measurement. There are important theoretical considerations with
regard to inequality measurement. While there are several indicators that measure inequality, there is
no consensus in favor of any particular index.1 We use non-linear parametric estimation of Lorenz
1A comprehensive survey of the topic can be found in Cowell (2000).
- 2 -
curves to approximate the entire income distribution, and then use these estimates to calculate the
Gini coefficient, four different Atkinson indexes, and poverty rates.
We also use our broadly comparable dataset to examine international patterns of inequality and
poverty. A first conclusion is that between-country inequality variation is more significant than
within-country. This suggests that country specific characteristics have a bigger role in explaining
inequality levels than time trends. However, we also find that within-country inequality is still
important and there are significant time trends in our series. Therefore, we reject the "glacial
change" hypothesis that inequality does not vary significantly over time. For the specific case of
OECD countries, we clearly detect a U-shape pattern that confirms the "U-turn" hypothesis recently
flagged by Atkinson (2003). For developing countries the cross-country pattern is less clear, but it
suggests a decrease in inequality for most of the analyzed period, with a slight increase in the 1990s.
Country-specific time trends are diverse and it is difficult to spot precise trends. The choice of
income concept, basic or extended series and the use of pool data may produce different results.
Nevertheless, this variety of choice emphasizes the richness of our inequality dataset, which is not
limited by a single series and provides wider information from where to draw conclusions. With
respect to poverty, we find a decline in the poverty ratios over time in most of the countries covered
by our sample. The only (though admittedly quite significant) exception is the poverty experience in
the African continent.
The paper is organized as follows. Section 2 explores difficulties involved in dealing with
inequality data. In the Section 3 we assess comparability criteria and discuss the resulting
assumptions needed in order to consistently group different definitions and sources. In Section 4 we
estimate the Lorenz curves and Atkinson indexes from grouped income data, and also poverty ratios.
Working with the resulting dataset, in Section 5 we compare it with the DS and WIID series, and
also compare the results provided alternatively by the Gini and the Atkinson indexes. In Section 6
we then explore how international and inter-temporal inequality has changed over time within our
dataset. We conclude in Section 7.
1. Problems when dealing with inequality data
We divide the tasks involved in building a cross-country inequality dataset into two main groups.
The first group includes data compilation and quality control. These issues are relatively well
- 3 -
addressed by existing datasets. The second group includes those issues that are not yet convincingly
tackled: the inter-temporal and international comparability and consistency of inequality data.
2.1 Secondary datasets
A "secondary" dataset is a summary of national information that is drawn from household income
studies and micro-datasets produced by national surveys. The two most used datasets are the
Deininger and Squire (DS) and the World Income Inequality Database (WIID). The WIID was
constructed itself based on the DS dataset and has expanded this dataset and aggregated new
available information. Thus, this is the largest and most exhaustive compilation of inequality data
available. It provides up to 5067 data observations, for different definitions, coverage and quality
ratings. Therefore, we take it as our starting point and main source of information.
The secondary datasets provide two important advantages. They compile most of the available
inequality data into one source, and they check for the quality of each observation. The quality
controls used to filter information from the primary to the secondary datasets eliminate unreliable
data and inequality observations that are not representative of the whole country. Deininger and
Squire (1996) used three quality controls:
"The statistics were selected by requiring that they be from national household surveys
for expenditure or income, that they be representative of the national population, and
that all sources of income or expenditure be accounted for."
The WIID quality ratings are very similar to those of DS. However, there are some important
differences.2 In particular, the WIID considers as reliable data some of the observations that did not
have a clear reference to the primary source, while DS did not consider these observations in their
"high quality" dataset.3 The second important difference is the inclusion of observations based on
monetary income, which is not used in DS because it does not account for all sources of income.4
Finally, missing income concepts are not accepted by WIID and this implies that we do not consider
2 These quality criteria differences introduce some divergences between our dataset and DS, which are not
accounted for by the comparability assumptions we use later.
3As explained later, the inclusion of these observations significantly increases the number of inequality data points
in the 1960s and 1970s. Although the measurement accuracy may be reduced, it provides a valuable extension in the time
series (Barro, 2000).
4We justify the inclusion of these observations in the following section. Mainly, the data included are from rich or
middle-income countries, for which non-monetary income is not expected to be significant.
- 4 -
some of the DS observations. The reliable data ratings of the WIID are labeled as OKIN and from a
total world population of 150 countries, we have OKIN data for 141 countries.
The main difference between DS and WIID, however, is that the last does not identify a single
time series for each country. Instead, the researcher has the full available information and has a wide
range of series to choose from. The disadvantage is that there is no clear indication on how to use or
join inequality observations with different definitions and/or sources.
Finally, it is important to mention that another source of inequality data, which we do not use, is
from the micro datasets provided by the Luxembourg Inequality Study (LIS) and the Living
Standards Measurement Study of the World Bank. We do not consider their use here because their
coverage, in terms of time and countries, is very limited. In addition, they are usually difficult to
access and obtaining summary statistics is very time consuming in an already burdensome process.5
2.2 Definition inconsistency
We follow the WIID and classify each data observation into six characteristics: concept measured,
reference unit, area coverage, population coverage, data sources and quality ratings. Technically there
are other distinctive characteristics that may significantly alter the inequality values, such as: survey
methods, sample characteristics, income issues included (e.g. imputed rents for own-occupied
houses, insurance premia, interests and dividends) and the time period considered.6 Nevertheless,
since our goal is to obtain a reliable cross-country dataset, we do not deal with these measurement
issues, which are generally country-specific, and we focus instead on the broad characteristics of the
inequality observations. Thus, our resulting dataset diminishes the measurement error embedded in
inequality data, but does not entirely eliminate this problem.
From the six main characteristics, we follow DS and select only data that cover the entire
population and have national coverage. Moreover, we use the quality criteria provided by the WIID
and select only the observations labeled as OKIN. After this first filtering of the information, we
remain with three characteristics: concept measured, reference unit and source. Since there are
multiple combinations of concepts and reference units, and usually more than one source per
country, we have what AB refer to as a "bewildering variety of estimates". That is, a number of
5Nonetheless, these data sets may be a preferable source of information for single country inequality analysis or
limited cross-country analysis.
6Some of these issues can induce substantial measurement errors of their own. For example, imputed rents can
represent a significant share of household income in some countries and if it is included in the household survey it can
create an important source of distortion in the comparability of different inequality observations. Ravallion and Chen
(1997) and Atkinson and Bourguignon (2000) discuss further on these points.
- 5 -
generally discontinuous series, with differences in one or two definitions and usually from different
sources. The problem can be better understood by looking at Table 1, which presents data for Chile.
For this particular country we have seven different series, two income concepts (gross income
and gross monetary income), two recipients or reference units (person and household) and six
sources. However, the number of series and definitions involved can be larger in other countries. In
total, there are five different concepts and as many as nine different income recipients.7 Additionally
each data point provides the Gini coefficient and sometimes distribution shares in quintiles or
deciles of population. In Table 1 the observations with distribution shares are indicated by the data
in boxes. Moreover, the Gini coefficient can be given by the primary source or directly estimated
from the distribution shares when available.
These characteristics of the inequality data leave several questions to be answered. Can we mix
different definitions of income concepts and income recipients? Can we mix different sources? If
yes, how do we mix them? Which data observation should we choose when there is more than one
available for a given year? How many series should we analyze per country? In order to have an
inequality dataset that can be readily used for empirical research we must answer these questions,
which are relevant if we want to analyze country time series, as well as cross-country analysis. Given
that especially in developing countries there are not many inequality observations, we must
formulate assumptions regarding the combination of the different definitions in order to obtain at
least one series per country.
2.3 Deininger and Squire approach
DS assume that all definitions are broadly comparable and instead, focus on the quality of the
observations. Thus, they freely mix the different definitions, regardless of income concept or
reference unit. However, they acknowledge the potential measurement errors that this approach may
cause and recommend the use of dummy variables to deal with the problem.
This strategy allows them to present a single time series for each country, which is very
convenient for empirical studies. Moreover, when there is more than one observation per year that
satisfies their three quality criteria, they choose the observation that is most consistent with the rest
of the series. In other words, they try to maintain the same income concept, recipient unit and
7 The concepts are gross income, gross monetary income, net income, net monetary income and expenditure. The
most common reference units are person, household, household per capita, household equivalent, family and family
equivalent.
- 6 -
source when possible. Their final inclusion criterion is that the observation originates from an
official publication. In the last column of Table 1, we present their "high-quality" data observations,
which are labeled as DS-accept. For this particular country, they mix household and person
reference units and three different sources. The 1994 observation they use is not included in our
series because it does not properly define the type of income it uses. This lack of information
accounts for the WIID considering the observation as not reliable (NOOK) and we do not use it.8
Nevertheless, the grouping procedure of DS has been strongly criticized by AB. Using a sample
of OECD countries; they show how inconsistent it is to create such series. In many cases the
constructed DS-accept series significantly modify the level and even the trend of some inequality
series, in comparison with series that use only consistent income concepts. Furthermore, AB
demonstrate for OECD countries, that the use of dummy variables is not enough to render some
definitions comparable. In particular this is the case for net and gross income, as well as for income
based and expenditure based observations.9
DS defend the grouping of net and gross income by assuming that in developing countries,
where there is not enough data to compare both definitions, redistribution is not important and thus,
gross and net income are comparable. Yet, AB stress the inconsistencies this mixing yields for many
OECD countries for which both types of income are available. Compared with the Luxembourg
Inequality Study (LIS), which adjusts inequality data to make it international comparable within the
OECD countries (Atkinson, et al., 1995), AB find that the rankings provided by DS are very
different from those of the LIS. Similarly, for the case of expenditure-based and income-based
observations, DS acknowledge that both concepts are significantly different. To correct for this
problem, they suggest the use of a fixed adjustment to render both concepts comparable. In their
dataset they find that expenditure-based observations are on average 6.6 points below income-based
Gini coefficients. However, this particular value is conditional on the sample they analyze. We use
the same procedure and compare the inequality levels of both concepts only for those countries
where both are available, but in our sample we find that the average difference is three points.10
Therefore, using the fixed value proposed by DS increases on average, around 3.6 points the levels
of inequality for the countries in our sample. We conclude that the estimate of the true difference
8 For the rest of the MIDEPLAN observations, they are not DS-accept because there was not a clear reference to
the primary source. However, WIID consider this data to be reliable and it provides more observations to be used.
9 In the next section, we use a bigger sample and reach similar conclusions.
10 The sample difference is given by the larger compilation provided by the WIID and by the differences in some of
the quality criteria we explained before. In our case, we can directly compare 19 countries that have both income and
expenditure information, of which 58% belong to the OECD.
- 7 -
can be unreliable and the use of fixed adjustments introduces arbitrary noise in an already
problematic dataset. In short, we agree with AB and conclude that the use of fixed adjustments is
not enough to reconcile both definitions.
The treatment of different reference units by DS is also problematic. DS collapse the numerous
definitions into two categories: person-based and household-based observations. After this rough
grouping, they compare both definitions and conclude that they are not significantly different.
However, not all the person-based definitions are comparable, nor are all the household-based ones.
This raises additional questions about their comparability assumptions.
Table 1: Chile, Gini series with different definitions, 1968-1996
Income Gross Income Monetary
concept: Gross Inc.
Recipient: Household Person Household
Source: UN Fields SH Mideplan Paukert IADB SH DS accept
Series: (1) (2) (3) (4) (5) (6) (7) (8)
1968 45.64 44.00 45.64
1969
1970
1971 46.00 46.00
1972
1973
1974
1975
1976
1977
1978
1979
1980 53.21 53.21
1981 53.46
1982 56.98
1983 54.49
1984 55.85
1985 54.91
1986 55.69
1987 56.72
1988 54.50
1989 57.88 57.88
1990 54.70 53.18 55.65
1991 55.38
1992 52.19 50.70 52.00 53.08
1993 50.00
1994 55.58 57.42 56.49
1995
1996 56.37 57.24
Note: Observations in boxes represent data with grouped data information
Source: WIID, version 1, Sept. 2000
- 8 -
We use two examples to highlight in detail the main problems involved when grouping
heterogeneous series. In Figure 1 we plot two inequality series for Spain, each differs in the concept
measured (e.g. gross income and expenditure). Each series has different levels and no significant
trend. However, DS use two gross income observations to expand the expenditure series and this
alters significantly the inequality conclusions. First, the DS series has an apparent time trend, with a
considerable decrease in inequality from the 1970s to the 1980s. Secondly, the combination of both
concepts significantly changes the inequality levels. In particular, the gross income series for the
1980s yields inequality observations of around 8 points higher than the DS-accept values. Such a
dramatic level variation substantially changes the country's international inequality ranking, as we
show in Section 5.
In Figure 2 we plot two series for Mexico that differ in their reference unit (household and
person), but have the same gross income concept. Both series have distinctive time trends and
significantly different levels in the last years. DS combine both reference units and again, freely
mixing different definitions changes the inequality results. In particular, for the 1980s, the levels of
both series are quite different around 8 points and this introduces an important change to the
apparent inequality pattern. Furthermore, DS report an increase in inequality from 1977 to 1989,
while the consistent gross-household series shows the opposite result. Finally, the DS dataset fails to
indicate any time trend at all for the entire period, while the gross-household series has, at least, a
decreasing trend.
Although such examples are not widespread, they certainly introduce noise in the data that
increases the measurement error and may affect the overall empirical results. While the corrections
proposed by DS may sometimes work, on other occasions they may distort the data further or leave
the discrepancies unaltered. In our last example from Mexico, DS do not recommend any correction
for reference unit differences and the problem shown above persists.
- 9 -
Figure 1: Spain, Gini coefficient series
40
35
30 DS accept
Gross income
Expenditure
25
20
15
1965 1970 1975 1980 1985 1990 1995
Figure 2: Mexico, Gini coefficient series
60
55
50 DS accept
Gross-household
45 Gross-person
40
35
1950 1955 1960 1965 1970 1975 1980 1985 1990 1995
For the case of Spain, the use of the particular 6.6 adjustment value reduces the inconsistencies, but
in other cases it does not help.
In summary, although the DS dataset was a very important step forward in the study of
inequality data, it has significant limitations that increase the measurement error and may seriously
- 10 -
alter the empirical results of those studies that use this inequality dataset. In particular, though it is
quite reasonable to use the kind of quality control considerations they introduced, the evidence
suggests that grouping different definitions to create single country time series is unreliable and the
use of fixed adjustments does not correct this problem.
2. Controlling for definition comparability and series grouping
Since there are several concepts and reference units, we still would like to have a methodology for
collapsing the various series available for each country. This is necessary to obtain consistent and
comparable series that can be used in cross-country and time series studies. To collapse further the
existing series provided by the WIID, we take three steps. First, we group those series which have
the same definitions (concept and reference unit). The second step is to make a comparability
analysis and judge which definitions and conflicting sources can be mixed and how. The final step is
to construct the national series, using consistent groupings and standard procedures based on these
results.
3.1 Grouping series with the same definitions
It is straightforward to group those series with identical definitions. The main difficulty in this first
step is to deal with different sources. In some cases, we can have a year where two sources report
observations with the same definitions. This is the case for Chile in 1992, shown in series 4 and 6 of
Table 1. If both sources have series that can be analyzed11 we run the same comparability tests as
below. If this is not possible, we choose observations using the following preference ordering:
1. Observations with income share information
2. LIS data
3. DS accept data
4. The source with the longest time coverage.
Since one of our main purposes is to compare the Gini coefficient with alternative indexes, we
need the income share information to construct such indexes. The Luxembourg Income Study (LIS)
is a project that has created a micro-data of social and economic information. It has been used to
11As explained below, this requires that both series have a common sample of at least three observations in a time
span of five or more years.
- 11 -
explicitly compare cross-country inequality information and thus presents adjusted data for such
purposes. Finally, the last two ordering preferences ensure consistency in the series and the previous
considerations contemplated by DS.
In our example for Chile, we have two conflicting sources: MIDEPLAN and IADB. Using the
decision criteria stated before we prefer the MIDEPLAN observation, since it provides income
share information. After doing this first grouping, we have collapsed seven series into three (see
Table 2).
We use the same standard procedure with all the countries to collapse series with the same
definitions. However, this first step is insufficient given that many countries remain with several
series (i.e. Sweden has up to 14) and further grouping procedures are necessary.
3.2 Comparability analysis
Li et al. (1998) have compared values of the Gini coefficient with different definitions for countries
and years where estimates are available. However, this procedure is biased towards the sample of
countries with available data. A more satisfactory procedure is to compare observations available for
the same country and the same year, as was done by DS.
The existing literature does not offer a consistent and standard comparison procedure. DS limit
themselves to comparing the average difference between different definitions. In the case of income
and expenditure, they try to check the correlation of the differences with some explanatory variables.
There are problems with this approach. For two series to be comparable, and thus, interchangeable
if one data observation is missing, we need much more than average differences. We need two series
that have a very similar trend and level. In the case the level is not comparable, we need the
difference between both series to be relatively constant over time and only in such cases does it
makes sense to freely mix different series. In other cases, grouping definitions that are not
comparable can seriously alter the level and/or trend of the series. Adjusted data in this way at best
increases the measurement error and at worst can invalidate the empirical results.
In addition, to clearly single out what we are comparing, the series should differ in only one of
the definitions. For example, if a Gini coefficient series for net income and household seems
comparable with a series of gross income and person, it could be because indeed gross and net are
comparable or because net and gross are different but a combination of income concepts and
reference units produce the similarities.
- 12 -
Table 2: Chile, grouped Gini series, 1968-1996
In c o m e G ro ss In co m e M o n e tra y
co n cep t: G ro ss In c.
R e cip ien t: H o u se h o ld P ers o n H o u se h o ld
S o u rce: 3s 3s S H D S ac cep t
S e rie s (1) (2 ) (3 ) (4 )
196 8 45 .64 4 4.00 4 5.64
196 9
197 0
197 1 46 .00 4 6.00
197 2
197 3
197 4
197 5
197 6
197 7
197 8
197 9
198 0 5 3.21 5 3.21
198 1 5 3.46
198 2 5 6.98
198 3 5 4.49
198 4 5 5.85
198 5 5 4.91
198 6 5 5.69
198 7 5 6.72
198 8 5 4.50
198 9 5 7.88 5 7.88
199 0 54 .70 5 3.18 5 5.65
199 1 5 5.38
199 2 52 .19 5 0.70 5 3.08
199 3 5 0.00
199 4 55 .58 5 7.42 5 6.49
199 5
199 6 56 .37 5 7.24
N o te : "3 s " re fe rs to th re e d iffe re n t p rim a ry so u rc e s
S o u rc e : W IID , v e rs io n 1 , S e p t. 2 0 0 0
3.3 Comparability criteria
To analyze different definitions and sources, we use the following procedure:
· Use only Gini data for the same country and the same year when they differ in only one of
the definitions (concept or reference unit) and have a common sample of at least three
observations in a time span of at least five years.
· Estimate the simple correlation between both series. If the correlation is negative we
conclude that the series are not comparable.
· Check if both series are normally distributed and run hypothesis tests for equal mean and
equal variance (i.e. a t-test and an F-test). If the variance is significantly different (at a 5%
- 13 -
significance level) then we conclude that the series are not comparable. If the mean is
significantly different, we test if there is a constant difference between them.12
· When the series are positively correlated and have a similar variance, they move in the same
direction over time. If instead, the mean is not equal we use the average difference between
series.13 Furthermore, we check for one, three and five points differences in the means
(which are some values reported as the average difference between series with different
concepts).
· To complement the hypothesis of equal mean and variance, we take OLS regressions on the
equation: S1 = (S2+c) +, where S represents the series and the error term. When the
means of S1 and S2 are not the same c is the average difference of the series, otherwise c = 0.
We run a Wald coefficient test to check the null hypothesis that = 1. To check how
sensitive the series are to absolute differences in the mean, we also test the null hypothesis
when c is ±1, 3 and 5. Note that in this case the inclusion of c is equivalent to the use of a
fixed adjustment or the use of a dummy variable.
· In summary, we consider two series to be comparable when they have a positive correlation,
not a significantly different variance and we cannot reject the null hypothesis that = 1,
when -1 c 1. For other values of c we consider the series to be comparable, but with a
constant absolute difference between them. In this last case, we must add c to make the
series compatible.
Through this comparison procedure we attempt to assure that both series have statistically the
same time trend and the same level (or an absolute constant difference). This helps ensure that when
freely mixing two series we do not alter the trend or level of the resulting series.14
3.4 Comparability assumptions
Once we have pared comparable series and followed the previous procedure, we can study which
definitions can be mixed. The results of this analysis provide the basis for establishing the
12For the few cases where the series fail the test of being normally distributed, we use the ANOVA F-statistic to
test for equality of means and the Levene and the Brown-Forsythe methods to test for the equality of the variances.
Again, if the variance is significantly different we conclude that the series are not comparable.
13We round the values to the closest integer to simplify the procedure. In some cases we need to use half points in
order for the series to be comparable.
14This procedure was also employed to compare some series that differed only in the source.
- 14 -
comparability assumptions we use later, which allows us to consolidate series and reduce the number
of definition combinations available for each country. In total we have 179 comparable pairs for 38
countries, 14 of which are OECD countries with 107 pairs and 24 are developing countries with 72
pairs. The results for all the comparable definitions are summarized in Table 3.
In the table, the first column shows which variables are being compared. The next two columns
indicate the number of pairs compared and the percentage that belongs to OECD countries. The
next column is very important, since it shows the percentage of series that are not comparable,
either because there is a negative correlation between the series or because they fail the equal
variance test. The columns labeled c ± 1 and c ± 2 show the percentage of comparable series when
the absolute average difference is less than one and two, respectively. The following column shows
the sample average difference and the next column reports the percentage of series that are
comparable when this average difference is applied. The final column indicates the decision
regarding the comparison of definitions.
In Table 3 we show all the 11 possible comparisons. We analyze below the six most relevant
pairs and we use them to illustrate how we reach the final decision regarding the comparability
assumptions. The remaining five couples follow the same procedure and we just mention the final
decision. Three main considerations were taken into account when deciding which series could be
comparable. The first criterion was the percentage of non-comparable couples. A high percentage
indicates that the considered series had different trends and hence, provide a very bad substitute for
a missing series. The second criterion is the percentage of comparable series when no fixed
adjustment is applied and when the average difference is applied. These percentages show how good
it can be to mix series with or without a fixed adjustment. Finally, we prefer combinations with large
samples and, for some cases; the percentage of OECD observations is also relevant for the analysis.
Income and Monetary Income. In this case, we have four developing countries and the USA, which
offer seven comparable pairs. Of these, 43% are comparable series with an average difference (c)
smaller than ±1 and 71% are in the range of c < |2|. Almost one third of the pairs are comparable
when using the average difference of -2. Moreover, all the pairs are positively correlated and have an
equal variance.
This result is theoretically consistent, since monetary income excludes own-produced
consumption and it should report a higher level of inequality. Since all the series are in principle
comparable it seems reasonable to freely mix both definitions. Nonetheless, we are uncertain of
- 15 -
which fixed adjustment (average difference) to apply. If we use the average difference of -2 only
29% of the series are comparable. Another inconvenience is the small sample of only seven
observations. Thus, we are uncertain about this comparison couple.
Table 3: Comparability results for all definitions
Sample Comparability Results
Obs OECD Not c ±1 c ± 2 average c percentage DECISION
Comparable comparable
Income vs Monetary Income 7 43% 0% 43% 71% -2 29% uncertain
Income vs Expenditure 19 58% 42% 16% 32% 3 26% no
Gross vs Net 36 78% 31% 19% 28% 3 39% no
Person vs HH per capita 8 0% 0% 88% 88% 0 88% yes
Household vs HH equivalent 36 100% 25% 19% 28% 4 36% no
Household vs Person/HHpc 23 26% 9% 61% 61% 0 61% uncertain
Household vs Family 8 75% 0% 50% 63% 2 38% uncertain
Household vs Family eq. 18 100% 6% 28% 44% 5 17% no
Household eq. vs Family eq. 16 100% 13% 81% 88% 0 81% yes
Family vs Person 2 50% 0% 100% 100% 1 100% uncertain
HHe/Fe vs Person/HHpc 6 83% 17% 67% 83% 0 67% uncertain
Notes: The OECD column corresponds to the percentage of observations from these countries.
The Not Comparable column presents the percentage of observations with a negative correlation and/or different variance.
The c±1 column shows the percentage of comparable pairs with a fixed adjustment (c) of +1 or -1.
The next column is the equivalent when the ±2 range is used.
The Average c column reports the mean difference between definitions for all observations.
The following column shows the percentage of comparable pairs when this average c is used.
The last column reports the comparability result for each pair of definitions.
It is important to note that DS did not take into consideration data observations defined for
monetary income. They argue that the consumption of own produced goods is an important source
of revenue for poorer households and not taking into account this kind of consumption can skew
the indicator towards more inequality. However, our results suggest that for some cases, a fix
adjustment can render both definitions comparable. In particular, we mix both definitions only in
rich and middle-income countries15, for which one does not expect this kind of consumption to be
15Australia, Brazil, Costa Rica, Hong Kong, Panama, Russia, United States and Venezuela.
- 16 -
important. Therefore, we find it reasonable to include data based on monetary income for these
countries and by doing so we can expand the available number of observations.
On the other hand, the example provided by DS to exclude monetary income is not compelling.
Although later they conclude, as we do, that income and expenditure-based are not readily
comparable, in their paper they actually compare the monetary income and expenditure observations
for Greece in 1974, as a way to associate income and monetary income. Since for this year the
difference in both series is of roughly six points they conclude that both definitions cannot be mixed.
However, when applying our comparability analysis to Greece, we conclude that both series are
indeed comparable when adding three points to the expenditure series.
Income and Expenditure. In this case 42% of the observation couples are not comparable at all, i.e.,
they have a negative correlation or significantly different variance. Moreover, if we were to use a
fixed adjustment to compare series, only 26% are comparable for the average difference of 3 points.
If we were to use the average difference found in DS of 6.6, then only 11% of the series are
comparable. In other words, in almost one out of ten cases this particular fixed adjustment
significantly alters the level and/or trend of the series, overwriting information contained in the
original data.
Since expenditure does not take into account income that is saved, we expect it to give less
unequal values. Moreover, expenditure information can take into account income smoothing by
borrowing or lending. Thus, we also expect lower inequality values from expenditure surveys.16
However, we find it highly problematic to freely mix expenditure-based and income-based
inequality observations. Given the very high percentage of non-comparable cases, the most likely
possibility is that both series are providing different inequality information. Therefore, we conclude
that it is not reasonable to mix income and expenditure definitions, not even when using fixed
adjustments.
Gross and Net Income. For this case, almost a third of all the series couples are not comparable.
The average difference is of three points, but only 39% of the series are comparable when such a
16In developing countries expenditure surveys are prevalent since many households do not know their actual
income or their knowledge is incomplete. This can be explained by the presence of significant own-produced
consumption, temporal and/or irregular monetary income sources. Therefore, it is easier to survey their consumption
(expenditure) and this has become a common practice in poor countries.
- 17 -
fixed effect is applied. Although this is not such a clear case as the previous, we also reject
comparing gross and net income. One argument used in DS to compare both definitions is that in
developing countries the difference should not be big, assuming that their redistribution systems
have a small impact on incomes. However, of the eight comparisons that come from non-OECD
countries, 25% are not comparable, 50% are comparable with c < |1| and 38% are comparable
when the average difference of three points is applied. Therefore, even when the sample is still small
and all the non-OECD countries involved are middle-income countries, there is not much evidence
that gross and net income are equivalent.
On the other hand, grouping series with net and gross income does make a significant impact in
OECD series. In particular, AB show in detail how damaging the combination of both definitions is
to the information on levels and trends contained in the original series for these rich countries.
Mixing different reference units. Grouping person and household per capita is probably the clearest
case in favor of mixing definitions. In 88% of the cases we can freely mix both series. In addition,
we do not need to adjust for any fixed effects. Although the sample of eight is small, the evidence is
strong. Therefore, we assume that both series are comparable and we evaluate this combined
definition (person-household per capita) with other income recipient definitions.
On the other extreme, comparing household and household equivalent observations does not
seem reasonable. A quarter of the observations are not comparable and when the average difference
of four points is applied only 36% of the series are comparable.
The comparability of household with person/household per capita is an uncertain one. The
non-comparable percentage is relatively small, but only in 61% of the cases are the definitions
comparable. In fact, household and person are the most common reference units and this particular
comparability assumption is very important and we deal with it latter.
The remaining five comparability results regarding reference units are shown in Table 3. In
summary, for the eight couple of reference unit definitions that were tested, only one was clearly
non-comparable. These results suggest that one may mix reference units in many cases, but not in all.
3.5 Grouping the data
Using the results of the previous section we can directly group those series with comparable
definitions and reduce the number of series per country. However, since the three main concepts
(e.g. net income, gross income and expenditure) cannot be mixed, then inevitably we have more
- 18 -
than one inequality series for some countries. Moreover, we associate each concept with the most
common reference unit and this union creates our three main resulting series: gross income-
household, net income-household and expenditure-person.
The presence of three series may seem inconvenient when conducting empirical research, but
different income concepts may offer different information about inequality behavior and using more
than one concept can increase the available information that we can use. For example, evaluating
both gross and net income inequality measures provides important information concerning the
redistribution policies of some national governments. In addition, trade theory makes direct
predictions about gross, not net factor incomes.
On the other hand, separating series because of different reference units is not very compelling.
Any difference provided by dissimilar reference units are mostly explained by demographic factors.
When the size of the household changes according to the income classes, one can expect different
inequality results from household and individual information. In addition, the number of adults in
different income classes can provide divergences in the inequality results. However, these
demographic factors vary across countries.
An additional advantage of our comparability analysis is that it offers country-specific
information. For those countries were we conducted the tests, we have and indication of whether
specific national series with different definitions may be comparable or not. This information is used
to group series in that particular country, even if the overall analysis resulted in a verdict of non-
comparable definitions. For instance, in a country were gross and net income are comparable with a
fixed adjustment, we can use this information to expand one or both series.
Finally, we have to take a decision concerning those definitions that yield an uncertain result.
The approach we take is to have two broad types of series. First, we construct "basic" series in
which we are confident of the definition groupings used. Hence, we only mix definitions for which
we have strong evidence that they may be comparable and/or definitions that are comparable for
that particular country.
We then construct "extended" series. These series have more observations, but we use definition
groupings that are less reliable and we combine definitions for which we are uncertain about their
comparability. For example, the most common "basic" series is the data based on gross income and
household definitions, and this series is mostly complemented with observations based on monetary
income and person definitions to create the "extended" series.
- 19 -
In summary, to finally collapse the remaining series, we perform the following standard
procedure:
· Group those reference units for which we are certain about their comparability
· Use the country specific information to further group series. In particular, to expand the
most common series: gross income-household, net income-household and expenditure-
person. This includes grouping series with a fixed adjustment, when the evidence supports
this type of comparability assumption.17
· Take advantage of the LIS data (only available for OECD countries) to adjust the series in
those cases in which they are comparable. E.g., if we have a net income-household series
from both the LIS and another source, and both series are comparable with a fixed
adjustment, then we adjust the series to have the levels reported by the LIS data.
· Finally, we use definitions for which we are uncertain about their comparability to create
the "extended" series.
In brief, for each country we can have one or more basic series relating to gross income, net
income or expenditure and in some cases, extended series, which expand these series by including
uncertain definition comparisons.
Even though we apply the same procedure for each country and in every case we try to be as
objective as possible, we face the same dilemma expressed by DS and "decisions concerning the
inclusion or exclusion of certain observations are always based on some judgment and arbitrariness".
In our case, our results can be replicated using the WIID database.18 This allows other researches
to review our procedure and make their own changes if necessary.
Using our previous example for Chile, from an initial number of seven series, we can collapse
the inequality information into two "basic" series (see Table 4). Here we used the fact that for Chile
17 When some observations are adjusted to make them consistent with the rest of the series, we have a problem
with the Atkinson indexes (which we calculate later). Generally, the discrepancy between Gini coefficients with different
definitions is not the same than for the Atkinson indexes. To solve this problem we use two standard approaches. A
direct approach is applied when we have three or more comparable Atkinson observations and then, we directly apply
the average difference. The indirect approach is to estimate the average difference between Gini and the particular
Atkinson index for both series and then adjust the Atkinson by the difference between both averages.
18 Our inequality dataset is posted in the web and there we present the country-specific adjustments made and the
GAMS code used later (www.intereconomics.com/francois/data.html).
- 20 -
the household and person series are comparable with an adjustment of 1.5 points. This allows us to
expand the gross income-household series from 6 to 15 observations.19
When compared with the DS accept series the levels are not significantly different, but in this
example, we find different time trends. In section 4 we compare in detail the inequality trends
reflected in our dataset against those present in the DS dataset.
3.5 Characteristics of the three main series
When our grouping methodology is applied to all the countries, we still have several series, especially
in OECD countries. However, the three main series have comprehensive world coverage and can be
readily used for empirical analysis. The main statistics of these series are presented in Table 5.
Although all the series have a smaller sample and coverage than the DS-accept series, we still
have a satisfactory representation. Moreover, the extended gross-household series is fairly
comparable in number of observations and OECD representation to the DS-accept series.
The net-household series seems to be better suited to analyze OECD countries, while the
expenditure-person series consists of a majority of developing countries. The gross income series is
better balanced between rich and poor countries. These sample differences between concepts can
also be observed in the average series length. OECD countries have longer series and this is
reflected in the average number of observations in the net income series. In contrast, developing
countries and the expenditure-person series have shorter series, and in many cases just one or two
observations.
Additionally, we have at least one of the three main series for 145 countries. This is an
improvement with respect to the 115 country coverage of the DS dataset. The share data is also
sparser than the Gini coefficient observations and again, only the net income series has almost the
same number of share data and Gini observations, due to the better quality of the OECD inequality
data.
3.6 Series length and panel data analysis
Another important issue concerns the length of the series we choose to analyze. Given the sparse
amount of inequality information in many countries, once we have created the basic and extended
19 Note that for international comparisons we only use the first series. The second series is only used when
analyzing inequality in Chile.
- 21 -
series, some countries end up with only one or two data points per series. Subsequently, when we
analyze inequality trends and when we compare our dataset with DS, we use only countries with at
least one series with three or more observations, in a time span of at least five years. This allows us
to study cross-country inequality trends and use panel data analysis. This set of countries with long
series includes 80 countries and the main characteristics are reported in Table 6. In the Appendix we
present the summary statistics of this subset of countries with longer series.
Table 4: Chile, final Gini series, 1968-1996
In co m e G ross In co m e
co n cep t:
Recip ien t: H o u seh o ld /P erso n Perso n
S o u rce: 3s 3s DS accep t
S eries (1) (2) (3)
1968 45.64 44.00 45.64
1969
1970
1971 46.00 46.00
1972
1973
1974
1975
1976
1977
1978
1979
1980 54.71 53.21 53.21
1981 54.96 53.46
1982 58.48 56.98
1983 55.99 54.49
1984 57.35 55.85
1985 56.41 54.91
1986 57.19 55.69
1987 58.22 56.72
1988 56.00 54.50
1989 59.38 57.88 57.88
1990 54.70 53.18
1991 56.88 55.38
1992 52.20 50.70
1993 51.50 50.00
1994 55.58 56.49
1995
1996 56.37
S ource: W IID , version 1, S ept. 2000
- 22 -
Table 5: Characteristics of the three main inequality series
Gross Income Net Income Expenditure Total for the
Household Household Person three series DS-accept
Gini Share Gini Share Gini Share Gini Share
data data data data
BASIC series
Countries 49 38 27 25 69 63 145 126 115
Observations 427 326 288 241 189 159 904 726 693
Average obs. 8.71 8.58 10.67 9.64 2.74 2.52 6.23 5.76 6.03
OECD countries 29% 39% 70% 72% 1% 2% 23% 27% 17%
EXTENDED series
Countries 95 70 47 43 85 75 227 188 115
Observations 634 445 433 376 254 205 1321 1026 693
Average obs. 6.67 6.36 9.21 8.74 2.99 2.73 5.82 5.46 6.03
OECD countries 17% 23% 43% 44% 1% 1% 16% 19% 17%
Table 6: Characteristics of the three main inequality series,
countries with three or more observations
Gross Income Net Income Expenditure Total for the
Household Household Person three series DS-accept
Gini Share Gini Share Gini Share Gini Share
data data data data
BASIC series
Countries 38 34 20 20 22 21 80 75 66
Observations 413 322 279 235 127 106 819 663 634
Average obs. 10.87 9.47 13.95 11.75 5.77 5.05 10.24 8.84 9.61
OECD countries 37% 47% 75% 75% 5% 5% 38% 43% 29%
EXTENDED series
Countries 57 47 32 31 30 28 119 106 66
Observations 580 420 407 354 182 146 1169 920 634
Average obs. 10.18 8.94 12.72 11.42 6.07 5.21 9.82 8.68 9.61
OECD countries 26% 32% 50% 52% 13% 11% 29% 32% 29%
Our full dataset, which includes countries with only one or two observations, can be used to
conduct cross-country analysis for specific years. It also allows for decade or five year averages that
can be used as a panel database. Since there is a majority of expenditure-person series for developing
countries in the full dataset, this can give a better representation than the longer series.
In summary, we have six main series: three basic series, with consistent definition comparability
and three extended series with less reliable comparability assumptions. In constructing these series,
we have used the recommendations made by Atkinson and Brandolini (2001):
"The use of simple dummy variable adjustments for data differences is not appropriate.
Over time, the net and gross income distributions may behave differently, as may the
distributions for households and for families. It is necessary to piece together
- 23 -
information from different sources, informed by an awareness of their relative strengths
and weaknesses. All of this points to the need for a blend of quantitative and qualitative
analysis, and the avoidance of mechanical use of the (secondary) datasets."
4. Alternative inequality indexes
We now turn our attention to a different topic. In this section we estimate Lorenz curves from
grouped income data. This allows us to re-estimate Gini coefficients, compute the Atkinson indexes
and estimate headcount poverty ratios.
Thereafter, we can use these estimates to conduct some tests. First, we compare our Gini
estimates, which are drawn from the Lorenz curve, with the Gini coefficients reported by the
primary sources. Secondly, we test if the inequality information provided by the Gini coefficient is
similar to the one offered by the Atkinson indexes. The later can be considered a test of the
robustness of the Gini coefficient as an inequality indicator.
4.1 Inequality measurement
Most of the inequality observations provided by the secondary datasets are given by Gini
coefficients. Formally, this inequality measure is given by:
G = 1+ 1n - 2 2 h
n2y (y1 + 2y2 + ...+ nyn) = 1+ 1n - n2 hyy (1)
h
where yh is the income of household h and income is arranged so y¹ y² . ... yn; n is the number
of households and y is the average income. Although this inequality index is widely accepted and
used, there are many other inequality indexes and there is no theoretical prerogative to prefer any.
Inequality is associated with the variance of the income distribution and this creates two basic
measurement complications. First, as with any distribution, it is not a single-valued variable. Second,
even when the concepts of Lorenz-dominance and Generalized Lorenz-dominance (Shorrocks, 1983)
are widely accepted as ways to impartially rank two different distributions, in many cases the Lorenz-
curve intersects at least once, and this method yields an incomplete ranking of distributions.
To solve both problems, inequality indexes are used to rank distributions in these indeterminate
cases and to provide a single-valued variable that can be used in empirical models. However, since all
- 24 -
inequality indexes have a specific method to weight and rank incomes from different levels, there is
no objective inequality index and any inequality indicator has built-in social preferences. Moreover,
many inequality measures are implicitly based on a social welfare function.20
In particular, when the Lorenz curves intersect, different indexes can provide different inequality
information and this makes the choice of the index important for the results. For instance, the Gini
coefficient is more sensitive to changes in the middle of the income distribution and it is less
sensitive to movements at the extremes. On the contrary, the family of Atkinson indexes is precisely
more sensitive to changes at the extremes and thus, it is a very convenient complement to the Gini
coefficient.
Formally, the Atkinson index (A) is defined as:
1
1 1 -
- 1 yh 1 -
if 1
n h y
A = (2)
1
yh n
1- if = 1
h y
where the level of sensitivity is conveniently provided by the inequality aversion parameter (),
which defines each Atkinson index.
To illustrate the differences between both inequality measures, in Figure 3 we plot the Lorenz
curves for Bulgaria in 1978 and 1996.21 Both curves intersect once and this points to an important
change in the distribution of income. For instance, in 1996 both the lowest and highest deciles
increased their income share22, while intermediate deciles experienced a relative decrease. The Gini
coefficient, however, did not change (26.5). In contrast, the Atkinson index with an inequality
aversion parameter of one, decreased more than a point (from 11.5 to 10.4), reflecting the gain of
the lowest income quintile against medium income households.
Therefore, by using both the Gini coefficient and the Atkinson indexes we can be more certain
about variations in the whole income distribution. If both inequality measures move in the same
direction our conclusions are more robust. If both measures behave differently this is an indication
20Dalton (1920), Kolm (1969) and Atkinson (1970).
21The curves are estimated using the technique we describe below.
22This can be observed by a steeper curve in these population segments.
- 25 -
that the choice of a particular inequality index is important, since the weighting assigned to different
parts of the distribution is relevant. Thus, the information given by both indexes is complementary
and provides a better understanding on how income inequality is behaving.
We use four different values of (0.5, 1, 1.5 and 2) to obtain more information on the inequality
trends. It is known that for values above one, the Atkinson index is very sensitive to abnormally low
incomes (Cowell, 1995). With this in mind, we estimate the four values to have a broader picture of
how affects the levels and trends of inequality. The most common used values of are 0.5 and 1
(Atkinson, et al., 1995; Burniaux, et al., 1998). In the macro literature, the conceptually equivalent risk
aversion parameter is estimated to be less than one.
Figure 3: Bulgaria, Gross-household series, Lorenz curves for two different years
100
90
80
meoc 70
in 1978
det 60
la 50
1996
muuccafo 40
30
%
20
10
0
0 10 20 30 40 50 60 70 80 90 100
% of accumulated population
Finally, some studies do not rely entirely on indexes and use share data directly to assess
inequality behavior. Indeed, one can compare the ratio of the first and fifth quintile or the extreme
deciles to obtain inequality information. These ratios provide information on the gap between the
richest and poorest households of the population and can be used to assess inequality dispersion.
However, this method has some important drawbacks. The most relevant is that it does not
consider the distribution within income shares. The lowest quintile income share can remain
unchanged, even if the poorest individuals are worst off. On the other extreme, the highest quintile
share can also remain constant even when the richest individuals are much better off. Such intra-
- 26 -
share changes in inequality are not measured by this kind of ratios, but are taken into account in an
indicator like the Atkinson index. Moreover, there is no clear indication of which ratios are to be
used and employing the extreme shares does not assure that we are comparing poor and rich
individuals, since many poor people can be represented by middle shares in countries with
widespread poverty. Finally, the ratios completely ignore the behavior of the middle-income
households. It can be the case that the ratio of the extreme quintiles is unchanged while the middle
income shares are diminishing and thus, the distribution of income is being polarized.
For the reasons listed above, we do not use such ratio measurements in this paper and instead
focus on inequality indexes that provide information for the whole income distribution. To obtain
information about poverty and how poor individuals are faring with respect to the rest of society, we
can directly estimate the extent of poverty from the Lorenz curve. This estimation is more useful
than just assuming that a particular share is representative of the poor and provides us with better
information in order to assess poverty.
Additionally, it is also of practical importance to know the actual shape of the Lorenz curve,
which can be directly used to asses and compare inequality. Although this information cannot be
directly employed in econometric models, it provides useful information for country-specific
inequality analysis and greater detail on the actual inequality experience of each country.
Finally, we could use our country results to estimate the world's Lorenz curve. However, this
task has additional limitations (i.e. lack of inequality data in many countries) that require further
assumptions, which exceed the scope of this paper. This estimation has already been done by Sala-i-
Martin (2002a, b) and we do not expect our estimations to alter the results found in these papers.23
4.2 Parametric estimation of the Lorenz Curve from grouped data
To construct inequality measures from grouped income data we must first obtain the Lorenz curve.
There are two approaches to obtain the Lorenz curve from grouped data: simple interpolation and
methods based on parameterized Lorenz curves. As explained by Datt (1998) the second method is
preferred for its relative accuracy. Parametric estimation implies choosing a specific functional form
and then estimating the underlying parameters. After the parameters are obtained, the Lorenz curve
can be easily calculated. Nevertheless, in order to be considered as a legitimate Lorenz curve a
functional form must comply with certain conditions.
23Although our country-specific Lorenz curve may be better estimated, in general the trend of reduced global
inequality driven by high growth rates in China, can hardly be offset by such estimation improvements.
- 27 -
If p is the cumulative proportion of population and L(p) is the cumulative income share of group
p, L(p) is a valid Lorenz curve if and only if:
L(p) 0 p 0,1 ( ) (3)
L(0)= 0 (4)
L(1)= 1 (5)
L 0+ 0
( ) for p (0,1) (6)
L(p) 0 (7)
There is a large literature concerning Lorenz curve estimation and there are many proposed
functional forms. Some models are better suited for specific distributions and others perform better
on typical distributions. However, given that income distributions can differ widely across countries
and time, no functional form is uniquely preferable. To deal with this fact, we use the most popular
functional forms and for each case, we choose the one that gives the best fit. The functional forms
applied at this stage, and the parameter constraints that assure a valid Lorenz curve, are discussed
below.
The General Quadratic Lorenz curve.24 In this model the Lorenz curve is given by:
1
LGQ (p)= - bp + e + mp2 + np + e2
1 ( )
2 (8)
2
where e = -( a + b + c + 1); m = b² - 4a and n = 2be 4c. The parameters to be estimated are then: a, b
and c. In order for (8) to represent a valid Lorenz curve we must have: m < 0, e < 0, c 0 and a + c
1.
Pareto Family of Lorenz Curves. A group of functional forms has been derived from the well-
known classical Pareto Lorenz curve. The main difference between these models is the number of
parameters employed.
24 Villaseñor and Arnold (1989).
- 28 -
· P0: Classical Pareto. This functional form is given by:
LP0(p) = 1-(1-p) (9)
A valid Lorenz curve is obtained when: 0 < 1.
· P1: Ortega et al. (1991):
LP (p)= p 1- (1- p)
1 (10)
where the necessary conditions for a valid Lorenz curve are: 0 < 1 and a 0. If a = 0 then
P1 reduces to P0.
· P2: Rasche et al. (1980). Here we have:
LP (p)= 1- (1- p)
2 (11)
where the necessary conditions are: 0 < 1 and 1.
· P3: Sarabia et al. (1999). Combining P1 and P2 they propose:
LP (p)= p 1- (1- p)
3 (12)
where 0 < 1, 0 and > 0 assure a valid Lorenz curve.
Kakwani and Podder (1973) suggest the following functional form to estimate the Lorenz curve:
LKP (p)= p exp(p -1) (13)
A valid Lorenz curve is obtained when 1 < < 2 and > 0.
The Beta model.25 This Lorenz curve is given by:
LB (p)= p - p (1- p) (14)
25Kakwani and Podder (1976) and Kakwani (1980).
- 29 -
where , and are the parameters of the model to be estimated and we need for a valid curve that:
> 0, 0 < 1 and 0 < 1. However, in many cases LB(p) fails condition (1) even when the
parameters have the right values. This is an important shortcoming of the Beta model, but we
consider it here since it is one of the best performers (Datt, 1998) and the negative values it
produces in the lower tail of the distribution can be easily detected.
Sarabia et al. (1999) propose a four-parameter functional form to correct for the Beta model
problem. We refer to this below as the BS model:
LBS (p)= p+ 1-(1- p)
(15)
where 0 1, 1, 0 < 1 and 0 assure that LBS(p) is a valid Lorenz curve.
4.3 Estimation and selection of the Lorenz curve model
In total we have seven different functional representations for estimating the Lorenz curve. Some of
these can be linearized to use ordinary least square estimations, but others cannot. Therefore, we
employ a non-linear estimation program using the General Algebraic Modeling System (GAMS)
software package to test the seven parametric models. We also check if each model complies with
the conditions to be taken as a valid Lorenz curve. When more than one model yields a valid Lorenz
curve we use the standard procedure adopted in the literature and choose the model that yields a
lower sum of squared residuals.
In our view, this non-linear estimation of parametric models is an improvement with respect to
existing software. The POVCAL software (Chen et al., 1998) only estimates linearized models of the
General Quadratic and Beta models. In some cases both models fail to provide a valid Lorenz curve
and in addition, this software does not correct for Beta models that generate negative values at the
bottom of the distribution.26 Furthermore, our GAMS-based program calculates the underlying
income distribution associated with the estimated Lorenz curve. Using this information we are able
26 Nevertheless, we use POVCAL to estimate the cumulative income shares L(p) when these are not provided
directly by the source and instead, the grouped data is presented by income classes or the income data is associated with
mean income and/or upper limit values.
- 30 -
to directly estimate the Gini coefficient, the Atkinson index for the four different values, and
poverty ratios. Nonetheless, not all the series present in the WIID database have grouped data
information and thus, we have fewer estimated Gini and Atkinson indexes than the number of Gini
coefficients provided by primary sources. This limits the analysis but provides additional information
not present in the source Gini coefficients.
4.4 Poverty estimation
It is straightforward to conduct poverty headcount analysis once the entire income distribution is
estimated and this procedure has the advantage of not relying on the strong assumption that the
poor people are well represented by the lowest quintile or decile. To estimate poverty ratios, we use
the official World Bank absolute poverty lines of one and two dollars a day (Ravallion et al., 1991).
The income levels are taken from the PPP-adjusted GDP values of the Penn World Tables version
6.1.27
However, the use of GDP data as an income indicator is problematic. First, inequality data is
drawn from household surveys and there is a substantial discrepancy between the national income
reported from these household surveys and that from national accounts data. The difference is
mainly explained because GDP not only includes private consumption, but also private investment
and government spending. Secondly, the poverty lines were calculated using mean consumption
levels in poor countries and therefore, include only the most basic consumption needs and they do
not take into account public services or investment.
Following these considerations, there are two main approaches to estimate absolute poverty.
The World Bank (Chen and Ravallion 2001, 2004) uses consumption and inequality data both drawn
from household surveys. On the other hand, we follow Sala-i-Martin (2002a, b) and estimate poverty
rates using inequality data from household surveys, but per capita income from national accounts
data. This later approach allows us a larger series. In addition, it indirectly accounts for governmental
expenditure and other non-private consumption sources of goods and services for the poor.
The two methods can yield significantly different poverty estimates for a given poverty line. Yet,
in recent articles (Chen and Ravallion, 2004; and Ravallion, 2004) it is shown that both methods
produce very similar results when the World Bank method uses the $1/day poverty line and the
other method uses $2/day. Moreover, Ravallion (2001) finds that, with the exception of the
27The poverty lines were reported in 1985 values and the PWT data is in 1996 dollars. Thus, the equivalent annual
income of $1/day is $532 and for $2/day is $1064 (Sala-i-Martin, 2002a, b).
- 31 -
transition economies of Eastern Europe and the Former Soviet Union, growth rates of national
accounts measures are not systematically different from growth rates of household survey measures.
To sum up, using a $2/day poverty line we obtain poverty estimates roughly equivalent to the
$1/day absolute poverty based on consumption and we expect that this equivalence does not change
over time.
Formally, the headcount poverty ratio is defined as the number of individuals with an income
below the poverty line in relation to the total population:
z
I(y)dy
PRz = 0
(16)
I(y)dy
0
where z is the poverty line and I(y) is the distribution function of income y.
We estimate the poverty ratio using a GAMS program similar to the one used in our previous
section. Nonetheless, since we are now primarily interested in the lower tail of the income
distribution, we select the model that provides a valid Lorenz curve that fits best the lower quintile
of the distribution. Moreover, we use a discrete version of the previous formula. For instance, we
divide the population in a thousand q units and estimate the income of each unit using the formula:28
I (q)= GDPpc IS(q)1000 (17)
where GDPpc is gross domestic product per capita, IS(q) is the income share of unit q. The poverty
ratios are given by the sum of the number of units with an income below the two poverty lines
($1/day and $2/day), divided by the total population. The total number of poor can easily be
obtained by multiplying the poverty ratio by the total population. With a similar procedure, it is also
straightforward to estimate other poverty indexes, such as the poverty gap and the Foster-Greer-
Thorbecke index.
28The formulas to integrate the poverty ratios are complicated by the non-linearity of most of the Lorenz curve
models. However, with a thousand units we have a three digit approximation of the real value.
- 32 -
5. Evaluating the data
On the basis of the previous two sections we have constructed a new inequality dataset. Unlike the
DS series, we have several inequality series for each country (gross-household, net-household and
expenditure-person), five inequality measures (Gini coefficient and four Atkinson indexes) and two
types of series based on their reliability (basic and extended).29 To assess the implementation of our
comparability assumptions, we begin by comparing our dataset with that of DS. We also check how
our estimated Gini coefficients fare with respect to the source information provided in the WIID.
Finally, we examine whether the Gini coefficient and the Atkinson index yield similar results.
5.1 Differences between our series and DS
In general, we want to know if inequality levels and time trends are significantly different when
moving from one dataset to the other. Inequality levels are important for cross-country comparisons,
while time trends provide information on specific country inequality behavior and are relevant when
analyzing pooled data.
A first comparison shows that for the 427 observations of the basic gross income-household
series (which is the most similar series to that of DS), there is a coincidence in country and year for
348 observations with respect to the DS-accept series. This represents an 81.5% of the total number
of observations for the basic series. Moreover, the correlation between both series is fairly high at
0.97. In the case of the extended gross income-household series the percentage of coincidence is of
70% and the correlation is also of 0.97. Therefore, even when our series and the DS-accept series
are highly correlated and share many observations for the same year and country, there is also an
important percentage of observations not included or shared by both series. These dissimilarities can
produce significant divergences in the level and trend information provided by both inequality
datasets.
5.2 Level differences
AB find serious level differences when comparing the DS dataset with the LIS information. Since
the LIS was conducted explicitly to render OECD inequality data comparable, it is a reliable
indicator on which to compare the inequality levels for OECD countries. Thus, we follow AB and
29The summary statistics for the Atkinson indexes and our own Gini estimates are given in Table 5, under the
column: share data.
- 33 -
compare our dataset with the LIS information for a single year (1991 or the closest available). The
results are shown in Table 7.
Table 7: OECD country rankings, Gini Gross-household series for one year
LIS Basic series DS-accept
1 Finland 29.61 Finland 29.61 Finland 26.11
2 Netherlands 30.59 Netherlands 30.59 Belgium 26.92
3 Sweden (1992) 31.11 Sweden 31.11 Canada 27.65
4 Germany (1983) 31.37 Germany 31.37 Great Britain 27.80
5 Norway 31.81 Norway 31.81 Netherlands 29.38
6 Belgium (1992) 31.95 Belgium 31.95 Germany 31.37
7 Denmark (1992) 33.20 Denmark 33.20 Sweden 32.44
8 France (1984) 34.91 France 34.91 Denmark 33.20
9 Canada 35.08 Canada 35.08 Norway 33.31
10 Great Britain (1986) 36.18 Great Britain 36.18 France 34.91
11 USA 39.15 USA 39.15 USA 37.94
Note: In the LIS ranking we state the year of the inequality observation when it is not 1991.
For this specific year our observations are exactly the same as the LIS and thus we have the
same country ranking. However, the ranking provided by the DS series is very different. This is
evident in the low inequality reported for Canada and Great Britain, and the higher inequality in
Sweden and Norway. The results are similar when evaluating net income rankings. In Table 8 our
basic series is almost identical to that of LIS, the only difference being the observation for Italy,
which is very low in the LIS series. On the other hand, the DS accept series once more provides a
completely different ranking. For example, Spain has a very low inequality since DS use expenditure
information for this country. Once more, Great Britain and Canada have surprisingly low positions
and Sweden and Norway very high ones.
One can argue that a one-year ranking check is not adequate, since a single uncharacteristic
observation can alter the results. Thus, we also rank the OECD countries by average Gini
coefficients for a five year period: 1983-1987. We choose this period since it provides the most LIS
observations possible. The results for the gross-household series are given in Table 9 and those for
net-household in Table 10.
- 34 -
Table 8: OECD country rankings, Gini Net-household series for one year
LIS Basic series DS-accept
1 Finland 26.11 Finland 26.11 Spain 25.91
2 Belgium (1992) 26.92 Belgium 26.92 Finland 26.11
3 Italy 27.12 Norway 28.80 Belgium 26.92
4 Norway 28.80 Sweden 29.16 Canada 27.65
5 Sweden (1992) 29.16 Germany 29.36 Great Britain 27.80
6 Germany (1983) 29.36 Netherlands 29.38 Netherlands 29.38
7 Netherlands 29.38 Denmark 29.96 Germany 31.37
8 Denmark (1992) 29.96 Spain 30.60 Italy 32.19
9 Spain (1990) 30.60 Canada 31.47 Sweden 32.44
10 Canada 31.47 France 31.94 Denmark 33.20
11 France (1984) 31.94 Italy 32.19 Norway 33.31
12 Australia (1989) 32.85 Australia 32.85 France 34.91
13 Great Britain (1986) 33.29 Great Britain 33.29 Australia 37.32
14 USA 35.24 USA 35.24 USA 37.94
Note: For Spain the LIS observation refers to family equivalent from an extended series.
In the LIS ranking we state the year if it is not 1991.
Again our basic series for gross income ranks the OECD in a very similar way as the LIS.
Nevertheless, the DS series once more yields an unsuitable ranking. Great Britain is again very highly
positioned and Norway too low. When turning to net income, some differences appear between our
data and the LIS with respect to Finland, Great Britain and Australia. The difference is justified in
these cases by the existence of more observations in our basic series than those in the LIS dataset.
We can draw two conclusions from this example of OECD countries. First, our series are
compatible with the LIS and the few divergences are justified by more observations present in our
series. The second conclusion is that the DS accept series yields some rankings that are very hard to
justify and can only be explained by the inconsistent use of different income concepts.
- 35 -
Table 9: OECD country rankings, Gini Gross-household series, average for 1983-1987
LIS Basic series DS-accept
1 Belgium 26.22 Belgium 26.22 Belgium 26.22
2 Finland 30.10 Norway 29.44 Great Britain 27.14
3 Norway 30.36 Finland 30.30 Netherlands 28.94
4 Sweden 30.77 Sweden 30.77 Finland 29.34
5 Germany 31.78 Netherlands 31.69 Sweden 31.30
6 Netherlands 32.94 Germany 31.78 Norway 31.69
7 Denmark 33.15 Denmark 33.15 Germany 31.78
8 Canada 34.28 Canada 34.67 Canada 32.67
9 France 34.91 France 34.91 Denmark 33.15
10 New Zealand 35.00 New Zealand 35.48 France 34.91
11 Great Britain 36.18 Great Britain 36.18 New Zealand 35.48
12 Australia 36.50 USA 38.93 USA 37.20
13 USA 39.23 Australia 39.09 Australia 39.09
Note: The New Zealand LIS observation refers to family.
We only have an extended series for Finland.
Table 10: OECD country rankings, Gini Net-household series, average for 1983-1987
LIS Basic series DS-accept
1 Finland 26.19 Belgium 26.22 Belgium 26.22
2 Belgium 26.22 Norway 26.87 Great Britain 27.14
3 Netherlands 28.35 Sweden 28.64 Netherlands 28.94
4 Norway 28.35 Netherlands 28.94 Finland 29.34
5 Sweden 28.64 Finland 29.34 Sweden 31.30
6 Germany 29.36 Germany 29.36 Norway 31.69
7 Canada 31.21 Canada 29.87 Germany 31.78
8 Denmark 31.30 Denmark 31.30 Canada 32.67
9 Australia 31.49 Great Britain 31.84 Denmark 33.15
10 France 31.94 France 31.94 Italy 33.80
11 Italy 32.78 Italy 33.80 France 34.91
12 Great Britain 33.29 USA 34.93 USA 37.20
13 USA 35.24 Australia 36.24 Australia 39.09
In the case of developing countries, there is no equivalent to the LIS that we can use as a
benchmark to compare datasets. However, the differences in levels between our basic series and the
DS accept series persist. When ranking OECD countries some differences were produced by the
loose interchange of net and gross income data present in the DS series. For developing countries
the source of divergence in levels is produced by mixing expenditure and gross income data. Since
- 36 -
expenditure data is significantly lower than gross income data, this alters the inequality levels
between countries.
The use of fixed adjustments can help to render the DS series comparable to the LIS. If
provisions are made to adjust for income differences some of the divergences shown before
disappear. However, there is still an element of arbitrariness in the procedure. How much should we
adjust the series? The countries average difference or the overall average? The decision of which
fixed adjustment to apply affects the outcome and the resulting country rankings. Thus, the
previous results support the idea that mixing different income concepts can lead to misguiding
conclusions and is an important limitation for inequality cross-country studies.
5.3 Time trend differences
While the inclusion of fixed adjustments to the series addresses level-differences, it does not help
with time trends. In particular, when the time trend of series with different definitions is not the
same, mixing the series creates a whole new trend. In many cases, the use of a fixed adjustment does
not correct for this problem. Therefore, we proceed to compare the time-trend differences between
our basic and extended series, with the DS accept series. To do so, we regress the Gini coefficient
against time for those series that have five or more observations. We use two equations:30
Gi = 1 + 2 ti + i (18)
Gi = 1 + 2 ti + 3 ti + i 2 (19)
The first regression tests for any linear time trends and the second for quadratic trends. To do a
valid comparison of our series with the DS-accept series we do not take into consideration
information that was not available to DS (data after 1996 and studies published after this year). We
limit the analysis to the three main series: gross-household, net-household and expenditure-person.
In total we study 44 basic series, 29 extended series and 43 DS-accept series with five or more
observations, where 45% of the series come from OECD countries. To have a statistically
significant trend, we must find that 2 is significantly different from zero and/or that 2 and 3 are
jointly different from zero. In total, 25% of the basic series have a different time trend than the DS-
accept series. The figure is over 66% in the case of the extended series.
30Given the small number of observation per country, we do not pursue a time series analysis.
- 37 -
We can also analyze only those series with ten or more observations. Although the number of
series decreases, the results remain the same. In this case we have 30 basic, 15 extended series and
20 DS-accept series. We find that 23% of the basic series and 73% of the extended series have a
different time trend than the DS-accept series. The OECD proportion of observations remains
close to half (47%). These comparisons show significant time trend differences between both
datasets, which suggests dissimilar conclusions regarding inequality variation.
As an example we present the case of Sweden. There are two main sources: LIS and SAS for
both the net and gross-household series. However, when conducting our comparability analysis both
sources are rendered not comparable and thus, we do not mix them. Instead, we choose the LIS
series, which has fewer observations than the SAS series, but has a larger time span and provides
better cross-country comparisons, as explained earlier. In Figure 4 we plot our two basic series and
the DS-accept series. The first two DS observations are taken from the LIS series but the remaining
are from the SAS series.
Figure 4: Sweden, Gini coefficient, basic series and DS-accept series
40
38
36
34
32 Net-Household
30 Gross-Household
28 DS accept
26
24
22
20
1963 1968 1973 1978 1983 1988 1993
The first conclusion is that the levels are different. This was already analyzed in the previous
section. More interesting, the DS-accept series fails to show any time trend at all.31 From our basic
31This could be corrected by a fixed adjustment of the net income series. However, DS do not find any significant
difference between net and gross income, and thus, do not apply any adjustment in this case.
- 38 -
series it is clear that there is a trend, in particular a U-shape pattern. This also is corroborated by
significant coefficients in the quadratic regression.
Another compelling example is Canada. In Figure 5 we present two of our basic series and the
DS-accept series. The gross-household series has been adjusted by the LIS data and thus, has a
different level from the DS-accept series until the 1988 observation. After this year the series diverge,
DS take some observations which fail the WIID quality criteria. Instead, we use the LIS observation
for 1991. Although the last part of the two series show different time trends, there is no overall time
trend in both series. However, when using our basic net-household series, we have significant
coefficients for the quadratic form regression, i.e., we spot again the U-shape pattern. This is a clear
example of the limitations of using a single inequality series per country. Moreover, the dummy
variable solution does not work here, since a fixed adjustment does not solve the inconsistency and
the time trend does not change.
Figure 5: Canada, Gini coefficient, basic series and DS-accept series
40
38
36
34
32 Net-Household
30 Gross-Household
28 DS accept
26
24
22
20
1965 1970 1975 1980 1985 1990 1995
There are many other cases in the data that show different levels and time trends between our series
and the DS-accept. Most of the differences are given by the grouping approach used; not only by the
income concepts and reference units, but also by incompatible sources. On the other hand, there are
some differences that are provided by the quality labels used. In the example of Canada we rejected
- 39 -
some observations used in DS. Sometimes it is the other way around and we include observations
not labeled as high quality by DS, but accepted as reliable data by the WIID.32
Overall, the differences are significant and it is clear that the decision regarding how to group
different definitions is essential when analyzing inequality data. Furthermore, the use of dummy
variables or fixed adjustments does not solve the problem satisfactorily. (In a recent paper,
Deininger and Squire 2002 continue to recommend this practice.) Again, we agree with AB and
conclude that researchers employing inequality data should be careful when using the DS-accept
series. On the other hand, our dataset does comply with the recommendations of AB:
"A secondary dataset should be a consolidation of earlier work, with multiple
observations for the same country and the same date being justified by differences in
source, in definition, or in methods of calculation."
5.4 Compatibility of the source and our estimated Gini coefficients
We turn our attention to our own estimated Lorenz curves. It is a straightforward exercise to obtain
Gini coefficients and Atkinson indexes once the Lorenz curve is estimated.33 With this information
we first test how our estimated Gini coefficients compare to the source information. Afterwards, we
analyze if the Atkinson indexes do effectively convey different or additional information on
inequality than the Gini coefficient alone.
In order to compare our Gini estimates and the source coefficients we conduct the same
comparability tests done above. Again we limit our analysis to the three main series: gross-household,
net-household and expenditure-person. We obtain 87 comparable series, of which 85% are fully
comparable with an average difference (c) of zero; 92% are fully comparable with c |1| and 95%
with c |2|. The average difference between series is of 0.15 and the average difference weighted
by the number of observations is 0.17. These results show a very good estimation of the Gini
coefficients from our constructed Lorenz curves.
32 A noteworthy example is the data presented by Paukert (1973). Most of his data is not accepted by DS because it
lacks a clear reference to the primary source. However, the WIID accepts all his observations. Since this source provides
information from the 1960s it expands many country series and thus, may alter the time trends for some countries. For
instance, Barro (2000) also uses observations that do not pass this "primary source" quality test.
33 Using the chosen parametric equation we can construct the whole income distribution and the Lorenz curve. In
our case, we use centuples to do so. The resulting inequality indexes do not change significantly if a lower unit is used.
- 40 -
Moreover, of the three non-comparable estimations, two of them can be explained by
inconsistent source information for one particular year. For example, the expenditure-person series
of Estonia has a Gini estimate of 39.47 for 1993 while the source value is of 31.52 (see Table 11).
This single observation renders the series incompatible.
Table 11: Estonia, Expenditure-Person series, share data and Gini coefficients
Year Gini (source) Gini est. Quintile 1 Quintile 2 Quintile 3 Quintile 4
1992 35.82 35.79 0.0702 0.1879 0.3490 0.5691
1993 31.52 39.47 0.0624 0.1717 0.3245 0.5386
1995 36.63 36.60 0.0691 0.1835 0.3423 0.5623
Source: WIID and DS datasets
Yet, a closer inspection of the accumulated share data reveals a source inconsistency. All the
quintile accumulated income shares are higher for 1992 and 1995, but the Gini coefficients are
higher in both years. This result is very contradictory and can be a typo in the source data.34 On the
other hand, our Gini estimate is consistent with the share information.
In general, our estimates are very close to the source information and we can be assured of the
quality of our estimated Gini coefficients. This also provides confidence in the values of the
Atkinson indexes that also use the constructed Lorenz curves. Li et al. (1998) state that the
estimation methods vary across different sources and therefore, the use of one standard technique
can minimize this problem. Consequently, our main series include our own estimations when there
is income share information, and the source data when there is no way to estimate the Gini
coefficient and the Atkinson indexes. This procedure introduces a distinctive characteristic to our
inequality dataset.35
34From equations (4) and (5) we know that the Lorenz curve for 1992 and 1995 dominates that of 1993 for the
whole middle section of the distribution. Thus, the lower Gini value given by the source could only be justified by
significant differences at the extremes of the distribution in 1993. However, given that the Gini attaches more weight to
the middle of the distribution this possibility seems very unlikely.
35Nevertheless, we retain both source for the Gini coefficient and this allows us to test the robustness of our results.
- 41 -
5.5 Comparing Gini- Atkinson-based indexes
We turn to the inequality results provided by the two measures we have. First, we check the
behavior of the Atkinson index for each value of . Afterwards we compare the Atkinson and the
Gini data. Throughout the section, we only use our estimated Gini coefficients. In this way we have
the same sample as the Atkinson index and both indexes are derived from the same estimation
technique.
Atkinson index results
As expected, the level and variance of the Atkinson index increases with . The higher the inequality
aversion, the index gives higher values and also is more sensitive to changes in the distribution (see
Table 12). The overall values of the net income series are lower than those of the expenditure series
because the first series has more OECD countries in its sample and the later more developing
countries. In fact, from the Gini coefficient values we know that the level of inequality and the
variance between OECD countries is smaller than for developing countries.
Table 12: Basic series statistics for different indexes
Gross-household Net-household Expenditure-person
Mean Variance Mean Variance Mean Variance
Atkinson = 0.5 14.27 6.92 8.59 2.64 12.72 10.82
Atkinson = 1 25.96 17.31 16.9 8.00 22.63 29.66
Atkinson = 1.5 36.9 35.57 25.53 22.68 30.64 50.23
Atkinson = 2 46.63 65.71 34.91 69.50 37.27 69.23
Gini coefficient 39.9 10.39 31.38 5.81 38.53 18.96
Differences between the Atkinson and the Gini
In the following sections we analyze only the two middle Atkinson indexes. Furthermore, we only
consider series with five or more observations. First we explore how the two inequality indexes rank
a sample of countries and then we analyze the time-trend information that both indexes provide.
- 42 -
Level differences. In Table 13 we rank a sample of 13 countries by the three inequality indexes. We use
the average for the whole 1980s decade. In general, the Atkinson ranking is very similar to that given
by the Gini coefficient. The only significant difference is in the ranking of Canada and Bangladesh
for the last index. Thus, although the Atkinson provides lower levels, this does not change much the
relative position of each country.
Table 13: Rankings based on basic gross-household series, 1980s average
Gini coefficient Atkinson = 1 Atkinson = 1.5
1 Bulgaria 22.66 Bulgaria 8.79 Bulgaria 12.59
2 Germany 31.38 Germany 16.78 Germany 25.62
3 Canada 33.94 Spain 18.84 Spain 27.57
4 Spain 34.27 Canada 19.40 Bangladesh 29.09
5 Japan 34.49 Japan 19.61 Japan 29.28
6 Korea 36.19 Korea 21.20 Korea 30.13
7 Bangladesh 37.52 Bangladesh 21.71 Canada 32.79
8 Australia 38.40 Australia 23.75 Australia 34.64
9 United States 38.45 United States 25.76 Hong Kong 36.17
10 Hong Kong 42.40 Hong Kong 26.71 United States 40.19
11 Bahamas 43.15 Bahamas 32.89 Colombia 50.48
12 Colombia 51.16 Colombia 37.89 Bahamas 50.81
13 Brazil 56.98 Brazil 45.45 Brazil 57.96
However, the Atkinson index provides additional information about income distribution. In Table
14 we rank the 12 countries that have a basic net-household series with more than five observations.
The general ranking does not change much from index to index, but it does have some interesting
cases. For example, Sweden has a relatively high inequality for the Atkinson indexes and Italy a
relatively low one. A closer inspection of the share information shows that the difference is provided
by the lowest quintile, where Italy has an average share income of 8.2% and Sweden of 7.4%.
Therefore, Italy has a lower Atkinson level when = 1.5. Yet, the middle quintiles are very similar
and Sweden's highest quintile has around four points less. This explains the lower inequality when
= 1 or when we use the Gini coefficient.
- 43 -
Table 14: Rankings based on basic net-household series, 1980s average
Gini coefficient Atkinson = 1 Atkinson = 1.5
1 Romania 23.37 Romania 9.04 Romania 13.74
2 Poland 25.06 Norway 9.74 Poland 14.29
3 Norway 26.84 Poland 9.87 Norway 17.20
4 Sweden 27.44 Netherlands 12.50 Netherlands 18.89
5 Netherlands 27.44 Germany 13.26 Germany 19.86
6 Germany 28.58 Sweden 13.67 Italy 20.29
7 Taiwan 29.09 Taiwan 13.82 Taiwan 21.05
8 Italy 30.18 Italy 14.11 Sweden 21.93
9 Finland 30.94 Finland 16.88 Finland 25.24
10 Great Britain 33.76 Great Britain 17.01 Great Britain 25.39
11 United States 34.43 United States 21.67 United States 34.08
12 Mexico 46.45 Mexico 31.42 Mexico 41.79
Time trend differences. To analyze the time trend information provided by both inequality measures,
again we only use the three main series for those countries with five or more observations, and we
regress once more equations (16) and (17). We have a sample of 60 sets of observations, 23 of those
representing OECD countries. For = 1 the Atkinson index yields a different time trend in 27% of
the cases from that of the Gini coefficient. When = 1.5 the difference increases to 31%. If we
expand the series length to ten or more observations the results are similar: 31% for = 1 and 28%
for = 1.5. Therefore, we can conclude that the Atkinson index gives a different time trend in
roughly a third of all cases. This is a significant divergence and confirms that the changes in different
parts of the income distribution can be responsible for both indexes reporting different inequality
results.
In summary, although there is not much variation in the ranking given by both inequality
indexes, there are important differences when we analyze the time trend information. The Atkinson
index does convey additional information about the extremes of the distribution and as such is, in
our view, a useful resource when analyzing inequality data.
6. International and intertemporal patterns of inequality
As before, in this section we use our three main series and in addition, we use our own Gini
estimates when share data are available. First we summarize the characteristics of the basic and
extended series and we end the section by analyzing how inequality varies across and within
countries.
- 44 -
6.1 Descriptive information
In the Appendix we present the summary statistics for our six series: three basic and three extended.
It is important to remember that the net income series are more representative of OECD countries,
the expenditure series of developing countries and the gross income series have a balanced sampled
between both groups. The disparities in the sampled countries of each series make it difficult to
compare the results when different income concepts are used. Nonetheless, a simple examination of
the tables shows that the gross income series have a higher mean and standard deviation on average.
The standard deviation is very similar for the net income and expenditure series, while the last have
a higher mean on average.
We also present for each series the results of a simple ANOVA analysis, which shows the
percentage of variation represented by between and within country changes. The results are
homogeneous and for each series between-country variation represents between 80% and 90% of
total variation. This suggests that inequality levels are more important than inequality trends, a
conclusion also reached by Li, et al. (1998). However, in our case the within-country variation is also
significant and thus, we find evidence for the weaker hypothesis they test, i.e., that inter-temporal
shifts in inequality are modest compared with international differences.
6.2 International patterns
Using again the two time trend equations, we run random-effect regressions on the six series. Here
we want to find if there are any inequality patterns that are common to countries and groups. The
results are presented in Table 15.
Table 15: Random-effect regressions of Gini series
Linear trend Quadratic trend
2 std. error p-value 2 std. error p-value 3 std. error p-value obs. countries
Basic series
G - hh 0.005 0.016 0.75 -0.312 0.101 0.002 0.005 0.002 0.002 412 38
N - hh -0.013 0.018 0.46 -0.316 0.074 0.000 0.005 0.001 0.000 279 20
E - p -0.057 0.027 0.03 -0.311 0.128 0.017 0.004 0.002 0.045 123 21
Extended series
G - hh -0.017 0.015 0.26 -0.430 0.086 0.000 0.006 0.001 0.000 573 52 a/
N - hh -0.004 0.016 0.80 -0.344 0.093 0.000 0.005 0.001 0.001 376 31 b/
E - p -0.063 0.023 0.01 -0.252 0.118 0.034 0.003 0.002 0.105 c/ 173 28
Notes: a/ Uses the joint series SUN/RUS. The results do not change if the series are separated.
b/ Uses the joint series CSK/CZE and CSK/SVK. The results do not change if the series are separated.
c/ We reject the null hypothesis that both coefficients are zero at a 99% confidence level.
- 45 -
First there is no linear pattern in the gross and net income series. The expenditure series has a
significantly decreasing trend. However, all six series do present significant quadratic trends. In
particular, all the series reveal a U-shape pattern with a significantly negative 2 and a positive 3.
These results support the idea of what Atkinson (2003) labeled the "U-turn" pattern. This is
represented by a decrease of inequality after the Second World War and a turning point around the
1980s when inequality began to increase again.36 In particular, for the gross and net series the turning
point is around the late 1970s and early 1980s. Although Atkinson finds this pattern for OECD
countries, our results suggest that it may represent a broader phenomenon.37
Additionally, the values of the quadratic coefficients produce a different U-pattern for the
extended expenditure series. It has a prominent decrease in inequality and the turning point is in the
mid-1990s. These divergent results suggest that the sample of countries may be behaving differently,
since expenditure series represent mainly developing countries. Therefore, to analyze this point
further we divide the series by OECD and developing countries (non-OECD). The results are
shown in Table 16.
Table 16: Random effects regressions: Gini, OECD and developing countries separated
Linear trend Quadratic trend
2 std. error p-value 2 std. error p-value 3 std. error p-value obs. countries
Non-OECD sample
G-hh basic -0.022 0.027 0.40 -1.012 0.536 0.060 0.015 0.008 0.050 a/ 224 22
E-p basic -0.056 0.027 0.04 -0.316 0.130 0.016 0.004 0.002 0.043 120 20
G-hh extended -0.025 0.023 0.26 -1.358 0.541 0.013 0.018 0.008 0.021 340 35 b/
E-p extended -0.061 0.024 0.01 -0.219 0.144 0.132 0.002 0.002 0.231 a/ 151 24
OECD sample
G-hh basic 0.030 0.018 0.10 -0.265 0.074 0.000 0.005 0.001 0.000 188 16
E-p basic 0.014 0.018 0.43 -0.267 0.074 0.000 0.005 0.001 0.000 200 15
G-hh extended -0.008 0.019 0.69 -0.256 0.083 0.002 0.004 0.001 0.002 233 17
E-p extended 0.015 0.018 0.41 -0.261 0.071 0.000 0.004 0.001 0.000 220 18
Notes: a/ We cannot reject the null hypothesis that both coefficients are zero.
b/ Uses the joint series SUN/RUS. The results do not change if the series are separated.
For the case of OECD countries the results are very robust. For the four analyzed series there is
no linear trend but a quadratic U-pattern. On the other hand, the series for developing countries
present different results. The gross-household basic series do not have any significant trend, while
the extended series presents the familiar U-pattern. Moreover, both expenditure series have
36Li, et al. (1998) fail to find any significant time trend. However, they only use the linear approach.
37He explains this inequality behavior by a decrease in governmental intervention and the increase of more liberal
economic policies.
- 46 -
decreasing linear trends and the expenditure basic series has a significant U-pattern. When the
estimated regression curves are plotted38 (Figure 6), we observe that the series for the developing
countries are mainly decreasing in the period. The two series with a significant quadratic trend have
their turning point late in the period: for the Gross extended series it is 1987 and for the
Expenditure basic series, 1991. This generates a trend that is decreasing trough the period and
increases slightly at the end.
Therefore, we can conclude that the OECD countries present a clear U-pattern time trend, with
a turning point around the late 1970s. In addition, for developing countries inequality has been
mainly decreasing in the period, with a slight increase in the 1990s. Subsequently, although inter-
country inequality is more variable, within-country trends are also significant.
Figure 6: Estimated random-effects regression curves
65
60
OECD: G-hh
55
extended
Non-OECD: G-hh
50
extended
45 Non-OECD: E-p
basic
40 Non-OECD: E-p
extended
35
30
1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995
6.3 Country-specific inter-temporal patterns
Inequality changes over time have important policy implications. For instance, if inequality is stable
over time, then economic growth has a direct impact on poverty reduction. Moreover, the particular
level of inequality determines how much poor households benefit from countrywide growth.
Conversely, significant shifts in inequality can offset the impact of growth on poor households,
signal important socioeconomic changes and strengthen the importance of redistributive policies.
38We only plot the OECD gross-household extended series, since the other series are very similar. The linear trend
of the developing countries expenditure-person extended series is very similar to that of the basic series and thus, it is
not plotted either.
- 47 -
We focus now on country specific inequality trends. Again we use a linear and a quadratic time
trend to test for inter-temporal variations in inequality, but now we use fixed-effects estimations to
capture the individual coefficients for each country. Moreover, as some countries have more series
with different reference units (i.e. household equivalent) we incorporate these series into the pooling
to obtain more country observations.39
In the Appendix (Tables A-7 and A-8) we present the fixed-effects regressions for the basic and
extended gross-household series. For the basic series 24% had a significant linear trend, 24% a
quadratic trend and 13% both. In total, 61% of the countries had some kind of time trend. For the
extended series the figures increase to 32%, 29% and 26%, leaving the total to 87%. The differences
between both series can be accounted by the increase in the individual country observations
provided by the extended series.
Using panel data provides more observations per country and thus, a better approximation of
the inequality trend. For each country we perform pooled regressions using all its basic series and
then asses if there is any significant trend at a 5% confidence level. Together with the results of the
basic gross-household series, the results of the panel data estimations for all the series are presented
in the Appendix (Table A-9). In some cases the panel data regressions confirm the results of the
gross-household series, in others they provide other trends or produce a trend that was not present
before. It is important to highlight that 71% of the countries for which we regressed the pooled data
have some kind of significant time trend.
The different results can be a consequence of several factors: the number of series in each
country, the increased number of observations provided by the panel data analysis and the
difference in definitions (income concept and/or recipient unit). Basically, given the data we are
working with, there are many ways to analyze any individual country. The presence of series with
different definitions, the basic series and the extended series and the Atkinson indexes provide a
richer source of information from where to draw inequality conclusions. What seems to be evident
is that many countries present inequality time series with some significant trends and thus, within-
country inequality variations are indeed important.
39Note that we are not mixing different reference units in a same series, but using distinctive series, each with a
different reference unit.
- 48 -
6.4 Poverty results
We estimate poverty ratios for the three basic extended series.40 In general, poverty has been
declining in most of the countries and as expected, there is no absolute poverty in OECD countries.
In many Asian and Latin American countries the ratios have been declining (i.e. China and India)
and have become zero for some (e.g. Indonesia and Thailand). The exception is the African
continent, were the ratios continue to remain high. These results are consistent with Sala-i-Martin
(2002a, b), who used a similar estimation technique. In Figure 7 we show the poverty ratios for the
$2/day poverty line for China, Mexico and Thailand.
These poverty results may seem surprising, especially the lack of poverty in some Southeast
Asian countries. However, one must keep in mind that the poverty lines are analytical constructs
that show minimum living standards in poor countries and do not reflect any relative poverty. For
some countries there may be poverty as defined by national standards, though it disappears when an
international absolute poverty line is used. Furthermore, the interrelation between growth and
inequality is crucial to understand poverty reduction. To clarify this point, we have done some
poverty numerics that show the relationship between income shares, GDP per capita and poverty
ratios.
40Since some of the observations in these series have been adjusted for comparability reasons, we have some share
data that does not correspond with the associated Gini values. Thus, we have some observations that present a lower or
higher inequality than the non-adjusted data. There is no easy way to correct for this problem. However, since most of
the adjustments were performed in OECD countries, the poverty results are mainly not affected, since there is no
absolute poverty in these countries. The three exemptions are the gross-household series for Brazil, Chile and Mexico.
For Brazil we did not use the adjusted data. Chile has higher poverty and Mexico lower poverty for the adjusted data
than expected otherwise.
- 49 -
Figure 7: $2/day poverty ratios for selected countries
70
60
50
China
40
Mexico
30 Thailand
20
10
0
1950 1955 1960 1965 1970 1975 1980 1985 1990 1995
Using equation (15), for any given income share, we can establish the minimum GDP per capita
needed to cross the poverty line. The income shares are determined by the underlying income
distribution of each data point. For illustration purposes we obtain the average income shares for
the three extended series and estimate the minimum GDP per capita that assures that the poverty
line is crossed.41 Figure 8 shows that the required GDP per capita is higher for the gross-household
series.
41We are implicitly assuming that growth does not change income inequality, although this is a controversial point
(Dollar and Kraay, 2002).
- 50 -
Figure 8: Minimum GDP per capita to cross $2/day poverty line
at different poverty ratios
$7,000
$6,000
$5,000
G -hh extended
$4,000
E -p extended
$3,000
N -hh extended
$2,000
$1,000
$0
< 20% < 10% < 5% < 1%
This is consistent with the fact that the series has higher inequality than the net and expenditure
series. Moreover, the figure shows the percentage of the population that lives below the poverty line
of $2/day. For example, with a GDP per capita of at least $3000, the poverty ratio is below 20%
when inequality is measured by expenditure or net income. Equivalently, with a GDP per capita of
at least $5000 there is less than 1% of absolute poverty. However, these minimum total income
requirements can vary when the country has extreme income distributions. Additionally, for a
poverty line of $1/day, the minimum GDP per capita is exactly half of the values shown in the
figure.
Finally, we present the specific case of Thailand. In Figure 9 we plot the poverty ratio for $2/day,
the GDP per capita levels and the Gini coefficient for the gross-household series. The Gini
coefficient has been relatively stable over the period, with an increase in the late 1980s and
beginning of the 1990s. Nevertheless, there have been high rates of GDP per capita growth in the
same period and this has allowed a sharp decrease in the poverty ratio. For 1996 the poverty ratio is
zero, when the GDP per capita was above $7000. The data is consistent with our poverty numerics:
with a GDP of around $3000 the poverty ratio was above 20% and in 1996 the poverty ratio is zero,
when the GDP per capita was above $6000.
- 51 -
Figure 9: Thailand: poverty ratios, Gini coefficients and GDP per capita
70 $8,000
60 $7,000
$6,000
50
$5,000 PR $2
40
$4,000 Gini
30
$3,000 GDPpc
20
$2,000
10 $1,000
0 $0
1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000
7. Conclusions
The empirical study of cross-country inequality benefits from, but is also limited by, the
heterogeneity and vast amount of available data. Some of these limitations can be overcome by
filtering the data with quality criteria. A complementary step is to make comparability assumptions
and group data with different definitions. However, we believe we have convincingly shown that it is
not a good idea (or sound practice) to collapse the whole available information set of a particular
country into a single income inequality time series. We have demonstrated that using different
income concepts in a same series may seriously affect inequality levels and time trends. Likewise,
mixing some recipient units may also alter significantly the series. There are already important
measurement errors implicit in most of the inequality data and freely mixing different concepts and
reference units only adds more noise to the data. Furthermore, using dummy variable adjustments
does not always correct for this problem and in some cases, increases the data distortions. Thus, the
single time series approach followed in the recent literature should not be continued, and the recent
body of literature based on those data should be reassessed.
As an alternative, we propose the use of more than one series per country, where each series is
characterized by a different income concept and/or reference unit. Although an individual country
- 52 -
may have several series, there are three series for which there is considerable world-wide coverage:
gross-household, net-household and expenditure-person. Moreover, we have generated two main set
of series, based on the reliability of the comparability assumptions followed. The "basic" series uses
only strong comparability assumptions, while the "extended" series allows for less reliable
assumptions, but has longer time series. In sum, this approach yields six main inequality series that
can readily be used in empirical tests and within these series the implicit measurement error has been
reduced.
We have also introduced improvements to existing methods for estimating Lorenz curves from
grouped data. The approach is more extensive than the often-used POVCAL software. The Gini
estimations obtained from the income share data are highly satisfactory and statistically comparable
to the coefficients reported by the primary sources. Furthermore, using the resulting estimated
Lorenz curves one can estimate Atkinson indexes, which are a conveniently complement to the
information provided by the Gini coefficient. We find that in roughly a third of the cases both
indexes report different inequality trends and thus, the use of both indexes is advisable in order to
obtain robust conclusions about income inequality.
Finally, we have used our broadly comparable dataset to examine international patterns of
inequality and poverty. A first conclusion is that between-country inequality variation is more
significant than within-country. This suggests that country specific characteristics have a bigger role
in explaining inequality levels than time trends. However, we also find that within-country inequality
is still important and there are significant time trends in our series. Therefore, we reject the "glacial
change" hypothesis (Li, et al., 1998) that inequality does not vary significantly over time. For the
specific case of OECD countries, we clearly detect a U-shape pattern that confirms the "U-turn"
hypothesis of Atkinson (2003). For developing countries the cross-country pattern is less clear, but it
suggests a decrease in inequality for most of the analyzed period, with a slight increase in the 1990s.
Country-specific time trends are diverse and it is difficult to spot precise trends. The choice of
income concept, basic or extended series and the use of pool data may produce different results.
Nevertheless, this variety of choice emphasizes the richness of our inequality dataset, which is not
limited by a single series and provides wider information from where to draw conclusions. With
respect to poverty, we find a decline in the poverty ratios in most of the countries covered by our
sample. The only (though admittedly quite significant) exception is the poverty experience in the
African continent.
- 53 -
References
Atkinson, A.B. (1970). "On the Measurement of Inequality," Journal of Economic Theory, 2: 244-63.
Atkinson, A.B. (1997). "Bringing Income Distribution in from the Cold", Economic Journal, 107: 297-
321.
Atkinson, A.B. (2003). "Income Inequality in OECD Countries: Data and Explanations." Revised
version of the paper presented at the CESifo conference on "Globalization, Inequality and Well-
Being" in Munich, November 8-9, 2002.
Atkinson, A.B. and F. Bourguignon (2000). "Income Distribution and Economics," in Handbook of
Income Distribution, Vol. 1, edited by A.B. Atkinson and F. Bourguignon. Elsevier, Amsterdam.
Atkinson, A.B. and A. Brandolini (2001). "Promise and Pitfalls in the Use of 'Secondary' Data-Sets:
Income Inequality in OECD Countries," Journal of Economic Literature, 39: 771-99.
Atkinson, A.B., L. Rainwater and T.M. Smeeding (1995). Income Distribution in OECD countries.
Evidence from the Luxembourg Income Study. OECD, Paris.
Barro, R. (2000). "Inequality and Growth in a Panel of Countries," Journal of Economic Growth, 5(1): 5-
32.
Burniaux, J., T. Dang, D. Fore, M. Förster, M. Mira D'Ercole and H. Oxley (1998). "Income
Distribution and Poverty in Selected OECD Countries," OECD, Economics Department,
Working Paper No. 189.
Chen, S., G. Datt and M. Ravallion (1998). "POVCAL: A Program for Calculating Poverty Measures
from Grouped Data," Policy Research Department, World Bank.
Chen, S. and M. Ravallion (2001). "How Did the World's Poorest Fare in the 1990s?" Review of Income
and Wealth, 47: 283-300.
Chen, S. and M. Ravallion (2004). "How Have the World's Poorest Fare Since the Early 1980s?"
mimeo, World Bank.
Cowell, F.A. (1995). Measuring Inequality. 2nd Ed. Harvester Wheatsheaf, Hemel Hempstead
Cowell, F.A. (2000). "Measurement of Inequality," in Handbook of Income Distribution, Vol. 1, edited by
A.B. Atkinson and F. Bourguignon. Elsevier, Amsterdam.
Dalton, H. (1920). "The Measurement of the Inequality of Incomes," Economic Journal, 30(9): 348-361.
Datt, G. (1998). "Computational Tools for Poverty Measurement and Analysis," Food Consumption
and Nutrition Division, International Food Policy Research Institute, FCND Discussion Paper
No. 50.
- 54 -
Deininger, K. and L. Squire (1996). "A New Data Set Measuring Income Inequality," World Bank
Economic Review, 10(3): 565-91
Deininger, K. and L. Squire (2002). "Revisiting Inequality: New Data, New Results," The Egyptian
Center for Economic Studies, Distinguished Lecture Series No. 18.
Dollar, D. and A. Kraay (2002). "Growth Is Good for the Poor," Journal of Economic Growth, 7(3):
195-225.
Kakwani, N. (1980). "On a Class of Poverty Measures," Econometrica, 48(2): 437-46.
Kakwani, N. and N. Podder (1973). "On the Estimation of Lorenz Curves from Grouped
Observations," International Economic Review, 14(2): 278-92.
Kakwani, N. and N. Podder (1976). "Efficient Estimation of the Lorenz Curve and Associated
Inequality Measures from Grouped Observations," Econometrica, 44(1): 137-148.
Kolm, S. (1969). "The Optimal Production of Social Justice," in Public Economics, edited by J.
Margolis and H. Guitton. Macmillan, New York.
Kuznets, S. (1955). "Economic Growth and Income Inequality," American Economic Review, 65:
1-28.
Li, H., L. Squire and H. Zou (1998). "Explaining International and Intertemporal Variations in
Income Inequality," Economic Journal, 108: 26-43.
Ortega, P., G. Martín, A. Fernández, M. Ladoux and A. García. (1991). "A New Functional Form
for Estimating Lorenz Curves," Review of Income and Wealth, 37(4): 447-52.
Paukert, F. (1973). "Income Distribution at Different Levels of Development: A Survey of
Evidence," International Labour Review, 108(2): 97-125.
Rasche, R.H., J. Gaffney, A.Y. Koo and N. Obst (1980). "Functional Forms for Estimating the
Lorenz Curve," Econometrica, 48: 1061-62.
Ravallion, M. (2001). "Growth, Inequality and Poverty: Looking Beyond Averages," World
Development, 29: 1803-15.
Ravallion, M. (2004). "Pessimistic on Poverty?" The Economist, April 10.
Ravallion, M. and S. Chen (1997). "What Can New Survey Data Tell Us about Recent Changes in
Living Standards in Developing and Transitional Economies?" World Bank Economic Review, 11:
357-82.
Ravallion, M., G. Datt and D. van de Walle (1991). "Qualifying Absolute Poverty in the Developing
World," Review of Income and Wealth, 37: 345-361.
- 55 -
Sala-i-Martin, X. (2002a). "The Disturbing 'Rise' in Global Income Inequality," NBER, Working
Paper 8904.
Sala-i-Martin, X. (2002b). "The World Distribution of Income (Estimated from Individual Country
Distributions)," NBER, Working Paper 8933.
Sarabia, J.-M., E. Castillo and D.J. Slottje (1999). "An Ordered Family of Lorenz Curves," Journal of
Econometrics, 91: 43-60.
Shorrocks, A. (1983). "Ranking Income Distributions," Economica, 50: 3-17.
UNU/WIDER-UNDP (2000). World Income Inequality Database, Version 1.0, 12 September 2000.
Villaseñor, J.A. and B.C. Arnold (1989). "Elliptical Lorenz Curves," Journal of Econometrics, 40(2): 327-
38.
- 56 -
APPENDIX
Table A-1: Summary statistics, Gross-Household Basic series
Country Obs. Mean St. dev. Max Min Max-Min Coverage
AUS 12 37.88 3.75 44.11 31.82 12.29 68 ~ 96
BEL 3 28.22 3.13 31.81 26.11 5.71 85 ~ 92
BGD 10 35.20 2.77 38.50 29.00 9.50 63 ~ 86
BGR 29 23.18 3.67 34.41 17.83 16.58 63 ~ 96
BHS 11 44.34 4.71 53.61 38.74 14.88 70 ~ 93
BRA 17 57.85 2.68 65.05 53.46 11.59 60 ~ 87
CAN 18 33.60 1.09 35.04 31.39 3.66 65 ~ 91
CHL 18 55.01 3.55 59.63 46.40 13.23 68 ~ 96
CHN 4 35.20 13.83 55.80 26.60 29.20 53 ~ 75
COL 7 51.69 1.90 54.52 49.24 5.28 70 ~ 92
CRI 5 47.52 3.02 51.40 44.69 6.72 61 ~ 83
DEW 8 32.07 2.16 35.56 29.40 6.16 73 ~ 84
DNK 6 36.46 4.69 41.27 30.98 10.30 78 ~ 92
ESP 5 33.19 3.24 36.30 27.77 8.53 65 ~ 91
FRA 7 43.10 6.08 49.00 34.72 14.28 56 ~ 84
GBR 4 33.87 5.44 40.38 28.40 11.97 69 ~ 95
HKG 8 44.24 3.61 52.00 39.68 12.32 71 ~ 96
IND 4 41.17 5.51 47.75 34.34 13.41 56 ~ 75
JPN 23 36.38 1.90 41.49 33.27 8.22 62 ~ 90
KOR 9 35.94 2.14 39.85 33.98 5.87 65 ~ 88
LKA 8 42.70 4.96 47.95 35.65 12.30 53 ~ 87
MEX 11 52.29 6.42 62.28 42.90 19.38 50 ~ 96
MYS 6 50.63 1.87 52.83 48.30 4.53 67 ~ 84
NLD 10 31.16 1.61 33.37 28.40 4.97 77 ~ 97
NOR 10 31.80 3.20 37.50 27.22 10.29 62 ~ 96
NZL 13 34.12 3.19 40.11 29.23 10.88 73 ~ 97
PAK 10 34.54 1.96 38.65 32.38 6.27 63 ~ 88
PER 4 50.08 3.69 55.00 46.43 8.57 71 ~ 97
PHL 11 47.61 2.64 51.45 43.61 7.83 56 ~ 97
POL 5 30.96 0.90 32.20 30.07 2.13 86 ~ 92
PRI 3 50.30 1.50 51.98 49.12 2.86 69 ~ 89
SGP 7 40.76 1.88 43.23 37.88 5.35 73 ~ 93
SWE 5 31.76 3.03 36.96 29.02 7.93 67 ~ 92
THA 11 46.88 3.57 53.53 42.90 10.63 62 ~ 96
TTO 4 46.04 4.22 51.64 41.49 10.15 58 ~ 81
TUR 3 50.81 6.12 56.26 44.20 12.06 68 ~ 87
TWN 31 31.15 1.71 34.60 28.82 5.78 64 ~ 97
USA 53 38.21 1.79 42.72 35.34 7.38 44 ~ 97
Average 10.87 40.21 3.50 45.52 35.81 9.71
Overall 413 38.94 9.39 65.05 17.83 47.22 44 ~ 97
Between Country variation 89% Countries 38
Within Country variation 11% OECD 14 37%
- 57 -
Table A-2: Summary statistics, Net-Household Basic series
Country Obs. Mean St. dev. Max Min Max-Min Coverage
AUS 8 37.23 6.01 44.00 31.04 12.96 68 ~ 96
BEL 4 27.08 0.96 28.39 26.11 2.29 79 ~ 92
CAN 12 30.03 1.97 34.30 26.60 7.70 71 ~ 94
DEW 7 29.98 2.35 33.56 27.40 6.16 63 ~ 83
DNK 3 30.79 1.01 31.48 29.63 1.85 76 ~ 92
FIN 10 30.43 2.35 33.93 26.37 7.55 77 ~ 91
FRA 3 30.42 2.23 31.85 27.85 4.00 79 ~ 84
G BR 32 32.79 2.57 38.38 26.23 12.15 61 ~ 95
IRL 3 38.70 1.28 39.86 37.32 2.54 73 ~ 87
ITA 25 35.19 4.58 42.00 28.78 13.22 48 ~ 95
MEX 13 50.94 5.95 58.06 40.90 17.16 50 ~ 96
NLD 16 29.17 2.14 32.40 24.66 7.74 75 ~ 94
NOR 15 28.39 2.75 34.50 24.22 10.29 62 ~ 96
POL 16 26.97 4.40 34.19 18.85 15.34 76 ~ 97
POR 4 37.09 2.51 40.36 34.25 6.11 73 ~ 91
ROM 9 26.76 3.21 31.26 22.88 8.38 89 ~ 97
SVK 11 20.96 3.91 30.60 17.73 12.87 58 ~ 97
SWE 5 28.73 2.41 32.61 26.44 6.17 67 ~ 92
TWN 30 30.11 1.66 33.60 27.82 5.78 64 ~ 97
USA 53 34.22 1.78 38.72 31.34 7.38 44 ~ 97
Average 13.95 31.80 2.80 36.20 27.82 8.38
Overall 279 32.17 6.25 58.06 17.73 40.33 44 ~ 97
Between Country variation 77% Countries 20
Within Country variation 23% OECD 15 75%
- 58 -
Table A-3: Summary statistics, Expenditure-Person Basic series
Country Obs. Mean St. dev. Max Min Max-Min Coverage
BGD 3 30.22 2.98 33.64 28.23 5.41 89 ~ 96
CIV 5 38.67 1.97 41.20 36.64 4.56 85 ~ 95
ESP 3 33.41 1.37 34.90 32.18 2.71 74 ~ 91
EST 5 36.72 1.82 39.47 34.57 4.90 92 ~ 98
GHA 5 34.64 1.63 36.73 32.73 4.00 88 ~ 97
GIN 3 42.54 3.73 46.84 40.36 6.48 91 ~ 95
HUN 3 21.86 4.41 26.96 19.24 7.71 93 ~ 97
IDN 13 34.37 1.72 37.71 31.68 6.03 64 ~ 96
IND 33 32.56 2.17 37.48 29.10 8.38 51 ~ 97
IRN 5 43.23 1.41 45.45 41.88 3.57 69 ~ 84
JAM 9 41.07 2.96 45.58 36.47 9.11 71 ~ 96
JOR 4 39.38 3.80 44.21 36.33 7.87 80 ~ 97
LKA 3 34.46 4.32 38.80 30.15 8.65 87 ~ 95
MRC 3 39.27 0.24 39.53 39.09 0.44 84 ~ 99
NGA 5 43.71 5.07 50.60 36.93 13.67 86 ~ 97
PER 3 43.70 1.22 45.11 43.00 2.11 86 ~ 94
POL 3 30.10 2.75 32.66 27.20 5.47 92 ~ 96
PHL 4 42.66 2.46 46.06 40.68 5.38 85 ~ 97
THA 3 30.10 2.75 32.66 27.20 5.47 89 ~ 98
TUN 3 43.77 2.45 46.20 41.31 4.90 65 ~ 90
TZA 5 42.61 1.45 44.00 40.15 3.85 69 ~ 93
ZMB 4 45.04 9.66 59.01 38.15 20.86 91 ~ 96
Average 5.77 37.46 2.83 41.13 34.69 6.43
Overall 127 36.97 6.00 59.01 19.24 39.77 51 ~ 99
Between Country variation 79% Countries 22
Within Country variation 21% OECD 1 5%
- 59 -
Table A-4: Summary statistics, Gross-Household Extended series
Country Obs. Mean St. dev. Max Min Max-Min Coverage
AUS 15 39.14 4.22 44.22 31.82 12.40 68 ~ 98
BEL 3 28.22 3.13 31.81 26.11 5.71 85 ~ 92
BGD 10 35.20 2.77 38.50 29.00 9.50 63 ~ 86
BGR 29 23.18 3.67 34.41 17.83 16.58 63 ~ 96
BHS 11 44.34 4.71 53.61 38.74 14.88 70 ~ 93
BRA 21 58.02 2.46 65.05 53.46 11.59 60 ~ 96
BRB 3 47.76 0.79 48.27 46.85 1.42 51 ~ 79
CAN 18 33.60 1.09 35.04 31.39 3.66 65 ~ 91
CHL 18 55.01 3.55 59.63 46.40 13.23 68 ~ 96
CHN 16 32.42 7.12 55.80 24.36 31.44 53 ~ 92
COL 11 53.95 5.20 64.53 47.83 16.70 64 ~ 94
CRI 12 47.17 2.95 53.54 43.90 9.64 61 ~ 95
DEW 8 32.07 2.16 35.56 29.40 6.16 73 ~ 84
DNK 15 33.71 3.96 41.27 28.29 12.98 63 ~ 95
DOM 4 47.07 3.55 51.00 43.29 7.71 76 ~ 92
ECU 3 47.00 7.94 53.00 38.00 15.00 68 ~ 94
ESP 5 33.19 3.24 36.30 27.77 8.53 65 ~ 91
EST 6 31.97 7.23 37.75 21.00 16.75 88 ~ 96
FIN 3 35.61 10.17 47.35 29.47 17.88 62 ~ 98
FRA 7 42.98 6.97 52.09 34.72 17.36 56 ~ 84
GBR 15 30.22 3.63 40.38 27.20 13.18 64 ~ 95
GTM 4 56.10 5.18 59.56 48.40 11.16 79 ~ 89
HKG 9 44.74 3.69 52.00 39.68 12.32 65 ~ 96
HND 7 55.04 3.98 61.88 50.00 11.88 68 ~ 93
IND 5 39.54 6.01 47.75 33.00 14.75 56 ~ 75
JPN 23 36.38 1.90 41.49 33.27 8.22 62 ~ 90
KOR 9 35.94 2.14 39.85 33.98 5.87 65 ~ 88
LKA 8 42.70 4.96 47.95 35.65 12.30 53 ~ 87
MEX 11 52.29 6.42 62.28 42.90 19.38 50 ~ 96
MYS 7 50.33 1.89 52.83 48.30 4.53 67 ~ 89
NGA 3 40.77 8.87 51.00 35.18 15.82 59 ~ 82
NLD 11 32.41 4.41 44.89 28.40 16.49 62 ~ 91
NOR 11 32.25 3.37 37.50 27.22 10.29 63 ~ 91
NZL 13 34.12 3.19 40.11 29.23 10.88 73 ~ 97
PAK 11 34.77 2.00 38.65 32.38 6.27 63 ~ 88
PAN 7 54.54 4.50 58.92 47.46 11.46 70 ~ 97
PER 6 52.77 6.25 63.95 46.43 17.52 61 ~ 97
PHL 11 47.61 2.64 51.45 43.61 7.83 56 ~ 97
POL 5 30.96 0.90 32.20 30.07 2.13 86 ~ 92
PRI 4 49.15 2.61 51.98 45.68 6.30 63 ~ 89
ROM 4 28.52 1.94 31.20 27.10 4.10 89 ~ 94
RUS 7 32.16 6.58 40.01 25.90 14.11 88 ~ 98
SUN/RUS 10 30.27 6.21 40.01 24.52 15.49 80 ~ 98
SGP 7 40.76 1.88 43.23 37.88 5.35 73 ~ 93
SLV 4 51.10 2.29 53.00 48.40 4.60 65 ~ 95
SUN 4 26.00 1.24 27.54 24.52 3.02 80 ~ 89
SVK 5 22.60 1.28 24.50 21.50 3.00 89 ~ 93
SWE 6 33.14 4.34 40.06 29.02 11.04 63 ~ 92
THA 11 46.88 3.57 53.53 42.90 10.63 62 ~ 96
TTO 5 45.80 3.69 51.64 41.49 10.15 57 ~ 81
TUR 3 50.81 6.12 56.26 44.20 12.06 68 ~ 87
TWN 31 31.15 1.71 34.60 28.82 5.78 64 ~ 97
UKR 8 27.96 5.08 34.43 21.82 12.61 80 ~ 97
USA 53 38.21 1.79 42.72 35.34 7.38 44 ~ 97
VEN 12 43.88 3.26 49.63 37.68 11.95 62 ~ 97
YUF 9 33.40 1.82 37.68 31.84 5.84 63 ~ 90
ZAF 3 61.34 2.09 63.00 59.00 4.00 90 ~ 95
Average 10.09 40.34 3.86 46.02 35.25 10.77
Overall 580 38.97 9.96 65.05 17.83 47.22 44 ~ 98
Between Country variation 87% Countries 57
Within Country variation 13% OECD 15 26%
- 60 -
Table A-5: Summary statistics, Net-Household Extended series
Country Obs. Mean St. dev. Max Min Max-Min Coverage
AUS 8 37.23 6.01 44.00 31.04 12.96 68 ~ 96
BEL 4 27.08 0.96 28.39 26.11 2.29 79 ~ 92
BGR 6 34.10 2.34 37.10 30.98 6.13 92 ~ 97
CAN 12 30.03 1.97 34.30 26.60 7.70 71 ~ 94
CHN 4 35.33 6.54 43.00 28.40 14.60 78 ~ 95
CSK 10 21.73 2.46 26.99 18.49 8.51 58 ~ 88
CZE 11 22.03 3.15 27.93 18.84 9.09 58 ~ 97
CSK/CZE 19 21.89 2.85 27.93 18.49 9.45 58 ~ 97
DEW 7 29.98 2.35 33.56 27.40 6.16 63 ~ 83
DNK 6 32.07 1.59 33.79 29.63 4.16 76 ~ 95
EST 7 37.94 2.31 41.02 33.80 7.21 92 ~ 98
FIN 10 30.43 2.35 33.93 26.37 7.55 77 ~ 91
FRA 3 30.42 2.23 31.85 27.85 4.00 79 ~ 84
GBR 32 32.79 2.57 38.38 26.23 12.15 61 ~ 95
GRC 3 35.89 4.82 41.30 32.06 9.24 74 ~ 88
HUN 14 23.31 1.52 25.79 20.36 5.43 62 ~ 98
IRL 3 38.70 1.28 39.86 37.32 2.54 73 ~ 87
ITA 25 35.19 4.58 42.00 28.78 13.22 48 ~ 95
MEX 13 50.94 5.95 58.06 40.90 17.16 50 ~ 96
NLD 16 29.17 2.14 32.40 24.66 7.74 75 ~ 94
NOR 15 28.39 2.75 34.50 24.22 10.29 62 ~ 96
POL 16 26.97 4.40 34.19 18.85 15.34 76 ~ 97
POR 4 37.09 2.51 40.36 34.25 6.11 73 ~ 91
ROM 9 26.76 3.21 31.26 22.88 8.38 89 ~ 97
SVK 11 20.96 3.91 30.60 17.73 12.87 58 ~ 97
CSK/SVK 19 20.71 2.15 24.81 17.73 7.08 58 ~ 97
SWE 12 29.41 1.92 32.70 26.44 6.26 67 ~ 96
TWN 30 30.11 1.66 33.60 27.82 5.78 64 ~ 97
UKR 8 27.46 5.08 33.93 21.32 12.61 80 ~ 97
USA 53 34.22 1.78 38.72 31.34 7.38 44 ~ 97
YUG 8 33.48 6.54 45.57 27.32 18.24 90 ~ 97
YUG/YUF 9 32.21 7.22 45.57 22.00 23.57 78 ~ 97
Average 12.72 30.75 3.22 35.86 26.44 9.41
Overall 407 30.41 6.94 58.06 17.73 40.33 44 ~ 98
- 61 -
Table A-6: Summary statistics, Expenditure-Person Extended series
Country Obs. Mean St. dev. Max Min Max-Min Coverage
BGD 6 33.46 4.37 39.19 28.23 10.96 73 ~ 96
BGR 6 25.10 2.34 28.10 21.98 6.13 92 ~ 97
CAN 6 22.10 1.15 23.60 20.70 2.90 78 ~ 92
CIV 5 38.67 1.97 41.20 36.64 4.56 85 ~ 95
DEW 3 23.25 0.41 23.68 22.88 0.80 73 ~ 83
EGY 5 36.18 5.52 42.00 28.94 13.06 59 ~ 95
ESP 10 25.34 1.41 26.98 22.59 4.39 74 ~ 96
EST 5 36.72 1.82 39.47 34.57 4.90 92 ~ 98
GHA 5 34.64 1.63 36.73 32.73 4.00 88 ~ 97
GIN 3 42.54 3.73 46.84 40.36 6.48 91 ~ 95
GRC 3 34.60 1.15 35.35 33.28 2.07 74 ~ 88
HUN 3 21.86 4.41 26.96 19.24 7.71 93 ~ 97
IDN 13 34.37 1.72 37.71 31.68 6.03 64 ~ 96
IND 33 32.56 2.17 37.48 29.10 8.38 51 ~ 97
IRN 5 43.23 1.41 45.45 41.88 3.57 69 ~ 84
JAM 9 41.07 2.96 45.58 36.47 9.11 71 ~ 96
JOR 4 39.38 3.80 44.21 36.33 7.87 80 ~ 97
LKA 6 32.34 3.92 38.80 27.38 11.42 63 ~ 95
MRC 3 39.27 0.24 39.53 39.09 0.44 84 ~ 99
NGA 5 43.71 5.07 50.60 36.93 13.67 86 ~ 97
PAK 10 31.46 0.82 32.43 29.89 2.55 69 ~ 96
PER 3 43.70 1.22 45.11 43.00 2.11 86 ~ 94
PHL 4 42.66 2.46 46.06 40.68 5.38 85 ~ 97
POL 4 28.82 3.41 32.66 24.96 7.70 86 ~ 96
SGP 4 37.55 2.99 40.95 33.70 7.25 78 ~ 93
THA 4 42.97 2.57 46.20 40.56 5.65 89 ~ 98
TUN 5 42.61 1.45 44.00 40.15 3.85 65 ~ 90
TZA 4 45.04 9.66 59.01 38.15 20.86 69 ~ 93
UGA 3 37.67 4.14 40.87 33.00 7.87 89 ~ 93
ZMB 3 46.54 3.06 49.75 43.65 6.10 91 ~ 96
Average 6.07 35.98 2.77 39.55 32.96 6.59
Overall 182 34.78 6.89 59.01 19.24 39.77 51 ~ 99
Between Country variation 85% Countries 30
Within Country variation 15% OECD 4 13%
- 62 -
Table A-7: Fixed-effects regressions on Gini Gross-Household Basic series
Linear trend Quadratic trend
1 2 p-value 1 2 p-value 3 p-value
AUS 23.97 0.356 0.000 21.89 0.470 0.518 -0.001 0.874
BEL -9.80 0.839 0.120 302.82 -12.933 0.548 0.151 0.522
BGD 30.44 0.152 0.180 56.32 -1.598 0.118 0.028 0.086
BGR 16.56 0.180 0.002 33.69 -0.817 0.042 0.014 0.013
BHS 58.22 -0.347 0.001 34.06 0.946 0.462 -0.017 0.314
BRA 57.75 0.003 0.976 53.40 0.283 0.580 -0.004 0.580
CAN 31.45 0.060 0.474 41.29 -0.533 0.465 0.009 0.414
CHL 43.58 0.269 0.002 -6.58 2.928 0.000 -0.034 0.000
CHN 65.73 -1.357 0.000 95.00 -4.761 0.000 0.083 0.000
COL 57.06 -0.144 0.196 16.03 2.147 0.219 -0.030 0.189
CRI 51.11 -0.115 0.462 58.71 -0.693 0.567 0.010 0.631
DEW 29.16 0.082 0.398 34.00 -0.202 0.698 0.004 0.580
DNK 58.77 -0.560 0.010 239.21 -9.306 0.051 0.104 0.066
ESP 28.53 0.138 0.314 -9.09 2.406 0.015 -0.032 0.021
FRA 58.77 -0.577 0.000 49.46 0.198 0.756 -0.014 0.218
GBR 16.28 0.463 0.001 17.21 0.411 0.761 0.001 0.969
HKG 35.70 0.220 0.057 90.89 -2.641 0.018 0.036 0.010
IND 33.01 0.363 0.066 -6.17 4.195 0.001 -0.085 0.001
JPN 38.45 -0.066 0.355 39.37 -0.126 0.791 0.001 0.899
KOR 34.70 0.038 0.740 5.96 1.878 0.079 -0.028 0.084
LKA 43.96 -0.043 0.644 63.58 -1.731 0.000 0.031 0.000
MEX 61.03 -0.258 0.000 51.09 0.575 0.015 -0.013 0.000
MYS 51.30 -0.021 0.914 -5.49 3.587 0.082 -0.056 0.080 a/
NLD 25.38 0.128 0.332 1.40 1.225 0.486 -0.012 0.532
NOR 38.90 -0.193 0.022 43.66 -0.482 0.338 0.004 0.561
NZL 18.86 0.379 0.001 3.63 1.136 0.286 -0.009 0.475
PAK 37.65 -0.099 0.299 46.45 -0.670 0.493 0.009 0.557
PER 61.09 -0.243 0.065 124.15 -3.663 0.062 0.043 0.081 a/
PHL 48.63 -0.031 0.580 50.18 -0.146 0.647 0.002 0.714
POL 28.64 0.050 0.932 -365.52 17.268 0.488 -0.188 0.489
PRI 54.22 -0.109 0.564 75.98 -1.383 0.497 0.018 0.530
SGP 37.59 0.078 0.616 44.82 -0.292 0.874 0.005 0.840
SWE 39.99 -0.220 0.106 74.55 -2.234 0.052 0.028 0.078 a/
THA 39.67 0.196 0.009 33.23 0.587 0.286 -0.005 0.474
TTO 48.93 -0.101 0.516 12.34 3.103 0.004 -0.062 0.003
TUR 71.20 -0.618 0.001 89.36 -1.715 0.559 0.016 0.708
TWN 30.19 0.025 0.635 49.33 -1.037 0.007 0.014 0.006
USA 36.84 0.049 0.041 40.29 -0.307 0.000 0.006 0.000
R-square 0.93 R-square 0.96
F-statistic 128.4 F-statistic 87.5
Note: a/ We reject the null hypothesis that both coefficients are zero at a 95% confidence level.
- 63 -
Table A-8: Fixed-effects regressions on Gini Gross-Household Extended series
Linear trend Quadratic trend
1 2 SE p-value 1 2 SE p-value 3 SE p-value
AUS 23.30 0.381 0.084 0.000 22.72 0.411 0.576 0.477 0.000 0.007 0.959
BEL -9.80 0.839 0.613 0.172 302.82 -12.933 21.865 0.555 0.151 0.240 0.529
BGD 30.44 0.152 0.129 0.239 56.32 -1.598 1.037 0.124 0.028 0.016 0.091
BGR 16.56 0.180 0.064 0.005 33.69 -0.817 0.408 0.046 0.014 0.006 0.014
BHS 58.22 -0.347 0.115 0.003 34.06 0.946 1.308 0.470 -0.017 0.017 0.322
BRA 56.87 0.029 0.077 0.710 57.52 -0.009 0.375 0.980 0.001 0.005 0.918
CAN 31.45 0.060 0.096 0.530 41.29 -0.533 0.741 0.472 0.009 0.011 0.422
CHL 43.58 0.269 0.099 0.007 -6.58 2.928 0.672 0.000 -0.034 0.008 0.000
CHN 42.93 -0.275 0.073 0.000 82.95 -3.378 0.312 0.000 0.050 0.005 0.000
COL 56.07 -0.056 0.087 0.517 122.65 -3.918 0.757 0.000 0.052 0.010 0.000
CRI 53.49 -0.165 0.088 0.061 55.89 -0.311 0.481 0.519 0.002 0.007 0.760
DEW 29.16 0.082 0.110 0.458 34.00 -0.202 0.529 0.703 0.004 0.007 0.587
DNK 44.44 -0.253 0.097 0.010 11.97 1.597 0.512 0.002 -0.025 0.007 0.000
DOM 32.21 0.352 0.251 0.162 94.29 -2.766 3.530 0.434 0.038 0.043 0.377
ECU 24.65 0.532 0.146 0.000 147.15 -6.789 9.820 0.490 0.097 0.130 0.456
ESP 28.53 0.138 0.156 0.377 -9.09 2.406 1.003 0.017 -0.032 0.014 0.023
EST -64.50 1.942 0.442 0.000 -478.19 18.907 15.869 0.234 -0.173 0.162 0.286
FIN 45.03 -0.281 0.068 0.000 83.28 -2.527 0.420 0.000 0.030 0.006 0.000
FRA 60.48 -0.621 0.126 0.000 49.34 0.307 0.646 0.635 -0.017 0.012 0.147
GBR 17.76 0.419 0.099 0.000 26.33 -0.100 0.571 0.861 0.007 0.008 0.359
HKG 42.31 0.066 0.109 0.545 86.00 -2.399 0.669 0.000 0.033 0.009 0.000
HND 71.37 -0.384 0.143 0.008 87.04 -1.326 2.404 0.582 0.013 0.033 0.695
IND 29.03 0.505 0.195 0.010 -10.87 4.523 1.158 0.000 -0.091 0.026 0.001
JPN 38.45 -0.066 0.082 0.417 39.37 -0.126 0.484 0.794 0.001 0.007 0.901
KOR 34.70 0.038 0.129 0.770 5.96 1.878 1.083 0.084 -0.028 0.016 0.089
LKA 43.96 -0.043 0.107 0.685 63.58 -1.731 0.405 0.000 0.031 0.007 0.000
MEX 61.03 -0.258 0.061 0.000 51.09 0.575 0.239 0.016 -0.013 0.004 0.000
MYS 53.10 -0.082 0.160 0.609 21.58 1.801 1.320 0.173 -0.027 0.019 0.153
NGA 61.83 -0.679 0.166 0.000 106.89 -4.661 8.039 0.562 0.073 0.147 0.621
NLD 43.75 -0.265 0.095 0.005 74.78 -2.073 0.459 0.000 0.024 0.006 0.000
NOR 39.76 -0.213 0.085 0.013 45.20 -0.557 0.466 0.233 0.005 0.007 0.456
NZL 18.86 0.379 0.130 0.004 3.63 1.136 1.083 0.295 -0.009 0.013 0.483
PAK 38.24 -0.114 0.102 0.264 48.40 -0.785 0.915 0.392 0.010 0.014 0.462
PAN 49.02 0.129 0.127 0.310 116.42 -3.341 1.046 0.002 0.042 0.013 0.001
PER 68.50 -0.398 0.096 0.000 87.21 -1.586 0.577 0.006 0.016 0.008 0.039
PHL 48.63 -0.031 0.063 0.627 50.18 -0.146 0.325 0.653 0.002 0.005 0.719
POL 28.64 0.050 0.661 0.940 -365.52 17.268 25.311 0.496 -0.188 0.276 0.497
PRI 46.40 0.086 0.154 0.577 29.63 1.200 1.092 0.273 -0.017 0.016 0.305
ROM 52.21 -0.488 0.844 0.563 1160.15 -46.240 37.861 0.223 0.472 0.390 0.228
SGP -0.31 0.612 0.174 0.001 100.66 -3.791 2.321 0.103 0.047 0.025 0.058
SLV 37.59 0.078 0.178 0.659 44.82 -0.292 1.866 0.876 0.005 0.023 0.843
SVK 14.44 0.170 0.962 0.860 -889.92 37.884 60.082 0.529 -0.393 0.626 0.531
SWE 44.23 -0.321 0.120 0.008 71.53 -2.073 0.802 0.010 0.026 0.012 0.029
THA 39.67 0.196 0.085 0.021 33.23 0.587 0.559 0.294 -0.005 0.008 0.482
TTO 46.70 -0.035 0.142 0.804 14.78 2.940 0.985 0.003 -0.059 0.019 0.003
TUR 71.20 -0.618 0.218 0.005 89.36 -1.715 2.981 0.565 0.016 0.042 0.713
TWN 30.19 0.025 0.059 0.677 49.33 -1.037 0.391 0.008 0.014 0.005 0.007
UKR 45.79 -0.388 0.229 0.092 273.83 -10.567 2.579 0.000 0.112 0.028 0.000
USA 36.84 0.049 0.027 0.072 40.29 -0.307 0.087 0.000 0.006 0.002 0.000
VEN 38.15 0.147 0.090 0.103 59.10 -1.046 0.466 0.025 0.016 0.006 0.010
YUF 41.62 -0.207 0.132 0.118 45.51 -0.467 0.902 0.605 0.004 0.014 0.772
R-square 0.92 R-square 0.96
F-statistic 106.8 F-statistic 92.96
Note: a/ We reject the null hypothesis that both coefficients are zero at a 95% confidence level.
- 64 -
Table A-9: Time trends, Gross-Household Basic series and panel data of all definitions
Gross-Household Fixed-effects Gross-Household Fixed-effects
Linear Quad. Linear Quad. Series Linear Quad. Linear Quad. Series
AUS 0.36 0.30 5 JPN U 1
BEL n.l. KOR 1
BGD 3 LKA U 3
BGR 0.18 U 0.21 2 MEX -0.26 inv U -0.19 3
BHS -0.35 -0.30 1 MYS inv U inv U 1
BRA 0.11 2 NGA n.a. n.a. 1.19 1
CAN 4 NLD 0.17 4
CHL 0.27 inv U 0.28 inv U 2 NOR -0.19 -0.22 4
CHN -1.36 U U 3 NZL 0.38 0.37 1
CIV n.a. n.a. 1 PAK 2
COL 2 PAN n.a. n.a. 1
CRI -0.18 2 PER U U 2
CSK n.a. n.a. -0.19 1 PHL 3
CZE n.a. n.a. U 1 POL 0.40 U 3
DEW U 7 PRI n.l.
DNK -0.56 inv U 6 ROM n.a. n.a. 1.86 1
EGY n.a. n.a. -0.42 1 RUS n.a. n.a. 1
ESP inv U -0.15 4 SGP 2
EST n.a. n.a. 3 SVK n.a. n.a. -0.24 U 1
FIN n.a. n.a. -0.18 U 3 SWE U U 4
FRA -0.58 -0.49 U 3 THA 0.20 0.19 3
GBR 0.46 0.23 U 4 TUN n.a. n.a. 1
GHA n.a. n.a. -0.41 1 TTO inv U n.l.
HKG U 0.40 1 TUR -0.62 n.l.
HND n.a. n.a. 1 TWN U U 2
HUN n.a. n.a. U 2 UKR n.a. n.a. U 2
IND inv U -0.68 2 USA 0.05 U 0.50 U 4
IRN n.a. n.a. 1 VEN n.a. n.a. 1
ITA n.a. n.a. -0.21 2 YUF n.a. n.a. 1
JAM n.a. n.a. inv U 1 YUG n.a. n.a. 1
Notes: "n.l." stands for countries without a series with more than five observations.
"n.a." is for countries without a Gross-HH basic series.
"U " refers to significant negative and positive quadratic coefficients for 2 and 3 respectively.
"inv U " is assigned when 2 and 3 are significant, while positive and negative respectively.
- 65 -