THE WORLD BANK ECONOMIC REVIEW xr-i ..-A i a-qnnc - AT..-L~.. i Top Indian Incomes, 1922-2000 Abhijit Banerjee and Thomas Piketty Can We Discern the Effect of Globalization on Income Distribution? Evidence from Household Surveys Branko Milanovic Financing Pharmaceutical Innovation: How Much Should Poor Countries Contribute? WilliamJack and Jean 0. Lanjouw Prices and Unit Values in Poverty Measurement and Tax Reform Analysis John Gibson and Scott Rozelle Has Distance Died? Evidence from a Panel Gravity Model Jean-Fran~oisBrun, Ciline Carire, Patrick Guillaumont, and Jaime de Melo Measuring and Explaining the Impact of Productive Efficiency on Economic Development Ruwan Jayasuriya and Quentin Wodon THE WORLD BANK ECONOMIC REVIEW EDITOR Jaime de Melo, University of Geneva EDITORIAL BOARD Economics, UK Ravi Kanbur, Cornell University, USA Mark Gersovitz,Johns Hopkins norls rleskomc, World Bank University, USA Martin Ravallion, WorldBank Paul Gertler, WorldBank Ritva Reinikka, WorldBank Indermit Gill, WorldBank Elisabeth Sadoulet, University of Calfornia, JanWillem Gunning,Free University, Berkeley, USA The Netherlands Joseph Stiglitz, Columbia University, USA JeffreyHammer, WorldBank Moshe Syrquin, University of Miami, USA Karla Hoff, WorldBank L. Alan Winters, WorldBank SUBSCRIPTIONS: A subscription to The World Bank Economic Rpview (ISSN 0258-6770) comprises 3 issues. Prices include postage; for subscribers outside the Americas, issues are sent air freight. Annual Subscription Rate (Volume 19, 3 Issues, 2005): Academic librarier-Print edition and site-wide online access: US$119/£83, Print edition only: US$113/£79, Site-wide online access only: US$107/£75; Corporat~Printedition and site-wide online access: USS144/$99, Print edition only: USS136/£94, Site- wide online access only: US$129/$89; PersonatPrint edition and individual online access: US$44/£34. Please note: $ Sterling rates apply in Europe, US$ elsewhere. There may be other subscription rates available; for a complete listing, please visit www.wbro.oxfordjoumals.org/subscriptions.Readers with mailing addresses in non-OECD countries and in soc~alisteconomies in transition are eligible to receive complimentay subsmptions on request by writing to the UK address below. Full prepayment in the correct currency is required for all orders. Orders are regarded as firm,and payments are not refundable. Subscriptions are accepted and entered on a complete volume basis. Claims cannot be considered more than four months after publication or date of order, whichever is later. All subscriptions in Canada are subject to GST. Subscriptions in the EU may be subject to European VAT. If registered, please supply details to avoid unnecessary charges. For subscriptions that include online versions, a proportion of the subscription price may be subject to UK VAT. Personal rates are applicable only when a subscription is for individual use and are not available if delivery is made to a corporate address. BACK ISSUES: The current year and two previous years' issues are available from Oxford University Press. Previous volumes can be obtained from the Periodicals Service Company, 11Main Street, Germantown, NY 12526, USA. E-mail: psc@periodicals.com.Tel: (518) 537-4700. Fax: (518) 537-5899. CONTACT INFORMATION: Journals Customer ServiceDepartment, Oxford UniversityPress,Great Clarendon Street, W o r d OX2 6DP, UK. E-mail: jnls.cust.serv@oupjournals.org. Tel: +44 (0)1865 353907. Fax: + 44 (0)1865 353485. In the Americas, please contact:Journals Customer Service Department, Oxford University Press, 2001 Evans Road, Cary, N C 27513, USA. E-mail: jnlorders@oupjournals.org. Tel: (800) 852-7323 (toll-free in USAICanada) or (919) 677-0977. Fax: (919) 677-1714. In Japan, please contart: Journals Customer Service Department, Oxford University Press, 1-1-17-SF, Mukogaoka, Bunkyo-h, Tokyo, 113-0023, Japan. E-mail: ohdaoup@po.iijnet.or.jp. Tel: (03) 3813 1461. Fax: (03) 3818 1522. POSTAL INFORMATION: The World Bank Economic Review (ISSN 0258-6770) is published by Oxford University Press for the International Bank for Reconstruction and DevelopmendT~EWORLD BANK. Send address changes to The World Bank Economic Review, Journals Customer Service Department, Oxford University Press, 2001 Evans Road, Cary, N C 27513-2009. Communications regarding original articles and editorial management should be addressed to The Editor, The WorldBank Economic Review, The World Bank, 3, Chemin Louis Dunant, CP66 1211 Geneva 20, Switzerland. DIGITAL OBJECT IDENTIFIERS: For information on dois and to resolve them, please visit www.doi.org. PERMISSIONS: For information on how to request permissions to reproduce articles or information from this journal, please visit www.oxfordjournals.org/permissions. ADVERTISING: Inquiries about advertising should be sent to Helen Pearson, Oxford Journals Advemsing, P O Box 347, Abingdon OX14 lGJ, UK. E-mail: helen@oxfordads.com. Tel: +44 (0)1235 201904. Fax: +44 (0)8704 296864. DISCLAIMER: Statements of fact and opinion in the articles in The World Bank Economic Review are those of the respective authors and contributors and not of the International Bank for Reconstruction and DevelopmendT~EWORLD BANK or Oxford University Press. Neither Oxford University Press nor the International Bank for Reconstruction and Development/THE WORLD BANK make any representation, express or implied, in respect of the accuracy of the materid in this journal and cannot accept any legal responsibility or liability for any errors or omissions that may be made. The reader should make her or his own evaluation as to the appropriateness or otherwise of any experimental technique described. PAPER USED: The World Bank Economic Review is printed on acid-free paper that meets the minimum requirements of ANSI Standard 239.48-1984 (Permanence of Paper). INDEXING AND ABSTRACTING: The World Bank Economic Review is indexed andlor abstracted by CAB Abstracts, Current ConfenfdSocialand Behavioral Sciences, Journal of Economic LiteraturdEconLif, PAIS International, RePEc (Research in Economic Papers), and Social Services Citation Index. COPYRIGHT 0TheInternational Bank for Reconstruction and DevelopmendTH~WORLD BANK 2005 All rights reserved; no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without prior written permission of the publisher or a license permitting restricted copying issued in the UK by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1P 9HE, or in the USA by the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Top Indian Incomes, 1922­2000 Abhijit Banerjee and Thomas Piketty This article presents data on the evolution of top incomes and wages for 1922­ 2000 in India using individual tax return data. The data show that the shares of the top 0.01 percent, 0.1 percent, and 1 percent in total income shrank substantially from the 1950s to the early to mid-1980s but then rose again, so that today these shares are only slightly below what they were in the 1920s and 1930s. This U-shaped pattern is broadly consistent with the evolution of economic policy in India: From the 1950s to the early to mid-1980s was a period of ``socialist'' policies in India, whereas the subsequent period, starting with the rise of Rajiv Gandhi, saw a gradual shift toward more probusiness policies. Although the initial share of the top income group was small, the fact that the rich were getting richer had a nontrivial impact on the overall income distribution. Although the impact is not large enough to fully explain the gap observed during the 1990s between average consumption growth shown in National Sample Survey­based data and the national accounts­based data, it is sufficiently large to explain a nonnegligible part of it (20­40 percent). This article presents data series on top incomes and wages in India during 1922­ 2000 based on individual tax return data. It uses tabulations of tax returns published annually by the Indian tax administration to compute the shares of the top 1 percent of the distribution of total income and the top 0.5 percent, 0.1 percent, and 0.01 percent. It does the same for the wage distribution. The analysis does not go below the top 1 percent because incomes below this level are largely exempt from taxation in India. The series begin in 1922, when the income tax was created in India and thus enables examination of the impact of the Great Depression and World War II on Abhijit Banerjee is professor of economics in the Department of Economics at the Massachusetts Institute of Technology; his email address is banerjee@mit.edu. Thomas Piketty is directeur d'e´tudes at Ecole des Hautes Etudes en Sciences Sociales, Campus Paris-Jourdan; his email address is piketty@ens.fr. The authors are grateful to Tony Atkinson, Amaresh Bagchi, Gaurav Datt, Govinda Rao, Martin Ravallion, T. N. Srinivasan, Suresh Tendulkar, and two anonymous referees for useful discussions; to Sarah Voitchovsky for excellent research assistance; and to the McArthur Foundation for financial support. The complete series is available online in the working paper version of this article (Banerjee and Piketty 2004; www.cepr.org/pubs/new-dps/dp_papers.htm). , THE WORLD BANK ECONOMIC REVIEW VOL. 19, NO. 1, pp. 1­20 doi:10.1093/wber/lhi001 Ó The Author 2005. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org. 1 2 , 19, 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . inequality. Of particular interest is the period starting in the 1950s, at the beginning of India's experiment with socialism. This experiment was officially suspended in 1991 with the beginning of the liberalization process, which continued through the 1990s. One explicit goal of the socialist program was to limit the economic power of the elite in the context of a mixed economy. The tax data offer an opportunity to look at the extent to which this program, with its well-known deficiencies, succeeded in its distributional objectives. This is an important part of the assessment of this period, and it offers a window into the broader question of the role of policy in affecting the distribution of income and wealth in a developing economy. With much of the economic activity in these countries outside the formal sector, it is not obvious that there is a lot that policy, especially tax policy, can affect. Yet the results are consistent with an important role for policy in shaping the distribution of income. In particular, there is evidence of a substantial decline in the share of the elite during the years of socialist planning and a comparable recovery in the postliberalization era. However, the rebound seems to start significantly before the official move toward liberalization. Given that these results are likely to be controversial, it is worth emphasiz- ing that there are several obvious problems with using tax data, not the least because of tax evasion. These are discussed at some length. Although the results appear to be robust, they are not intended to be definitive but rather to provide a point of departure on an important question about which very little is known, primarily because of data limitations. There are good reasons to suspect that the usual sources of information on income distribution in India--such as consumer expenditure surveys--are not particularly effective at picking up the very rich. This is in part because the rich are such a small share of the population and in part because they are much more likely to refuse to cooperate with the time-consuming process of responding to a consumer expenditure survey.1 Although there is no hard evidence that the rich are indeed being under- counted in India (the Indian consumer expenditure surveys do not, for example, report refusal rates by potential income category), one reason to suspect that this is the case comes from what has been called the Indian growth paradox of the 1990s. According to the standard household expen- diture survey conducted by the National Sample Survey (NSS), real per capita growth in India during the 1990s was fairly limited. This conclusion stands in sharp contrast with the substantial growth measured by national accounts statistics (NAS) over the same period. This puzzle has attracted considerable 1. See, for example, Szekely and Hilgert (1999), who look at a large number of Latin American household surveys and find that the 10 highest incomes reported in surveys are often not much larger than the salary of an average manager in the given country at the time of survey. For a systematic comparison of survey and national accounts aggregates in developing countries, see Ravallion (2001). Banerjee and Piketty 3 attention in recent years,2 and it has been widely suggested that it might simply be that a large part of the growth went to the very rich. However there has been no attempt to directly quantify this possibility.3 The tax data permit taking a useful step in this direction by putting bounds on the extent to which the growth gap can be explained simply as undercounting of the very rich. The analysis here concludes that it can explain between 20 percent and 40 percent of the puzzle. Although this is not negligible, it leaves the bulk of the puzzle unaccounted for, largely because the share of the rich in total income is still relatively small. This suggests that there probably is some deeper problem with the way either the NSS or the National Statistical Office (which generates the NAS ) collects its data.4 The next two sections briefly outline the data and methodology (section I) and present the long-run results (section II). Section III discusses the potential problems with this evidence, and section IV uses the evidence to shed some light on the Indian growth paradox of the 1990s. Section V concludes. I. DATA AND METHODOLOGY The tabulations of tax returns published each year by the Indian tax adminis- tration in the All-India Income-Tax Statistics series constitute the primary data source used in this study. The first year for which income data are available is 1922/23 and the last year is 1999/2000.5 2. See, for example, Datt (1999), Ravallion (2000), World Bank (2000), and Sundaram and Tendulkar (2001). Recently released data from the 1999­2000 NSSround has revealed that growth was larger than expected during the 1990s and that poverty rates did decline over this period, contrary to what most observers believed on the basis of pre-1999­2000 NSS rounds (Deaton and Dreze 2002; Deaton 2003a, b). However, the overall growth gap between NSSdata and NAS data still appears substantial, even after this correction (see table 2). The existence of this growth discrepancy was already a subject of inquiry in India during the 1980s (see, for example, Minhas 1988 and Minhas and Kansal 1990), but the gap observed during the 1990s appears to be substantially larger than the gap during previous decades. For a broader, international perspective on the survey and national accounts debate, see Deaton (2003c). 3. Sundaram and Tendulkar (2001) find that the NSS NAS ­ gap is particularly important for commod- ities that are more heavily consumed by higher income groups, thereby providing indirect evidence for the explanation based on rising inequality. 4. See Bhalla (2002) for a negative view of the NSSapproach. For more balanced discussions of the relative merits of survey and national accounts aggregates in developing economies, see Ravallion (2001) and Deaton (2003c). 5. References to the relevant All-India Income-Tax Statistics (AIITS) publications are given in the working paper version of this study (Banerjee and Piketty 2004, table A0). Financial years run from April 1 to March31 in India, so that 1922/23 refers to the period running from April 1, 1922, to March 31, 1923, for example. AIITS publications refer to assessment years (the year in which the income is assessed), whereas this study refer to incomeyears,ortheyearinwhichtheincomewasearned.Thus,forexample,AIITS1923/24containsthedataon income year 1922/23. AIITS2000/01 data (income year 1999/2000) were not yet available when the study was updated,sotheincomeyear1999/2000datafortopincomeswereobtainedbyinflatingthe1998/99databythe nominal 1999­2000/1998­99 per tax unit national income growth rate. This approximation probably leads to an underestimate of top income growth, but as there was no large NSS round for 1998/99, it was easier to make comparison with 1999/2000 as the end point. 4 , 19, 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . Due to the relatively high exemption levels, the number of taxpayers in India has always been rather small. The proportion of taxable tax units was around 0.5­1 percent from the 1920s to the 1980s and then rose sharply during the 1990s to 3.5­4 percent by the end of the decade, following the large increase in top nominal incomes (figure 1 ).6 Therefore, the long-run series cannot go below the top 1 percent. The published tax tabulations report the number of taxpayers and the total income reported by these taxpayers for a large number of income brackets. Standard pareto extrapolation techniques were used to compute for each year the average incomes of the top percentile (P99­100), the top 0.5 percent (P99.5­ 100), the top 0.1 percent (P99.9­100), and the top 0.01 percent (P99.99­100) of the tax unit distribution of total income, as well as the income thresholds P99, P99.5, P99.9, and P99.99 and the average incomes of the intermediate fractiles P99­99.5, P99.5­99.9, and P99.9­99.99.7 The results for 1999/2000 give a sense of orders of magnitude (table 1). There were almost 400 million tax units in India in 1999/2000 (396.4 million). Based on the national accounts statistics, the average income of those 400 million tax units was around 25,000 rupees (Rs) a year ($3,000 in purchasing power parity, PPP , terms).8 To fall into the top percentile (P99), which included about 4 million tax units, required an income of more than Rs 88,000 (around $10,000 in PPP terms). The average income of the bottom half of the top percentile (fractile P99-99.5, about 2 million tax units) was about Rs 99,000 (less than $12,000 in PPP terms). To fall into the top 0.01 percent (about 40,000 tax units) required an 6. Throughout the article, ``tax units'' should be thought of as individuals (all estimates were obtained by summing up tax returns filed by individuals and those filed by ``Hindu undivided families'' (a group that made up less than 5 percent of the total in the 1990s, down from about 20 percent in the interwar period). The total, theoretical number of tax units was set to be equal to 40 percent of the total population of India throughout the period (see Banerjee and Piketty 2004, table A1, col. 2). This represents a rough estimate of the potential ``positive-income population'' of India. This is lower than India's adult population (the 15 years old and older population has made up about 60­65 percent of the total population since the 1950s), but it is very close to India's labor force (the labor force has consisted of about 40­45 percent of total population since the 1950s). 7. The Pareto law is given by 1 À F(y) =(k/y)a where 1 À F(y) is the fraction of the population with income above y, and k > 0 and a > 1 are the structural pareto parameters. For a recent use of pareto extrapolation techniques with similar tax return data, see Piketty (2003) and Piketty and Saez (2003). See also Atkinson (2004) and Dell (2004). 8. The average income series (see Banerjee and Piketty 2004, table A1, col. 7) was set to be equal to 70 percent of national income per tax unit (the 30 percent deduction is assumed to represent the fraction of national income that goes to undistributed profits, nontaxable income, and so on. The national income series was taken from Sivasubramonian (2000), from which the population series was also taken. Banerjee and Piketty (2004, table A0) also report on other income aggregates based on gross domestic product and national accounts household consumption (both taken from the World Bank's World Development Indicators database, from which the Consumer Price Index series and the PPP exchange rate used in table 1 were also taken) and on NSS household consumption (computed from Datt 1997, 1999 for the 1956­1998 series and Deaton and Dreze 2002, note 24, for the corrected 1999/2000/1993/2004 growth rate). Banerjee and Piketty 5 FIGURE 1. Proportion of Taxable Tax Units in India, 1922­2000 (percent) 4.0% 3.5% 3.0% 2.5% 2.0% 1.5% 1.0% 0.5% 0.0% 3 8 3 8 3 8 3 8 3 8 3 8 3 8 3 8 1922- 1927- 1932- 1937- 1942- 1947- 1952- 1957- 1962- 1967- 1972- 1977- 1982- 1987- 1992- 1997- Source: Authors' computations using tax return data; see Banerjee and Piketty (2004, table A1). income of more than Rs 1.4 million ($160,000 in PPP terms), and the average income above that threshold was more than Rs 4 million ($470,000 in PPP terms).9 As is the case in other countries, the top of the income distribution in India appears to be very precisely approximated by the pareto structural form.10 However, the estimates for the recent period are subject to sampling error: The official tax tabulations were based on the entire population until the early 1990s (as is the case in most Organisation for Economic Co-operation and Development countries),11 but they now seem to be based on uniform samples of all tax returns. Although there is uncertainty about the new sampling 9. To put these numbers in global perspective, consider that India's 1999/2000 P99.99 threshold (about $160,000 in PPPterms) is located midway between U.S. 1998 thresholds for P95 ($107,000) and P99 ($230,000); see Piketty and Saez (2003, table 1). India's 1999/2000 P99.9 threshold (about 34,000$ inPPP terms) is well below the U.S. 1998 P90 threshold ($82,000). 10. In the same way as for other countries (see previous references), the extrapolation results were checked and were found to be virtually unaffected by the choice of extrapolation thresholds used to estimate the structural parameters. Pareto coefficients are locally very stable in India, as they are in other countries. Before the 1990s, less than 1 percent of individuals were subject to tax, and the lowest threshold available was used to estimate the top percentile threshold, P99 (given that pareto coefficients are in practice very stable, the resulting estimates appear to be as precise as estimates for thresholds P99.5 and higher). 11. Or they were based on stratified samples with sampling rates close to 100 percent for top incomes. TABLE 1. Top Indian Incomes in 1999­2000 Income Level Income level Average Average Income (US$, Market (US$, ppp Average Income Income (US$, Threshold Level Exchange Conversion Number of Income (US$, Market ppp Conversion (Percentile) (Rupees) Rate) Factor) Fractile Tax Units (Rupees) Exchange Rate) Factor) Full population 396,400,000 25,670 596 2,968 6 P99 87,633 2,035 10,131 P99­99.5 1,982,000 98,842 2,295 11,427 P99.5 147,546 3,427 17,057 P99.5­99.9 1,585,600 216,929 5,038 25,079 P99.9 295,103 6,853 34,116 P99.9­99.99 356,760 590,488 13,713 68,264 P99.99 1,383,930 32,140 159,992 P99.99­100 39,640 4,034,289 93,690 466,392 Source: Authors' computations using official tax return data; see Banerjee and Piketty (2004, table A1 and table A2). Note: U.S. dollar amounts have been computed by applying the average 1999/2000 market exchange rate ($1 = Rs 43.06) and the average 1999/2000PPP conversion factor ($1 = Rs 8.65) to amounts in current 1999/2000 rupees. Banerjee and Piketty 7 procedure, the sampling rate seems to be sufficiently large to guarantee that the estimated trends for top income shares are statistically significant.12 Official tax publications also include tabulations of the amounts of income in each category (wages, business income, dividends, interest, and so on) for each income bracket. In particular, there are separate tables for wage earners, by far the largest subgroup. This enabled separating estimates for top wage fractiles, which could then be compared with the estimates for top fractiles of total income (see later discussion).13 II. THE LONG-RUN DYNAMICS OF TOP INCOME SHARES, 1922­2000 The results show that income inequality (as measured by the income shares of those in the top income groups) follows a U-shaped pattern over 1922­2000 (figure 2 ). The top 0.01 percent income share fluctuates around 2­2.5 percent of total income from the 1920s to the 1950s. It then gradually falls from about 1.5­2 percent of total income in the 1950s to less than 0.5 percent in the early 1980s. It rises again during the 1980s and 1990s and reaches 1.5­2 percent again during the late 1990s. This means that the average income of the top 0.01 percent of the income distribution was about 150­200 times larger than the average income of the entire population during the 1950s. The difference fell to less than 50 times larger than the average income in the early 1980s, but then rose again to 150­200 times larger during the late 1990s. The exact turning point is also of some interest. The decline in the share of the top 0.01 percent is relatively rapid until 1974/75 (see figure 3). Then it slows considerably, but there is still a clear downward trend until 1980/81. Then the trend reverses, moving upward throughout the 1980s, reaching a peak in 1988/89. During the 1980s the share of the top 0.01 percent more than doubles--from less than 0.4 percent to more than 0.8 percent. But it then reverses again, and by 1991/92 the share is back below 0.6 percent. Then it takes off, and after 1995/96 it remains in the 1.5­2 percent range. 12. According to the tax administration statistics division, the sampling rate is about 1 percent and approximately uniform (the official tax publications do not include any precise information about sampling design and rate). Given India's large population, this implies that the estimate for the top 1 percent income share (8.95 percent of total income in 1999/2000; see Banerjee and Piketty 2004, table A4) has a standard error of about 0.04 percent and that the estimate for the top 0.01 percent income share (1.57 percent of total income in 1999/2000; see Banerjee and Piketty 2004, table A4) has a standard error of about 0.08 percent. There is some evidence, however, that the sampling design is changing and that published tabulations were becoming more volatile by the end of the period. In particular, the tabulations for income year 1997/98 (assessed year 1998/99) contain far too many individual taxpayers above 1 million rupees, suggesting that something went wrong in the sampling design that year. The 1997/98 estimates were corrected downward on the basis of 1996/97 and 1998/99 tabulations. 13. Published wage tabulations for income year 1996/97 and 1997/98 appear to suffer from sampling design failures (top wages are clearly truncated in 1996/97, and they are too numerous in 1997/98). The estimates for those two years were corrected on the basis of 1995/96 and 1998/99 data. 8 , 19, 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . FIGURE 2. The Top 0.01 Percent Income Share in India, 1922­2000 (percent) 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 1922-23 1927-28 1932-33 1937-38 1942-43 1947-48 1952-53 1957-58 1962-63 1967-68 1972-73 1977-78 1982-83 1987-88 1992-93 1997-98 Source: Authors' computations using tax return data; see Banerjee and Piketty (2004, table A4). FIGURE 3. The Top 0.1 Percent Income Share in India, 1922­2000 (percent) 8.0 7.0 6.0 5.0 4.0 3.0 2.0 1.0 0.0 1922-23 1927-28 1932-33 1937-38 1942-43 1947-48 1952-53 1957-58 1962-63 1967-68 1972-73 1977-78 1982-83 1987-88 1992-93 1997-98 Source: Authors' computations using tax return data; see Banerjee and Piketty (2004, table A3). Banerjee and Piketty 9 A similar (though less pronounced) U-shaped pattern is also observed for the top 1 percent income share, which went from about 12­13 percent during the 1950s to 4­5 percent in the early 1980s and to 9­10 percent in the late 1990s (figure 4). Once again the turning point seems to be around 1980/81, and during the 1980s the share of the top 1 percent also doubles. Then, as with the share of the top 0.01 percent, there is a period of retrenchment that lasts till 1991/92, followed by a renewed upward movement. A comparison of these trends reveals another intriguing fact: Although in the 1980s the share of the top 1 percent increases almost as quickly as the share of the top 0.01 percent (see figures 2 and 4), in the 1990s there is a clear divergence between what is happening in the top 0.01 percent and in the rest of the top 1 percent. To confirm that this is the case, the top percentile is broken into four fractiles: P99­99.5, P99.5­99.9, P99.9­99.99, and P99.99­100. During the 1987­2000 period, only those in the top 0.1 percent enjoyed income growth rates faster than the growth rate of GDP per capita (table 2). This contrasts with what is observed when the period includes the 1980s, which shows evidence of above average growth for the entire top percentile (table 3). Although 1980/81 is clearly the year when the data series turn around, it is not possible to date the true turnaround with as much precision, because the income share of the rich is also affected by short-run, cyclical factors. It may be that the data series puts the turnaround in 1980/81 only because no allowance was made for the deep recession of 1979/80 and 1980/81, which hurt the rich. Thus what FIGURE 4. The Top 1 Percent Income Share in India, 1922­2000 (percent) 19.0 18.0 17.0 16.0 15.0 14.0 13.0 12.0 11.0 10.0 9.0 8.0 7.0 6.0 5.0 4.0 1922-23 1927-28 1932-33 1937-38 1942-43 1947-48 1952-53 1957-58 1962-63 1967-68 1972-73 1977-78 1982-83 1987-88 1992-93 1997-98 Source: Authors' computations using tax return data; see Banerjee and Piketty (2004, table A4). 10 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . TABLE 2. Top Income Growth during the 1990s: 1999/2000 versus 1987/88 (percent) Item Nominal Growth Real Growth Household consumption per capita (Nss) +242 +19 GDPper capita (NAS) +337 +52 Household consumption per capita (NAS) +304 +40 National income per tax unit (NAS) +346 +55 Top income fractiles (tax returns) P99-100 +392 +71 P99.5­100 +412 +78 P99.9­100 +548 +125 P99.99­100 +1,009 +285 P99­99.5 +331 +50 P99.5­99.9 +317 +45 P99.9­99.99 +393 +71 P99.99­100 +1,009 +285 Consumer price index +188 Share of growth gap accounted for by P99­100 20.1 P99.5­100 17.2 P99.9­100 12.7 P99.99­100 8.0 Source: Authors' computations using tax return, NAS, and NSSdata; see Banerjee and Piketty (2004, tables A1­A3). TABLE 3. Top Income Growth in India during the 1980s­1990s: 1999/2000 versus 1981/82 (percent) Item Nominal Growth Real Growth Household consumption per capita (NSS) +487 +25 GDPper capita (NAS) +700 +70 Household consumption per capita (NAS) +599 +49 National income per tax unit (NAS) +688 +68 Top income fractiles (tax return) P99-100 +1,508 +242 P99.5­100 +1,747 +293 P99.9­100 +2,270 +404 P99.99­100 +3,980 +767 P99­99.5 +992 +132 P99.5­99.9 +1,392 +217 P99.9­99.99 +1,698 +282 P99.99­100 +3,980 +767 CPI +370 Share of growth gap accounted for by P99­100 39.7 P99.5­100 33.5 P99.9­100 19.1 P99.99­100 9.3 Source: Authors' computations using tax return, NAS, and NSSdata; see Banerjee and Piketty (2004, tables A1­A3). Banerjee and Piketty 11 appears as a sharp upward trend starting in 1981 may be just a reversion in 1981/ 82 and 1982/83 to the preexisting trend. Therefore, rather than ascribing the turnaround to a single year, it is ascribed to the early to mid-1980s. The fact that the turnaround is so early makes it hard to attribute it to the formal process of liberalization. Indeed, given the nature of the data, it cannot be entirely ruled out that the driving factor was either a shift in the global economic environ- ment or a part of the natural evolution of a mixed economy. However, the timing of the turnaround is also consistent with the view that there was a structural shift in the Indian economy in the early to mid 1980s. Delong (2001) and Rodrik and Subramanian (2004), based on macro time-series data, date the acceleration in the growth rate of the Indian economy to the early to mid-1980s, rather than the early 1990s. They suggest that this may have to do with a shift of power within the ruling Congress Party toward a more technocratic, probusiness group associated with Rajiv Gandhi, who entered politics in 1981 following his brother's death and became prime minister in 1984. Available macro series also show that the wage share in the private corporate sector has been declining in India since the early to mid-1980s (in contrast to the 1970s, when the profit share was declining; see Nagaraj 2000, figure 7, and Tendulkar 2003, table 14), which is again consistent with the time for the turnaround proposed here. Also, although the turnaround was earlier, the data suggest a definite accel- eration in the growth of the share of the top 0.01 percent after 1991. Moreover, this contrasts with what is observed for the top 1 percent, suggesting that what happened after 1991 was qualitatively different from what happened before, and even more biased in favor of the ultra-rich. Finally, there is tentative evidence suggesting that what happened in India over the entire period was not simply a reflection of forces that were affecting countries all over the world. A comparison of India with France and the United States during the 1950s and 1960s shows that India was less egalitarian than the other two countries (figures 5 ­7). The top 0.01 percent earned a substantially higher share of total income in India. Subsequently, however, top income shares declined con- tinuously in India during the 1960s and the 1970s and fell below the levels in France and the United States during the early 1980s. That the decline in India occurred mostly during the 1950s to 1970s (rather than during the interwar and World War II period) seems consistent with the interpretation posited by Piketty (2003) and Piketty and Saez (2003) to explain the French and U.S. trajectories: The shocks induced by the Great Depression of the 1930s and World War II were less severe in India,14 whereas tax progressivity was extremely high in India during the 14. Note that unlike in France, the United States, or the United Kingdom, top income shares in India were actually rising during the Great Depression of the 1930s. Top Indian nominal incomes did decline during the 1930s, but less rapidly than the national income and wage series computed by Sivasubramonian (2000). This probably reflects the fact that India had a very different position than France, the United States, or the United Kingdom in the world division of labor during the 1930s (Indian entrepreneurs might have benefited from the drop in world manufacturing output and raw material prices). 12 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . FIGURE 5. The Top 0.01 Percent Income Share in India, France, and the United States, 1913­2000 (percent) 5.0 4.5 4.0 India 3.5 France United States 3.0 2.5 2.0 1.5 1.0 0.5 0.0 1913-14 1918-19 1923-24 1928-29 1933-34 1938-39 1943-44 1948-49 1953-54 1958-59 1963-64 1968-69 1973-74 1978-79 1983-84 1988-89 1993-94 1998-99 Source: Authors' computations using tax return data; see Banerjee and Piketty (2004, table A4) for India; Piketty (2003) for France; Piketty and Saez (2003) for the United States. FIGURE 6. The Top 0.1 Percent Income Share in India, France, the United States, and the United Kingdom, 1913­2000 (percent) 12.0 11.0 10.0 9.0 India France 8.0 United States United Kingdom 7.0 6.0 5.0 4.0 3.0 2.0 1.0 0.0 1913-14 1918-19 1923-24 1928-29 1933-34 1938-39 1943-44 1948-49 1953-54 1958-59 1963-64 1968-69 1973-74 1978-79 1983-84 1988-89 1993-94 1998-99 Source: Authors' computations using tax return data; see Banerjee and Piketty (2004, table A4 for India); Piketty (2003) for France; Piketty and Saez (2003) for the United States; Atkinson 2004 for the United Kingdom. Banerjee and Piketty 13 FIGURE 7. The Top 1 Percent Income Share in India, France, and the United States, 1913­2000 (percent) 22.0 20.0 18.0 India France 16.0 United States 14.0 12.0 10.0 8.0 6.0 4.0 2.0 1913-14 1918-19 1923-24 1928-29 1933-34 1938-39 1943-44 1948-49 1953-54 1958-59 1963-64 1968-69 1973-74 1978-79 1983-84 1988-89 1993-94 1998-99 Source: Authors' computations using tax return data; see Banerjee and Piketty (2004, table A4) for India; Piketty (2003) for France; Piketty and Saez (2003) for the United States. 1950s through the 1970s, perhaps inducing a large impact on capital concentra- tion and pretax income inequality (larger than in France or the United States). Available data do seem to indicate that the fall in top shares observed during this period was due primarily to a fall in top capital incomes.15 Top income shares then rise again in India, following a pattern similar to that in the United States but not in France, where the top shares remain fairly flat during the 1980s and 1990s (the pattern in most other European countries is similar).16 The share of the very rich in Indian incomes is currently much higher than in Europe. As will be shown next, the rise in top Indian incomes during the recent period was not due to the revival in top capital incomes (the rise in top wages did play a key role, as in the United States). Although the data do not permit the identification of the precise causal channels at work or isolate the impact of globalization, the fact that the rise in income inequality is so concen- trated within top incomes seems more consistent with a theory based on rents 15. The official tax publications do not provide a complete set of tabulations broken down by income sources, so the point could not be studied in greater detail. 16. A data series on top income shares recently constructed for Germany by Dell (2004) confirms that France is fairly representative of Continental Europe. The United Kingdom appears to be intermediate between Continental Europe and the United States: There is a rise in top shares after the early 1980s, but much less pronounced than in the United States (see Atkinson 2004). 14 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . and market frictions (see, for example, Banerjee and Newman 2003) than with a theory based solely on skills and technological complementarity (in which the rise in inequality in developing countries reflects a low-skilled labor force unable to benefit from globalization; see, for example, Kremer and Maskin 2003). III. MEASUREMENT ISSUES The presumption so far has been that what has been measured is the actual income share of the rich. There are a number of reasons why this may not be true. First, despite extensive efforts, it was not possible to determine exactly what changes were made during the 1990s in the procedure for generating the samples used to create the tax tables. From informal conversations with Indian tax officials, it seems that at least in recent years the procedure is more an informal attempt to sample randomly than a precise random sample. To the extent that this increases the risk of the data being clustered, the implication is that the within-sample variance might overstate the precision of the data. This remains a possibility, but for the most part the trends seem quite stable. Although the results for single years or sets of years may reflect sampling variation, the fact that in every year between 1973/74 and 1992/93 the share of the top 0.01 percent is less than 0.85 percent (and in every year but two it is less than 0.7 percent) and that in 1995/96 and every year after that it is greater than 1.5 percent makes the results appear much more robust. The two intervening years, 1993/94 and 1994/95 show, as might be hoped, shares between 0.7 percent and 1.5 percent for the top 0.01 percent. A more serious problem is that the surge in top incomes may reflect improve- ments in the income tax department's ability to measure (and hence tax) the incomes of the wealthy. The tax cuts in the early 1990s might have reduced the incentives among the wealthy for evading taxes. Note, however, that the overall decline in the top marginal rate, though nonmonotonic, was quite moderate: The top marginal tax rate dropped from 50 percent in 1987/88 to 40 percent in 1999/2000 (figure 8). By comparison, the change in the share of the top 0.01 percent was huge: It went up from 0.7 percent in 1987/88 to more than 1.5 percent in 1999/2000. If this entire change is to be explained by a shift in tax rates, the implied elasticity would have to be enormous. Experience elsewhere also suggests that the rise of top incomes can be explained by nontax structural factors (changing social norms, booming econ- omy, international trade, and globalization) rather than by tax changes and increased incentives to report top incomes. The consensus in the United States seems to be that while short-run elasticities can be substantial,17 the medium- and long-run elasticity of top taxable incomes with respect to top tax rates is 17. Reflecting mostly income relabeling or changes in timing of the exercise of bonuses or stock options. Banerjee and Piketty 15 FIGURE 8. The Top 0.01 Percent Income Share and the Top Marginal Income Tax Rate in India, 1981­2000 (percent) 2.5 75 70 2.0 65 60 1.5 55 50 Top 0.01 percent share (left scale) 1.0 45 Top marginal rate (right scale) 40 0.5 35 30 0.0 25 1981-82 1982-83 1983-84 1984-85 1985-86 1986-87 1987-88 1988-89 1989-90 1990-91 1991-92 1992-93 1993-94 1994-95 1995-96 1996-97 1997-98 1998-99 Source: Authors' computations using tax return data and tax return law; see Banerjee and Piketty (2004, table A4). TABLE 4. Top Wage Growth in India during the 1990s: 1999/2000 versus 1987/88 (percent) Item Nominal growth Real growth Household consumption per capita (NSS) +242 +19 GDPper capita (NAS) +337 +52 Household consumption per capita (NAS) +304 +40 National income per tax unit (NAS) +346 +55 Top wage fractiles (tax returns) tP99-100 +420 +81 P99.5­100 +492 +105 P99.9­100 +551 +126 P99.99­100 +955 +266 P99-99.5 +246 +20 P99.5­99.9 +470 +98 P99.9­99.99 +448 +94 P99.99­100 +955 +266 CPI 188 Source: Authors' computations using tax return, NAS , and NSSdata; see Banerjee and Piketty (2004, tables A1, A5, and A6). 16 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . probably modest. The rise in top income shares in the United States during 1970­2000 seems to reflect real economic change (rather than pure fiscal manipulation): The top shares started rising well before Tax Reform Act of 1986, and the rise continued at an even faster pace during the 1990s, despite the 1993 rise in top tax rates (Goolsbee 2000; Piketty and Saez 2003). In China top income shares rose substantially during 1986­2001 (twice as fast as in India), despite the fact that top Chinese income tax rates have remained unchanged since the early 1980s (Piketty and Qian 2004). Of course, the effect of tax changes in India could have been reinforced by spectacular improvement in the collection technology (as well as increased incentives on the taxpayer side). There were a number of innovations in tax collection in the 1990s, such as the 1998 introduction of the ``one in six rule'' that required everyone who satisfied at least one of six criteria (such as owning a car and travel abroad) to file a tax return. To further investigate this issue, the exercise was rerun for wages only. Wages are much less subject to tax evasion than are nonwage income, because taxes are typically deducted at the source and employers have a strong incentive to report what they pay because wage payments are deductible from employers' taxes. Therefore, if better collection made the difference, wage incomes would be expected to have grown much more slowly than other income. The results show that top wages (table 4) increased essentially in step with top incomes (see table 2) during the 1990s, rising 81 percent between 1987/88 and 1999/ 2000 for the top percentile of the wage distribution compared with 71 percent for the top percentile of the income distribution. This is consistent with the fact that wages as a share of the total income of the top percentile increased some- what during this period (from 28 percent to 31 percent). The interpretation that there was a ``real'' increase in top incomes (especially top wages) in India during the 1990s is also consistent with the evolution of the public sector salary scale. Following a succession of Pay Commissions, including the well-known Fifth Pay Commission, whose recommendations were imple- mented in 1997, the salaries of central government employees were raised sharply in India during the 1990s (see, for example, Kochar 2003). According to computations for this study (based on published public sector salary scales), the Fifth Pay Commission alone can account for a substantial part of the rise in the number of top income taxpayers in India between 1994 and 1997. Central government employees made up about 7 percent of all income tax taxpayers in India in 1994 (fewer than 500,000 central government taxpayers, of total of about 7 million taxpayers), and following the pay rise they made up almost 30 percent of all taxpayers by 1997 (about 3.2 million central government tax- payers, of a total of 11 million). According to these computations, of the 4 million extra taxpayers recorded between 1994 and 1997, around 2.7 million (almost 70 percent) were central government employees. The very top wage of the central government salary scale was 98,000 Rs (9,000 Rs a month) in 1994 (which was just a little bit above the P99.5 threshold), and it was raised to Banerjee and Piketty 17 360,000 Rs (30,000 Rs a month) in 1997 (which was well above the P99.9 threshold).18 However, it does not seem to be the case that public sector wage increases were the primary driver behind the increase in inequality in the 1990s. Most of the rise in top Indian income shares actually took place before 1997, and it is likely that the revised scale put forward by the Fifth Commission was itself a response to the large rise in top private sector wages that had taken place in previous years.19 IV. THE GROWTH PARADOX OF THE 1990s Can the fact that the rich were getting richer help solve what has been called the Indian growth paradox of the 1990s? Table 2 illustrates this paradox: For the period 1987­2000, it compares the growth rate of average consumption as reported in the NSS, with the growth rate of average income and consumption from the NAS , as well as the top incomes from the tax returns. The years 1987/88 and 1999/2000 were chosen because there were large rounds of the NSS surveys in those years, enabling more precise estimates of the NSS NAS ­ gap.20 To elim- inate the effect of using different deflators, nominal growth performance was compared first and then real growth performance was computed using the same deflator for all the series (the Consumer Price Index, CPI ). 18. All these computations on public sector wages were made using the 1994 and 1997 (post­Fifth Commission) central government salary scales published in the ``Report of the 5th Central Pay Commis- sion'' (``Distribution of Filled Posts in Central Government and Union Territories in Different Scales of Pay, as on 31.3.1994,'' Government of India Press, New Delhi, 1997) and in the Gazette of India (Special Issue, The First Schedule--Part A, ``Revised scales for posts carrying present scales in Group A, B, C and D,'' Government of India Press, New Delhi, 1997). In 1994, the central government scale ranked from scale 1 (9,000 Rs a month) to scale 62 (750 Rs a month), and all employees in scales 1 to 46 (approximately 500,000 employees) were subject to tax (that is, they had annual incomes greater than 28,000 Rs, which was the base exemption level in 1994, excluding all special deductions). In 1997 the (revised) scale ranked from scale S-34 (30,000 Rs a month, previously scale 1) to scale S-1 (2,550 Rs a month, previously scale 62), and all employees in (revised) scales S-34 to S-3 (approximately 3.2 million people) were subject to tax (that is, they had annual incomes of more than 40,000 Rs, which was the base exemption level in 1997, excluding all special deductions). Note that these numbers include only central government employees strictly speaking and that they would need to be scaled up substantially to take other government employees into account. In 1994 there were about 4 million central government employees, and the total number of workers employed by state governments, quasi-government bodies, and local bodies was about 3.5 times as large. In principle the Fifth Pay Commission revised scales also applied to these noncentral government employees, but salary distribution for these employees could not be found (such a document apparently exists only for the central government). 19. Such a view would be consistent with the fact the ceiling on private sector executive compensa- tion was repealed as early as 1991. 20. Intermediate NSSsurveys were conducted between the two large surveys of 1987/88 and 1993/94 and between the two large surveys of 1993/94 and 1999/2000, but these were based on smaller samples and are considered less reliable. The 1999/2000 per capita consumption estimates come from Deaton and Dreze (2002), who corrected the data for changes in the recall period (all surveys until 1993/94 were conducted with a 30-day recall period, but the NSS has experimented with a 7-day recall period since then). 18 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . According to the NSS data, real growth was fairly limited in India during the 1990s: Per capita consumption increased by only 19 percent in real terms between 1987/88 and 1999/2000. According to NAS data, however, real growth was more than twice as large: Both per capita GDP and national income increased by more than 50 percent in real terms, and per capita household consumption increased by 40 percent. This NSS NAS ­ gap is referred to as the Indian growth paradox and has been the subject of much discussion in recent years.21 Table 2 raises the possibility that the very large growth of top incomes during the 1990s might help solve this puzzle. The average income growth among the top 1 percent of the tax units was 71 percent in real terms between 1987/88 and 1999/2000, which is substantially more than average growth according to the national accounts. Moreover, the higher up in the top 1 percent, the higher the growth (up to 285 percent for the top 0.01 percent). What fraction of the NSS NAS ­ gap can be explained by the huge growth performance of very top incomes? Assume that the NSS is unable to record any of the extra growth enjoyed by the top 1 percent (say, the people in the top 1 percent do not report their extra growth to the NSS or do not report anything at all). By the calculations reported here, the top 1 percent share in total consump- tion was around 8 percent in 1987/88.22 Since the average income of the top 1 percent increased by 71 percent in real terms between 1987/88 and 1999/2000 according to the tax returns, compared with 19 percent for average NSS con- sumption, this implies that NSS growth was 3.55 percent less than it would have been without the misreporting (0.0812  [1.71/1.19 À 1] = 3.55). This implies that the growing incomes among the top 1 percent can explain at most 20.1 percent (3.55/[1.40/1.19 À 1] = 20.1) of the NSS NAS ­ gap (see table 2).23 This is significant, but it still leaves 80 percent of the puzzle unexplained. The problem lies in the fact that almost all the extraordinary growth was among the top 0.1 percent, and the weight of this group is simply not large enough to have an impact on aggregate statistics of the necessary magnitude. For the rise of inequality to fully explain the NSS NAS ­ gap, there would have to have been very high income growth at the bottom of the top 1 percent and not simply among those in the top 0.1 percent. Top income growth can explain a larger proportion of the NSS NAS ­ gap if the analysis starts in the 1980s. For instance, under the same assumptions, the top 1 percent can explain almost 40 percent of the cumulative NSS NAS ­ gap over the 21. See previous references. Real growth during the 1990s would be somewhat higher if the GDP deflator were used instead of the CPI, but the NSS NAS ­ gap would obviously not change. 22. According to estimates for this study (computed with 70 percent of national income as the income denominator), the top percentile income share was 8.12 percent in 1987/88 (see Banerjee and Piketty 2004, table A3). 23. This is in a sense a lower bound, because the 1987/88 top percentile share is being used as the baseline for this computation, and the share was higher for later years. Banerjee and Piketty 19 1981­2000 period (table 3). This is because the bottom of the top percentile enjoyed rapid income growth in the 1980s (see figures 2­4). The booming Indian elite of the 1980s and 1990s seems too thin to explain all of the growth puzzle, but large enough to account for a nonnegligible part of it. V. CONCLUSION The results suggest that the gradual liberalization of the Indian economy did make it possible for the rich (the top 1 percent) to substantially increase their share of total income. However, although in the 1980s the gains were shared by everyone in the top 1 percent, in the 1990s the big gains went only to those in the top 0.1 percent. The 1990s were also the period when the economy was opened. This suggests the possibility that the ultra-rich were able to corner most of the income gains in the 1990s because they alone were in a position to sell what world markets wanted.24 It would be interesting to see whether in the coming years, as more people position themselves to benefit from world markets, the share of the rich and the ultra-rich stops growing and even begins to shrink. To be able to determine this and to shed light on related issues, more research (and better data) are needed that focus on the rich. REFERENCES Atkinson, Anthony B. 2004. ``Top Incomes in the United Kingdom over the Twentieth Century.'' Nuffield College, Oxford. Banerjee, Abhijit, and Andrew Newman. 2003. ``Inequality, Growth and Trade Policy.'' Massachusetts Institute of Technology, Cambridge, Mass, and London School of Ecocomics and Political Science. Banerjee, Abhijit, and Thomas Piketty. 2004. ``Top Indian Incomes, 1922­2000.'' CEPR Discussion Paper 4632. Centre for Economic Policy Research, London. Bhalla, Surjit S. 2002. Imagine There Is No Country: Poverty, Inequality and Growth in the Era of Globalization. Washington, D.C.: Institute for International Economics. Datt, Gaurav. 1997. ``Poverty in India 1951­1994: Trends and Decompositions.'' World Bank, Washington, D.C. ------. 1999. ``Has Poverty Declined since Economic Reforms?'' Economic and Political Weekly, December 11­17. Deaton, Angus. 2003a. ``Adjusted Indian Poverty Estimates for 1999­2000.'' Economic and Political Weekly, January 25. ------. 2003b. ``How to Monitor Poverty for the Millennium Development Goals.'' Journal of Human Development 4(3):353­78. ------. 2003c. Measuring Poverty in a Growing World (or Measuring Growth in a Poor World). NBER Working Paper 9822, Cambridge, Mass. 24. The point is that one does not have to be rich on a global scale to be counted among the rich in India and even among the ultra-rich (see table 1). Even those who are paid about as much as an average American make it into the group of the ultra-rich. 20 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . Deaton, Angus, and Jean Dreze. 2002. ``Poverty and Inequality in India--A Re-Examination.'' Economic and Political Weekly, September 7. Dell, Fabien. 2004. ``Top Incomes in Germany, 1880­2000.'' Campus Paris­Jourdan, Paris. Delong, J. Bradford. 2001. ``India Since Independence: An Analytical Growth Narrative.'' University of California, Berkeley. Goolsbee, A. 2000. ``What Happens When You Tax the Rich? Evidence from Executive Compensation.'' Journal of Political Economy 108(2):352­78. Kochar, Anjini. 2003. ``Government, Schooling and Poverty: The Trickle-Down Benefits of Higher Schooling in India.'' Stanford University, Center for Research on Economic Development and Policy Reform, Palo Alto, Calif. Kremer, Michael, and Eric Maskin. 2003. ``Globalization and Inequality.'' Harvard University, Cam- bridge, Mass. Minhas, B. S. 1988. ``Validation of Large Scale Sample Survey Data: Case of NSSEstimates of Household Consumption Expenditure.'' Sankhya, Series B, 50(3), Suppl. Minhas, B. S., and S. M. Kansal. 1990. ``Firmness, Fluidity and Margins of Uncertainty in the National Accounts Estimates of PCE in the 1980s.'' Journal of Income and Wealth 12(1). Nagaraj, R. 2000. ``Indian Economy since 1980? Virtuous Growth or Polarization?'' Economic and Political Weekly, August 5. Piketty, Thomas. 2003. ``Income Inequality in France, 1901­1998.'' Journal of Political Economy 111(5):1004­42. Piketty, Thomas, and Nancy Qian. 2004. ``Income Inequality and Progressive Income Taxation in China and India, 1986­2010.'' Campus Paris-Jourdan, Paris, and Massachusetts Institute of Technology, Cambridge, Mass. Piketty, Thomas, and Emmanuel Saez. 2003. ``Income Inequality in the United States, 1913­1998.'' Quarterly Journal of Economics 118(1):1­39. Ravallion, Martin. 2000. ``Should Poverty Measures Be Anchored to the National Accounts?'' Economic and Political Weekly, August 26­September 2. ------. 2001. ``Measuring Aggregate Welfare in Developing Countries: How Well Do National Accounts and Surveys Agree?'' World Bank, Washington, D.C. Rodrik, Dani, and Arvind Subramanian. 2004. The Mystery of the Indian Growth Transition. NBER Working Paper 10376, Cambridge, Mass. Sivasubramonian, S. 2000. The National Income of India in the Twentieth Century. Delhi: Oxford University Press. Sundaram, K., and Suresh D. Tendulkar. 2001. ``NAS­NSS Estimates of Private Consumption for Poverty Estimation.'' Economic and Political Weekly, January 13­20. Szekely, Miguel, and Marianne Hilgert. 1999. ``What's Behind the Inequality We Measure: An Investiga- tion Using Latin American Data.'' Inter-American Development Bank, Washington, D.C. Tendulkar, Suresh D. 2003. ``Organised Labour Market in India--Pre and Post Reform.'' University of Delhi, Delhi School of Economics. World Bank. 2000. India--Policies to Reduce Poverty and Accelerate Sustainable Development. Report 19471-IN. Washington, D.C. Can We Discern the Effect of Globalization on Income Distribution? Evidence from Household Surveys Branko Milanovic New data derived directly from household surveys are used to examine the effects of globalization on income distribution in poor and rich countries. The article looks at the impact of openness (proxied by the ratio of trade to GDP ) and of direct foreign investment on relative income shares across the entire income distribution. It finds strong evidence that at low average income levels, the income share of the poor is smaller in countries that are more open to trade. As national income levels rise, the incomes of the poor and the middle class rise relative to the income of the rich. The article explains why using the trade to GDP ratio in purchasing power parity terms, as favored by some analysts, is inappropri- ate in studies of the effect of trade on income distribution. The effect of globalization on income inequality has received widespread atten- tion in the past decade. Most of it was concentrated on the effects on wage and income inequality in the United States, Western Europe, and other rich countries (Slaughter and Swagel 1997; Dluhosch 1998; Schott 2001; Lejour and Tang 1999). A second strand of the literature has focused on how globalization affects world income distribution through differences in mean per capita growth rates (Milanovic 2004; Milanovic and Yitzhaki 2002; Melchior and others 2000; Schultz 1998; Sala-i-Martin 2002). Only recently has there been more interest in how globalization affects income distribution within developing economies (Cornia and Kiiski 2002; Lustig and Branko Milanovic is lead economist in the Development Research Group at the World Bank and senior associate at the Carnegie Endowment for International Peace; his email addresses are bmilanovic@worldbank.org and bmilanovic@ceip.org. He is grateful to Dimitri Kaltsas, Gouthami Padam, and Prem Sangraula for excellent research assistance. The article has been much improved thanks to the comments of Jaime de Melo, Richard Freeman, James K. Galbraith, Aart Kraay, Martin Ravallion, Peter Rundell, Alan Winters, three anonymous referees, and two anonymous editorial board members. The author is also grateful for comments and suggestions from participants at the Conference on Globalization and Inequality at the Brookings Institution in Washington, D.C., in June 2002; the Massachusetts Avenue Development Seminar at the Center for Global Development in Washington, D.C., in November 2003; and the Conference on Globalization, Poverty, and Inequality at the University of Utah in November 2004. The article was written as part of research project 857-25 financed by a World Bank Research Grant. , THE WORLD BANK ECONOMIC REVIEW VOL. 19,NO. 1, pp. 21­44 doi:10.1093/wber/lhi003 Ó The Author 2005. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK . All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org. 21 22 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . Kanbur 1999; Ravallion 2001; Galbraith and Kum 2002). There are theoretical models of how trade affects income distribution (Wood 1994, 2000; Benarroch and Gaisford 1997; Kremer and Maskin 2003).1 Detailed empirical analyses of the effects of economic change, including market reforms and increased international integration, on within-country income distribution are essentially limited to Latin America, however. Harrison and Hanson (1999) and Robertson (2000) study wage inequality in the wake of Mexican trade reforms. Beyer and others (1999) look at a similar issue in Chile. Arbache (1999) studies the effect of market liberalization on sectoral wage dispersion in Brazil. Behrman and others (2003) assess the impact of various policy changes (including trade liberalization and capital account opening) on wage differentials in Latin American countries. But there are relatively few studies of the impact of openness on income distribution in both poor and rich countries, which is the objective of this article.2 Two recent studies by World Bank researchers (Lundberg and Squire 2003; Dollar and Kraay 2002) look at the relationship between openness and growth and find conflicting evidence on the relationship between openness and inequality. Lundberg and Squire (1999, 2003) consider growth and inequality to be determined simultaneously. They find that openness, measured by the Sachs- Warner (0­1) indicator, has either no effect or a mild negative effect on inequal- ity.3 Barro (2000) and Ravallion (2001) find statistically significant nonlinearity in the relationship between openness and inequality, with openness associated with increased inequality in poor countries. In a somewhat different twist, Spilimbergo and others (1999), controlling for countries' endowments in skilled labor, capital, and land, find that openness reduces inequality in capital-rich countries while increasing inequality in countries with abundant skilled labor. They argue that the effect in capital-rich countries is driven by the reduction of capital rents once domestic capital markets open up, whereas the effect in labor- rich countries is consistent with the Heckscher-Ohlin framework. Dollar and Kraay (2000, 2002) reach a different conclusion. Using an unba- lanced panel covering the same period and similar countries as Lundberg and Squire (2003), they find that openness (defined as exports plus imports as a share of GDP )4 is positively associated with per capita income growth and that this effect is the same for the bottom income quintile as for the mean--trade has no systematic impact on inequality. Because trade is good for growth, the effects across all income groups are positive and the same--where the ``same'' means that each decile's gain is proportional to its initial income. (The rich benefit more in absolute but not in relative terms.) In a similar vein, Birdsall and 1. For a recent review of the evidence on the relationship between trade and poverty and its compatibility with expectations based on theory of international trade, see Winters and others (2004). 2. The terms openness and globalization are used interchangeably in this article. 3. They also find that when growth and inequality are determined simultaneously, openness is a tradeoff variable: Its effect is positive for growth and negative for equality. 4. With GDP measured in PPPterms--a feature with important implications, as explained later. Milanovic 23 Londono (1997, 1998) report no differences in growth in income between the poorest and other quintiles due to trade variables, although initial distributions of land and education do matter. Finally, Li and others (1998), in a sensitivity run of their main model, use the ratio of exports to GDP (a proxy for openness) as an explanatory variable for the Gini coefficient. They also find no statistically significant effect of openness on the Gini coefficient. These different findings (summarized in table 1) have generated intense discussion. Dollar and Kraay (2000) address some empirical and methodologi- cal differences between their study and that of Lundberg and Squire (1999). A recent study by Ravallion (2004) attempts to uncover the source of the differ- ences in results and to ``reconcile'' their findings. He is doubtful about both types of findings because the studies depend on fairly noisy data and work with averages only. According to Ravallion, generalizations are difficult because the heterogeneity in countries' underlying conditions is too great. In any case, cross-country studies yield inconsistent results on the effects of openness on inequality. On the one side, Li and others (1998), Birdsall and Londono (1998), and Dollar and Kraay (2001, 2002) find that openness has no systematic and significant effect on inequality. On the other side, Lundberg and Squire (1999), Barro (2000), and Ravallion (2001) find that openness has a negative effect on equality in poor countries and that in some of the formula- tions it has a negative effect on the real income of the poor as well. The conclusions run nearly the full gamut, from openness reducing the real income of the poor to openness raising the income of the poor proportionately less than the income of the rich to raising both the same in relative terms. Note, however, that there are no results that show openness reducing inequality, that is, raising the real incomes of the poor proportionately more than the incomes of the rich--let alone raising the absolute incomes of the poor by more. I. THE NEW DATABASE This article provides additional empirical evidence on how globalization affects income distribution in developed and developing economies, using the newly developed database, World Income Distribution (WYD) (available online at www.worldbank.org/research/inequality/data). The data are drawn almost entirely from household-level surveys, giving the database two main advantages over earlier income distribution databases such as that of Deininger and Squire (1997) and WIDER (2004).5 One is the ability to define welfare aggregates as well as recipient units consistently across countries and time, and the other is the 5. Data for about two-thirds of country/years are calculated directly from household surveys, a much higher proportion than in other databases (Deininger and Squire 1997 or WIDER 2004), which depend heavily on published, not necessarily mutually consistent, sources. Additional details about the data sources and surveys are available in Milanovic (2004, chap. 9 and 10) and Milanovic (2002, appendix 1); the data are available online at www.worldbank.org/research/inequality/data. TABLE 1. Comparison of Various Studies of Openness and Inequality Sample, Number of Observations, and Definition of Openness Welfare or Inequality Effect of Openness Study Period Number of Countries Variable Measure (source of data) on Inequality Lundberg and 1960­98 Unbalanced panel; Sachs-Warner measure Income share of bottom Mildly proinequality Squire (2003) 5-year intervals; quintile or Gini (Deininger 119 observations; and Squire 1997) 38 countries Dollar and Kraay (2002) 1960­99 Unbalanced panel; Trade/GDP inPPP terms Income share of bottom Insignificant 5-year intervals; quintile (mostly WIDER 2004) 285 observations; 92 countries 24 Ravallion (2001) 1960­94 Unbalanced panel; Export/GDP in Gini from Deininger Proinequality in poor 5-year intervals; current dollars and Squire (1997) countries 159 observations Barro (2000) 1960­90 Balanced panel; Trade/GDP adjusted for Gini from Deininger Proinequality in poor 10-year intervals; country size and Squire (1997) countries 214 observations Spilimbergo and 1965­92 Unbalanced panel; Own index of openness Gini from Deininger Proinequality in skilled others (1999) 320 observations; adjusted for endowments and Squire (1997) labor­rich countries; 34 countries and proequality in capital-rich countries Li and others (1998) 1960­94 Unbalanced panel; Export/GDP in current Gini from Deininger Insignificant 5-year intervals; dollars and Squire (1997) 159 observations Source: Author's compilation. Milanovic 25 provision of information on income levels by deciles (or even finer partitions), which can end the reliance solely on such synthetic inequality measures as the Gini coefficient and the Theil index. The ability to look at the entire distribution, at what is happening behind a change in one summary statistic, is crucial for getting a better grasp of the effects of globalization. WYD is very rich cross-sectionally. It includes 321 surveys with decile data for 95 countries in 1988 and 113 countries in 1993 and 1998. It covers only these three benchmark years, however. All incomes are expressed in international dollars (in purchasing power parity, PPP, terms). The WYD data cover more than 95 percent of world GDPincome and around 90 percent of world population. Coverage is almost complete for all geographical regions except Africa. For Africa the 1998 data cover more than two-thirds of the population and income, although the proportion is smaller for the 1988 data. II. CHANNELS OF INFLUENCE ON THE ENTIRE INCOME DISTRIBUTION AND ESTIMATION ISSUES By definition, the absolute income level of the ith decile in country j at time t can be written as a function of an inequality index (Ijt) and mean income of the country (mjt).6 ð1Þ yijt ¼ fðIjt; mjtÞ: The relative income of the ith decile (normalized by the mean) is then7 ð2Þ yijt=mjt ¼ gðIjtÞ: The level of the inequality index is then assumed to depend on the levels of the following variables: . Two ``standard'' globalization variables: openness (OPENjt), measured as the sum of exports and imports in the country's GDP, and direct foreign investment as a share of GDP (DFIjt); . Financial depth (FDjt), the ratio of M2 to GDP, introduced on the assump- tion that greater financial depth should reduce the importance of the financial constraint to borrow for education purposes, and thus should help those who are talented but lack resources (see, for example, Li and others 1998); and . An indicator of democracy (DEMjt), introduced on the assumption that democratization, through the median voter hypothesis, should lead to greater redistribution and a reduction in inequality (Milanovic 2000; lit- erature review in Gradstein and Milanovic 2004). 6. Deciles go from the poorest, 1, to the richest, 10. 7. The movement from 1 to 2 implies the homogeneity assumption. 26 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . The use of the trade to GDP ratio as a measure of globalization has been criticized--although in a somewhat different context--by several researchers (Rodrik 2000; Birdsall and Hamoudi 2002; Lubker and others 2002). There are two key critiques. First, although openness represents an outcome that govern- ments cannot influence and not a policy, or choice, variable such as the tariff level, openness is often presented--implicitly at least--as a policy. The trade ratio may decline not because the country follows a more closed policy but, for example, because of balance of payments difficulties, as happened when the terms of trade for commodity producers collapsed in the early 1980s (Birdsall and Hamoudi 2002). A failure to consider this exogenous shock could result in falsely ascribing the growth slowdown in the 1980s to decreased ``openness.'' Second, the trade to GDP ratio is often treated as a determinant of growth, whereas causality may run in the opposite direction, from growth to trade. Both criticisms are valid, but they do not affect the use of openness as a variable to explain inequality, as is done here, where the concern is not with policies (with whether a country follows open policies or not) or with the growth-trade causality. Rather, the concern is solely with how a given level of trade--whether achieved through open or closed policies--affects the distribu- tion of income. Openness is not taken to be a choice variable but is considered only for its possible impact on income distribution. What of the other variables? Financial depth and democracy are not thought to be linked directly with globalization even though such a view could be plausibly entertained. For example, increased financial depth (increased mon- etization of the economy) can be regarded as proceeding directly from better integration of a country into the international economy, and democratization can be thought to occur in response to greater international exchanges. How- ever, these two variables are used here as controls for the nonglobalization- related part of the influence on income distribution and as orthogonal to the globalization-proper variables. They are introduced primarily to avoid misspe- cification of the model. In this formulation there is no role for income as an explanatory variable. The argument that income affects inequality and should be included on right side of the equation is based on some variant of the Kuznets-type relationship. Whether or not one subscribes to the Kuznets hypothesis, it is clear that income serves only as a proxy for several structural changes--the movement of labor (from agriculture, where income is more equally distributed, to industry, where it is less equally distributed), educational change (increasing share of highly skilled people and a decreasing education premium), or demographic change (increasing share of the elderly and rising social transfers). These structural changes are all associated with rising GDP per capita. Once the equation is solved for such structural correlates of income as financial deepening and democracy, income plays no additional independent role. However, the possibility has to be taken into account that the globalization variables will have different effects on the share of a given decile depending on a Milanovic 27 country's level of development. The simple Stolper-Samuelson theorem implies that increased openness and direct foreign investment would benefit low-skilled workers in poor countries (for caveats and why this may be a special case, see Winters and others 2004, pp. 73 and 97). Thus, in poor countries the signs for the OPEN and DFI variables would be expected to be positive among the bottom deciles (increasing their income shares). In rich countries the situation would be the reverse. Openness would expose low-skilled workers to increased foreign competition, so the signs among the bottom deciles for the OPEN and DFI variables would be expected to be negative. The coefficients of the two globalization variables will therefore vary as a function of the income level of the country. Ideally, of course, the coefficients should vary as a function of the skill composition of each income decile and each country's income level. How- ever, because the data do not include information on the individual composition of each decile or on the skill composition of people in each decile, the country's income level is interacted with the openness variable. Barro (2000), Ravallion (2001), and Dollar and Kraay (2002) have all used interaction between open- ness and income. Thus the equation can be written (omitting time subscripts) for each decile: yij=mj ¼ i0 þ i1OPENj þ i2mj þ i3ðOPENj mjÞ þ i4DFIj ð3Þ þ i5ðDFIj mjÞ þ Æk ikXk þ eij; where the X's stand for other controls. In the most parsimonious formulation these other controls are financial depth and democracy. The b coefficients vary across deciles and are thus subscripted. Ten pooled cross-section regressions--one for each income decile--with the same independent variables are run across all countries. Regressions such as equation 3 can be run independently (with one omitted) or as a simultaneous system (seemingly unrelated regressions) with a constraint.8 The constraint ensures that increases in the shares of some deciles are balanced by decreases in the shares of others. Because of likely autocorrelation of shares (within countries and across years), the regressions are run with robust (Huber/White) standard errors. There are two additional problems: endogeneity and robustness of the results to the introduction of other variables. The endogeneity problem may plague both the openness and other right-side variables. Inequality might influence financial depth, democracy, or government spending (introduced later). To 8. If the slopes are assumed to be homogenous across countries and the intercepts are ``fixed'' (different between countries), a fixed-effect (FE) estimator could be used. If this estimator were used, it could be argued that the marginal effects of openness (and other explanatory variables) are the same across countries, so inequality could be determined (by varying the intercept) by other unobservable country-specific effects. This seems reasonable, but because the panel is very short (three observations only) and shares within countries change very slowly, most of the data variability is contained in cross- sectional observations. Thus, the use of the FE estimator yields poor results. 28 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . adjust in part for endogeneity, all right-side variables are calculated as five-year averages. There is also a substantive reason for that: to reflect the fact that openness or financial depth does not affect income distribution instantaneously. Endogeneity is also addressed more fully by instrumenting the possibly endo- genous variables by their lagged values and other instruments and by using a generalized method of moments (GMM) estimator whose efficiency properties are superior to those of the traditional instrumental variable/two-stage least squares estimators. The robustness of the results can be questioned because the right-side vari- ables may not include all relevant variables that can affect income shares. As a check on the robustness of the results, the parsimonious formulation is extended by adding government spending as a share of GDP and the real rate of interest. Government spending is expected to be propoor, and the real interest rate, due to the typically high concentration of capital assets in the hands of the income- rich, is expected to be prorich. Control variables are added for regional effects as well because one of the strong results of the inequality literature is that there are regional patterns in income distribution (Latin American and African coun- tries tend ceteris paribus to have more unequal income distributions and Asian countries more equal; see Higgins and Williamson 1999; Fields 2001; Milanovic 1994).9 Broadening the range of the control variables also addresses another potential source of endogeneity: having another omitted variable jointly deter- mine inequality and the right-side variables. Increasing the number of controls on the right side tends to cover most of the bases for such an effect. III. DESCRIPTIVE STATISTICS Before globalization and other macro-variables are linked to changes in income distribution, the variables need to be defined more precisely. Income distribu- tion is based on data on annual per capita incomes in PPPdollars of each decile from the 321 surveys and 129 countries in total (with 82 countries being a balanced panel) for the benchmark years 1988, 1993, and 1998. Each decile contains 10 percent of individuals, not households. The dependent variable is defined as the ratio between decile mean income and country mean income. All right-side variables are calculated as averages over five-year periods rather than single values for 1988, 1993, and 1998, for two reasons. First, the distribution data are only benchmarked in 1988, 1993, and 1998. The surveys used to calculate the decile data might have been conducted before or after the benchmark year (say, 1986 or 1989 rather than 1988).10 Thus the ``averaging'' for the dependent vari- able is accompanied by a similar averaging of the controls. 9. This point was made by an anonymous referee. 10. Overall, however, more than 70 percent of the surveys are within a year of the benchmark, and more than 90 percent of surveys are within two years of the benchmark. Milanovic 29 Second, even if all the surveys were conducted the same year, there would be some advantage in relating changes in income shares to several years' average share of exports and imports in GDP . This is done to avoid having the results swamped by noise--very short-run changes that cannot have much influence on a sluggish variable like income distribution. Thus openness that is associated with income distribution around 1988 is taken as the average of exports and imports to GDP during the five-year period ending in 1988. The same is done for income distribution in 1993 and 1998. Identical calculations are done for other right-side variables. Mean-normalized average incomes were calculated for each decile in 1988, 1993, and 1998 (table 2). For example, in 1988 the bottom decile's income was 30.7 percent of the mean calculated across all countries and 30.3 percent of the mean calculated across the common-sample countries.11 By 1993 the bottom decile's income had fallen to 23.5 percent of the mean for all countries and 24.4 percent of the mean for the common-sample countries. By 1998 it had declined even further, to 23.3 percent of the mean for both groups. Between 1988 and 1993 the relative incomes of the bottom eight deciles declined--with the largest decline among the poorest deciles--and the relative income of the top two deciles rose, with the greatest increase among the very top. The situation changed between 1993 and 1998, when deciles two through seven gained, whereas the very bottom decile and the top three deciles lost (all in relative terms). Figure 1 illustrates the recent upsurge in globalization as reflected in the openness variable (ratio of trade to GDP in current dollar terms) and the increased importance of direct foreign investments as a percentage of recipient countries' GDP . There is a sustained increase in the (unweighted) share of open- ness from around 70 percent in the mid-1980s to more than 90 percent at the turn of the century. The dollar-weighted share of trade in world GDP (not shown in the figure) increased from 38 percent to 44 percent. The higher unweighted ratio of trade to GDP reflects the fact that trade shares are greater for smaller (and poorer) countries. Even more dramatic was the increase in unweighted foreign direct investments, from less than 1 percent of GDP in the late 1980s to 4 percent in 2000. Of less interest are the other control variables, financial depth and democ- racy. Financial depth is measured simply as the ratio of M2 to GDP. Democracy is measured by the democracy variable from the PolityIV database and takes a value from 0 (absence of democracy) to 10 (best).12 11. Each country is one observation regardless of its population size. 12. The database was created by Monthy Marshall, Keith Jeggers, and Ted Gurr. The data are available online at www.cidcm.umd.edu/inscr/polity. Democracy is defined as ``general openness of political institutions.'' Financial depth is measured using M2 (the variable 35L..ZF, money plus quasi- money from International Financial Statistics) and nominal GDP. 30 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . TABLE 2. Mean-Normalized Average Incomes by Decile (across countries, not weighted for population) Balanced Panel All Countries (common sample countries) Decile 1988 1993 1998 1988 1993 1998 First 0.307 0.235 0.233 0.303 0.244 0.233 Second 0.441 0.375 0.380 0.437 0.391 0.387 Third 0.539 0.476 0.482 0.535 0.495 0.491 Fourth 0.635 0.571 0.581 0.631 0.593 0.590 Fifth 0.736 0.677 0.686 0.733 0.701 0.697 Sixth 0.855 0.804 0.810 0.853 0.831 0.821 Seventh 1.000 0.959 0.962 1.000 0.984 0.972 Eighth 1.201 1.182 1.181 1.202 1.207 1.188 Ninth 1.541 1.566 1.552 1.548 1.580 1.553 Tenth 2.745 3.156 3.138 2.757 2.973 3.068 Number 95 113 113 82 82 82 of countries Decile ratioa 8.9 13.4 13.5 9.1 12.2 13.2 Note: Deciles are formed based on per capita income or expenditures obtained from household surveys. a The ratio of the average income of the tenth decile to the first decile. Source: Author's computations based on household survey data from the WYD database. IV. ESTIMATION OF THE REGRESSIONS Ten-level regressions are estimated, starting with the parsimonious formulation and moving to an extended formulation that includes real rate of interest and government expenditures as a share of GDP .13 Two types of estimation are performed: simultaneous decile estimation and instrumental variable GMM esti- mation, which instruments openness and government expenditure as a share of GDP by their lagged values and the country's population. The results of the different regressions are quite similar, so only the GMM estimates of the extended formulation are discussed here.14 The regression is an unbalanced panel run across 138 decile shares in 1988, 1993, and 1998 (table 3).15 The Hansen J statistic (test of overidentifying 13. The nominal interest rate is the deposit rate on 12-month deposits as reported in the International Monetary Fund's (IMF) International Financial Statistics (various issues; the variable is 60L .. . ZF). The real rate is obtained by deflating the nominal rate by the 12-month consumer price index (also as reported in International Financial Statistics). Government expenditures are the sum of central (consolidated accounts), local, and state or provincial government expenditures. The data are from the IMF's Govern- ment Financial Statistics. 14. The full results are available online at www.worldbank.org/research/inequality. 15. There are 321 total surveys. Some drop out because they lack other right-side variables. The dropout rate is much lower for the parsimonious formulation, which is run across 201 decile shares. The fact that key results are virtually identical for both samples is reassuring. Milanovic 31 FIGURE 1. Average Annual Trade to GDP Ratio (left scale) and Direct Foreign Investment to GDPRatio (right scale) for a Large Sample of Countries (percent; unweighted, calculated in current dollars) 4 90 85 3 percent percent in 80 in 2 75 trade/GDP DFI/GDP 70 1 65 1985 1990 1995 2000 year trade/GDP DFI/GDP Note: The data on trade shares include nearly all countries in the world (the number ranges from 125 to 150). The data on foreign investment inflows include about 80 countries. Each country/year is one observation. Source: For trade to GDP ratio, author's calculation based on World Bank data (World Development Indicators and Statistical Information Management and Analysis database); For direct foreign investment to GDPratio, author's calculations based onUNCTAD(1996, 1997, 2000). restrictions) is insignificant throughout, indicating that instruments are valid.16 The F-test of excluded instruments (not reported in the table) is highly significant (value greater than 1,200), again implying that the instruments are appropriate. The results are as follows. Increased openness reduces the income shares of the bottom six deciles. The negative effect of openness is smaller in richer countries, for which the interaction term between openness and mean income is positive. Open- ness would therefore seem to have a particularly negative impact on poor and middle-income groups in poor countries--directly opposite to what would be expected from the standard Heckscher-Ohlin-Samuelson framework. Only when income level (calculated from household surveys) reaches about PPP $7,500 (about 16. Hansen's J statistic is consistent in the presence of heteroscedasticity, whereas its alternative, Sargan's statistic, is not. TABLE 3. Explaining Mean-Normalized Decile Incomes for 1988, 1993, and 1998 (GMM/instrumental variable estimation) First Second Third Fourth Fifth Sixth Seventh Eighth Ninth Tenth Variable Decile Decile Decile Decile Decile Decile Decile Decile Decile Decile Openness5 0.102** 0.137** 0.138** 0.135** 0.118** 0.094** 0.065 0.001 0.090 0.695** (0.029) (0.005) (0.005) (0.005) (0.007) (0.019) (0.055) (0.978) (0.086) (0.013) Expgdp5 0.244** 0.304** 0.297** 0.286** 0.263** 0.213** 0.150** 0.043 0.147** 1.637** (0) (0) (0) (0) (0) (0) (0) (0.148) (0.014) (0) Mean income 0.003 0.001 0.001 0.001 0.0002 0.001 0.001 0.003 0.0009 0.001 (in PPP $000) (0.587) (0.841) (0.861) (0.861) (0.971) (0.829) (0.741) (0.331) (0.889) (0.969) Openness5 * 0.014** 0.020** 0.019** 0.019** 0.017** 0.013** 0.008 0.001 0.019** 0.092** mean income (0.049) (0.014) (0.016) (0.011) (0.012) (0.033) (0.106) (0.8) (0.04) (0.028) M2gdp5 0.097** 0.128** 0.116** 0.102** 0.089** 0.081** 0.068** 0.048 0.001 0.749** (0.008) (0.001) (0.003) (0.006) (0.01) (0.008) (0.012) (0.056) (0.976) (0.001) DFI5 0.003 0.001 0.002 0.003 0.005 0.007 0.008 0.008 0.007 0.037 (0.596) (0.833) (0.737) (0.622) (0.441) (0.265) (0.148) (0.08) (0.239) (0.335) DFI5 * 0.003** 0.003** 0.002 0.002 0.002 0.001 0.0005 0.0005 0.003** 0.011 mean income (0.034) (0.032) (0.113) (0.11) (0.168) (0.365) (0.737) (0.734) (0.031) (0.239) 32 Democracy5 0.002 0.004 0.006 0.007** 0.008** 0.008** 0.007** 0.005 0.0005 0.043 (0.646) (0.288) (0.102) (0.037) (0.015) (0.007) (0.006) (0.083) (0.901) (0.056) Rint5 0.001** 0.002** 0.002** 0.002** 0.002** 0.002** 0.001** 0.001** 0.002** 0.010** (0.003) (0) (0) (0) (0) (0) (0.002) (0.017) (0.013) (0) Constant 0.164** 0.237** 0.335** 0.432** 0.540** 0.676** 0.858** 1.121** 1.626** 4.031** (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) Hansen J 0.317 0.031 0.342 0.671 0.891 0.955 1.070 0.946 1.810 0.978 (0.573) (0.860) (0.558) (0.413) (0.345) (0.328) (0.301) (0.3306) (0.178) (0.323) Number of 135 138 138 138 138 138 138 138 138 138 observations Centered R2 0.3326 0.5015 0.5248 0.5431 0.5569 0.5491 0.5011 0.2537 0.2491 0.5234 **Significant at the 1 or 5 percent level. Note: The dependent variable is the decile mean income/overall mean income. Numbers in parentheses are p-values. Openness and government expenditure as share of GDP are instrumented. GMM calculations are performed using the ivreg2.ado routine developed by Baum and others (2002). A suffix of 5 indicates a five-year average. Government expenditures, openness, and M2 are expressed as a share of GDP (such as 0.3 not 30 percent);DFI GDP / is expressed as a percentage. Real rate of interest is expressed as an annual percentage. Mean income is expressed in 1995 PPPdollars. Regressions are run with robust standard errors. Source: Author's computations based on household survey data from the WYD database. Milanovic 33 the level of Spain and Israel) does openness become a good thing for poor and middle-income groups by raising their share in total income.17 How large is the openness effect? Consider a poor country with a mean income of PPP $2,000 per capita and whose second decile's share of income is about 4 percent (an average value in the sample). The second decile's mean per capita income is therefore PPP $800. An increase from 0.7 to 0.9 in the trade to GDP ratio (an average change between 1985 and 2000) reduces the decile's share of income to about 3.8 percent and its mean per capita income to PPP $760 in absolute terms (of course, absent any other effect, including a change in total income).18 Direct foreign investment is not statistically significant, whether alone or interacted with income.19 Neither is real mean income alone. Financial depth, as expected, increases the income share of the poor and middle class and reduces the share of the top decile. Democracy positively affects the income shares of the middle deciles. An interesting result, this suggests that earlier work failed to detect the effect of democracy on inequality (Bollen and Jackman 1985; Gradstein and others 2001) because democracy affects primarily the income shares of the middle groups while leaving the shares of the top and the bottom deciles unchanged.20 As a consequence, synthetic inequality measures, such as the Gini coefficient, may not show much change. The real interest rate is shown to be statistically significant throughout and strongly antipoor: It reduces the share of the bottom eight deciles and raises that of the top two. How strong is this effect? The income share of the top decile is about 30 percent. Each percentage point increase in the real rate of interest raises that share by about 0.1 percentage point. In other words, the real income of the rich (assuming total income remains fixed) goes up by one-third of 1 percent. Government expenditure plays a role directly opposite that of high real interest rates. A 10 percentage point increase in the ratio of government expenditure to GDP raises the bottom decile's share of the pie by 0.24 percentage point--about one- tenth of what the bottom decile received on average (see table 2). Next, the sensitivity of these results to changes in specification is tested. When regional dummy variables (with Western Europe­North America­Oceania, or WENAO , as the reference category) and the five-year average rate of inflation are 17. The turning point is somewhat lower using the formulation with financial depth and democracy only. It is present in all cases, however. One should attach much less confidence to the exact income level at which the turn occurs than to its existence. Note that Barro (2000, p. 28) finds the turning point to be around GDP per capita of PPP$13,000 (in 1985 prices). 18. This is obtained as follows. At PPP$2,000, the sensitivity of the decile ratio variable is 0.137+ (0.02 * 2) = 0.01, which means that with a unitary increase in openness, the decile ratio will go down by 0.1 (see table 3). If openness now increases from 0.7 to 0.9, the effect will be (0.9 0.7) * 0.1= 0.02 or 0.2 percentage point. So, the effect of a 20 percentage point increase in the trade share will be a decline of 0.2 percentage point in the second decile's income share. 19. Except for the bottom two deciles for the interaction term. 20. For an exception see Tavares and Wacziarg (2001). 34 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . added, the general quality of the results improves and R2 rises to a rather high average value of 0.7 (table 4). The role of openness becomes sharper: It reduces the share of the bottom seven deciles and raises that of the top two. Its level of significance also becomes very high. The interaction term between openness and income is equally statistically significant almost throughout. The turning point now occurs around PPP$8,000. Government expenditures are still propoor, and the real rate of interest is prorich and a significantly strong predictor of income shares across the entire distribution. As expected from the literature, inflation is strongly significant and negatively affects the income shares of the poor and the middle class. The regional dummy variables for Latin America, and Eastern Europe and the former Soviet Union are significant almost throughout, the first being antipoor and the second propoor (compared with the omitted WENAO dummy variable). Both Africa and Asia show lower shares for the middle deciles. Finally, the sensitivity of the results to different definitions of openness is examined and compared with the results of Dollar and Kraay (2001, 2002). Several alternative definitions of openness are used here: the ratio of trade to GDP in constant U.S. dollars converted at market exchange rates (the volume of trade variable kopen in Penn World Table 6.1) and the ratio of exports to GDP in current prices. Both formulations yield the same results as the preferred measure does (trade to GDP in current dollars).21 Perhaps more interesting is to explore the difference between these results and those reported by Dollar and Kraay (2002). Their much-quoted article reaches two important conclusions: first, that the growth rate of the bottom quintile displays on average a unitary elasticity (meaning that the poor's percentage increase in income is the same as the mean increase), and second, that openness does not significantly affect the income share of the bottom quintile. This second finding differs from the results found here. The difference may be due to a difference in the sample (unlikely, however, because the samples include largely the same countries), the period covered (Dollar and Kraay's data go back to the 1970s), or the inequality measure used (that used for this study is better because the deciles are almost all calculated directly from individual household surveys, whereas Dollar and Kraay's quintile data come from a heterogeneous collection of sources, with many of them being extrapolations). The most important difference, however, is in the definition of the openness variable. This study defines openness as the trade to GDP ratio, both expressed in nominal dollar terms. Dollar and Kraay define openness rather unusually as the ratio between trade in 1985 dollars and GDP in 1985 international dollars. It is a consistent definition, because both the numerator and the denominator are in international dollars (trade is by definition conducted at international prices), but it is a different indicator from (say) the volume of trade as given in Penn World Table 6.1. There, trade is also in the numerator (in the same 1985 prices), but GDP is 21. For reasons of space, the results are not reported here, but they are available from the author. TABLE 4. Explaining Mean-Normalized Decile Incomes for 1988, 1993, 1998: Adding Regional Dummy Variables and Inflation (GMM/instrumental variable estimation) First Second Third Fourth Fifth Sixth Seventh Eighth Ninth Tenth Variable Decile Decile Decile Decile Decile Decile Decile Decile Decile Decile Openness5 0.149** 0.199** 0.200** 0.199** 0.176** 0.143** 0.096** 0.012 0.171** 1.032** (0) (0) (0) (0) (0) (0) (0.002) (0.699) (0.001) (0) Expgdp5 0.146** 0.187** 0.139** 0.111** 0.079 0.027 0.019 0.053 0.064 0.481 (0.013) (0.001) (0.008) (0.028) (0.1) (0.566) (0.668) (0.195) (0.255) (0.148) Mean income 0.010** 0.012** 0.014** 0.015** 0.014** 0.012** 0.010** 0.004 0.012** 0.078** (inPPP $000) (0.047) (0.018) (0.004) (0.001) (0) (0.001) (0.001) (0.25) (0.036) (0.002) Openness5 * 0.019** 0.026** 0.025** 0.024** 0.022** 0.017** 0.011** 0.0005 0.024** 0.123** mean income (0.001) (0) (0) (0) (0) (0) (0.005) (0.918) (0.005) (0) M2gdp5 0.006 0.007 0.018 0.026 0.030 0.037 0.040 0.041 0.054 0.343 (0.854) (0.821) (0.546) (0.394) (0.29) (0.179) (0.116) (0.14) (0.274) (0.081) 35 DFI5 0.0006 0.003 0.005 0.004 0.005 0.006 0.007 0.008 0.009 0.038 (0.88) (0.468) (0.161) (0.237) (0.212) (0.135) (0.106) (0.095) (0.129) (0.129) DFI5 * 0.003** 0.002** 0.002 0.002** 0.002** 0.001 0.0005 0.0005 0.003** 0.011 mean income (0.038) (0.04) (0.051) (0.018) (0.02) (0.114) (0.569) (0.619) (0.031) (0.058) Democracy5 0.003 0.008** 0.009** 0.009** 0.009** 0.009** 0.008** 0.006 0.001 0.056** (0.416) (0.001) (0) (0) (0.001) (0.001) (0.004) (0.052) (0.897) (0.003) Rint5 0.001** 0.001** 0.0009** 0.001** 0.001** 0.001** 0.0009** 0.0007 0.002** 0.005** (0.018) (0.004) (0.004) (0.002) (0.001) (0.001) (0.009) (0.05) (0) (0.005) Linf5 0.022** 0.032** 0.032** 0.033** 0.030** 0.026** 0.017** 0.008 0.045** 0.159** (0.003) (0) (0) (0) (0) (0) (0.005) (0.228) (0) (0) Africa 0.010 0.028 0.062 0.083** 0.096** 0.099** 0.092** 0.041 0.083 0.475 (0.801) (0.464) (0.082) (0.018) (0.005) (0.004) (0.008) (0.308) (0.134) (0.068) Asia 0.067 0.026 0.016 0.041 0.063** 0.083** 0.098** 0.086** 0.011 0.301 (0.091) (0.521) (0.633) (0.208) (0.034) (0.003) (0) (0) (0.761) (0.144) (Continued) TABLE 4. Continued First Second Third Fourth Fifth Sixth Seventh Eighth Ninth Tenth Variable Decile Decile Decile Decile Decile Decile Decile Decile Decile Decile Latin America 0.082** 0.102** 0.124** 0.135** 0.140** 0.141** 0.130** 0.077** 0.063 0.886** (0.011) (0.001) (0) (0) (0) (0) (0) (0.001) (0.069) (0) East Europe and 0.125** 0.140** 0.132** 0.116** 0.091** 0.063** 0.024 0.024 0.117** 0.572** former Soviet Union (0.001) (0) (0) (0) (0) (0.003) (0.224) (0.289) (0.002) (0) Constant 0.323** 0.480** 0.618** 0.739** 0.846** 0.966** 1.098** 1.244** 1.336** 2.298** (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) 36 Hansen J 3.647 2.936 0.903 0.044 0.104 0.432 1.020 1.569 2.992 0.382 (0.058) (0.086) (0.342) (0.833) (0.746) (0.511) (0.312) (0.2104) (0.083) (0.536) Number of 135 138 138 138 138 138 138 138 138 138 observations Centered R2 0.619 0.735 0.756 0.768 0.768 0.746 0.670 0.3407 0.444 0.725 **Significant at the 1 or 5 percent level. Note: The dependent variable is the decile mean income/overall mean income. Numbers in parentheses are p-values. Openness and government expenditure as share of GDP are instrumented. GMM calculations are performed using the ivreg2.ado routine developed by Baum and others (2002). A suffix of 5 indicates a five-year average. Government expenditures, openness, and M2 are expressed as a share of GDP (such as 0.3 not 30 percent);DFI GDP / is expressed as a percentage. Real rate of interest is expressed as an annual percentage. Mean income is expressed in 1995 PPPdollars. The omitted region is Western Europe, North America, and Oceania (Australia and New Zealand). Regressions are run with robust standard errors. Source: Author's computations based on household survey data from the WYD database. Milanovic 37 calculated in exchange rate dollars of the same year. The difference is important because using PPP dollars to express the denominator significantly increases GDP for poor countries and thus reduces the importance of trade in their GDP . When the previous regression is run with the Dollar and Kraay measure of openness,22 the results change in an important way (see table 5). Openness is no longer significant for any decile, nor is the interaction between openness and income. In other words, openness does not matter for income distribution.23 To see why there is a difference in the results obtained by these two measures of openness, consider how they behave when trade in poor countries expands.24 The effect on the Dollar and Kraay measure will be small because the bulk of these countries' GDP will still consist of nontradables that are valued at (high) international prices. But the effect on the trade to GDP ratio in both current and constant prices (at market exchange rates) may be large. Consider India and China. In nominal terms, the trade to GDP ratio in India increased from 16 per- cent in 1985 to 31 percent in 2000 and in China from 21 percent to 49 percent. In volumes (given by Penn World Table 6.1) the increase was from 19 percent of GDP to 25 percent for India and from 12 percent of GDP to 53 percent for China. In the PPP terms used by Dollar and Kraay, however, the trade ratio barely budged over the same period, going from 4 percent to 5 percent in India and from 3 percent to 12 percent in China.25 A stark illustration of the difference implied by the use of different measures is shown in figures 2 and 3, where the lowest line is always the Dollar and Kraay measure and the ratio of trade to GDP in current prices and in constant prices move almost in unison. Now consider the situation in which income inequality rises simultaneously with increases in trade in both countries (as indeed it did). The Dollar and Kraay measure will fail to detect much of a relationship between openness and inequal- ity because the measure is artificially sluggish. With the measure used here, however, an increase in openness will be associated with greater inequality. 22. The measure is calculated, following Dollar and Kraay, from Penn World Table 6.1 as (exports+ imports) expressed in local currency and at 1996 constant prices divided by the 1996 exchange rate and then by the 1996 GDP per capita in international dollars (variable rgdpch from Penn World Table 6.1). Similar results are obtained using World Bank data, where trade in current U.S. dollars is divided by the U.S. Consumer Price Index (with 1995 as the base year) and then by the 1995 GDP per capita in international prices. The World Bank measure takes into account terms of trade effect. 23. The general quality of the regressions goes down although government expenditures as share of GDP, democracy, and real rate of interest behave about the same as in earlier. 24. In general, there would be no important difference between the two measures for the rich countries because their GDP s calculated at PPP or market exchange rates are similar. However, even there, the differences do appear for countries like Sweden or Germany whose price level is higher than that of the United States (numeraire in the case of PPP calculations). Here, the bias is in the opposite direction: The ratio of trade to GDP in PPPterms will be higher than ratio of trade to GDP in current terms. 25. Not even the direction of change is always the same. For example, between 1980 and 2000 Indonesia's openness increased using current values (from 49 percent to 82 percent of GDP ) while declining using GDP in PPP terms (from 17 percent to 13 percent). TABLE 5. Explaining Mean-Normalized Decile Incomes for 1988, 1993, 1998: Changing the Definition of Openness to the Dollar-Kraay Measure (GMM/instrumental variable estimation) First Second Third Fourth Fifth Sixth Seventh Eighth Ninth Tenth Variable Decile Decile Decile Decile Decile Decile Decile Decile Decile Decile Open_ppp5 0.098 0.083 0.058 0.037 0.012 0.021 0.053 0.109 0.087 0.012 (0.365) (0.482) (0.612) (0.73) (0.906) (0.828) (0.543) (0.099) (0.395) (0.986) Expgdp5 0.234** 0.277** 0.271** 0.260** 0.236** 0.183** 0.117** 0.008 0.148** 1.448** (0.004) (0) (0) (0) (0) (0.002) (0.015) (0.808) (0.037) (0.001) Mean income 0.0001 0.003 0.005 0.005 0.006 0.007 0.007 0.007** 0.002 0.039 (in PPP$000) (0.984) (0.615) (0.456) (0.341) (0.223) (0.119) (0.091) (0.045) (0.72) (0.244) Open_ppp5 * 0.0138 0.016 0.013 0.010 0.007 0.003 0.002 0.010 0.016 0.033 mean income (0.222) (0.19) (0.28) (0.347) (0.488) (0.78) (0.82) (0.131) (0.133) (0.619) M2gdp5 0.1003** 0.115** 0.091** 0.075 0.063 0.057 0.046 0.034 0.005 0.570 (0.004) (0.004) (0.022) (0.051) (0.073) (0.072) (0.092) (0.159) (0.899) (0.015) DFI5 0.001 0.001 0.005 0.007 0.008 0.010 0.010 0.008 0.004 0.056 (0.854) (0.824) (0.367) (0.261) (0.166) (0.096) (0.059) (0.063) (0.469) (0.118) 38 DFI5 * 0.003** 0.003** 0.002 0.002 0.002 0.001 0.0005 0.0003 0.002 0.010 mean income (0.028) (0.023) (0.105) (0.111) (0.166) (0.352) (0.726) (0.821) (0.103) (0.201) Democracy5 0.001 0.005 0.007 0.008** 0.009** 0.008** 0.007** 0.003 0.002 0.045 (0.886) (0.249) (0.104) (0.046) (0.025) (0.019) (0.023) (0.248) (0.622) (0.076) Rint5 0.001** 0.002** 0.002** 0.002** 0.002** 0.002** 0.001** 0.001** 0.002** 0.009** (0.005) (0) (0) (0) (0.001) (0.001) (0.002) (0.013) (0.017) (0) Constant 0.127** 0.190** 0.283** 0.379** 0.491** 0.633** 0.824** 1.108** 1.652** 4.327** (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) Hansen J 0.006 0.772 1.062 1.285 1.399 1.412 1.460 1.213 1.601 1.450 (0.936) (0.379) (0.3026) (0.2569) (0.2369) (0.2347) (0.2269) (0.2707) (0.2057) (0.2285) Number of 135 138 138 138 138 138 138 138 138 138 observations Centered R2 0.3132 0.4913 0.5117 0.5280 0.5426 0.5368 0.4907 0.2481 0.2418 0.5097 **Significant at the 1 or 5 percent level. Note: The dependent variable is the decile mean income/overall mean income. Numbers in parentheses are p-values. Openness and government expenditure as share of GDP are instrumented. GMM calculations are performed using the ivreg2.ado routine developed by Baum and others (2002). A suffix of 5 indicates a five- year average. Government expenditures, openness, and M2 are expressed as a share of GDP (such as 0.3 not 30 percent);DFI GDP / is expressed as a percentage. Real rate of interest is expressed as an annual percentage. Mean income is expressed in 1995 PPPdollars. Regressions are run with robust standard errors. Source: Author's computations based on household survey data from the WYD . Milanovic 39 FIGURE 2. Different Measures of Openness for China, 1980­2000 .5 .4 share .3 GDP/ trade .2 .1 1980 1985 1990 1995 2000 year Source: FortheDollar-Kraaymeasure,author'scalculations basedonPennWorldTable6.1;forthe trade to GDP ratio in current prices, author's calculations based on World Bank data (World DevelopmentIndicatorsand Statistical InformationManagementand Analysisdatabase);for the trade toGDP ratio in constant prices, the volume of trade is from Penn World Table 6.1 (the variable KOPEN ). The key question is, which approach makes more sense? When the relation- ship of interest is how important international trade is for income creation and income distribution in a given country, it is the trade to GDP ratio in nominal prices that matters. The role that trade plays in the total income of a country-- in people's earnings--depends on how much actual income is generated in trade compared with purely domestic activities. Income distribution is affected by the incomes actually received, not by notional incomes that are ascribed through the imputation of international prices to domestic goods and services. For a barber in India, for example, what matters for his income and for the income distribu- tion in India is the actual local pay received, not how much his output is valued at international prices. For China, surely exports play a role in people's income that is commensurate with the 26 percent share of exports in nominal GDP in 2000, rather than with the 7 percent that exports represent in China's GDP calculated in PPP terms.26 26. Clearly the greater the PPP value ofGDP , the better off the average citizen, who can consume more goods and services. This type of PPPcomparison is useful for comparing average welfare levels in different countries. But this is not the objective here. 40 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . FIGURE 3. Different Measures of Openness for India, 1980­2000 .5 .4 share .3 GDP/ trade .2 .1 1980 1985 1990 1995 2000 year Source: For the Dollar-Kraay measure, author's calculations based on Penn World Table 6.1; for the trade to GDP ratio in current prices, author's calculations based on World Bank data (World Development Indicators and Statistical Information Management and Analysis database); for the trade toGDP ratio in constant prices, the volume of trade is from Penn World Table 6.1 (the variable kopen). In conclusion, the results of the level regressions (to the extent that most of the identification comes from cross-sectional differences) show that rich people in poor countries that trade more tend to control a greater share of overall income than the rich people in equally poor countries that trade less. However, at some rather high level of mean country income--around PPP$8,000 per capita as calculated from household surveys--the situation reverses and more open countries are associated with more equal income distribution (higher income shares for the poor). V. CONCLUSION The effects of globalization on income distribution within rich and poor coun- tries are a matter of intense debate. This study examined these effects using new data from household surveys and looking at the impact of openness (trade as a share of GDP ) and direct foreign investment (as a percentage of GDP) on the entire income distribution in both poor, middle-income, and rich countries. It found rather robust evidence that in countries at very low income level, it is the rich who benefit from openness, but as income levels rise, the incomes of the poor and the middle class rise proportionately more than the incomes of the rich. Milanovic 41 Because most of the identification comes from cross-country variability, it cannot be strongly asserted that for any given country, openness makes income distribution worse before making it better, but it can be argued that the poor in poor countries do not seem to be the beneficiaries from greater trade. In other words, in poor countries that are otherwise identical, the income shares of the poor will be less in countries that trade more than in countries that trade less. Direct foreign investment has no effect on income distribution. Democracy raises the income shares of the middle deciles and leaves those of the top and the bottom deciles unchanged (possibly explaining why synthetic measures of inequality such as the Gini coefficient have generally failed to detect an effect of democracy on inequality). Government expenditures increase the income shares of the bottom income groups, and higher real rates of interest--a topic attracting surprisingly little attention--and inflation are prorich. Even the mid- dle classes lose income shares when real interest rates and inflation are high. The use of a trade to GDP ratio in PPP terms is shown to be misleading in studies of the effect of trade on income distribution. In conclusion, the poorest deciles in poor countries--those who should benefit most from increased trade according to both economic theory and the policy pre- scriptions of international organizations--appear to be the losers in relative terms. The case for trade as an engine of growth for the poorest of the poor is not completely undermined, however. But the case must be based on trade's impact on average incomes which, if sufficient, might lift the real incomes of the poor as well as those of the rich. The case cannot be made on the basis of trade's favorable or neutral impact on income distribution. REFERENCES Arbache, Jorge Saba. 1999. ``How Do Economic Reforms Affect the Dispersion and Structure of Wages? The Case of an Industrialising Country Labor Market.'' Paper presented at the 1999 Royal Economic Society conference, March 29­April 1, University of Nottingham, U.K. Baldwin, Richard E., and Philippe Martin. 1999. ``Two Waves of Globalization: Superficial Similarities, Fundamental Differences.'' NBER Working Paper 6904. National Bureau of Economic Research. Cambridge, Mass. Barro, Robert. 2000. ``Inequality and Growth in a Panel of Countries.'' Journal of Economic Growth 5(1):5­32. Baum, C. F., M. E. Schaffer, and S. Stillman. 2002. ``Instrumental Variables and GMM : Estimation and Testing.'' Working Paper 545. Boston College, Department of Economics, Chestnut Hill, Mass. Available online at http://fmwww.bc.edu/ec-p/WP545.pdf. Beck, T., G. Clarke, A. Groff, P. Keefer, and P. Walsh. 2000. ``New Tools and New Tests in Comparative Political Economy: The Database of Political Institutions.'' Policy Research Working Paper 2283. World Bank, Washington, D.C. Behrman, Jere, Nancy Birdsall, and Miguel Szekely. 2003. ``Economic Policy and Wage Differentials in Latin America.'' Center for Global Development Working Paper 29. Washington, D.C. Available online at www.cgdev.org/publications/?pubid = 29. Benarroch, Michael, and James D. Gaisford. 1997. ``Economies of Scale, International Capital Mobility, and North-South Inequality.'' Review of International Economics 5(3):412­28. 42 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . Beyer, Harald, Patricio Rojas, and Rodrigo Vergara. 1999. ``Trade Liberalization and Wage Inequality.'' Journal of Development Economics 59(1):103­23. Birdsall, Nancy, and Amar Hamoudi. 2002. ``Commodity Dependence, Trade and Growth: When Open- ness Is Not Enough.'' Center for Global Development Working Paper 7. Washington, D.C. Available online at www.cgdev.org/publications/?pubid= 7. Birdsall, Nancy, and Juan Luis Londono. 1997. ``Asset Inequality Matter: An Assessment of the World Bank's Approach to Poverty Reduction.'' American Economic Review 87(2):32­37. ------. 1998. ``No Trade-off: Efficient Growth via More Equal Human Capital Accumulation.'' In Nancy Birdsall, Carol Graham, and Richard Sobot, eds., Beyond Tradeoffs: Market Reforms and Equitable Growth in Latin America. Washington, D.C.: Inter-American Development Bank and Brookings Institution. Bollen, K., and R. W. Jackman. 1985. ``Political Democracy and the Size Distribution of Income.'' American Sociological Review 50:438­57. Bordo, Michael D., Barry Eichengreen, and Douglas A. Irwin. 1999. ``Is Globalization Today Really Different than Globalization a Hundred Years Ago?'' NBERWorking Paper W7195. National Bureau of Economic Research, Cambridge, Mass. Cornia, Andrea, and Sampsa Kiiski. 2002. ``Trends in Income Distribution in the Post World War II Period: Evidence and Interpretation.'' WIDER Working Paper 2001/89. United Nations University, World Institute for Development Economics Research, Helsinki. Available online at www.wider. unu.edu/publications/dps/dp2001­89.pdf. Craft, Nicholas. 2000. ``Globalization and Growth in the Twentieth Century.'' Background paper to World Economic Outlook. International Monetary Fund. Washington, D.C. Deininger, Klaus, and Lyn Squire. 1997. ``Deininger and Squire Data Set: A New Data Set Measuring Income Inequality.'' World Bank, Washington, D.C. Available online at www.worldbank.org/ research/growth/dddeisqu.htm. Dluhosch, Barbara. 1998. ``Globalization and European Labor Markets.'' CEPR Discussion Paper 1992. Centre for Economic Policy Research, London. Dollar, David, and Aart Kraay. 2000. ``Growth Is Good for the Poor.'' World Bank, Development Research Group, Washington, D.C. ------. 2001. ``Trade, Growth, and Poverty.'' Policy Research Working Paper 2615. World Bank, Washington, D.C. ------. 2002. ``Growth Is Good for the Poor.'' Journal of Economic Growth 7(3):195­225. Fields, Gary. 2001. Distribution and Development: A New Look at the Developing World. New York: Russell Sage Foundation and Cambridge, Mass.: MIT Press. Freeman, Richard B. 1995. ``Are Your Wages Set in Beijing?'' Journal of Economic Perspectives 9(3):15­32. Galbraith, James K., and Hyunsub Kum. 2002. ``Inequality and Economic Growth: Data Comparisons and Econometric Tests.'' University of Texas, LBJ School of Public Affairs, University of Texas Inequality Project, Austin. Gradstein, Mark, and Branko Milanovic. 2004. ``Does Liberte´ = Egalite´? A Survey of the Empirical Evidence on the Links between Political Democracy and Income Inequality.'' Journal of Economic Surveys 18(4):515­37. Gradstein, Mark, Branko Milanovic, and Yvonne Ying. 2001. ``Democracy and Income Inequality: An Empirical Analysis.'' Policy Research Working Paper 2561. World Bank, Washington, D.C. Harrison, Ann, and Gordon Hanson. 1999. ``Who Gains from Trade Reform? Some Remaining Puzzles.'' Journal of Development Economics 59(1):125­54. Higgins, Matthew, and Jeffrey Williamson. 1999. ``Explaining Inequality the World Round: Cohort Size, Kuznets Curve, and Openness.'' Federal Reserve Bank of New York Working Paper 79. Available online at www.ssrn.com. IMF (International Monetary Fund). Various years. International Financial Statistics. Washington, D.C. Milanovic 43 Kanbur, Ravi. 2000. ``Income Distribution and Development.'' In A. B. Atkinson and F. Bourguignon, eds., Handbook of Income Distribution, vol. 1. Amsterdam: North Holland-Elsevier. Kremer, Mark, and Eric Maskin. 2003. ``Globalization and Inequality.'' Harvard University, Department of Economics, Cambridge, Mass. Available online at http://post.economics.harvard.edu/faculty/ kremer/webpapers. Lejour, Arjan M., and Paul J. G. Tang. 1999. ``The Differential Impact of the South on Wage Inequality in the North.''CPB Netherlands Bureau for Economic Policy Analysis, The Hague. Li, Hongyi, Lyn Squire, and Heng-fu Zou. 1998. ``Explaining International and Intertemporal Variations in Income Inequality.'' Economic Journal 108(446):26­43. Lubker, Malte, Graham Smith, and John Weeks. 2002. ``Growth and the Poor: A Comment on Dollar and Kraay.'' Journal of International Development 14:555­71. Lundberg, Mattias, and Lyn Squire. 1999. ``Growth and Inequality: Extracting the Lessons for Policy- makers.'' World Bank, Washington, D.C. ------. 2003. ``The Simultaneous Evolution of Growth and Inequality.'' Economic Journal 113(487):326­44. Lustig, Nora, and Ravi Kanbur. 1999. ``Why Is Inequality Back on the Agenda?'' Paper prepared for the Annual World Bank Conference on Development Economics, April 28­30, World Bank, Washington, D.C. Available online at www.worldbank.org/poverty/wdrpoverty/inequality.htm. Melchior, Arne, Kjetil Telle, and Henrik Wiig. 2000. ``Globalisation and Inequality: World Income Distribution and Living Standards, 1960­1998.'' Studies on Foreign Policy Issues Report 6B:2000. Royal Norwegian Ministry of Foreign Affairs, Oslo. Milanovic, Branko. 1994. ``Determinants of Cross-Country Income Inequality: An Augmented Kuznets' Hypothesis.'' Policy Research Working Paper 1246. World Bank, Washington, D.C. ------. 2000. ``The Median Voter Hypothesis, Income Inequality and Income Redistribution: An Empirical Test with the Required Data.'' European Journal of Political Economy 16(3):367­410. ------. 2002. ``True World Income Distribution, 1988 and 1993: First Calculation Based on Household Surveys Alone.'' Economic Journal 112(476):51­92. ------. 2004. Worlds Apart: Global and International Inequality 1950­2000. Princeton, N.J.: Princeton University Press. Milanovic, Branko, and Shlomo Yitzhaki. 2002. ``Decomposing World Income Distribution: Does the World Have a Middle Class?'' Review of Income and Wealth 48(2):155­78. Ravallion, Martin. 2001. ``Growth, Inequality, and Poverty: Looking beyond Averages.'' World Devel- opment 29(11):1803­15. ------. 2004. ``Loking beyond Averages in the Trade and Poverty Debate.'' Policy Research Working Paper 3461. World Bank, Washington, D.C. Robertson, Raymond. 2000. ``Trade Liberalisation and Wage Inequality: Lessons from the Mexican Experience.'' World Economy 23(6):827­49. Rodrik, Dani. 2000. ``Comments on `Trade, Growth, and Poverty' by D. Dollar and A. Kraay.'' Harvard University, John F. Kennedy School of Government, Cambridge, Mass. Available online at http:// ksghome.harvard.edu/ .drodrik.academic.ksg. Sala-i-Martin, Xavier. 2002. ``The Disturbing `Rise' of World Income Inequality.'' NBERWorking Paper 8904. National Bureau of Economic Research, Cambridge, Mass. Schott, Peter K. 1999. ``One Size Fits All? Heckscher-Ohlin Specialization in Global Production.'' NBER Working Paper 8244. National Bureau of Economic Research, Cambridge, Mass. Schultz, T. P. 1998. ``Inequality in the Distribution of Personal Income in the World: How It Is Changing and Why.'' Journal of Population Economics 11(3):307­44. Slaughter, Matthew J., and Phillip Swagel. 1997. ``The Effect of Globalization on Wages in the Advanced Economies.'' International Monetary Fund Staff Studies for World Economic Outlook. Washington, D.C. Spilimbergo, Antonio, Juan Luis Londono, and Miguel Szekely. 1999. ``Income Distribution, Factor Endowment and Trade Openness.'' Journal of Development Economics 59(1):77­101. 44 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . Tang, Paul J. G., and Adrian Wood. 1999. ``Globalisation, Co-Operation Costs and Wage Inequalities.'' Online document available at http://ssrn.com/abstract= 148169. Tavares, Jose, and Romain Wacziarg. 2001. ``How Democracy Affects Growth.'' European Economic Review 45(8):1341­78. UNCTAD (United Nations Conference on Trade and Development). 1996. Handbook of International Trade and Development Statistics. Geneva. ------. 1997. Handbook of International Trade and Development Statistics. Geneva ------. 2000. Handbook of International Trade and Development Statistics. Geneva WIDER (United Nationa University, World Institute for Development Economics Research). 2004. World Income Inequality Database. Available online at ww.wider.unu.edu/wiid/wiid.html. Williamson, Jeffrey G. 1996. ``Globalization and Inequality Then and Now: The Late 19th and Late 20th Centuries Compared.'' NBER Working Paper 5491. National Bureau of Economic Research, Cambridge, Mass. Winters, Alan, Neil McCulloch, and Andrew McKay. 2004. ``Trade Liberalization and Poverty: Evidence so Far.'' Journal of Economic Literature 42(1):72­115. Wood, Adrian. 1994. North­South Trade, Employment and Inequality: Changing Fortunes in a Skill- Driven World. Oxford: Clarendon Press. ------. 2000. ``Globalisation and Wage Inequality: A Synthesis of Three Theories.'' Department for International Development, U.K. Available online at www.ssrn.com. World Bank. Various years. World Development Indicators. Washington, D.C. Financing Pharmaceutical Innovation: How Much Should Poor Countries Contribute? William Jack and Jean O. Lanjouw A public economics framework is used to consider how pharmaceuticals should be priced when at least some of the research and development incentive comes from sales revenues. Familiar techniques of public finance are used to relax some of the restric- tions implied in the standard use of Ramsey pricing. Under the more general model, poor countries should not necessarily cover even their own marginal costs, and the pricing structure is not related to that which would be chosen by a monopolist in a simple way. This framework is then used to examine ongoing debates regarding the international patent system as embodied in the World Trade Organization's Agreement on Trade-Related Aspects of Intellectual Property Rights. The most contentious issue in the pharmaceutical sector is not whether or how much to support private research. Most observers recognize the major contribu- tions to global health that have come from private research efforts and the fact that price-cost margins supported by the patent system have been pivotal in stimulating that research. Conflicts arise instead over how the financing of research and development (R&D) incentives should be shared among consu- mers. How much of the total cost should a U.S. retiree, a French worker, or an Ethiopian peasant be expected to contribute? Today the tensions are clearly on display. Senior citizen outrage over high drug costs in the United States has driven legislation through Congress to create a massive Medicare prescription drug benefit that will shift costs from pharma- ceutical consumers to taxpayers at large. Similar outrage has led to repeated efforts to legalize the importation of lower priced Canadian drugs into the William Jack is associate professor in the Department of Economics, Georgetown University; his e-mail address is wgj@georgetown.edu. Jean O. Lanjouw is nonresident senior fellow at the Brookings Institution, nonresident senior fellow at the Center for Global Development, Washington, D.C., and associate professor at the University of California, Berkeley; her e-mail address is jlanjouw@brook.edu. The authors thank Francois Bourguignon, Jennifer Brant, Colleen Chien, Brian Deese, Peter Lanjouw, Michael Kremer, Jamie Love, F. M. Scherer, Scott Stern, Alan Winters, and three anonymous referees for constructive comments on an earlier draft. , THE WORLD BANK ECONOMIC REVIEW VOL. 19,NO. 1, pp. 45­67 doi:10.1093/wber/lhi005 Ó The Author 2005. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK . All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org. 45 46 , . 19, . 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L N O United States. The goal: to shift some of the costs of pharmaceutical research from Americans to Canadians. In a similar vein, an increasing number of voices support the view expressed by Mark McClellen, then commissioner of the U.S. Food and Drug Administration, in Cancu´n that the main reason [U.S.] prices are higher is that our country is paying the bulk of the costs of developing new treatments. That's got many Americans angry. . . . I know that many are compla- cent with the current situation, in which the United States has borne the bulk of costs. I know it is not clear how to work together internationally to create better ways to share the burden than are provided by our current trade agreements. But it is clear to me that we cannot carry the lion's share of this burden for much longer.1 The goal is, at least implicitly, to shift more of the costs of pharmaceutical research to Europeans and others. These conflicts within and between rich countries reflect the same debate that has been raging for years over drug pricing in the developing world. The heart of the controversy is distributional: Given a desire to support private research, to what extent should industry control of sales in poor countries be part of the effort? The developing economy debate has focused mainly on the minimum standards for patent protection set forth for members of the World Trade Organization (WTO) in the Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPS),2 in particular the requirement that member countries offer protection for pharmaceutical innovations. The institutional apparatus embodied by TRIPS is a significant factor in determining the prices at which pharmaceutical products (or licenses to produce them) are sold in different countries. More generally, these prices and license payments are the result of negotiations between governments and firms, within a legal and regulatory framework that seeks to both stimulate appropriate levels of R&D and promote broad access to available drugs. The public debate over the suitability of the international pharmaceutical prices that emerge from this process tends to be polarized between those who focus on the incentive effects and those who concentrate on other social objectives. The purpose of this article is to provide a framework for determining a policy that respects both objectives. To this end, most of the legal aspects of the debate are abstracted and well- established techniques of applied public finance are employed to integrate both efficiency and distributional concerns. Although the analysis here accounts for the need to provide incentives to develop new products, it does not formally address the question of how much incentive should be provided. Furthermore, a variety of policy options can be used to support research and address distributional concerns. The broad com- position of these policies is taken as given, and the focus is on how to best 1. See www.fda.gov/oc/speeches/2003/genericdrug0925.html (accessed April 20, 2005). 2. Agreement on Trade-Related Aspects of Intellectual Property Rights, Marrakesh Agreement Estab- lishing the World Trade Organization, Annex 1C. Available online at www.wto.org/english/tratop_e/ trips_e/trips_e.htm (accessed April 20, 2005). Jack and Lanjouw 47 structure pharmaceutical prices. Specifically the concern is how the burden of generating any given profit from sales should be shared across countries. The basic principles of optimal pricing presented here are consistent with broadly defined social objectives. These principles have come to be known as Ramsey pricing, after the seminal work of Frank Ramsey (1927).3 The techniques reviewed in this article are not new. Indeed, some of the literature on pharmaceutical pricing has employed them already (see Danzon 1997, 2001). In most of this literature, however, the assumptions that underlie the derivation of standard Ramsey prices are very restrictive. In particular, they require either that concerns about the global distribution of well-being be suitably addressed through other policies or that policymakers have no such concerns. The characterization here of social objectives--which is general enough to encompass a broad range of distributional preferences--will be familiar to students and practitioners of applied public economics, although it does not appear to have been explicitly employed in formal analyses of international pharmaceutical pricing. Once one departs from the restrictive assumptions imposed by the standard (distributionally neutral) Ramsey pricing model, the pricing rules are replaced by so-called many-person Ramsey rules (Diamond 1975). Ramsey prices calculated in the standard way are a special case. The analysis here highlights two common prescriptions derived from the standard Ramsey pricing model that are not valid once one allows for more broadly defined social objectives. First is that prices should at least cover marginal costs in each country--that is, countries should pay at least for the direct costs of delivering drugs--a conclusion valid only when distributional concerns are not incorporated into the analysis. Second is that pricing structures should be closely related to those that would arise under monopoly pricing in each country. Again, this does not carry over to optimal policies in the presence of distributionally sensitive objectives. In addition, using a natural formulation of health needs, the standard model prescribes higher prices in countries with greater need for drugs. Rationalizing such a policy when equity is a concern would be difficult. Section I discusses how the problem examined in this article fits into the more general problem of the provision of incentives for R&D. Section II presents the intuition for and derivation of the many-person Ramsey rule and relates it directly to the determination of reasonable royalty rates on compulsory licenses. In the light of this analysis, section III reviews the more standard Ramsey pricing rule and illustrates how policy implications drawn from the standard model can differ from those derived from the more general framework. Section IV illus- trates the tradeoffs between pricing, welfare, and R&D investment and explores several recent controversies. 3. This work enjoyed a renaissance in the early 1970s with the work of Baumol and Bradford (1970) and Diamond and Mirrlees (1971). The same tools have been extensively applied to the question of public enterprise pricing, as exposited by Bo¨s (1986), for example. 48 , . 19, . 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L N O I. FINANCING R&D INCENTIVES The formulation of public policy regarding pharmaceutical pricing can be use- fully thought of as proceeding in three steps: establishing the total amount of resources that society should devote to research, identifying the broad mechan- isms by which such funds are raised and allocated, and designing each broad mechanism. Although the focus of this article is the third step, for clarity each part of the process is briefly discussed. How Big an Incentive to Provide for Innovation It is widely believed that under a laissez-faire policy, incentives for investment in R&D are severely attenuated and the aggregate amount of R&D investment is far less than what would be socially desirable. This is because the produc- tion of new drugs (like many other innovations) exhibits significant returns to scale due to the presence of sometimes enormous fixed costs, and competition leading to marginal cost pricing results in negative profits. Exactly what level of R&D is optimal depends on several factors. In general, the right level of R&D depends on its expected benefits and costs in the obvious way. The extent to which research investment leads to the discovery of new products, coupled with an evaluation of the benefits of these new products, must be weighed against the value of other goods that could otherwise have been obtained. Making these tradeoffs requires careful attention to such factors as the rate of time preference (because research pays off in the longer run) and the cost of stimulating different levels of innovation. These topics are not pursued further in this article.4 How to Support Innovation A variety of mechanisms exist for providing R&D incentives in the presence of increasing returns to scale. The patent system promises future profits on sales. Various tax credits and subsidies (such as the orphan drug legislation in the United States) provide returns not so much for successful innovation but for attempts at innovation (that is, at the research stage). Alternatively, some R&D is funded directly by the government through public institutions (such as the U.S. National Institutes of Health). The relevant distinction here is between research support derived from profits on sales of successful innovations and research support provided through the general tax system. Suppose that total R&D funding is the sum of R, funding from sales revenues in excess of marginal costs, and T, funding from general taxes. An important finding of Atkinson and Stiglitz (1976) is that under certain conditions, if a government wants to raise a certain amount of revenue with both efficiency and 4. See Nordhaus (1969) and Scherer (1972, 2003) for discussion of some of the tradeoffs involved in this decision. Jack and Lanjouw 49 equity objectives in mind, it should implement a suitably designed income tax and avoid taxing the consumption of goods and services separately.5 In the context of pharmaceutical pricing, this would mean that drug prices should be set equal to marginal cost (that is, very close to zero) and that R would be zero. However, in practice there are many sound reasons why sales revenues in excess of marginal costs might be a useful source of financing. First, the form of research support can affect the productivity of the investment. In particular, support from sales of products gives researchers a larger incentive to find products that address consumer needs than to, say, work on projects that are primarily of scientific interest. It might be politically easier for governments to allow a firm to charge certain prices than to hand over a large sum of tax revenue. Governments can thus maintain support for the overall R&D incentive if at least some of the funding does not come from explicit tax increases.6 More important, even if all other conditions are satisfied, the Atkinson- Stiglitz result requires a sophisticated (nonlinear) worldwide income tax. Clearly, the global institutions are not in place to implement such a tax, with good reasons, including institutional and political constraints and corruption. When equity concerns cannot be directly addressed, it is important to take into account the distributional implications of all policies. Drug pricing may well be one of the more effective tools for redistribution because it targets a basic human need and provides resources in a form more difficult to divert than, say, direct transfers to country governments. Pricing to Yield R Having established the size of the research support that society is willing to provide and the division of this incentive between taxes, T, and net sales revenues, R, the last issue to address is the design of the price structure that will yield R, the primary subject of this article. The results will apply to any level of R: As the choices made in the first two steps change, the pricing formulas remain unaffected (although the actual prices prescribed will change). That is, the general insights into the structure of prices conditional on R remain infor- mative as R changes. The design of the price structure, and possible practical limitations on it, will inform and feed back into the decisions made in the first two steps. For example, if certain constraints mean that feasible prices differ from prices that would otherwise be considered optimal, then policymakers might want 5. This result is in the context of a static model with no savings. Under an intertemporal model with savings the result indicates the optimality of a (nonlinear) consumption tax. See Atkinson and Stiglitz (1980). 6. Of course, this is not to suggest that prices above marginal cost are not taxes, rather that they are perceived differently by voters. 50 , . 19, . 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L N O to shift more financing into taxes and pursue other avenues for attaining distributional goals. Furthermore, the total resources that should be devoted to research may be revised if otherwise desirable pricing policies are unavailable. With this in mind, readers should interpret the findings here as the first input (working backward) in the solution to the three-part problem of finding ways to induce R&D when policymakers care about dynamic efficiency (that is, R&D incentives), static efficiency (such as the distortionary costs of monopoly pri- cing), and equity. II. RAMSEY PRICING The analysis here starts by considering the simplest model with two countries, north and south.7 It also assumes that both countries have the same number of residents and that they are identical within each country.8 Thus for simplicity each country, i = n, s, is assumed to have a single consumer9 with income, mi,10 who chooses between two goods, x, which represents all other goods and whose price is normalized to 1, and a drug, y, available in country i at a price, pi. Incomes can vary across countries. The consumption preferences of the consumer in country i are represented by a function, ui(x, y). The con- sumers in both countries are assumed to have the same preferences for the two goods. However, in the next section preferences for drugs will be allowed to differ in north and south, due, say, to differences in disease conditions and health needs. These preferences can be used to determine the highest level of utility that the consumer can attain with income mi and drug price pi. Because there is no saving in this simple model, consumers will spend their entire incomes so that x + piy = mi. Then consumer i's maximum utility level (indirect utility function) for any price and income can be written ð1Þ viðpi; miÞ ¼ max uiðx; yÞ x;y subject to x þ piy ¼ mi: 7. The analysis would hold for any number of countries. 8. The assumption that individuals in a given country are identical is clearly contrary to fact and made to highlight the ways pharmaceutical prices should vary across countries. In addition, differential drug pricing within a country is likely to be more difficult to implement than differential pricing across countries, and other means of redistribution are likely to exist within countries that are not available for redistribution across countries. 9. The optimal pricing rules are the same for any distribution of population across countries, although the level of prices would differ. 10. mi is income net of any contributions, ti, to tax-financed R&D incentives, T. Taking the ti as given, only the distribution of mi across countries matters, not the distribution of ti. Since the ti are small relative to the mi, this distinction is of little practical importance. Jack and Lanjouw 51 To formalize the policy objective, drug prices are assumed to be chosen to make some concept of global aggregate well-being as large as possible.11 Extending in a natural way a long tradition of applied public economics for determining country-level tax policies, this goal is described in terms of global social welfare, which depends on the utility of consumers in both countries. That is, social welfare is a function, W(vn, vs).12 How much a small increase in the utility of consumer i contributes to social welfare depends on how W is defined and on consumer i's starting level of utility, as well as the utility levels of other consumers. The incremental effect of an increase in consumer i's utility is denoted by gi = @W/@vi. To determine appropriate prices across countries, it is useful to observe that the way in which a price increase in country i affects social welfare can be decomposed into two parts. First, increasing the drug price will generally lower consumer i's utility (assuming the drug was already being consumed). If con- sumer i's drug consumption was yi initially and remained at that level after a small increase in the drug price, Dpi, the effect of the price increase would be the same as if consumer i's income had been reduced by Dmi = Dpi yi. Thus, knowing what effect a reduction in income would have on consumer utility allows an assessment of the first impact of a price increase in country i. This marginal utility of income is typically denoted by ai = @vi/@mi. Studies of behavior under conditions of uncertainty indicate that ai is very likely to decline as income increases. That is, people get less benefit from an extra dollar as they become richer.13 Second, as discussed earlier, the reduction in consumer utility in country i (Dvi) will have an effect on social welfare. The two pieces together give the total effect on social welfare of a change of income in country i. This is denoted bi: ð2Þ i ¼ ð@W=@viÞð@vi=@miÞ ¼ i i : With the relationship between price and income changes, Dmi=Dpi yi, changing the drug price in country i would have the following effect on social welfare: 11. Exactly who determines priorities at the global level is not addressed here. The purpose is simply to ask how prices should be chosen, given some criterion that is global in nature because it incorporates the well-being of consumers in different countries. 12. This article assumes that competition drives rents in the pharmaceutical industry to zero, so that all net revenue is invested in R&D. More generally, if pharmaceutical companies earn positive rents (that is, profits in excess of the cost of capital), static social welfare can be defined as a function of consumer utilities and profits, where the weight on profits depends on how they are distributed. Allowing for positive profits would affect the choice of the optimal level of R&D incentive, but not the pricing rules conditional on R focused on here. 13. The value of a indicates how a given consumer would view an extra dollar if the consumer were poor compared with if the same individual were rich. A declining a does not necessarily imply that an extra dollar would give every rich consumer less utility than it would give to a poor consumer--only if they derive the same utility from consumption, as assumed in this section. 52 , . 19, . 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L N O ð3Þ ð@W=@piÞ ¼ ð@W=@viÞð@vi=@piÞ ¼ i i y : Thus lowering the price of the drug in country i tends to be most beneficial when bi is large. The value of bi is large when extra income is particularly valuable to the consumer in country i (large ai) and when boosting the well- being of those in country i is viewed as particularly important (large gi). For example, if extra income is more valuable to poor consumers and if south is the poorer country, as > an. Second, a preference for a more equal distribution of well-being implies that gs > gn.14 Thus on both accounts one might expect bs>bn. One objective of this article is to bring attention to the fact that the use of public economics in the policy debate on international drug pricing has implicitly assumed a very special (and arguably unrepresentative) case: one in which both ai and gi, and hence bi, are constant across all countries at all incomes. Derivation of Ramsey Prices The model described earlier is used to characterize the optimal drug price in each country. It is assumed that the marginal cost of producing and distributing the drug is constant within a country but can vary across countries and is equal to ci in country i.15 Now, suppose that for a given product the innovator firm is permitted to generate sales revenue over costs of R. The optimal set of drug prices pn and ps are those that give the highest social welfare, W, subject to this revenue constraint. It is assumed that profits (or losses) on sales accrue to the firm, either because the firm sells the drug directly or because it controls and sets the terms of licenses to alternative manufactures. If licensing is compulsory, there must be enough competition among generics manufacturers that they do not profit from sales. It also means that there are no import or sales taxes. Formally, these prices solve the problem (4) max Wðvn; vsÞ pn;ps subject to ðpi ciÞyi ¼ R: i A Lagrange multiplier, l, is introduced on the constraint, so that the first-order conditions satisfied at the optimal prices are 14. The welfare function is usually assumed to be anonymous, that is, it does not discriminate directly among consumers. But, for example, if one cared less about the well-being of consumers from the south, then all else equal, gs would tend to be lower than gn. A strong enough preference for northerners could imply a desire to redistribute from the poorer south to the richer north. 15. The marginal cost of delivery could vary within a country. This is ignored here, and it is assumed that the costs of launching a product in an additional country (obtaining marketing approval, advertising to doctors, arranging distribution, and the like) are negligible. Jack and Lanjouw 53 ð5Þ ð@W=@viÞ=ð@vi=@piÞ þ ½yi þ ðpi ciÞð@yi=@piÞ ¼ 0 . . . i ¼ n; s: The (negative of the) first term is the marginal social cost (that is, the incremental reduction in social welfare) associated with a price increase in country i, MSCi. The bracketed part of the second term is the marginal revenue earned from a price increase in country i, MRi. The condition then says that for each country ð6Þ MSCi=MRi ¼ : The ratio on the left side is the reduction in welfare as revenue generated by sales in country i increases. Because l is the same for both countries, the marginal social cost of revenue generation is equalized across countries. That is, at the optimum prices, raising an extra dollar from either country should have the same (negative) effect on social welfare: if the marginal welfare loss per unit of net revenue differed between countries, the amount earned from each could be adjusted, keeping aggregate net revenue constant at R, while reducing the total social cost. Using equation 3, equation 5 can be rearranged as ð7Þ Àpi ci =pi ¼ ½ð Á iÞ= 1= ; i where pi is the optimal (Ramsey) price in country i and Zi = (pi/yi) (@yi/@pi) > 0 is the elasticity of demand for the drug in that country.16 It measures how sensitively consumers react by adjusting their consumption of drugs when prices change. The Lagrange multiplier, l, measures the social value of relaxing the financing constraint by a dollar--that is, the gain in social welfare that would accrue if the revenue allowed the firm, R, was marginally reduced. As discussed earlier, bi, measures the increase in social welfare associated with a $1 increase in the income of the consumer in country i. For a given elasticity of demand, the markup as a share of the final price is smaller the larger bi is. As argued, bi might be expected to be larger in a poor country, which would suggest lower markups. In general, though, there is no reason to expect that at the optimum prices, bi < l for all countries i (although it must be the case that bi < l for at least one country for revenue to be nonnega- tive). That is, the term in brackets could even be negative. This means that Ramsey prices do not necessarily cover marginal costs in all countries, because if 16. It is assumed that optimal prices, including Ramsey and monopoly prices, are unique. Equation 7 is closely related to Diamond's (1975) many-person Ramsey rule. Despite the mathematical similarity, however, there is a slight difference in interpretation. Diamond's analysis considers the optimal tax rates applied to several different goods, under the constraint that these tax rates must be the same for all individuals. Under the model here there is just one taxed good, the drug, but it is assumed to be possible to apply different tax rates (prices different to marginal cost) to it in different countries, that is, tax rates that depend on who buys the drug. 54 , . 19, . 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L N O the right side of equation 7 is negative, pi < ci. This possibility is illustrated graphically in the appendix. (Whether such prices can be attained given avail- able policy tools is considered in section IV.) Finally, note that the condition refers to the markup over cost as a share of price. If the marginal cost of producing and distributing the drug, ci, is the same across countries, a lower markup in country i also implies a lower price in country i. If it is not the same, the final price, pi, could be higher in the country with the lower markup. Ramsey-Reasonable Royalty Rates The TRIPS agreement allows for the compulsory licensing of patented innovations in some circumstances. These licenses give manufacturers the right to produce and sell a patented product in the country issuing the license in return for adequate remuneration to the patent holder (TRIPS, articles 31h and 31k). In general, when compensation is due a patentee, it is commonly in the form of royalty payments per unit of sales, and laws requiring compensation are some- times directly specified in relation to reasonable royalties.17 What is ``reason- able'' is not clearly stated and is open to interpretation.18 One natural approach to this problem in the international context is to follow the same line of reason- ing used here, asking what royalty rates would generate the highest level of social welfare. Royalty payments are typically defined as a share of the final sale price. Thus, if the royalty rate to be paid to the inventor on sales in country i is, say, ri, the marginal cost of production to the manufacturing firm becomes ci + ripi. If the market is competitive--that is, with several generics producers operating under compulsory license--the price will equal the marginal cost of production, pi = ci + ripi. In this case ð8Þ ri ¼ ðpi ciÞ=pi and the royalty rate in country i will determine the markup as a share of price in that country.19 Thus, given competition, it follows directly that the Ramsey- reasonable royalty rate is defined by equation 7. III. STANDARD RAMSEY PRICES This section presents the special case of the model illustrated earlier that has been used in the discussion of international drug prices. Two common prescrip- 17. In assessing damages in infringement cases, for example, 35 U.S.C. §284 provides that damages should be ``in no event less than a reasonable royalty for the use made of the invention.'' 18. See Scherer and Watal (2002) for an interesting discussion of historical practice within countries. 19. Without competition and price controls the price would be higher. Some of the resulting revenue would go to the inventor as royalty payments and some would go to the generics producer as profit. Jack and Lanjouw 55 tions derived from this special model are highlighted to show how they are at odds with those drawn from the more general formulation discussed earlier. The first is that prices should at least cover marginal costs in each country. The second is that the pricing structure should be closely related to what would arise if monopoly prices were charged in each country. The special model also suggests that prices should be higher in countries with a greater need for drugs. This inference is not typically noted. Consumer Surplus, Ramsey Prices, and ``Fair'' Prices The standard approach does not incorporate any concern for the distributional effect of drug prices. There are two interpretations of this treatment. First, it could mean that distribution simply does not matter. This requires assuming that an extra dollar gives the same amount of additional utility to a consumer regardless of that person's starting income level. That is, the marginal utility of income, ai, is constant. Furthermore, it requires that the way any total amount of utility is distributed between the two consumers is not given any particular importance. Formally, a social welfare function is implicitly assumed, and it is utilitarian: ð9Þ Wðvn; vsÞ ¼ vn þ vs: With the utilitarian social welfare function, gs = gn = 1. Together with the con- stant marginal utility of income, this means that bi = b is also constant. The second interpretation is that distribution does matter, but that it is expected to be dealt with in other ways. If there are other (unused but never- theless effective) avenues for bringing all consumers in the world to a similar level of well-being, distributional concerns can be safely ignored when pricing drugs. In either interpretation, bi can be treated as constant across countries, and the optimal pricing conditions from equation 7 simplify to ð10Þ ðpi ciÞ=pi ¼ ½ð Þ= 1= ;i where p¯i is the optimal price and b = ag = a.20 The term in brackets, (l b)/l, is now constant across countries. Thus prices should differ across countries in such a way that the proportional markup as a share of price is inversely related to the elasticity of demand at those prices--the standard result. This rule implies higher prices for those who change their con- sumption less, and it is efficient because it causes the least distortion to consump- tion patterns. Finally, it can be argued in this special case that at the optimal prices, 20. It is common in the literature on regulation and utility pricing, and in some accounts of international drug pricing, to adopt the maximization of a measure of aggregate consumer surplus as the policy objective. This approach can be reconciled with the one adopted here, with ai =a for i = n and s, and a utilitarian social welfare function. 56 , . 19, . 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L N O l>b, so that the price in each country is at least as high as ci, the marginal cost of production and distribution there. This is shown graphically in the appendix. It is tempting to interpret the condition that each country cover at least its own marginal costs as fair. But this can be fair only in a procedural sense. Putting such a condition on prices only emerges as a general policy prescription when concern for equity is ruled out. Comparison with Monopoly Pricing If a single firm were producing and selling the drug, and if the firm could freely choose prices in separate country markets, it would set prices to maximize net revenue, R = Si(pi ci)yi. The first order condition for this problem is simply ð11Þ yi þ ðpi ciÞð@yi=@piÞ ¼ 0; which is the same as equation 5, except that it does not have a term giving weight to social welfare. The monopolist would thus choose prices pm in i different countries to satisfy ð12Þ Àpm i ci =pm ¼ 1= : Á i i When compared with equation 10, the relative price-cost markups derived in the special case are proportional to what would be chosen by a price-discrimi- nating monopolist. That is, if i = (pi ci)/pi is the markup as a share of price in country i, ð13Þ m = m = = : s n ¼ n s ¼ s n These can be compared with the ratio of price-cost markups under the optimal Ramsey prices characterized in equation 7: ð14Þ = m = m: s n ¼ ½ð sÞ=ð nÞ s n If bs > bn, the relative markup in country s is smaller than the relative markup that would be chosen by a price-discriminating monopolist. Factors Determining the Elasticity of Demand The standard Ramsey and monopoly pricing rules, equations 10 and 12, have been loosely interpreted as requiring prices to be higher in countries with lower demand elasticities, because the elasticity of demand is typically not constant as a function of price. Indeed, when marginal costs are zero, these rules collapse to Zi = (l b)/l or Zi = 1, that is, prices should be set so that the elasticity of demand is equal across countries. In general, this requires differ- ential prices if the relationship between price and elasticity differs by country. This begs the question of how demand elasticities are likely to vary across countries. This section offers two illustrative answers to this question. Two important dimensions along which countries differ are income and health Jack and Lanjouw 57 needs. Consumers in countries in the south are assumed, on average, to be poorer and sicker than consumers in countries in the north. THE EFFECT OF NCOME ON I DEMAND ELASTICITY. Suppose consumer demand for a drug in country i is a linear function of its price and the consumer's income, ð15Þ yðpi; miÞ ¼ & y0 bðpi=miÞ if pi < y0m=u; 0 otherwise where y0 is the same for both countries and measures the amount of the drug that would be consumed if it were free and b is a constant. Note that for any price at which demand is positive, demand is increasing in income and decreas- ing in price. But there is no difference between the countries in terms of their health needs, that is, if consumers in the two countries had the same income and faced the same prices, their demand for the drug would be the same. It is easy to show that the elasticity of demand in country i is ð16Þ i ¼ ðy0 yÞ=y: This elasticity is decreasing in income and increasing in price. These two proper- ties together imply that prices should be higher in the north than in the south, even with distributionally insensitive social objectives. If equity is a concern, the optimal price differential between north and south will increase further. THE EFFECT OF HEALTH NEEDS ON DEMAND ELASTICITY. Disease prevalence and the susceptibility of people to disease differ markedly across countries, so some pharmaceutical products will be in higher demand in some countries than in others. Demand for pharmaceuticals may also be higher in some countries because alternative medical treatments are unavailable or more costly. Perhaps the easiest way to incorporate these country differences into the Ramsey pricing analysis is to suppose that the utility that consumers in country i obtain from the consumption of a drug, y, and other products, x, can be written ð17Þ uiðx; yÞ ¼ ½x þ uð yÞ ; i where yi measures the value of the drug relative to other consumption in country i.21 Assuming a constant elasticity functional form for function v, demand elasticities in this case are 21. This specification can be interpreted as follows: Consumers care about the composite good, x, and their level of health attainment, h, according to u(x, h) = x+ v(h). In practice, health attainment is a complicated and not well-understood function of many variables, but here suppose it is a simple linear function of the consumer's consumption of the drug, h = yy, so that utility is u(x, y) = x+ v(yy). A simple way to model heterogeneous health needs is to assume that consumers with greater health needs have a higher value of y, so that the effect of the drug on their health attainment is larger. 58 , . 19, . 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L N O ð18Þ i ¼ 1= i: Countries with greater need for drugs (higher yi) have lower demand elastici- ties.22 The Ramsey pricing rule, equation 7, now becomes ð19Þ ðpi ciÞ=pi ¼ ½ð i Þ= i: This means that the markup is greater, the greater are health needs (as measured by yi) and the smaller is bi. In the special case of a utilitarian social welfare function and constant marginal utility of income ( = 1), so that bi = b for all i, 0 only the first effect is relevant, and equation 19 prescribes that countries with greater needs should face higher prices. This result again highlights the inadequacy of distributionally insensitive pricing rules, because it is likely that countries with greater need for pharmaceuticals will also be relatively poorer. IV. POLICY ILLUSTRATIONS The Ramsey prices described earlier give the highest level of social welfare while allowing an innovative firm to earn a given net revenue. How do these prices compare with the prices expected under various regulatory and patent regimes? This section uses a graphical tool to conveniently compare alternative policy choices with each other and with the benchmark Ramsey prices. The model now allows for many countries. In many cases a given net revenue target, R, can be reached with a variety of international price structures. Some might result in much of the revenue being raised in richer countries, while others might have it raised in poorer countries. With each set of prices an associated level of global welfare, W, is attained. The shaded area in figure 1 depicts all the combinations of net revenue and welfare (R, W) that can be obtained with different sets of global drug prices. At point A, the sales revenue raised by the firm in excess of marginal costs, Rm, is the amount that would be raised by a profit-maximizing monopolist able to separate markets. This maximum net revenue target can be reached only if the price in each country is the monopoly price for that country, pm. When i monopoly prices are charged in each country, social welfare is Wm. At the other extreme, the firm receives no revenue in excess of marginal costs and R = 0. There is a set of Ramsey prices associated with this revenue target, and these prices yield the highest level of social welfare, shown as point B. To 22. Given the utility function in (17), at any price pi, consumers in country i have a demand for drugs yi(pi ; yi) that satisfies the condition yiv0(yiyi) = pi. This condition implies that demand for the drug depends on its price and consumer needs, yi, but not on income (admittedly an extreme assumption). The elasticity of demand is then Zi (yi ; yi) = (1/yi) [v0(yiyi)/yiv00(yiyi)]. In the special case where v(h) = (h1 )/(1 r), for r E ( 1,1) then r yv00/v0 = r and equation 18 follows. Jack and Lanjouw 59 FIGURE 1. Revenue-Welfare Options W B Wm A C 0 Rm R Note: All points in the shaded region, representing revenue-welfare pairs (R, W), are attainable. Those on the upper boundary are attained by setting Ramsey prices. the right of point B, the revenue target increases, which means prices must rise. Thus the highest attainable welfare falls until point A is reached, with welfare Wm. These maximum welfare levels, reached when Ramsey pricing is used to generate any given net revenue target, are depicted by the bold line.23 There is also a minimum level of welfare than can be reached for a given revenue target. These ``worst possible'' welfare and revenue combinations form the bottom edge of the shaded area in figure 1. For any given R there are prices that can generate any intermediate levels of welfare between these bounds (assuming continuity). Thus all combinations of R and W in the shaded area can be attained by setting different drug prices. Finally, point C has two interpretations. If the drug is developed, point C is where the drug is priced so high everywhere that demand is zero and conse- quently welfare is very low. Point C also represents the sales revenue (zero) and welfare that would be obtained if the drug fails to be developed. Limitations on Implementable Prices: Voluntary Participation In the discussion of Ramsey pricing thus far it has been assumed that any price could be set in any country. However, policymakers often control prices only indirectly, and their options may be limited by the available policy tools and the ways prices in one country affect prices elsewhere. This section considers what prices could be implemented in practice. 23. If no value is put on the well-being of some countries, their prices can be raised first without affecting the level of social welfare and the bold line would be horizonal at low R. 60 , . 19, . 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L N O If firms can separate markets, the country-specific monopoly price can be reached in each country by granting firms strong and well-enforced patent rights--in particular, rights that are not compromised by compulsory licensing. Prices below the monopoly level can be reached by granting firms patent rights and then regulating price levels. Most developed economies follow this strategy, and price controls come in a wide variety of forms. Arriving at price equal to marginal cost can in principle be managed in two ways. First, the price can be directly set at that level through regulation. This, however, assumes that regulators have enough information to determine marginal cost, which is unlikely, and that firms would not respond by refusing to enter the market. Second, the price can be controlled indirectly by allowing entry by competitive generics firms. Given current TRIPS obligations, this could be ensured only through government grant of nonexclusive compulsory licenses. Of course, generics firms would need to enter the market for competition to be effective, which cannot be assumed.24 In many countries, other features of the market or regulation unrelated to patents may limit competitive entry. What about prices below marginal cost? As seen earlier, Ramsey prices may be less than marginal cost in some countries. Firms may be willing to sell at a loss in limited circumstances. Drug donation programs are an example, although they do not fit the model here precisely because some part of the cost to the firm is offset by tax deductions. One way to implement such prices is to link sales across countries through a bulk purchase arrangement, in the spirit of the large-scale procurement of vaccine by the United Nations Children's Fund. Firms would bid to sell a given quantity of a pharmaceutical at a given uniform unit price to the intermediary, which would then sell the product to individual countries at differentiated Ramsey prices.25 Such sys- tems, however, require an administrative infrastructure that would be costly to expand to a broad range of pharmaceuticals. Furthermore, richer countries in the arrangement effectively subsidize losses in the poorer ones, creating a clear incentive for firms and richer countries to interact outside of the scheme.26 Even if these difficulties could be overcome, it would be hard for governments to credibly commit now to providing these sales arrangements for future new drugs and therefore difficult to use them to encourage the right level of investment today. 24. In August 2003 an agreement was reached on rules governing the export of drugs under compulsory license to supply countries lacking the ability to manufacture their own (resolving the ``paragraph 6'' debate). How this agreement is implemented will influence the sources of generic supply for poorer countries and thus competition among suppliers. 25. The United Nations Children's Fund purchases vaccines primarily for distribution in developing areas, so the potential cross-subsidy from richer countries is limited. 26. The obligation sometimes imposed on providers, such as the postal service, to serve unprofitable locations or consumers at subsidized prices to obtain a larger contract is an example of such an arrangement. The implicit subsidy from one group of consumers to another sometimes induces the former to opt out and purchase services from alternative suppliers, such as private postal services. Jack and Lanjouw 61 Thus, as a practical matter, it may not be possible to implement prices lower than marginal costs. Figure 2 shows what this limitation on pricing implies for the options. At point A the Ramsey price in each country is the monopoly price, which is greater than marginal cost, so this point remains feasible. When the net revenue target is zero, for all countries welfare-maximizing prices equal marginal cost only when bi = b (see section III and the appendix). On the other hand, if there is a concern for distribution, Ramsey prices may be higher than marginal cost in some countries and lower in others. If so, restrict- ing prices to be at least as high as marginal cost in all countries would lower welfare in comparison with the welfare attained with Ramsey prices, and point B would no longer be feasible. Figure 2 depicts the top portion of figure 1. The upper boundary of the feasible area in figure 2 now starts at point B0 and indicates the highest level of welfare possible with each revenue target when all countries must cover their marginal costs. It is below the previous boundary, implying less attractive options, until the point where the required net revenue is sufficiently high that Ramsey prices also imply pi ci in all countries. Limitations on Price Differentiation: Arbitrage and Reference Pricing A second and far more restrictive limitation on pricing is a global uniformity constraint. Completely uniform prices are unlikely, but there are two reasons why it might be difficult for a firm to charge markedly different prices for the same drug in different countries. The first is arbitrage. The Washington Post described how $18 million worth of reduced-price antiretroviral drugs meant for Africa were diverted back to the FIGURE 2. Revenue-Welfare Options with Constraints W Unconstrained B prices Prices at least B' marginal cost Uniform prices A Wm R 0 Rm Note: Limitations on prices shrink the feasible set of (R, W) pairs. At R = 0, the highest attainable welfare is at B0, below B. Along the bold line, price must be at least as high as marginal cost, and along the dashed line, prices must be uniform across countries. 62 , . 19, . 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L N O European market by black marketeers (October 3, 2002). Competition among intermediaries tends to narrow the differences in prices across countries. With- out competition between intermediaries, prices may stay high in the high-priced countries with some part of the profit going to the intermediary. Either way revenue to the innovative firm falls.27 The second factor pushing global prices together is regulatory practice. In particular, many countries' price control boards refer to prices in other countries when determining their own price ceilings (so-called reference pricing; see Jacobzone 2000). Figure 2 illustrates how options are constrained and welfare diminished if countries are limited to a uniform price. With this restriction on pricing, the available combinations of welfare and revenue are largely limited to those on the dashed line drawn inside the constrained boundary (in bold).28 The mono- poly revenue, Rm, is no longer possible because it requires different prices across countries.29 TRIPS As a result of the TRIPS component of the treaty establishing the WTO , all member countries are expected to grant and enforce 20-year patents on pharmaceutical innovation.Becausemostrichcountriesalreadyoffersuchprotection,themainresult of TRIPS is to strengthen pharmaceutical patent rights in a group of poorer countries. Point D in figure 3 indicates a possible ``pre-TRIPS'' location for a pharmaceu- tical treating a disease with global incidence (for example, cancer). Firms have patent rights (often with price control) in most countries, with generic competi- tion in some poor countries. Both regulation and competition push firm net revenue below Rm, where welfare is also lower than it need be, given R, because the global prices that result from the current system of uncoordinated national price regulation and free market pricing are unlikely to correspond to Ramsey prices for any social welfare function.30 The introduction of TRIPS results in a move to one of the points denoted Ei, i = 1,2,3. Stronger patent rights increase the net revenue on drug sales. R grows very little, however, because markets in 27. In practice, costs of intermediation allow for some limited differentiations even when arbitrage is legal (see Ganslandt and Maskus 2004 for evidence within the European Union). Intermediation becomes considerably more difficult if transshipments are illegal. 28. In addition, the set of possible (R, W) points will include some single points below the boundary, because the nonmonotone relationship between profit and price varies across countries. 29. The European Union initiated an enforcement effort to enable firms to differentiate (``tier'') prices to the benefit of poor countries without being damaged by illegal arbitrage. It allows firms to register approved tier-priced drugs and mark them with an identificable logo. Customs authorities are directed to detain market products as they come into the European Union. See http://europa.eu.int/comm/trade/csc/ med08_en.htm (accessed May 26, 2003). 30. See Scherer and Watal (2002) on antiretroviral drugs. Prices may not correspond closely to income levels because price regulations are less effective in poorer countries or because it can be profit- able for a firm to target only the upper class in a poor country with a very unequal income distribution. Jack and Lanjouw 63 FIGURE 3. Revenue-Welfare Implications of TRIPS W B B' E3 D E2 E1 A Wm R 0 Rm Note: The effect ofTRIPSis to induce a move from pont D to a point like either E1 or E2, where welfare is lower and revenue is marginally higher. An alternative would be to move to a point like E3, which allows the same increase in revenue, but with a positive impact on welfare. poor countries are exceedingly small despite having large numbers of people. Lanjouw (2002) estimates, for example, that countries with half the world's population account for less than 2 percent of spending on cardiovascular drugs. For this reason firms often choose not to obtain patents in poor countries even when they are able to (Attaran and Gillespie-White 2001). TRIPS also changes the structure of contributions so that a greater share of total net revenue comes from sales in poorer countries. Because in some countries prices are higher with TRIPS but nowhere are prices lower, welfare certainly decreases. How much welfare has fallen, however, depends on the social welfare function used. Because relatively poor countries have higher prices as a result of TRIPS , a welfare function with any aversion to inequality would suggest that welfare falls steeply, as indicated by point E1 in figure 3. But if there is little concern for distribution, the welfare fall may be moderate, as indicated by point E2. Figure 3 illustrates an important question regarding the purpose of TRIPS. If the goal is to increase the relative share of global research costs paid by poor countries to be somehow fair, it seems reasonable to keep net revenue at the pre- TRIPS level of point D by pairing the new patent regime with stronger price control in the rich countries (that is, lower prices). Alternatively, if the purpose is to slightly increase the net revenue received by an innovative firm from a given product, it seems worthwhile to consider whether strengthening patent rights in poorer countries is the best option. There are, after all, many alternatives: Targeting point E3, for example, would move prices in rich countries closer to Ramsey prices. At the most basic level, this means that countries with similar income levels and demand patterns should have similar markups. 64 , . 19, . 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L N O Tradeoffs for a product that is specific to developing economies must also be considered. Suppose, for simplicity, that a product, such as a vaccine for malaria, does not yet exist and would have no market outside of developing areas (point C in figure 1). Clearly, if nonsales revenue sources of research support are enough to have the product invented (that is, if T is large enough), there is no reason to allow a profit margin on sales, and optimal prices would lead to point B. In this situation granting patent rights in developing economics would be damaging to welfare. However, it seems unrealistic to assume that this will always be the case. Indeed, health advocates often stress the enormous gap between the human suffering caused by developing area­specific diseases and the relatively low level of public and philanthropic investment to discover products to treat them.31 That is, point B simply may not be possible; there may need to be some contribution from consuming countries in the form of net revenue for the desired innovation to occur. Without any additional incentive society might remain at point C. Thus, for products specific to the developing world there is some rationale for having patents in the poorer countries.32 V. CONCLUSION Techniques of modern public finance have been employed to consider how pharmaceutical prices should be set in a global context. In particular, how concern about the extreme inequality in the distribution of world income leads to adjustments to standard pricing prescriptions has been considered. With these adjustments poor countries should not necessarily cover their own marginal costs of drug production and distribution. In particular, poor countries should not necessarily share in any of the costs of R&D. Also, the pricing structure is not related to what would be chosen by a monopolist in a simple (proportional) way. Both of these results are at odds with standard analyses that do not take into explicit account distributional concerns. Care has been used here to distinguish between general tax sources for financing R&D incentives and sales revenues, although little has been said about what the split between these two sources should be. It is explicitly recognized that private R&D cannot be treated as free, and although there might be ways of limiting the economic rents earned by pharmaceutical com- panies (for example, through various contractual mechanisms), in the end the costs of R&D must fall either on taxpayers in general or consumers of the product. 31. See, for example, the reports of the Me´decins Sans Frontie`res Working Group on Drugs for Neglected Diseases at www.accessmed-msf.org/dnd. It is estimated that almost a million children die each year from malaria. 32. See Lanjouw (2002, 2003) for a mechanism that allows different global patent rights for different diseases. Jack and Lanjouw 65 A serious problem that arises with most development assistance programs is within-country targeting. This is avoided by assuming that all individuals in a given country are identical, but in practice there is likely to be a concern that some of the benefits of low drug prices in poor countries might accrue to local elites. These benefits might accrue to them as consumers, and members of the elite might also be able to appropriate some of the benefits of lower (import) prices if they act as intermediaries. Nevertheless it is expected that significant benefits would often reach the poor, especially if low prices are implemented through competition and not by regulation. Finally, the framework has been used to examine ongoing debates on the international patent system and global pricing. In particular, the very small net revenue increase that TRIPS might afford pharmaceutical companies (thereby strengthening R&D incentives) comes at the cost of shifting a greater share of the burden onto poorer countries. The same increase in incentives could be implemented in an alternative fashion with a positive welfare effect. APPENDIX Figure A1 shows that in the special case where bi = b optimal prices are such that the price in each country is at least as high as ci, the marginal production and distribution cost. It also shows that when the bi's are not the same across countries, the price in one country may be lower than ci. Return to the discus- FIGURE A1. Marginal Social Costs of Revenue Generation in Countries n and s dSCs/dRs dSCs/dRs s dSCn/dRn n s n dSCn/dRn R*n R*s R*n R*s R R Panel A Panel B 66 , . 19, . 1 T H E W O R L D B A N K E C O N O M I C R E V I E W V O L N O sion immediately following equation 5 where the marginal social cost of raising additional net revenue in country i was the ratio of (@w)/@vi) (@vi/@pi) to yi + (pi ci) (@yi)/@pi. Using equation 3, this ratio is ð20Þ MSCi=MRi ¼ y =ðyi þ ½pi i i ci ½@yi=@pi Þ: This ratio is the change in social cost from raising an additional dollar in country i, dSCi/dRi, at a price, pi. At the optimal prices, the marginal social cost of raising revenue is equalized across countries: dSCs/dRs = l = dSCn/dRn. Note that when the price in country i is just equal to production cost ci there, equation 20 collapses to ð21Þ MSCi=MRi ¼ dSCi=dRi ¼ : i Two cases are shown in figure A1. In panel A the length of the horizontal axis represents the total net revenue of R to be raised. The amount raised in country n is measured from the left and that in country s from the right. The horizontal dashed line is the marginal cost, which is assumed to be the same across countries: cn = cs = c. Generating revenue in either country imposes a social cost. First consider the social cost when revenue is raised in the south. When ps = c, revenue from the south is zero (the right edge of the figure). Equation 21 shows that the marginal social cost of generating an additional unit of revenue at this point is bs. The cost increases moving left, and more revenue is generated in country s through increases in ps. The same applies for the north starting from the left edge of the figure. This gives the two marginal social cost curves. In panel A, bn = bs, so the curves cross at some point in the middle of the figure, meaning that both north and south contribute to the revenue requirement in amounts Rn and Rs, respectively, and prices in each are above marginal cost. In panel B it assumed that bn < bs. Now the two cost curves can cross at a point outside the interval shown, meaning that the net revenue requirement, R, should be shared between the two countries by having country n contribute Rn > R, while country s receives a subsidy in the amount of Rs ¼ Rn R, by paying less than the marginal cost of production. REFERENCES Atkinson, Anthony, and Joseph Stiglitz. 1976. ``The Design of Tax Structure: Direct versus Indirect Taxation.'' Journal of Public Economics 6:55­75. ------. 1980. Lectures on Public Economics. New York: McGraw-Hill. Attaran, Amir, and L. Gillespie-White. 2001. ``Do Patents for Antiretroviral Drugs Constrain Access to AIDS Treatment in Africa?'' Journal of the American Medical Association 286(15):1886­92. Baumol, William, and David Bradford. 1970. ``Optimal Departures from Marginal Cost Pricing.'' Amer- ican Economic Review 60:265­83. Bo¨s, Dieter. 1986. Public Enterprise Economics: Theory and Application. New York: North-Holland. Jack and Lanjouw 67 Danzon, Patricia M. 1997. ``Price Discrimination for Pharmaceuticals: Welfare Effects in the US and the EU.'' International Journal of the Economics of Business 4(3):301­21. ------. 2001. ``Differential Pricing for Pharmaceuticals: Reconciling Access, R&D, and Patents.'' Work- ing Paper WG2: 10. World Health Organization, Commission on Macroeconomics and Health, Geneva. Diamond, Peter. 1975. ``A Many-Person Ramsey Tax Rule.'' Journal of Public Economics 4:335­42. Diamond, Peter, and James Mirrlees. 1971. ``Optimal Taxation and Public Production II: Tax Rules.'' American Economic Review 61:261­78. Ganslandt, Mattias, and Keith Maskus. 2004. ``The Price Impact of Parallel Imports in Pharmaceuticals: Evidence from the European Union.'' Journal of Health Economics 23(5):1035­57. Jacobzone, S. 2000. ``Pharmaceutical Policies in OECD Countries: Reconciling Social and Industrial Goals.'' Labour Market and Social Policy Occasional Paper 40. Organisation for Economic Co- operation and Development, Paris. Lanjouw, Jean. O. 2002. ``A Patent Policy for Global Diseases: U.S. and International Legal Issues.'' Harvard Journal of Law and Technology 16(1):85­124. ------. 2003. ``Intellectual Property and the Availability of Pharmaceuticals in Poor Countries.'' Innova- tion Policy and the Economy 3:91­130. Nordhaus, William. 1969. Invention, Growth and Welfare. Cambridge, Mass.: MITPress. Ramsey, Frank. 1927. ``A Contribution to the Theory of Taxation.'' Economic Journal 37:47­61. Scherer, F. M. 1972. ``Nordhaus' Theory of Optimal Patent Life: A Geometric Re-interpretation.'' American Economic Review 62(3):422­27. ------. 2003. ``Global Welfare in Pharmaceutical Patent Policy.'' Princeton University, Woodrow Wilson School of Public and International Affairs, Princeton, N.J. Scherer, F. M., and Jayashree Watal. 2002. ``Post-TRIPS Options for Access to Patented Medicines in Developing Nations.'' Journal of International Economic Law 5(4):913­39. Washington Post. 2002. ``HIV Drugs for Africa Diverted to Europe.'' October 3. World Bank. 2003. World Development Report 2004: Making Services Work for Poor People. New York: Oxford University Press. Prices and Unit Values in Poverty Measurement and Tax Reform Analysis John Gibson and Scott Rozelle Researchers often use unit values (household expenditures on a commodity divided by the quantity purchased) as proxies for market prices when calculating poverty lines and estimating consumer demand equations. Such proxies are often needed because commu- nity price surveys in developing economies are either absent or suffer quality problems. However, using unit values may result in biases due to measurement error and quality effects. In a household survey experiment, information on prices was obtained in three ways: from unit values, from a market price survey, and from the opinions of house- holders who were shown pictures of items and asked to report the local price. The three sets of price data are used to calculate poverty lines, estimate price elasticities, and analyze marginal tax reforms. There are substantial biases when unit values are used as a proxy for market price, even when sophisticated correction methods are applied. Performance was better for the price opinions of household members. The results highlight the importance of price collection methods and the need to consider the wider costs of having potentially unreliable community-level price data. Prices are important. Economists need good measures of prices to conduct studies for many applications in developing economies. For example, they need matrices of own- and cross-price elasticities of demand for constructing computable general equilibrium models for trade policy analysis (Minot and Goletti 2000). Effective reform of indirect taxation and subsidy regimes requires accurately estimated price elasticities to predict changes in public expenditure and tax revenues as demand changes following subsidy or tax rates shifts (Ahmad and Stern 1991; Laraki 1989). Elasticities also are needed to account for the welfare effects of economic crises because first-order approximations John Gibson is a professor in the Department of Economics at the University of Canterbury, New Zealand; his email address is john.gibson@canterbury.ac.nz. Scott Rozelle is a professor in the Depart- ment of Agricultural and Resource Economics at the University of California, Davis; his email address is rozelle@primal.ucdavis.edu. The authors are grateful for assistance and helpful comments from Chris Hector, Tim Maloney, Susan Olivia, Berk Ozler, Steven Stillman, three anonymous referees, and seminar ¨ audiences at Canterbury University and the Northeast Universities Development Consortium Conference. Data for this study were originally collected as part of a World Bank poverty assessment for Papua New Guinea, for which financial support from the governments of Australia, Japan, and New Zealand is gratefully acknowledged. , THE WORLD BANK ECONOMIC REVIEW VOL. 19, NO.1, pp. 69­97 doi:10.1093/wber/lhi002 The Author 2005. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development/THE WORLD BANK . All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org. 69 70 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . that ignore consumer substitution can greatly overstate welfare losses (Friedman and Levinsohn 2002). Poverty analysts need accurate and timely price data to ensure that poverty lines correspond to the actual change in the cost of living for poor people; this issue has affected recent debates about poverty reduction in India (Deaton 2003). Surprisingly few studies systematically collect price data, despite their wide- spread importance. State statistical bureaus in countries such as China, Indonesia, and Pakistan do not collect market price data that can be matched to their rural household income and expenditure surveys. Consequently, in some countries, such as Lao People's Democratic Republic and Pakistan, Poverty Reduction Strategy Papers use poverty estimates based on assumed levels of rural prices. Even research-driven surveys suffer from a lack of price data. The Indonesia Family Life Survey collected a tremendous amount of data from households and commu- nities, including expenditures on 37 food items, but market price surveys were carried out for only 9 foods. This incomplete information on prices makes it difficult to reliably measure the inflation rate that Indonesian households faced during the Asian economic crisis of the late 1990s and may contribute to the large discrepancy between the poverty increases implied by the survey price data and the increases implied by the official (urban) inflation rates (Beegle and others 1999). Even in the well-funded and comprehensive Living Standards Measurement Study (LSMS) surveys, there have been problems in gathering prices: In most previous LSMS surveys, interviewers have collected price data by visiting markets and vendors and asking the price of particular goods.. . . Another possible way to collect prices would be to ask community informants or a sub-sample of household informants about prices. Given how little is known about how to collect data on community-level prices and how many problems there have been in past LSMS studies, it is recommended that both methods be used. (Frankenberg 2000, p. 329; emphasis added) Community-level prices of the type collected in most LSMS surveys may be unreliable because they are gathered from the wrong market or for the wrong specification of goods or because the prices quoted are not the prices actually paid by local residents (Deaton and Grosh 2000). Indeed, in some LSMS surveys the market price data have either never been released because of quality pro- blems (for example, Tajikistan) or analysts have been forced to discard some of the prices.1 This poor track record for collecting price data may not be surprising. In the rural areas of many developing economies, it is hard for outsiders to find, understand, and study markets. Markets may assemble intermittently, at differ- ent places on different days, and often at very early hours. Perhaps because managing the traditional part of the data collection effort (household expendi- tures) is already logistically difficult, adding another part to the survey (for 1. In Co^te d'Ivoire, the price of canned tomato paste had to be used as a substitute for all nonfood prices, which were poorly measured (Glewwe 1991). Gibson and Rozelle 71 collecting prices) with its own complications may cause overall survey quality to decline. The problems are likely to be most apparent in countries with poor infrastructure and low population densities--the very places where price policy can be an important tool for government because of the high per capita admin- istrative cost of income interventions. Without good price data, economists have had to turn to imperfect proxy measures, such as unit values (the ratio of household expenditure on a particular good to the quantity consumed).2 Unit values have recently been used in calculat- ing purchasing power parity exchange rates (Deaton and others 2004), calculating and updating poverty lines (Deaton 2003), assessing household welfare changes from trade liberalization (Nicita 2004b) and economic crises (Friedman and Levinsohn 2002), analyzing indirect tax and subsidy reforms (Deaton and Grimard 1992; Nicita 2004a), and assessing the distributional and nutritional impacts of devaluation (Minot 1998). In some applications, however, such as demand studies, the use of unit values is believed to give biased results (Deaton 1997). In contrast to market prices, unit values reflect household-specific quality and reporting error effects and are subject to sample selection effects because they are unavailable for nonpurchasing households. Even procedures developed by Deaton (1990) to cor- rect these biases have been shown to produce inaccurate and imprecise results (Gibson and Rozelle 2002). Alternative strategies, such as using more readily available urban price series as proxies for the prices faced by rural households, also may cause bias (Alderman 1988). I. HOUSEHOLD SURVEY EXPERIMENT Because these types of problems appear to be pervasive, an experiment was devised during a survey in Papua New Guinea to test three alternative ways of collecting price data: from the unit values implicit in household expenditure data, from a market price survey (conducted by making repeated trips to the market and surveying traders), and from the opinions of household respondents who were shown pictures of various items and asked to report the local price. The picture-based methodology has several potential advantages over unit value-based approaches. Because it is easy to show pictures to all households and ask for their price estimates, there are likely to be fewer missing observa- tions. More important, any measurement error in these price opinions should not be correlated with actual demands. Finally, biases due to quality effects should be less, because everyone sees and is responding to the same picture. 2. In some applications it is also possible to substitute assumptions for data. For example, researchers often use additivity assumptions, such as in the linear expenditure system, to get price elasticities from household budget data, without using any prices. But additive preferences imply that expenditure and own-price elasticities are roughly proportional, forcing a tradeoff between equity and efficiency and leading to recommendations of uniform rates of commodity taxes regardless of the patterns in the data (Deaton 1997). 72 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . The prices from the market price survey are used to assess the two price proxies. Although somewhat innocuous, such a preference for relying on market price surveys is not always apparent in the literature (Deaton and Grosh 2000). This article explicitly assumes that prices for well-defined items collected from market surveys using certain sampling rules are the appropriate standard. Although in some cases there may be reasons to worry about the quality of market prices, three features of the case study country used here increase the reliability of the market price surveys. First, villages are small and almost every village visited had a well-defined market. Second, haggling is uncommon in markets in Papua New Guinea. Moreover, several of the selected products are sold only in trade stores and supermarkets, where transactions always take place at the listed prices. Thus the prices observed by enumerators are likely to be the prices actually faced by households in the survey. Third, there is very little quality variation in many of the foods consumed in Papua New Guinea, which are often branded products with well-defined package sizes. Often a single brand supplies the whole market either because of local monopolies or because a dominant importer controls port and distribution facilities. Even for the foods that are produced and marketed by the informal sector, there is little quality variation within markets (as will be shown), so significant variation in quality between markets seems unlikely. Although the experiment relates to a single country, the findings may be of wider interest. This appears to be the only systematic attempt to test an idea that was proposed early in the development of the LSMSsurveys: to obtain price data by interviewing groups of housewives (Saunders and Grootaert 1980).3 This strategy was never implemented, in part because subsequent LSMS reports were critical of this ``novel but risky'' idea (Wood and Knight 1985). The main concerns were that such price opinions could be biased by differences in bargaining skill, uncertainty about the reference period (which matters in inflationary environments), and the lack of a representative sample. The experiment reported on here overcomes several of these shortcomings. It is based on a representative sample of households each shown a defined specification (a photograph) and asked to report the current price. These price opinions do not vary with observable household characteristics, so the concern about bias due to differences in bargaining skill may be misplaced. Also, this is one of only two studies to demonstrate empirically the magnitude of the bias from using unit values as proxies for market prices. Surprisingly, despite the widespread reliance on unit values and despite the plea by Deaton (1990), there has never been a ``crucial experiment'' in which results calculated from market price data are compared with the results from either naive or corrected unit value procedures. The literature seems to include only one article that compares poverty estimates with poverty lines priced with unit values and 3. This data-collection strategy has recently been used in the Indonesia Family Life Survey, with price opinions collected from key informants (the Ibu PKKwomen's groups). However, comparisons of those prices with prices collected from market surveys do not seem to be available. Gibson and Rozelle 73 market prices (Cape´au and Dercon 1998).4 The current study goes further by having three types of prices and by looking at the effect on estimated demand elasticities and marginal tax reform calculations as well. II. DATA COLLECTION Data for this study come from the Papua New Guinea Household Survey, which was designed and supervised by the authors in 1995 and 1996, with fieldwork taking place over 12 months. The survey covered a random sample of 1,200 households residing in 73 rural clusters (each providing 12 households to the sample), 40 clusters from the capital city (6 households each), and 7 clusters from smaller urban areas (12 households each). Market prices were collected in each cluster using two different surveys. The prices of 14 commercially produced food items (such as rice, sugar, and beer) and 9 nonfood items (such as soap and kerosene) were collected from the two main trade stores or supermarkets used by households in the cluster. These prices typically were for a finely defined specification (for example, a 1kg bag of Trukai brand rice). For four of the foods and one of the nonfood items, the prices covered two different specifications of the same commodity (for example, a bottle of beer and a carton of beer); simple averages of the prices of the two specifications were used. The second market survey collected the prices of 11 locally produced foods from the nearest local market; for 1 food (bananas), prices were collected for two different varieties. Enumerators recorded the price and weight of up to six different lots of each commodity (drawing the sample from different sellers). The market price survey took place on two different days in each cluster; potentially, up to 12 observations are available on the price of each food for a given market. The unit values were obtained from a closed-interval consumption recall. After an initial interview to signal the start of the consumption recall period, enumerators revisited the households about two weeks later and asked respon- dents to recall the value and quantity of all purchases, gifts, and own-produc- tion since the initial interview. This recall covered 36 categories of food and 20 categories of other frequent expenses.5 The unit values are calculated as the ratio of purchase values to purchase quantities. The data-collection methods affect the unit values in the survey in two important ways. First, the unit values are for the same period as the market price survey. In contrast, in some LSMS surveys (for example, in Vietnam) the 4. In fact, the main point that Cape´au and Dercon discuss is not the comparison of market prices and unit values but rather how to collect data on crops for which households have difficulty converting from their traditional units of measure (what the farmer knows) to kilograms (what the economist needs). The data-collection methods in the Papua New Guinea survey make this conversion issue less of a problem. 5. In addition to these short period measures of consumption, the estimate of household's total expenditure used an annual recall of 31 categories of infrequent expenses and an inventory of durable assets, which provides estimates of the flow of annual services from durables and dwellings. 74 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . unit values cover a 12-month period, which would weaken any comparison with current market prices. Second, to reduce problems stemming from a failure to understand metric quantities, all households received during the first inter- view an empty 25 kg sack, with graduations of 1/4, 1/2, and 3/4 marked on the outside to use in recording food volumes. This unit was recommended for bulky root crop staples and was used in more than 90 percent of cases.6 The other main unit used was a simple count of the number of items, recommended for items like coconuts and betelnuts and for livestock. Average volume to weight (and count to weight) conversion factors were established from weighing trials conducted in all regions of the country. Although crude compared with the ideal of weighing all items consumed, these procedures avoid the problem of enu- merators and respondents using idiosyncratic conversion factors and so reduce the relevance of the Cape´au and Dercon (1998) procedure. The picture method data come from price opinions gathered from each house- hold for 15 food items (including beverages) and 3 tobacco products. Because six of the food items were alternate specifications of a particular food (for example, a bottle and a can of soft drink), the pictures refer to nine categories of food. On average, these nine foods constitute 30 percent of the household's total consump- tion expenditure, with individual budget shares ranging from 11 percent (sweet potato) to 1 percent (flour, biscuits, and soft drinks). Central to the enumeration process, respondents were shown a series of 18 high- quality color photographs taken by professionals. The photos showed each food item in the typical bundle, pile, or package found in markets. For foods where scale was important, a box of matches was included in the photograph (see examples in figure 1 for the four items with the largest budget shares--sweet potato, banana, betelnut, and rice).7 Interviewers were instructed to ask the following question when showing the photographs, which was done at the conclusion of the second visit: ``How much does it currently cost to buy a [item] like this in the main market or store in this village or town?'' The questions about food were directed to the person in the household who typically buys most of the food, and the questions about drinks, betelnut, and tobacco to the person who makes most of these purchases. Respondents reported their opinion about the price of what they saw in the photographs, and reported prices were transformed to kilogram prices at the analysis stage, using the actual weights of the items in the photographs.8 Respondents were expected to map a two-dimensional picture into volumes and 6. The average Papua New Guinea household consumes almost 100 kg of root crops every two weeks, so the sacks were filled several times during the recall period. This should reduce errors due to the relatively coarse graduations used. 7. Full color versions of the pictures in figure 1 can be viewed on the publisher's Web site. 8. No attempt was made to force respondents to report a price in kilograms, which are not widely used in markets in Papua New Guinea. This does not mean that people are unaware of size differences-- they just use different terminology. For example, canned fish comes in three sizes and the smallest size (155 g) is known in the local vernacular as ``battery'' because its shape resembles that of a D-size battery. Gibson and Rozelle 75 FIGURE 1. Examples of Photographs Used for Eliciting Price Opinions Source: Papua New Guinea Household Survey 1995/96. weights and to form an assessment of quality based on the photograph.9 Pretesting showed that respondents were good at this: Reported prices based on pictures were close to the prices reported when respondents were shown the actual items instead of the picture. Actual products were not used in the main survey because the interview teams would be burdened by carrying bulky products and the same product could not be used simultaneously in different survey locations, so quality variations could be introduced into the price opinions.10 III. UNIT VALUES, PRICES, AND PRICE OPINIONS The data-collection effort provided three different measures of price (market prices, price opinions, and unit values) for nine foods (sweet potatoes, bananas, rice, betelnuts, flour, biscuits, canned fish, soft drinks, and beer). With surveys that 9. This was fairly straightforward for trade store products because quantities are indicated on the packaging and were visible in the photographs, and quality is easily known from the brand name. But even for fresh produce the picture conveys quality information. For example, people could tell from the color and size of the individual tubers in which region of Papua New Guinea the sweet potato had been grown. 10. However, analysis suggests that there is little quality variation in the goods used and in the valuations that respondents placed on those goods. 76 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . have just one measure of price, analysts are often forced to use unit values even though they are not a direct substitute for genuine price data.11 The survey data were used to answer two questions: Are the problems in using unit values as a measure of price large enough to justify the expense of collecting additional information on prices? If this additional information is needed, do price opinions have smaller problems than unit values?12 Negative answers to both questions would suggest that current procedures using unit values are appropriate. Positive answers to both would suggest that some innovation in data-collection methods is needed along the lines of the photo-guided price opinions. Finally, if additional information on prices is needed but price opinions perform poorly, greater invest- ment may be needed in properly carrying out community price surveys. This section reports some simple descriptive analyses that may help answer these two questions. To guard against outliers affecting the analysis, the survey forms were reexamined and data entry errors and obvious miscoding (such as kg entered as g) were rectified or removed. Following the rule of Cox and Wohl- genant (1986), unit values and price opinions more than five standard devia- tions from their respective means were also removed to further reduce outlier effects. This procedure removed 23 unit values (of 4,550) and 25 price opinions (of 9,100), a proportionately greater trimming of the unit values. Even after outliers were trimmed, the unit values appear to be fairly noisy and biased measures of market prices. The correlations between household-specific unit values and market prices range between 0.38 and 0.59 for sweet potatoes, bananas, and rice, the three foods with the largest budget shares.13 Examining deviations from the 45-degree line in price plots also demonstrates the low correlations for the major food commodities (figure 2). The correlations for the major food commodities, however, are still higher than those for the six other food commodities (r = 0.37; results not shown).14 Unit values also appear 11. An exception is Minot and Goletti (2000), who estimate a demand system for 14 foods (in the context of a study of trade liberalization in Vietnam), where unit values are used for 7 of the foods and market prices for the other 7. This use in the same demand system implies a direct substitutability between the two types of price data. 12. Unit values are likely to be collected in many surveys anyway, because of the interest in quantities (for example, for studies of nutrition), so picture prices might reduce problems by substituting for unit values or complementing them by acting as an instrument. 13. These correlations should not be seen as either atypically low or as reflective of the unusual conditions in Papua New Guinea. A comparison of market prices and unit values for 33 items in the 1997/98 Vietnam Living Standards Survey (VLSS) yields an average correlation of only 0.25 (Gibson and others 2002). Using a more restricted set of foods, and data from the 1992/93 VLSS, Deaton and Grosh (2000) report a median correlation of 0.34. A caveat to both comparisons is that the unit values in the VLSS are meant to refer to the previous 12 months, whereas the market prices are from the month when the household was actually surveyed. 14. The correlations with market prices are even lower for the unit values applied to self-produced foods (r = 0.35) and for the unit values for gifts received r = 0.36). There is also little agreement among the different types of unit values: For households that both purchased and produced either sweet potatoes, bananas, or betelnut, the average correlation between the two types of unit values is only 0.26. For those that both purchased and received gifts, the average correlation is 0.43. Gibson and Rozelle 77 FIGURE 2. Comparisons of Market Prices and Household-Specific Unit Values and Price Opinions Sweet Potato 250 250 r = 0.58 r = 0.52 200 200 xuv xp = 1.26 xppx p = 0.94 150 150 Value Price Unit 100 100 Picture 50 50 0 0 0 50 100 150 200 250 0 50 100 150 200 250 Market Price Market Price Banana 350 350 r = 0.38 300 xuv xp = 1.31 300 r = 0.48 xpp xp = 0.95 250 250 200 Price 200 Value 150 150 Unit Picture 100 100 50 50 0 0 0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350 Market Price Market Price Rice 240 240 200 200 160 Price 160 Value Unit120 Picture120 80 r = 0.59 r = 0.79 80 xuv xp = 0.94 x pp x p = 1.01 40 40 40 80 120 160 200 240 40 80 120 160 200 240 Market Price Market Price Note: Prices are in toea per kilogram (130 toea = US$1 in 1996). The 458 line shows the points where market prices equal unit values (or picture prices). Source: Authors' computations based on data from Papua New Guinea Household Survey 1995/96. to be biased measures of mean market prices, according to the ratio, xuv/xp. The average unit value overstates the average market price by about 30 percent for sweet potatoes and bananas, the two most common locally produced foods. Photo-guidedprice opinionsappearto provide a better measure ofmarket prices. For the same households as for the unit value analysis, the scatter plots of market prices and price opinions are distributed more symmetrically around the 45-degree 78 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . line andthe ratio of means of the two price series, xpp/xp, is closer to 1, ranging from 0.94 to 1.01 (see figure 2). The correlations with market prices range from 0.48 to 0.79 for the three major foods. The average correlation coefficient between price opinions and market prices for the six minor food commodities is also higher, r=0.64 (compared with r=0.37 for the unit values). There are several reasons why price opinions and especially unit values may be imperfect measures of market prices. Both may contain quality effects, although these tend to be small in the data, particularly for the price opinions (see later discussion). The specification of items may differ for the pictures, the market price surveys, and the unit values. But for the foods where a clear comparison is possible, there is no evidence that such a discrepancy between the specifications for the unit value and the market price surveys contributes to the low correlation.15 Finally, both price opinions and unit values are subject to reporting error, and it could be that the errors are greater for unit values. If all three series (market prices, unit values, and price opinions) are treated as error-ridden measures of true but unknown community prices, the intracluster correlations among each measure can provide an estimate of the ``reliability ratio''--the proportion of measurement error in the variance of the observed price series. The intracluster correlation is systematically lower for unit values (after the effects of quality have been purged) than for market prices and price opinions. The average value of the correlation coefficients across the nine foods is only 0.38 for unit values, compared with 0.78 for market prices and 0.65 for price opinions. By this analysis then, unit values are the least reliably measured, although there is imperfect reliability for all the price measures. Figure 2 suggests that it is possible that a few households disproportionately generate much of the reporting error bias in both price opinions and unit values. The use of cluster averages can reduce this source of bias and indeed results in improved correlation between unit values and market prices, although the unit values still tend to be noisier measures than the price opinions (table 1, columns 6 and 7). The average correlation of cluster-level unit values and market prices is 0.63, and the average correlation for price opinions is 0.77.16 15. For example, the brand of rice used for the market price survey (Trukai) accounted for 86 percent of rice sales in Papua New Guinea in 1996, and most of those sales were for the specified 1 kg pack size, according to Neville Whitecross of Trukai Industries, Port Moresby. The correlation between unit values and market prices is almost the same for households that report purchasing only 1 kg of rice during the recall period (r = 0.61) as it is for other households (r = 0.57). Thus, even when the pack size for the unit value corresponds to that of the market price survey, there is a low correlation between unit values and market prices, suggesting that reporting errors are important. The intracluster correlation in the rice prices collected from the market survey is 0.82, so variation in the prices charged by different trade stores within each cluster is unlikely to account for the discrepancy. Moreover, this variation in market prices within a cluster would also affect the calculated reliability of the price opinions, so it cannot account for the relatively poor performance of the unit values. 16. The average correlation is no higher (r = 0.63) if a more broadly defined unit value is formed, based on the ratio of the combined value of purchases, net gifts received, and own-production to the combined quantity. TABLE 1. Descriptive Statistics for Cluster-Level Market Prices, Unit Values, and Price Opinions No. Clusters with Data Onb Correlation with Market Prices Product Mean Market Pricea Mean Unit Valuea Mean Price Opiniona Unit Values Price Opinions Unit Values Price Opinions Sweet potatoes 43.9 59.0 42.5 93 118 0.74 0.74 Bananas 54.2 75.9 51.3 92 118 0.65 0.71 Rice 114.7 107.3 115.5 114 118 0.75 0.93 79 Flour 143.6 114.9 158.3 95 116 0.43 0.72 Biscuits 444.4 450.0 452.4 112 118 0.50 0.83 Canned fish 432.7 437.0 422.7 115 118 0.42 0.56 Betelnut 510.8 566.0 419.9 107 117 0.63 0.64 Soft drink 272.8 263.3 287.9 100 118 0.73 0.91 Beer 558.3 507.0 586.8 63 116 0.86 0.93 Source: Authors' computations based on data from Papua New Guinea Household Survey 1995/96. aToea per kg, as calculated from cluster-level averages; 130 toea = US$1 in 1996. bOf a possible 120 clusters. 80 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . Averaging by cluster, however, does not remove the bias that occurs when unit values are used to calculate average market prices (see table 1). On average, the mean price for each food and the mean of the cluster-level unit values for the same food differ by 14 percent (this is calculated for each food as: |xuv xp|/xp. In contrast, the average error is only 6 percent for the price opinions. Hence, the conclusion that unit values are more biased measures of average market prices holds even for the cluster-level estimates. In addition to being a biased and noisy measure of market prices, unit values exhibit a further statistical problem that becomes apparent when the cluster means are formed. A cluster mean unit value is available only when at least one household in a cluster makes a purchase during the recall period. When no households make such a purchase, a sample selection problem occurs. This can be a serious problem for some commodities. For example, in the sample there are only 63 clusters with an average unit value for beer and 92 clusters with one for bananas rather than the expected sample of 120 clusters.17 How serious this sample selection problem would be elsewhere is likely to depend on the length of the survey recall period, with longer recalls allowing more households to record a purchase.18 In contrast to the unit values, the price opinions are much more widely available. The most for any food was four clusters with missing price opinions for all households. Thus, the method of obtaining opinions about prices rather than just relying on purchase behavior can potentially capture the full range of spatial price variation in a sample. IV. THE EFFECTS OF THE ALTERNATIVE PRICE COLLECTION METHODS This section measures the impact of using the alternative prices series as proxies for market prices. First, it examines how using unit values compares with using photo-guided price opinions in estimating the poverty line and various aggregate measures of poverty. Next, the same comparison is made for price elasticity estimates, and implications are drawn for tax policy analysis. Effects on Poverty Measures Poverty lines for Papua New Guinea are based on the market prices collected by the survey (World Bank 1999). Specifically, the cost of buying a basket of food that provides 2,200 calories a day was calculated for five regions: the National 17. This lack of unit values affects rural areas particularly. For example, a unit value for beer is available for 35 of the 40 clusters in the capital city but in only 28 of the 80 clusters elsewhere. Hence, the spatial distribution of prices may not be measured in a reliable way when unit values are used as the proxy for market prices. 18. However, even without this sample selection issue there is still bias in the unit values. For example, in the 93 clusters where a unit value for sweet potatoes is available, the average market price is 46.8 toea per kg (slightly above the average across all clusters), which is still 20 percent below the mean unit value for those clusters. Gibson and Rozelle 81 Capital District, the South Coast, the Highlands, the North Coast, and the New Guinea Islands. Rural and urban areas within each region are combined because the sample usually had only one urban cluster per region and there are no rural clusters in the National Capital District.19 The regional average prices used to calculate the cost of the poverty line basket of foods were calculated from the cluster-level averages of the market prices (see table 1).20 This section follows the same procedures to calculate the food poverty line, substituting the unit values and price opinions in place of the market prices. The unit values and price opinions are first averaged by cluster before the regional averages are calculated. This ensures that clusters with more purchasing house- holds do not receive undue weight in the calculations. The cluster averages also tend to dampen measurement error. The unit values are also purged of quality effects by running within-cluster regressions on a set of household character- istics (see equation 1 for the characteristics included). One constraint with these exercises is that the poverty line food basket contains 35 foods, but there are only 9 foods with data from both price opinions and unit values. Although these foods contribute almost half the value of the poverty line food basket, the experiments are effectively varying only half of the value of the food poverty line. Thus the measured effect of different price collection methods on estimated poverty may be, if anything, understated. The regional food poverty lines that result from using market price, unit value, and price opinion data are illustrated in figure 3. When market prices are used, the food poverty line ranges from 235 kina (K) a year in the North Coast region to K626 in the National Capital District, with a population-weighted average of K330.21 Although the existing poverty lines for Papua New Guinea include a nonfood allowance, which is equivalent to between a third and a half of the value of the food poverty line, that is ignored here because price informa- tion was gathered only for foods. The food poverty line is consistently overstated when unit values are used as the measure of price (see figure 3). In the National Capital District, South Coast, and Islands regions unit values overstate the poverty line by a slight margin of 4­10 percent. However, in the other two regions, which contain 70 percent of the population, unit value-based analysis overstates the food poverty line by 16­ 20 percent. In contrast, the use of photo-guided price opinions creates a smaller bias. The use of price opinions causes the food poverty line to be understated by 19. An analysis of covariance also showed that urban­rural price differentials within regions were less important than interregional price variations (World Bank 1999). 20. The National Capital District is an exception, with the average price formed directly from the raw prices rather than from the cluster-level prices. This reflects the assumption that there is less need for the average to reflect the spatial distribution of prices within a city than there is in larger geographical regions (World Bank 1999). 21. This is equivalent to US$250 per year and refers to adult-equivalents rather than per capita. 82 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . FIGURE 3. Regional Food Poverty Lines 700 677 626 Market Prices 600 578 Unit Values Price Opinions 500 446 428 403 year 400 370 371 353 351 364 per 319 300 282 255 Kina 235 200 100 0 NCD South Coast Highlands North Coast Islands Source: Authors' computations based on data from Papua New Guinea Household Survey 1995/96. about 10 percent in the National Capital District and South Coast and to be overstated by 4­11 percent in the other three districts. On average the food poverty line has a proportionate error, |zi zp|/zp (where z is the food poverty line, p is market prices, and i is unit values or price opinions), of 14 percent with the unit values and 9 percent with price opinions.22 When data-collection methods create biased estimates of the poverty line, they also affect measures of poverty rates (table 2). The overstatement of the food poverty line when unit values are used causes an upward bias in poverty measures. Thus, for example, the headcount index is estimated as 28 percent (with a standard error of 2.6 percent) rather than the 22 percent based on market prices,23 and the poverty gap index is estimated as 8.0 percent rather than as 5.9 percent. The differences between unit value estimates and those based on market prices are statistically significant (the t-statistics for the null hypothesis of no difference range from 4.8 to 6.8). This finding that using unit values results in higher poverty measures is consistent with Cape´au and Dercon (1998), who conclude that headcount poverty in rural Ethiopia would be overstated by one-fifth if unit values were used instead of other price data. 22. The overstatement would be even higher, at 17 percent, if the unit values had not been purged of quality effects. 23. These standard errors correct for weighting, clustering, and stratification, using the program of Jolliffe and Semykina (1999). Gibson and Rozelle 83 TABLE 2. Aggregate Food Poverty Measures for Papua New Guinea, 1996 Poverty Line Food Basket Calculated From Headcount Index Poverty Gap Index Poverty Severity Index Market prices 22.0 (2.4) 5.9 (0.9) 2.4 (0.4) Unit values 28.0 (2.6) 8.0 (1.0) 3.4 (0.6) Price opinions 23.8 (2.5) 6.8 (1.0) 2.8 (0.5) Source: Authors' computations based on data from Papua New Guinea Household Survey 1995/96. Note: Based on the food poverty lines in figure 3. The poverty estimates are in terms of adult- equivalents. The unit values have been purged of quality effects using a regression. Numbers in parentheses are SEs corrected for the effect of clustering, sampling weights, and stratification. There is also an upward bias associated with the use of price opinions, but the discrepancy is significantly smaller (see table 2). Estimates based on price opinions overstate the headcount poverty measure by only 8 percent (the t-statistic for the null hypothesis of no difference is 2.2). This overstatement is significantly less than when unit values are used (the t-statistic is 4.5 for the test that the overstatement is the same for unit values and price opinions). Clearly, the price opinions provide more accurate measures of poverty in Papua New Guinea, although even the smaller overstatement may be enough to justify the expense of collecting better price data from local stores and markets. Effects on Price Elasticity Estimates and Indirect Tax Analysis In developing economies, pricing policy plays the same central role in fiscal policy that income tax and social security policies play in industrial countries (Deaton 1989). The matrix of price elasticities needed to estimate the revenue effects of price reforms can therefore provide fundamental information to governments.24 That makes it important to establish what bias might occur when elasticities are calculated from either unit values or price opinions if estimates from market price surveys are not available. Attention is focused here on the three major staples--sweet potatoes, bananas, and rice25--which account for more than one-fifth of household consumption expenditures and supply about 45 percent of calories to households. These three foods have some policy significance as well as consumption and nutritional impor- tance, because until recently rice was imported duty-free, whereas all other food imports were subject to tariffs. But following a switch to a value-added tax (VAT), 24. The elasticities are not needed for evaluating the welfare effects of marginal tax and subsidy reforms. The existing demand structure, and some social weights for aggregating the effects across households, provide sufficient information when price changes are small (Ahmad and Stern 1984). 25. All of the other foods and nonfoods are aggregated into a composite fourth commodity in the demand system, and leisure is assumed to be separable from goods demand (an assumption forced by the fact that the survey did not gather data on wage rates). 84 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . rice is now taxed at the same 10 percent rate as other imported goods. Sweet potatoes and bananas effectively fall outside of the tax net because the farmers and traders who sell them in informal markets are not registered for the VAT . Eleven clusters have no market price survey data for either sweet potatoes or bananas, so the demand system is estimated on the remaining 109 clusters (containing 1,018 households). This reduced sample highlights one advantage of price opinions--there would be only two clusters with missing data if only the price opinions were used. Of the 109 clusters, only 86 have at least one household purchasing either sweet potatoes or bananas (the total number of purchasing households is around 350). Thus, imputed unit values must be used for the other clusters. The base model uses market prices and a ``share-log'' functional form (Deaton 1989): ð1Þ wi ¼ ai þ bi ln x þ yij ln pj þ g0z þ ui; where wi is the share of the budget devoted to good i, x is total expenditure, pj is prices, and z is a vector of other household characteristics: (log) household size, the share of the household in seven demographic groups (males and females 0­6 years old, 7­14 years old, and 15­50 years old; women over 50 years old), dummy variables for whether the household head is female or employed in the formal sector, and regional and quarterly dummy variables. An advantage of the functional form in equation 1 is that it is able to treat zero and nonzero consumption in the same way. The analysis of tax and subsidy reform relies on unconditional demand functions because the revenue effect of a tax increase does not depend on whether demand changes take place at the extensive or intensive margins (Deaton 1990). Thus the literature on censored demand systems is not needed here. The price elasticities for equation 1 are given by: ð2Þ "ij ¼ ðyij=wiÞ dij; where dij is the Kronecker delta (equal to 1 if i = j or to 0 otherwise), and budget shares are evaluated at their mean values. The most common empirical strategy for using unit values is to simply replace the prices in equation 1 with unit values. Most of the variation in the literature concerns how to deal with the missing unit values and whether to leave unit values at the household level or aggregate them to the cluster level. Two methods are used here: . UV1, which uses household-specific unit values, with missing unit values replaced by the mean unit value calculated across other households in the same region and season (following Minot 1998). . UV2, which uses cluster median unit values in place of both household- specific and missing unit values. This follows several studies that use averages, but with the median chosen for its robustness to outliers. Gibson and Rozelle 85 These same two methods are also applied to the photo-guided price opinions, denoted as PP1 and PP2. In addition to replacing unobserved prices with some form of unit value (as in UV1 and UV2), estimating equation 1, and getting elasticities from equation 2, a two-equation system of budget shares (wGic) and unit values (vGic), both func- tions of the unobserved prices (pHc), are used from Deaton (1990): ð3Þ wGic ¼ a0G þ b0G ln xic þ 0 N G ·zic þ H¼1yGH lnpHc þ ðfGc þ u0GicÞ ð4Þ ln vGic ¼ a1G þ b1G ln xic þ 1 N c G ·zic þ H¼1 GH ln pHc þ u1Gic In addition to the variables previously defined, fGc is a cluster fixed effect in the budget share for good G, u0Gic and u1Gic are idiosyncratic errors, the i indexes households, the G and H index goods, and the c indexes clusters. Deaton's method recognizes that the data are collected on clusters of house- holds that are presumed to face the same market prices. The variation in budget shares and unit values within clusters is used to identify the effect of income and other household characteristics on the quantity and quality demanded. For example, the coefficient b1G is the elasticity of the unit value with respect to total expenditure (henceforth, called the quality elasticity), while the elasticity of quantity demanded with respect to total expenditure is derived from b0G. The first-stage, within-cluster regressions are consistent even in the absence of market prices, which are treated as fixed effects. Any residual variation in unit values (and covariance with budget share residuals) is assumed to reflect mea- surement error, and the first-stage regression residuals give an empirical esti- mate of these errors. More specifically, the error terms, e0Gic and e1Gic from equations 3 and 4, contain all the variability around the cluster means of wGc and lnvGc that is not explained by household characteristics, so this residual variability is assumed to reflect measurement error. Results from the first stage of the Deaton procedure are reported in table 3. To compare the quality effects and measurement error properties of unit values and price opinions, equation 4 is estimated with both types of data. The quality elasticities are universally small, ranging from 0.07 to 0.06 for unit values and from 0.04 to 0.01 for price opinions. These small values are consistent with the evidence from Deaton (1990) and Gibson and others (2002) that the quality problems with unit values are less important than the measurement error problems. Moreover, although the quality elasticities are small, they are larger for unit values than for price opinions. On average, the absolute value of the quality elasticities is almost four times as large for unit values as for the price opinions. The unit values also have higher measurement error variance (variability around the cluster means of lnvGc that is not explained by household character- istics in equation 4) than the price opinions for all nine foods. On average, the measurement error was 4 times greater (almost 10 times for soft drink and TABLE 3. Quality and Measurement Error Indicators for Unit Values and Price Opinions from the First-Stage Regressions of the Deaton Procedure Quality Elasticitya Residual Varianceb Residual Covariancec Product Unit Values Price Opinions Unit Values Price Opinions Unit Values Price Opinions Sweet potatoes 0.016 (0.039) 0.040 (0.027) 0.152 0.151 0.042 0.466 Bananas 0.059 (0.055) 0.005 (0.025) 0.334 0.166 7.255 0.131 Rice 0.019 (0.011)* 0.005 (0.007) 0.031 0.011 0.803 0.149 Flour 0.045 (0.037) 0.010 (0.020) 0.121 0.064 0.865 0.233 86 Biscuits 0.035 (0.026) 0.009 (0.006) 0.111 0.011 0.276 0.018 Canned fish 0.018 (0.020) 0.005 (0.008) 0.074 0.019 0.076 0.022 Betelnut 0.012 (0.038) 0.003 (0.017) 0.260 0.079 0.854 0.381 Soft drink 0.028 (0.021) 0.006 (0.007) 0.071 0.007 0.221 0.027 Beer 0.074 (0.056) 0.001 (0.012) 0.058 0.011 0.331 0.530 Source: Authors' computations based on data from Papua New Guinea Household Survey 1995/96. and the procedures developed in Deaton (1990). Note: Numbers in parentheses are standard errors. *Significant at the 10 percent level. aThe coefficient b1G in equation 4. bCalculated from e1Gic in equation 4. cFrom equations 3 and 4, 1,000. Gibson and Rozelle 87 biscuits) for the unit values as for the price opinions. Finally, the covariance between the errors in the unit value equation and the errors in the budget share equation also were higher for unit values than for price opinions for seven of the nine foods. On average, the covariance in the errors was almost 10 times greater for the unit values, suggesting that the errors in the price opinions are less correlated with actual demands than are the errors in the unit values. In the second stage of the Deaton procedure, a between-clusters errors-in- variables regression is applied to the (adjusted) average budget shares and unit values, which have been purged of household characteristics at the first stage. If it were not for the effect of prices on clusterwide quality variation, the para- meters estimated at the second stage would be sufficient for calculating price elasticities. Instead, a separability theory of quality (Deaton 1988) has to be used to identify the price effects at the third and final stage. An important feature of the procedure is that it depends on a large number of clusters (rather than a large number of households) for its consistency properties. When comparing the own-price elasticity estimates from the five price proxy series and methods (UV1, UV2, PP1, PP2, and the Deaton method) with elas- ticity estimates based on market prices, both price opinions series (PP1 and PP2) create the estimates with the least bias (figure 4). The point estimates of the elasticities estimated from photo-guided price opinions (particularly those using the cluster medians--PP2) are close to those of the market price-based esti- mates. Also, the confidence intervals have a high degree of overlap. There is less overlap for the two simple unit value procedures, UV1 and UV2, and for that of the Deaton method (see figure 4). For example, in estimates of the own-price elasticity of demand for sweet potato, the market price-based estimate is 1.33 ± 0.09. When household-level unit values are used, however, the estimated elasticity is much lower in an absolute value sense ( 1.00 ± 0.08). When cluster median unit values are used (UV2), the absolute value of the estimated elasticities are even lower ( 0.77 ± 0.10). Moreover, although the Deaton procedure calculates point estimates of the own-price elasticities for sweet potatoes and rice that are relatively consistent with the estimates from market prices, it does a poor job of estimating the own-price elasticity for bananas (giving a point estimate of 2.2 rather than 1.0). There is also considerable imprecision in the Deaton estimates. The imprecision, however, is not surprising because Deaton's method essentially reduces to a between- clusters regression, and the sample used here does not have many clusters. Estimates of cross-price elasticities, also important in indirect taxation ana- lysis, are likewise adversely affected by the use of unit values. Although there are too many cross-price elasticity estimates to display individually, the aggregate bias (AB) can summarize the performance of each method. Let e be the vector of elasticities calculated from the market price data and ^e the corresponding elasticity vector from unit values or price opinions, so that the bias is ^e ­ e, and AB =^e ­ e)0(^e ­ e), which is the sum of squared biases. The aggregate bias is calculated for the own-price elasticities alone (AB1) and for the full system of FIGURE 4. Own-Price Elasticity Comparisons for Market Prices, Price Opinions, and Unit Values Sweet Potato Market Prices PP1 (missing=reg/qtr mean) PP2 (cluster medians) UV1 (missing=reg/qtr mean) UV2 (cluster medians) Unit Values: Deaton Method -2.0 -1.5 -1.0 -0.5 95% Confidence Interval* Banana Market Prices PP1 (missing=reg/qtr mean) PP2 (cluster medians) UV1 (missing=reg/qtr mean) UV2 (cluster medians) Unit Values: Deaton Method -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 95% Confidence Interval* Rice Market Prices PP1 (missing=reg/qtr mean) PP2 (cluster medians) UV1 (missing=reg/qtr mean) UV2 (cluster medians) Unit Values: Deaton Method -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 95% Confidence Interval* *68 percent confidence interval (± 1 STD error) for the elasticities from the Deaton method. Source: Authors' computations based on data from Papua New Guinea Household Survey 1995/96. Gibson and Rozelle 89 own- and cross-price elasticities (AB2). For both AB1 and AB2 the calculation excludes the results for ``other goods,'' which are simply derived from the other elasticities. With the exception of the Deaton method, where bootstrapping is used, standard errors for AB1 and AB2 are obtained from the delta method. The aggregate bias in the own-price elasticities is lowest (AB1 = 0.048) when the estimation uses cluster medians of the price opinions (table 4, column 1). When the cross-price elasticities are included in the aggregate bias calculation (AB2), household-specific price opinions perform best (AB2 = 0.904, see table 4, column 2). It is notable that the bias estimates for both procedures using price opinions are less than 35 percent of those for the similar procedure using unit values. Moreover, although neither AB1 nor AB2 is statistically significant when price opinions are used, AB2 is statistically significant (at p < 0.03 or smaller) for all three of the unit value procedures. Similarly, the correlation of the elasticities from price opinions (PP1 and PP2) with the market price elasticities is higher (0.94­0.96) than is the correlation for UV1 and UV2 (0.67­0.80, see table 4, column 3). The Deaton procedure does worst in the aggregate bias calculations, although the standard errors for AB1 and AB2 are also widest with this procedure.26 The bias in the elasticities calculated from naive unit value procedures could affect public policy decisions. An obvious use of the price elasticities is in deciding on the direction of marginal tax reform (Deaton and Grimard 1992). Social cost-benefit ratios, li, of a marginal increase in tax on each of the three foods are estimated from: ð5Þ Þ ½yki=wi ; ~ i ¼ ðwEi =wiÞ=ð1 þ ½ =ð1 þ Þ ½yij=wi ~ ~ i i 1 þ k6¼i k ½ =ð1 þ k where ti is the tax rate on good i (0.1 for rice and 0 for the others), yki is the log price derivative of the budget share (from equation 1 or 3), and the average budget shares w and wi are: i ~ ð6aÞ w"i ¼ ½ M " M m¼1 ðxm=nmÞ xmwim = m¼1 xm ð6bÞ wi ¼ ~ M M m¼1 xmwim= m¼1 xm where xm and nm are the total expenditure and size of household m, and is the coefficient of inequality aversion.27 When market prices are used to estimate yki, the highest ratio of social costs to benefits occurs when there is a marginal increase in the tax on sweet potato (l = 1.47 ± 0.01), followed by a tax on rice 26. To verify that there was no flaw in the programming, the market prices were passed through the STATA code for the Deaton procedure. The correlation between these elasticities and the market price elasticities reported in figure 4 and table 4 was 0.999. 27. This expression for the cost-benefit ratio of a marginal tax increase is adapted by Deaton (1997) from the more usual one (see, for example, Ahmad and Stern 1984, equation 38) and allows for both quantity and quality responses to tax-induced price changes. TABLE 4. Summary Comparisons of Estimates Using Market Prices, Price Opinions, and Unit Values Cost-Benefit Ratio of Tax Rise Fore Data Source and Estimation Methoda AB1b AB2c Corrd Sweet Potatoes Bananas Rice Market prices 1.47 [3] (0.01) 1.39 [1] (0.02) 1.44 [2] (0.05) PP1 (missing = reg/qtr mean) 0.089 (0.133) 0.904 (0.503) 0.958 1.46 [3] (0.01) 1.40 [1] (0.01) 1.41 [2] (0.04) PP2 (cluster medians) 0.048 (0.147) 1.448 (0.874) 0.938 1.45 [2] (0.01) 1.40 [1] (0.02) 1.47 [3] (0.07) UV1 (missing = reg/qtr mean) 0.369 (0.356) 3.323 (1.444) 0.804 1.49 [3] (0.01) 1.40 [2] (0.02) 1.35 [1] (0.03) UV2 (cluster medians) 0.653 (0.408) 4.844 (1.553) 0.669 1.48 [3] (0.01) 1.42 [2] (0.02) 1.34 [1] (0.03) 90 Unit values, Deaton method 1.415 (0.943) 7.775 (3.582) 0.737 1.53 [3] (0.04) 1.34 [1] (0.06) 1.43 [2] (0.08) Source: Authors' computations based on data from Papua New Guinea Household Survey 1995/96. and the procedures developed in Deaton (1990). Note: Numbers in parentheses are SEs derived from the delta method, except for those for unit values estimated using the Deaton method, which are bootstrapped from the second-stage regression using the approach outlined in Deaton (1997). Numbers in brackets are the good's rank in terms of the cost- benefit ratio, li, where 1 denotes the good with the lowest cost-benefit ratio from a marginal tax increase. aPP refers to ``photo-guided price opinions'' and UV to ``unit values.'' bAggregate bias on the own-price elasticities. cAggregate bias on own- and cross-price elasticities. dCorrelation between the elements of the elasticity matrix and the market price elasticities. Calculations exclude the elasticities for ``other goods'' derived from the adding-up and homogeneity restrictions. eCalculated from equation 5, using an inequality aversion parameter, = 0.5. Gibson and Rozelle 91 (l = 1.44 ± 0.05), while banana looks like the best candidate for a tax increase (l = 1.39 ± 0.02) (see table 4). But this ranking is preserved by only two of the other estimation methods: price opinions with missing values replaced by regio- nal and quarterly means (PP1) and the Deaton procedure applied to unit values.28 The other two unit value procedures rank rice as the best candidate for tax increases. Hence, using unit values as proxies for market prices in an optimal tax reform exercise might lead policymakers in Papua New Guinea to increase a tax that is not the socially least-cost source of revenue. Some of the poor performance of the methods that rely on unit values may reflect the sample selection problem of several clusters having no unit value available. Although this is an intrinsic disadvantage of unit value methods, unit values may be more widely available in some settings either because households are more reliant on purchased food or because the consumption recall period is longer. The performance of the cluster-median and Deaton estimators is explored for the subsample of 86 clusters that have unit values available for all three foods (table 5). This change in sample coverage does improve the relative performance of the cluster-median unit values, although the aggregate bias (AB2) is still almost twice as large for unit value-based measures as for those using price opinions (but the difference is no longer statistically significant). The Deaton method also appears to do better on this subsample, with the aggregate bias now statistically insignificant and with a higher correlation with market price elasticities. Thus, unit value methods may not fail as badly as indicated in table 4 and figure 4 if the unit values are available for a wider range of clusters than they are in Papua New Guinea. However, a trend in the literature is to artificially reduce the number of clusters by redefining them at a broader geographic level. Starting with Gracia and Albisu (1998), several users of the Deaton method have treated regions as a cluster. For example, Nicita (2004a, b) uses each of the 32 states in Mexico as a cluster, even though there are hundreds of lower-level municipios. Likewise, Kedir (2001) groups households from an unclustered urban survey in Ethiopia into clusters of varying aggregation, even treating Addis Ababa as a single cluster. It is doubtful that the Deaton method can provide reliable elasticity estimates in these circum- stances, because it assumes that ``households in a single cluster live near one another'' (Deaton 1997, p. 73) and it needs a large number of clusters for its consistency properties. Intraregional variation in unit values because of spatial price variation will wrongly be treated as measurement error when clusters are artificially aggregated. Using a single unit value for an entire region will overstate price in villages where market prices are low and understate it in villages where 28. This finding is sensitive to the value of the inequality aversion parameter used. As increases, the equity effects of not taxing sweet potatoes and bananas, which tend to be consumed by the poor, dominate the tax derivative effects, and the rankings are not sensitive to differences in the price elasticities. However, attempts to econometrically estimate , using the approach of Ravallion and Dearden (1988), suggests that is likely to be close to zero in Papua New Guinea. 92 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . TABLE 5. Results for the Subsample with Each Cluster Having a Unit Value Available Price Elasticities of Demand Calculated From Cluster Medians of Market Prices Price Opinions Unit Values Deaton Procedure Own-price elasticity for Sweet potatoes 1.19 (0.10) 1.30 (0.11) 0.90 (0.11) 2.05 (0.58) Bananas 1.12 (0.14) 0.70 (0.16) 1.34 (0.10) 2.16 (0.91) Rice 1.59 (0.33) 1.77 (0.39) 1.95 (0.29) 3.00 (1.86) Aggregate bias, own-price 0.22 (0.27) 0.26 (0.34) 3.53 (3.08) elasticities only Aggregate bias, own- and 1.23 (1.32) 2.07 (1.44) 6.88 (4.31) cross-price elasticities Correlation with elasticities 0.89 0.88 0.95 from market prices Source: Authors' computations based on data from Papua New Guinea Household Survey 1995/96. and the procedures developed in Deaton (1990). Note: Total of 86 clusters, containing 755 households. See table 4 for details on aggregate bias and correlation with elasticities from market prices. prices are high, so demand differences will be explained by attenuated price differences, usually causing elasticities to be overstated. To see the impact of aggregating clusters, the Deaton method was rerun with each of the 19 provinces in Papua New Guinea treated as a cluster. For both sweet potatoes and rice, the estimated own-price elasticities move further from the values estimated when market prices are used, so aggregating seems to impair the Deaton estimator. The effect on the elasticity for rice is especially large; the own-price elasticity is ­4.1 when provinces are used as clusters but only ­2.3 when the original 109 clusters are used.29 It is not surprising that the elasticity for rice is affected most, because an analysis of variance shows that rice has the highest proportion of within-province variation in market price (0.77) and in quality- purged unit values (0.56). Thus, if most price variation is typically within regions, the strategy of applying the Deaton method to artificially aggregated clusters is likely to bias elasticity estimates and mislead subsequent analyses. V. PRACTICALITIES OF PRICE SURVEYS Is the improvement in data quality from using price opinions instead of unit values worth the additional effort? In Papua New Guinea, price opinions were collected 29. The point estimate of 2.3 differs from that in figure 4 because the elasticities in the figure have controls for region and quarter, but these were not used in the experiment where provinces were treated as clusters. Gibson and Rozelle 93 with the aid of a picture in a fairly efficient, timely way. The enumerators collected price opinions for 18 products at the same time that they conducted the rest of the household survey. No additional logistical effort was required other than remind- ing enumerators to bring their picture albums with them. On average the typical household spent only about 10 minutes on this block of the survey. Because each cluster included 12 households, the survey team spent about two hours per cluster collecting price opinions. Moreover, enumerators and respondents liked this part of the survey because it provided a break from the normal questioning. In the Papua New Guinea survey more effort was needed to survey local stores and markets than to obtain the price opinions. The typical community price survey required visits of 15 minutes or less to each of two trade stores. The survey in fresh produce markets, however, was more time consuming, typically taking the enu- merator up to an hour to weigh and record the prices of up to six lots of 11 different items. In sparsely populated areas, even more time was spent getting to the market, which was commonly near the community school. In some cases, the nearest market was more than an hour's walk from the village. Because the survey in the fresh produce market was repeated when the team returned to each community for the consumption recall, the travel and survey times were doubled. In addition, many communities had prohibitions on selling betelnuts in the main market and selling beer in trade stores. Enumerators had to spend additional time traveling to the roadside betelnut markets and to the nearest beer sellers. On average, the total time spent surveying local stores and markets to collect market price data was about four hours per community or twice the time spent on the price opinions. In some areas the markets would assemble infrequently (typically one day a week) or start at daybreak and last for only one or two hours. When the market day was missed, survey teams had to spend time and resources to leave a member in the village until the market convened again or send back a team at a later date. None of these timing problems affected the collection of price opinions. The experience in Papua New Guinea suggests that with a time commitment of 15­20 minutes per household or 3­4 hours per cluster, it would be feasible to collect price opinions for 30­40 items. Thus, this method would be most suitable for multitopic living standards surveys that do not aim to get especially detailed measures of consumption but that do need prices for modeling house- hold behavior. As more prices are required, market price surveys become more attractive because the fixed cost of finding and getting to the market can be spread over the larger number of items whose prices are surveyed. However, the success of market price surveys depends on the ability to find every item in a rural marketplace; more detailed surveys might run into the problem of a large number of missing prices.30 Thus, some consideration of how markets operate 30. For example, a 1999 survey in Cambodia sought prices for 50 food items in 600 villages, but data were obtained on less than half of the price­village combinations because of items missing from markets (Gibson 2000). 94 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . may be needed when choosing whether to rely on price opinions or market price surveys. An increase in the detail of a survey can also undermine methods that rely on unit values because of the greater likelihood that entire clusters have no house- holds reporting the purchase of a narrowly specified item. For example, 21 percent of the clusters in the Papua New Guinea survey did not have a unit value for flour, whereas the broader category of cereals had purchasers (and hence unit values) in most clusters. Price opinions are less dependent on actual pur- chases, so a survey that sought details on many items rather than a few commonly consumed ones would still have information available. For example, 77 percent of the Papua New Guinea sample offered opinions on the price of flour, even though only 30 percent purchased it during the recall period. This does raise questions about the reliability of the price opinions of households that do not purchase a particular good. In the Papua New Guinea survey there was only a small gap, of 0.06, in the average correlations between market prices and price opinions when the sample was divided by households that purchased each item and households that did not (r = 0.62 for purchasers and 0.56 for nonpurchasers). Nonpurchasers may be relatively well informed about the prices of goods they do not consume because they still observe those prices in stores and the market when they are shopping for other goods. Con- sistent with this explanation, the item with the largest discrepancy was beer (r = 0.89 for purchasers and 0.75 for nonpurchasers), which is usually sold in less commonly frequented hotels, clubs, and licensed outlets. Thus, the useful- ness of price opinions may also depend on how segmented markets are, which affects the ease of observing prices for items that the household does not usually consume. VI. CONCLUSION Cross-sectional household survey data of the kind examined here are increas- ingly being used as economists try to exploit one of the few data sources in developing areas that can help provide estimates of the demand responses needed for evaluating tax and subsidy reforms. The findings suggest that unit values, whether used in naive or improved estimation procedures, lead to biased estimates of poverty rates and biased estimates of price elasticities. In contrast, price opinions perform better, with both poverty estimates and demand elasti- cities being closer to the values established from market price surveys. There are good reasons to expect this better performance from the price opinions. The picture-based method can provide price estimates for a much wider range of households than unit values can, the errors in the estimates are unlikely to be correlated with demand, and the price opinions should have less quality varia- tion because everyone sees the same picture. It may thus be worthwhile to pursue the approach of directly asking households about prices, rather than indirectly obtaining price information from unit values. Gibson and Rozelle 95 Whether relying on price opinions would be better than collecting good measures of prices by surveying local stores and markets depends somewhat on the nature of each survey and on the nature of rural markets in a given country. What is clear is that in many developing economies, for a variety of reasons, the logistics of collecting market prices appear to be so difficult that many surveys do not attempt it, and of those that do, some end up rejecting the data. Consequently, many important analyses of poverty and price and tax policies rely on very imperfect price information. The findings here should also provide an incentive for others to experiment with methods of gathering price data in rural areas of developing areas. For example, three-dimensional models might be used instead of pictures. A broader experiment could gather price data using price opinions from an informed respondent without an aid (as was done in the Indonesia Family Life Survey), using pictures, and using three-dimensional models to elicit price opinions. It would also be interesting to learn whether certain types of respondents have more informed opinions than other household members. Such comparisons are precluded by the design of this study, which asked only for the opinion from ``the most informed'' person. Using pictures or other aids to help gather data from households on their beliefs about existing prices could also provide a way to ask questions about hypothetical prices. Households could be asked how some hypothetical price changes would affect their demand for a pictured item. It would be an interest- ing experiment to find out how well the direct approach approximated the econometric estimates of price elasticities of demand. Such willingness-to-pay questions could be applied to more than food, with medicines and other health interventions being plausible candidates.31 REFERENCES Ahmad, E., and N. Stern. 1984. ``The Theory of Reform and Indian Indirect Taxes.'' Journal of Public Economics 25(3):259­98. ------. 1991. The Theory and Practice of Tax Reform in Developing Countries. Cambridge: Cambridge University Press. Alderman, H. 1988. ``Estimates of Consumer Price Response in Pakistan Using Market Prices as Data.'' Pakistan Development Review 27(2):89­107. Beegle, K., E. Frankenberg, and D. Thomas. 1999. ``Measuring Change in Indonesia.'' Labour and Population Program Working Paper 99-07. RAND, Santa Monica, Calif. Cape´au, B., and S. Dercon. 1998. ``Prices, Local Measurement Units and Subsistence Consumption in Rural Surveys: An Econometric Approach with an Application to Ethiopia.'' Working Paper 98-10. Oxford University, Centre for the Study of African Economies, Oxford. Cox, T., and M. Wohlgenant. 1986. ``Prices and Quality Effects in Cross-Sectional Demand Analysis.'' American Journal of Agricultural Economics 68(4):908­19. 31. The 1993 LSMS survey in Tanzania used willingness-to-pay questions in the parts of the ques- tionnaire on health and education facilities. 96 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . Deaton, A. 1988. ``Quality, Quantity, and Spatial Variation of Price.'' American Economic Review 78(3):418­30. ------. 1989. ``Household Survey Data and Pricing Policies in Developing Countries.'' World Bank Economic Review 3(2):183­210. ------. 1990. ``Price Elasticities from Survey Data: Extensions and Indonesian Results.'' Journal of Econometrics 44(3):281­309. ------. 1997. The Analysis of Household Surveys: A Microeconometric Approach to Development Policy. Baltimore, Md.: Johns Hopkins University Press. ------. 2003. ``Prices and Poverty in India, 1987­2000.'' Economic and Political Weekly 25(January): 362­68. Deaton, A., and F. Grimard. 1992. ``Demand Analysis for Tax Reform in Pakistan.'' LSMSWorking Paper 85. World Bank, Washington D.C. Deaton, A., and M. Grosh. 2000. ``Consumption.'' In M. Grosh and P. Glewwe, eds., Designing Household Survey Questionnaires for Developing Countries. Washington, D.C.: World Bank. Deaton, A., J. Friedman, and V. Alatas. 2004. ``Purchasing Power Parity Exchange Rates from Household Survey Data: India and Indonesia.'' Princeton University, Princeton, N.J. Frankenberg, E. 2000. ``Community and Price Data.'' In M. Grosh and P. Glewwe, eds., Designing Household Survey Questionnaires for Developing Countries. Washington, D.C.: World Bank. Friedman, J., and J. Levinsohn. 2002. ``The Distributional Impact of Indonesia's Financial Crisis on Household Welfare: A `Rapid Response' Methodology.'' World Bank Economic Review 16(3):397­ 423. Gibson, J. 2000. ``A Poverty Profile of Cambodia, 1999.'' A Report to the World Bank and the Cambodian Ministry of Planning, Phnom Penh. Gibson, J., and S. Rozelle. 2002. ``Demand Systems with Unit Values: Comparisons with Elasticities from Market Prices.'' University of Waikato, Department of Economics, Hamilton, New Zealand. Gibson, J., S. Rozelle, and T. Le. 2002. ``Evaluating the Use of Unit Values and Community Prices in Demand Analysis.'' University of Waikato, Department of Economics, Hamilton, New Zealand. Glewwe, P. 1991. ``Investigating the Determinants of Household Welfare in Co^te d'Ivoire.'' Journal of Development Studies 35(2):307­37. Gracia, A., and L. Albisu. 1998. ``The Demand for Meat and Fish in Spain: Urban and Rural Areas.'' Agricultural Economics 19(3):359­66. Jolliffe, D., and A. Semykina. 1999. ``Robust Standard Errors for the Foster-Greer-Thorbecke Class of Poverty Indices: SEPOV .'' Stata Technical Bulletin 51. Stata, College Station, Tex. Kedir, A. 2001. ``Some Issues in Using Unit Values as Prices in the Estimation of Own-Price Elasticities: Evidence from Urban Ethiopia.'' CREDIT Research Paper 01/11. University of Nottingham, Centre for Research in Economic Development and International Trade, Nottingham, U.K. Laraki, K. 1989. ``Ending Food Subsidies: Nutritional, Welfare, and Budgetary Effects.'' World Bank Economic Review 3(3):395­408. Minot, N. 1998. ``Distributional and Nutritional Impact of Devaluation in Rwanda.'' Economic Devel- opment and Cultural Change 46(2):379­402. Minot, N., and F. Goletti. 2000. Rice Market Liberalization and Poverty in Viet Nam. Research Report 114. Washington, D.C.: International Food Policy Research Institute. Nicita, A. 2004a. ``Efficiency and Equity of a Marginal Tax Reform: Income, Quality and Price Elasticities for Mexico.'' Policy Working Paper 3266. World Bank, Washington, D.C. ------. 2004b. ``Who Benefited from Trade Liberalization in Mexico? Measuring the Effects on House- hold Welfare.'' Policy Working Paper 3265. World Bank, Washington, D.C. Ravallion, M., and L. Dearden. 1988. ``Social Security in a `Moral Economy': An Analysis for Java.'' Review of Economics and Statistics 70(1):36­44. Gibson and Rozelle 97 Saunders, C., and C. Grootaert. 1980. ``Reflections on the LSMS Group Meeting.'' Living Standards Measurement Study Working Paper 10. World Bank, Washington, D.C. Wood, D., and J. Knight. 1985. ``The Collection of Price Data for the Measurement of Living Standards.'' Living Standards Measurement Study Working Paper 21. World Bank, Washington, D.C. World Bank. 1999. Papua New Guinea: Poverty and Access to Public Services. Washington, D.C. Has Distance Died? Evidence from a Panel Gravity Model Jean-Franc¸ois Brun, Ce´line Carre`re, Patrick Guillaumont, and Jaime de Melo The estimated coefficient of distance on the volume of trade is generally found to increase rather than decrease through time using the traditional gravity model of trade. This distance puzzle proved robust to several ad hoc versions of the model using data for 1962­96 for a large sample of 130 countries. The introduction of an ``augmented'' barrier to trade function removes the paradox, yielding a decline in the estimate of the elasticity of trade to distance of about 11 percent over the 35-year period for the whole sample. However, the ``death of distance'' is shown to be largely confined to bilateral trade between rich countries, with poor countries becoming marginalized. There is a widespread perception that the current wave of globalization, much like the first, should have led to the ``death of distance.'' Other things equal, globalization should generate a dispersion of economic activity reflecting a decline in transaction costs, especially transport costs. But studies based on the traditional gravity model of international trade--the workhorse for studies on the pattern of trade and the influence of transport costs--do not reach that conclusion. Although some studies based on the gravity model have used direct Jean-Franc¸ois Brun is associate professor of economics at the University of Auvergne; his email address is j-f.brun@u-clermont1.fr. Ce´line Carre`re is assistant professor of economics at Ecole des Hautes Etudes Commerciales, University of Lausanne; her email address is c.carrere@u-clermont1.fr. Patrick Guillaumont is professor of economics at the University of Auvergne; his email address is p.guillaumont@u-clermont1.fr. Jaime de Melo is professor of economics, Department of Political Economy, at the University of Geneva; his email address is demelo@ecopo.unige.ch. All authors are associated with the Center for Studies and Research in International Development (CERDI). This article was submitted in May 2003, and all editorial matters were handled by Alan Winters. This is a revised version of Brun and others (2002). The authors gratefully acknowledge helpful comments from participants at seminars at CERDIand the World Trade Organization and at the Second Research and Training Network Workshop on ``Trade, Industrialization, and Development,'' November 27­28, 2003, in London, as well as comments from three anonymous referees. Jeffrey Bergstrand, Agne`s Be´nassy-Que´re´, Olivier Cadot, Paul Collier, Marcel Fafchamps, Caroline Freund, David Hummels, Jaya Krishnakumar, Maurice Schiff, and Alan Winters gave helpful suggestions on an earlier draft. , THE WORLD BANK ECONOMIC REVIEW VOL. 19,NO. 1, pp. 99­120 doi:10.1093/wber/lhi004 The Author 2005. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK . All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org. 99 100 , 19, 1 T H E WO R L D B A N K E C O N O M I C R E V I E W V O L . N O . measures of transport cost barriers to trade,1 most rely on distance as a proxy for transport costs, obtaining an estimated elasticity of bilateral trade with respect to distance in the range [ 1.3; 0.8]. However, as will be detailed, when the model is estimated separately for several years, the absolute value of the coefficient almost always increases over time. This is puzzling, because the common perception of globalization is that distance should be becoming less important in international trade, implying decreasing rather than increasing values for the estimated coefficient of distance. This paradoxical result was initially investigated by Brun and others (1999) in a traditional gravity model framework. Earlier, Leamer and Levinsohn (1995, pp. 1387­88), reviewing the literature on international trade and distance, noted that ``the effect of distance on trade patterns is not diminishing over time. Contrary to popular impression, the world is not getting dramatically smaller.'' They conclude that ``dispersion of economic mass is the answer, not a shrinking globe'' for this result. In a recent examination of the paradox Coe and others (2002) review explanations in the literature. One is the exclusion of zero observations from the model, which could bias estimation of the impact of distance over time because of the changing composition of trade. Another is that the traditional gravity model omits what is now being referred to as ``multilateral trade resistance.'' Both explanations are tested here. The para- doxical result remains. A third, obvious explanation is examined here as well: the possibility of misspecification of the transport cost function due to omitted- variable bias. Each of these potential explanations is examined for a large sample of 130 countries spanning bilateral trade over 35 years. The contribution of this study comes from the combination of more appropriate estimation techniques and a more thorough treatment of transport costs. The article first considers whether the puzzle still holds when estimation is carried out in a random-effects panel procedure with correction for endogeneity (Hausman and Taylor estimator) and for potential selection bias. The puzzle is found to be a robust result in a large sample. Next, a better specified gravity equation is considered. To bring it closer to accepted theoretical foundations, it includes a measure of remoteness in the estimates. Most important, it also defines an ``augmented'' transport cost func- tion that includes indexes of infrastructure, price of oil, and composition of trade as arguments, all of which turn out to be statistically significant along 1. Some recent studies use direct measures of transport costs in a gravity model. Baier and Bergstrand (2001) use c.i.f.-f.o.b. price ratios over 1970­85 for 16 Organisation for Economic Co-operation and Development countries. Arguing that such comparisons provide limited information (Hummels and Lugovskyy 2003), Hummels (1999, 2001a, b) has studied determinants of freight costs based on more reliable customs data (such as German shipping data over the past 40 years or U.S. customs data at the five-digit SITC level over the period 1974­98). In the same way, Limao and Venables (2001) estimate maritime and land transport costs using 1990 transport cost data for container shipments from Baltimore to 64 destination cities, and Estevadeordal and others (2003) use British maritime transport cost data for a sample of 46 countries over the period 1870­1938. Brun and others 101 predicted lines and contribute to the solution of the puzzle. To check for robustness, the sample is split into low-income countries and high-income countries. To anticipate the main results: Once remoteness is included in the model, the increasing impact of distance on trade vanishes, and with the introduction of an augmented transport cost function into the log-linear specification of the gravity model, the absolute value of the elasticity of bilateral trade with respect to distance decreases significantly over time. Furthermore, when the sample is split into low- and high-income countries, the elasticity of bilateral trade with respect to distance reveals no trend for low-income countries' trade, whereas it falls for bilateral trade between high-income countries, a result in accordance with estimates obtained using more reliable customs data on transport cost (available for only a handful of countries). This result may reflect the fact that low-income countries have been marginalized in the current wave of globalization. I. IS THERE A PUZZLE? The puzzle is found in the results for the traditional gravity model using cross- sectional data. This section examines whether the puzzle remains when panel data for a sample of 130 countries for 1962­96 are used. It also addresses issues raised by alternative treatments of missing observations and by potential selec- tion bias. The bulk of the literature that reaches the conclusion that the coeffi- cient for distance increases in absolute value over time is based on the traditional gravity model. Estimated in cross-section, this model is a variant of the following log-linear equation: ð1Þ lnðTijÞ ¼ a0 þ a1lnðYiYjÞ þ a2lnðNiNjÞ þ blnðDijÞ þ a3DUMkij þ eij; where Tij is the total volume of trade between country i and country j, which depends on the product of partners' income, YiYj, the product of partners' respective populations, NiNj, the distance between i and j, Dij, and a vector of dummy variables, DUMkij. These dummy variables usually capture a common language, a common land border, a common colonizer, the condition of being landlocked, the existence of a free trade area, and sometimes a common currency. Typically, equation 1 is estimated in a cross-section setting for different years (with or without importer and exporter fixed effects). The puzzle is revealed in a negative impact of distance (b < 0) that increases absolutely through time. For example, in a sample of 74 countries for 1965 and 1992, Frankel (1997, table 4.2) obtains b65 = 0.48 and b92 = 0.77. Likewise, Leamer (1993) estimates distance elasticities, which did not fall between 1970 and 1985. As pointed out by Coe and others (2002), several other studies have failed to find a declining coefficient for distance over time, and most have found a significant increase in the absolute value of the estimated coefficient. For 102 , 19, 1 T H E WO R L D B A N K E C O N O M I C R E V I E W V O L . N O . instance, estimating equation 1 over a sample of 130 countries for 1962 and 1996 yields: b62 = 0.86 and b96 = 1.34, representing an increase of the impact of distance on bilateral trade of about 36 percent over 35 years.2 Setting the theoretical underpinnings of the gravity equation aside until the following section, this puzzling result is put to the scrutiny of typical econo- metric problems. Estimated in cross-section, equation 1 has several shortcom- ings. First, because the dummy variables capture only part of the unobservable heterogeneity of country pairs, the remaining unobservable heterogeneity could potentially bias estimates of the coefficient for distance. Second, the typical ordinary least squares estimates may be prone to omitted-variables bias. Third, there is a potential selection bias due to missing values in bilateral trade data. The use of panel data, with a time dimension in addition to the traditional importer and exporter dimensions, can address the issue of unobservable het- erogeneity of country pairs. The usual correction introduces three specific effects: exporter, importer, and time effects (see Matyas 1997; Soloaga and Winters 2001). But the model with three specific effects is only a restricted version of the more general model adopted here, which allows for country-pair heterogeneity (see Cheng and Wall 1999; Egger and Pfaffermayr 2003). These bilateral specific effects are included to capture all unobservable time-invariant characteristics of the bilateral trade relationships that might otherwise be cap- tured by the distance coefficient. Moreover, because the focus is on the death of distance, panel data with a time dimension allow estimating a time-varying elasticity of trade with respect to distance. Thus, b is allowed to change over time but not across countries (differences across countries are permitted in section III, which splits the sample into groups). Finally, a quadratic term is used to allow for the existence of a turning point: lnðMijtÞ ¼ a0 þ a1:lnYit þ a2:lnYjt þ a3:lnNit þ a4:lnNjt þ b0:t þ b1:lnDij ð2Þ þ b2:t:lnDij þ b3:t2:lnDij þ eijt ¼ Z1c1 þ bt:lnDij þ eijt; where t is a time trend, eijt = mij + vijt with mij is a specific bilateral random effect, and vijt is the idiosyncratic error term with the usual properties. In this equation, used as a starting point for examining the puzzle, bilateral imports (of i from j) rather than total bilateral trade (Mij instead of Tij) are used as the dependent variable, and a random-effects estimation procedure is used to avoid eliminating the coefficient for distance in the equation (the within-transformation in a bilateral fixed-effects model removes variables that are cross-sectional 2. Appendix C reports the evolution of the distance coefficient (equation C1) when equation 1 is estimated for each year with country fixed effects as in the recent literature (Rose and Van Wincoop 2001; Anderson and van Wincoop 2003; Feenstra 2003). Brun and others 103 time-invariant). The cost of moving to a random-effects estimation procedure is that some explanatory variables should be correlated with the bilateral random effects, a potential problem addressed later. An instrumental procedure is used to correct for this potential endogeneity (see appendix B). The data set covers 130 countries for the years 1962­96, a period that captures most of the current wave of globalization during which transport costs have purportedly fallen. Import trade statistics are taken from the United Nations Commodity Trade Statistics (Comtrade) database. Although the usual restriction of equal coefficients for origin and destination countries is rejected by the data (especially for the population coefficients), the difference in specification has no effect on the values of the parameters of interest b1, b2, and b3 (table 1, columns 1 and 2).3 The impact of distance on trade increases over time, because |bt| 1.321+(0.0052.t) (0.0001.t2) (table 1, column 2). Thus estimation on panel data including bilateral specific effects does not resolve the puzzle observed in the literature based on cross-section data. However, endogeneity tests find the gross domestic product (GDP) variables to be endogenous--correlated with the bilateral specific effects (see details in appendix B). It is therefore sensible to take the results corresponding to equation 2 estimated with the instrumental variables estimator proposed by Hausman and Taylor (1981) as typical of the results for a gravity trade model using distance as a proxy for transport costs (table 1, column 3). The overall fit is good (R-squared = 0.61), and all variables have the expected sign and plausible values. As suggested by theory, the elasticity of trade to income is significant and close to unity. The population variables have the expected negative sign, capturing the often observed phenomenon that trade tends to constitute a smaller percentage of GDP for larger countries, as discussed later. The conclusion from this preliminary inquiry is that the puzzle persists, yielding the following estimate of the evolution of the absolute value of the coefficient for distance: jbtj 1:268 þ ð0:0062:tÞ ð0:0001:t2Þ: This evolution of |bt| over 1962­96, as estimated in equation 2, is plotted in figure 1. These estimates indicate that a 10 percent increase in distance would reduce bilateral trade by 12.7 percent in 1962 and by 13.8 percent in 1996, or an 8.7 percent increase in the impact of distance over 35 years instead of the expected decrease. How robust is this result? The first concern is the large number of missing values. The sample has a potential of 130 129 35=586,950 observations. 3. Several recent studies use panel data without a time-series dimension but with country fixed effects (for origin countries and destination partners). This estimation procedure forces them to impose identical coefficient values on the income and population variables (Rose and Van Wincoop 2001; Coe and others 2002; Anderson and van Wincoop 2003). TABLE 1. Distance in a Traditional Panel Gravity Model 1 2 3 4 5 HT (with variables Variable GLS GLS HT HT (with zero values) for selection bias) lnYit 1.000 (98.76) 0.833 (39.80) 0.764 (43.13) 0.808 (40.77) lnYjt 1.159 (115.87) 1.218 (54.81) 1.571 (104.42) 1.057 (46.82) ln(YitYjt) 1.080 (141.55) lnNit 0.076 (5.95) 0.049 (2.44) 0.057 (4.90) 0.022 (1.15) lnNjt 0.164 (12.90) 0.251 (12.51) 0.554 (40.85) 0.175 (8.81) 104 ln(NitNjt) 0.121 (13.30) lnDij 1.320 (51.90) 1.321 (52.17) 1.268 (70.93) 1.309 (76.40) 1.203 (69.48) t 0.027 (6.11) 0.027 (6.01) 0.030 (5.14) 0.044 (8.32) 0.011 (1.91) t.lnDij 0.0052 (9.52) 0.0052 (9.63) 0.0062 (9.10) 0.0043 (6.59) 0.0064 (9.49) t2.lnDij 0.00010 (22.04) 0.00010 (22.06) 0.00009 (14.76) 0.00009 (16.04) 0.00010 (16.95) No. observations 171,998 171,998 171,998 216,511 171,998 R-squared 0.58 0.59 0.61 0.59 0.64 Hausman test 401.62 w2(5) 619.85w2(7) GLS vs. HT 9666.27 w2(8) 13,727.05w2(8) 2418.34w2(8) Source: Authors' computations based on data described in appendix A. Note: Numbers in parentheses are t-statistics.GLSis generalized least squares estimator andHTis Hausman and Taylor (1981) estimator. Brun and others 105 FIGURE 1. The Elasticity of Bilateral Trade to Distance |bt| 1.40 1,40 1.35 1,35 1,30 1.30 1.25 1,25 1.20 1,20 1962 1966 1970 1974 1978 1982 1986 1990 1994 Traditional equation (table 1, col. 3) Standard equation (table 2, col. 1) Augmented equation (table 2, col. 2) Source: Authors' computations based on data described in appendix A. Missing values are reported for 71 percent of the potential observations, result- ing in 171,998 observations. The database does not distinguish between coun- tries that do not report their trade statistics (missing values) and country pairs with no bilateral trade (zero trade). Some country pairs report missing values at the beginning of the sample period and positive trade at the end. For them, the missing values can be suspected of being zeroes (perhaps because transport costs are too high at the beginning of the sample period). Excluding these zeroes would bias the results. The missing values are assumed to equal zero if they occur before 1975, provided that positive trade is observed thereafter. As in Frankel (1997, p. 145), zeroes are replaced by Mijt=1. The total number of observations rises by 26 percent. Results under this specification do not change the increasing impact of distance on trade (see table 1, column 4). The missing values are a source of potential selection bias. Following Nijman and Verbeek (1992), three additional explanatory variables are introduced into equation 2 to correct for the selection bias (see table 1, column 5).4 Even if two of these proxies are significant, the estimates for b1, b2, and b3 remain similar. Estimates of the evolution of bt could also be biased because some series may contain a unit root, in which case the estimates in table 1 would be spurious if the relations were not cointegrated. So, a Levin and Lin (1993) unit root test 4. The following variables are added in the equation (coefficient; t Student): the number of years of presence of the couple ij in the sample (0.04; 31.98); a dummy variable that takes the value 1 if ij is observed during the entire period (0.016; 0.74); and a dummy variable that takes the value 1 if ij is present in t 1 (0.54; 35.02). Together, these variables have been shown in Monte Carlo experiments by Nijman and Verbeek (1992) to be good proxies for the Heckman correction term for selection bias. 106 , 19, 1 T H E WO R L D B A N K E C O N O M I C R E V I E W V O L . N O . was applied to the series for GDP , population, and bilateral imports. This test rejects, very significantly, the null hypothesis of a unit root for all series. Finally, to check that the increasing impact of distance on trade does not capture tendencies in other coefficients, two sets of regressions were estimated: year-by-year regressions with country fixed effects and regressions over three- year subperiods with bilateral effects and the Hausman and Taylor (1981) estimator. The estimated coefficients from 1962 to 1996, plotted in appendix figure C1, show that the increasing impact of distance over time is unaffected. From these robustness checks it can be concluded that the estimated value of |bt| increases over time in the traditional gravity model. The puzzle remains. II. EXPLAINING THE PUZZLE The puzzle could be the result of a misspecification of the traditional gravity equation. A more solid theoretical foundation for the gravity equation is found in the currently popular application of the gravity model to aggregate trade between economies assumed to be specialized in differentiated products. As shown by Deardoff (1998) and Anderson and Van Wincoop (2003, 2004), utility maximization of an identical constant elasticity of substitution (CES) utility function (over countries) yields the following expression for the cost, insurance, and freight (c.i.f.) value of bilateral imports, ð3Þ Mijt ¼ ðYitYjt=YwÞðyijt=P~itP~jtÞ1 ; s where Yw is world income, yijt is bilateral transport costs, s is the elasticity of substitution in the CES utility function, and P~ it, P~ jt can be interpreted as multi- lateral trade resistance indexes whose values are given by: ð4Þ P~1its ¼ jðYit=YwÞð ijt=P~jtÞ1 : s Estimation of the system of equations 3 and 4 raises issues related to estima- tion of the multilateral trade resistance index and estimation of transport costs. For the trade resistance index, suppose, as in the literature, that estimation proceeds on a cross-section basis. In that case either data for the price indexes are available (as in Baier and Bergstrand 2001), or P~ i and P~ j can be estimated directly (from the structural model with nonlinear least squares), following Anderson and Van Wincoop (2003). Alternatively, as explained by Anderson and Van Wincoop (2003) and Feenstra (2003), country-specific effects can be used to capture the variation in P~ i and P~ j.5 Thus, a first method of estimating the gravity equation uses panel data techniques relying on a cross-section specification with a fixed-effects estimator 5. See the discussion by Anderson and Van Wincoop (2004, p. 712) on the three ways to estimate the theoretical gravity model on a cross-section basis. Brun and others 107 that includes importer and exporter dummy variables (Rose and Van Wincoop 2001; Coe and others 2002; Anderson and Van Wincoop 2003). However, the preferred approach, used here, relies on a random-effects estimator, with the two dimensions being country-pair and time-specific effects. Of these two panel techniques the country fixed-effects estimator offers more variability on the distance coefficient because it is estimated year by year. But the risk with this method is that the distance coefficient will capture all unobserved bilateral characteristics that cannot be included in the specification (other than the usual dummy variables for common language, common border, and so on). The country-pair random effects estimator with a time dimension controls for all the unobserved bilateral specific effects, but at the cost of imposing a trend specification for the evolution of distance. After weighing the tradeoffs, the country-pair random-effects estimator is selected (but for illustrative purpose, appendix C reports the results for year-by year estimations with country fixed effects). The choice of this method requires replacing the multilateral trade resistance indexes with a remoteness measure.6 This implies substituting values of Rit and Rjt for P~ it and P~ jt in equation 3. Estimation of bilateral transport costs, yijt, starts with the standard transport cost function in which distance reflects marginal transport costs. (Later, an augmented version is proposed.) In the standard implementation, used by Hummels (2001a) and Anderson and Van Wincoop (2003) among others, transport costs include distance (Dij) and a vector of dummy variables for common border (Bij) and being landlocked (Li ). Assuming the standard multi- (j) plicative form yields: ð5Þ yijt ¼ ðDijÞgted1: Bijþd2Liþd3Lj with expected signs d1 < 0, d2 > 0, and d3 > 0. Once more, because bilateral specific effects capture the time-invariant characteristics of bilateral trade, equation 5 is estimated without including the dummy variables. The elasticity of transport costs to distance, gt, is assumed to be approximated by a quadratic time trend (t): ð6Þ gt ð½@yijt=yijt =½@Dij=Dij Þ ¼ g1 þ g2:t þ g3:t2: 6. As explained, the literature on estimating the theoretical gravity model, notably using multilateral trade resistance indexes, is based on cross-section data. With a time dimension added in the estimation, none of the three suggestions is applicable: Price indexes are not available; the structural estimation method proposed by Anderson and Van Wincoop (2003) is not suitable for three dimensions (importer, exporter, and time); and country dummy variables no longer capture the multilateral resistance indexes, which change over time. Hence, the multilateral trade resistance indexes are replaced by a remoteness measure (as in Soloaga and Winters 2001) even if, as Anderson and Van Wincoop (2003) show, the functional form of the remoteness variable is not in conformity with the theory and can introduce a bias. However, it is comforting that the year by year estimations with country fixed effects--without bias due to the remoteness variables--reported in appendix C yield similar results. 108 , 19, 1 T H E WO R L D B A N K E C O N O M I C R E V I E W V O L . N O . Finally, in estimating a gravity model using panel data with a long time dimension (35 years in this case), it is essential to capture the effects of changes in relative prices, because a normalization of prices to unity can no longer be justified. For a large sample of countries for which representative price indexes are not available, real exchange rate indexes have to be used. As in Soloaga and Winters (2001) and Bayoumi and Eichengreen (1997), among others, the bilat- eral real exchange rate between i and j, RERijt is introduced into the equation. Also, RERijt may be interpreted as a proxy for unobservable movement of multilateral resistance indexes through time.7 Under these working assumptions, estimation of the standard trade barrier function in the gravity model boils down to plugging equation 5 into a modified version of equation 3 that also includes population to proxy Engel effects and, on the supply side, differences in factor endowments (Bergstrand 1989; Frankel 1997; Soloaga and Winters 2001; Coe and others 2002).8 This yields an equa- tion similar to equation 2, except for the inclusion of ``remoteness'' that takes into account relative transport costs, and the real exchange rate that takes into account the effects of the evolution in relative prices. This results in: ð7Þ lnðMijtÞ ¼ Z1c1 þ bt:ln Dij þ a5:ln Rit þ a6:ln Rjt þ a7:ln RERijt þ eijt ¼ Z1c1 þ Z2c2 þ bt: ln Dij þ eijt with a5 > 0, a6 > 0, a7 < 0, and according to equations 3 and 5, bt = (1 s)gt>0. Estimation results for equation 7 are reported in table 2, column 1, and should be compared with those in table 1, column 3. Broadly speaking, coefficient estimates for GDP and population have values similar to those in table 1, while the coefficient for distance continues to be estimated at around 1.3. The coefficient for the real exchange rate (RERijt) has the expected negative sign: An increase of the real effective exchange rate that reflects a depreciation of the importing country's currency against that of the exporting country reduces i's imports from j. Likewise, the remoteness variable is positive and significant: The more remote a pair of countries is from the rest of the world, the more they will tend to trade with each other. Notably, the introduction of bilateral real exchange rates and relative transport costs almost eliminates the trend for bt, with an estimated turning point in 1981 (see figure 1). Finally, a 10 percent increase in distance would reduce bilateral 7. We thank an anonymous referee for this observation. 8. As Bergstrand (1989) points out, a negative coefficient estimate for exporter population, Nj, can be interpreted as a positive relationship between per capita income and trade (because capital-abundant countries tend to produce and export more). A negative coefficient estimate for importer population, Ni, may reflect tastes (however, this coefficient is generally not significantly different from zero, indicating an income elasticity of demand of approximately unity). Note that Coe and others (2002) introduce the population variables as a measure of geography: For larger countries, the cost of trading among themselves rather than with other countries is relatively low compared with the cost for smaller countries. This implies that large countries will tend to trade less than small countries. TABLE 2. Distance in Standard and Augmented Panel Gravity Models Standard Augmented Standard Augmented Standard Augmented Standard Augmented 1 2 3 4 5 6 7 8 Variable All All P-P P-P R-R R-R P-R P-R lnYit 0.906 (44.74) 1.039 (71.33) 0.951 (27.27) 1.196 (43.65) 0.831 (17.76) 0.788 (22.53) 0.852 (14.60) 0.913 (21.21) lnYjt 1.164 (57.87) 1.103 (86.34) 1.352 (31.72) 1.299 (38.39) 1.005 (23.39) 0.982 (28.44) 1.303 (24.55) 1.236 (35.78) lnNit 0.015 (0.80) 0.072 (5.63) 0.007 (0.18) 0.288 (8.95) 0.020 (0.54) 0.176 (5.76) 0.191 (3.83) 0.272 (7.47) lnNjt 0.200 (11.04) 0.184 (15.34) 0.350 (8.00) 0.249 (9.15) 0.179 (4.38) 0.323 (10.27) 0.298 (5.49) 0.292 (8.59) lnDij 1.333 (70.46) 1.353 (74.37) 1.019 (39.37) 1.059 (41.10) 0.715 (25.62) 0.716 (24.83) 0.993 (38.29) 0.978 (40.36) lnRit 0.368 (10.74) 0.525 (15.45) 0.419 (6.73) 0.692 (12.21) 0.127 (1.63) 0.085 (1.05) 0.136 (1.99) 0.061 (0.86) lnRjt 1.909 (59.98) 2.214 (65.72) 1.647 (29.72) 1.877 (33.39) 0.165 (2.83) 0.169 (2.74) 1.658 (34.47) 1.867 (37.30) 109 lnRERijt 0.0005 (6.44) 0.0005 (6.02) 0.0007 (5.39) 0.0007 (5.20) 0.0003 (2.00) 0.0003 (2.39) 0.0008 (4.58) 0.0007 (7.77) t 0.028 (4.76) 0.063 (10.83) 0.009 (1.28) 0.038 (5.28) 0.028 (2.64) 0.029 (2.60) 0.028 (3.22) 0.0666 (4.40) t.lnDij 0.0047 (6.80) 0.0034 (4.11) 0.0072 (8.52) 0.0042 (4.56) 0.0009 (1.05) 0.0048 (3.33) 0.0069 (1.05) 0.0053 (2.06) t2.lnDij 0.00012 (20.84) 0.00003 (2.46) 0.00008 (10.07) 0.0001 (8.90) 0.00005 (3.79) 0.00005 (3.26) 0.00007 (6.44) 0.00014 (6.44) lnKit 0.088 (6.24) 0.194 (12.13) 0.0727 (3.78) 0.0552 (1.96) lnKjt 0.184 (12.18) 0.147 (8.35) 0.115 (5.02) 0.159 (7.51) lnPFt 0.097 (6.70) 0.007 (0.35) 0.408 (19.70) 0.523 (20.77) lnpjt 0.216 (21.48) 0.184 (12.60) 0.027 (1.34) 0.153 (11.72) No. observations 171,998 171,998 57,332 57,332 57,332 57,332 57,332 57,332 R-squared 0.64 0.64 0.56 0.58 0.62 0.64 0.56 0.59 GLS vs. HT 33,548 w2(11) 15,410w2(15) 900 w2(11) 628w2(15)) 458 w2(11) 716 w2(15) 804w2(11) 876w2(15) Source: Authors' computations based on data described in appendix A. Note: Numbers in parentheses are t-statistics.GLS is generalized least squares estimator and HT is Hausman and Taylor (1981) estimator. P-P is bilateral trade between the poorest tercile of countries, and R-R is bilateral trade between the richest tercile. 110 , 19, 1 T H E WO R L D B A N K E C O N O M I C R E V I E W V O L . N O . trade by 13.4 percent in 1962 and by 13.5 percent in 1996, which is not significantly different. However, the expected decreasing trend fails to be observed over the 35-year period. The puzzle remains. Except for explanations pertaining to the specifica- tion of the transport cost function, all possible explanations for the existence of the puzzle that have been raised in the literature have now been exhausted.9 Several new factors are added to the standard specification of the transporta- tion cost function (equation 5). First, as in Limao and Venables (2001), an index is introduced for the quality of infrastructure in period t, Ki , with larger values (j)t indicating better infrastructure.10 Also included is the cost of oil, PFt, arguably the main factor affecting the marginal cost of transport. Finally, differential freight costs between primary products and manufactures are considered by including the share of primary products in total exports, pijt, as a proxy.11 Because of data unavailability, pijt is proxied by pjt, the share of primary export products in total exports for country j regardless of destination. Thus, the augmented transport cost function becomes:12 ð8Þ yijt ¼ ðKitÞr1ðKjtÞr2ðPFtÞr3ðpjtÞr4ðDijÞg1þg2: tþg3:t2 with the elasticity of transport costs to distance given by equation 6, and with the following expected signs: r1 < 0, r2 < 0, r3 > 0, r4 > 0. Inserting equation 8 into equation 7 gives the augmented gravity model: ð9Þ lnðMijtÞ ¼ Z1c1 þ Z2c2 þ bt: ln Dij þ a8: ln Kit þ a9: ln Kjt þ a10: ln PFt þ a11: ln pjt þ eijt The expected signs in equation 9 are a8 = (1 s)r1 > 0, a9 = (1 s)r2 > 0, a10=(1 s)r3<0, a11=(1 s)r4<0. In comparing results from the augmented 9. Also addressing the distance puzzle, Coe and others (2002) estimate a nonlinear cross-section gravity model in which they consider the possibility that the error term enters additively in equation 3 instead of as a multiplicative exponential, as implicitly assumed here. They find that when estimated nonlinearly, the coefficient for the distance variable shows a decline between 1975 and 2000, whereas it presents no clear trend when estimated with the standard log-linear specification (as in equation 7). Coe and others recognize, however, that theoretical models are nonstochastic and that they have not formally tested the appropriateness of an additive rather than a multiplicative error structure, which limits the inferences that can be draw from their comparison of the two estimation procedures. 10. The index constructed by Limao and Venables (2001) from data in Canning (1996) is used. The index is a simple average of four indexes (roads, paved roads, telephone lines, and railways) corrected for density. Appendix A describes how this index was constructed. 11. Including the mode of transport would also be desirable, but that information is not available for such a large sample. 12. The elements in this function capture several barriers to trade beyond transport cost. The phrases trade barrier, transactions costs, and transportation costs are used interchangeably to remind readers that this reduced form goes beyond capturing transport costs. For a similar interpretation of the values taken by the distance coefficients in the standard gravity model, see Rauch (1999). Brun and others 111 specification with those from the standard specification, it should be noted that the standard specification excludes the four variables in the second line of equation 9. The coefficient estimates for the augmented transport cost function in column 2 of table 2 are relatively close in value to those obtained in column 1, even though the estimated values for the remoteness variables are now larger. All the variables included in the augmented transport cost function carry the expected signs and are significant. The coefficient for the price of oil is negative and significant. Likewise, infrastructure improvement significantly increases the volume of trade (see Limao and Venables 2001). Finally, the share of primary products in total exports of j has a significant negative impact on trade, capturing the stylized fact that freight costs are greater for primary products than for manufactures (Hummels 2001a). Thus, with the augmented trade barrier function, the puzzle disappears. The expected decreasing trend for bt is shown in figure 1. According to the estimates in table 2, column 2, a 10 percent increase in distance would reduce bilateral trade by 13.5 percent in 1962 and by 12.0 percent in 1996--a decrease in the impact of distance of about 11.1 percent over 35 years. Each of the transport cost function components is also introduced separately in the standard gravity equation to see how much each contributes to solving the puzzle. Oil prices solve much of the puzzle, explaining around 45 percent of the change in the elasticity of bilateral trade to distance. Infrastructure variables are also significant, explaining about 40 percent of the puzzle. Trade composition explains about 15 percent. Next, the results are tested for sensitivity to the standard misspecification problems of endogeneity, sample selection bias, and nonvarying coefficients. To check for sensitivity to potential endogeneity for the infrastructure, remoteness, distance, and population variables, the variables (in addition of those for GDP ) are instrumented according to the Hausman and Taylor (1981) method. There is no effect on the evolution of the elasticity of bilateral trade to distance. The results are unaffected when the other coefficients are allowed to change over subperiods of three years (see appendix C, equation C4). The declining trend in the distance coefficient remains unaffected when the three variables used earlier to correct for a potential selection bias are introduced. Finally, the dummy variables for regional agreements added in equation 9 have no effect on the estimated values for bt (even if most of regional agreements considered have a positive and significant direct impact on bilateral trade).13 The conclusion is thus that correction of some of the obvious misspecifica- tions in the gravity equation that relies on distance as a proxy for transport or 13. Trade agreements taken into account are the European Union, Mercosur, Association of Southeast Asian Nations, Andean Community, West African Economic and Monetary Union, Central African Economic and Monetary Community, Economic Community of West African States, Southern African Development Community, and Common Market for Eastern and Southern Africa. 112 , 19, 1 T H E WO R L D B A N K E C O N O M I C R E V I E W V O L . N O . trade barriers yields plausible results for the coefficients and also yields a plausible reduction in the barrier to trade represented by distance. III. ROBUSTNESS The gravity model is known to be sensitive to sample and product selection. Evenett and Keller (2002) find that it performs better than the Hecksher-Ohlin model in explaining trade patterns for manufactures, which are largely differ- entiated products. It has also been found repeatedly that the standard gravity model derived under the hypothesis of complete specialization in different products performs better for industrial countries than for developing economies (see Feenstra 2003, chap. 5). The suitability of the aggregate gravity model as a description of trade patterns and an approximation of barriers to trade in a heterogeneous sample of countries deserves to be explored further. The sample is broken down into three groups of equal size with selection according to the income per capita of each bilateral trade partner so that P-P is bilateral trade between the poorest tercile of countries in each time period and R-R between the richest tercile.14 With the standard specification the overall fit is approximately the same for both groups, though coefficient values for the distance and remoteness variables are much larger for P-P bilateral trade than for R-R bilateral trade (table 2, columns 3 and 5). In the standard specification the puzzle remains for both groups, although more pronounced for the P-P group. Results from the augmented specification for the P-P and R-R groups (columns 4 and 6 of table 2) reveal larger discrepancies across coefficients than those for the standard specification (columns 3 and 5), but larger values are again found for the coefficients for distance and remoteness for the P-P group. The coefficients for infrastructure are significant in both groups but are larger for the P-P group. The price of oil is significant only for R-R trade, whereas the share of primary commodities in trade is significant only for P-P trade.15 Thus the results from the augmented gravity model suggest the death of distance for trade between high-income countries and marginalization for low- income countries. 14. As noted by an anonymous referee, this means that the sample is likely to change depending on relative income growth and then to be endogenous to trade. However, over the whole period the maximum changes noted in the P-P group of 45 countries are the entry of 4 countries and the exit of 2. For the R-R group of 39 countries, 2 countries entered and 1 exited. These changes represent only 4.7 percent of the P-P sample observations (2740/57,332) and 3 percent of the R-R sample observations (1730/57,332). 15. Again, the introduction of variables checking for selection bias does not affect the evolution of bt. However, the coefficient values for these variables are larger for the P-P regressions, which would be consistent with some remaining specification problems. Brun and others 113 The results for the R-R group accord with those obtained by Hummels (2001a) for freight rate estimates for U.S. imports at the commodity level for 1974­98. Estimating freight rate costs as a function of weight, distance, com- modity fixed effects, and a time trend (and a time trend squared), Hummels finds that the distance coefficient falls over time, but only after containers are introduced (in 1980). Of course, this is only very indirect evidence that an augmented trade barrier function in a gravity equation may capture some of the determinants of transport costs isolated in a more reliable data set, but it is reassuring, nonetheless. The apparent marginalization of low-income countries is confirmed in another breakdown reporting exports from the poorest tercile of countries to the richest tercile of countries (P-R) (see table 2, columns 7 and 8). As expected, coefficients for exporters are close to the corresponding values in the P-P group and coefficients for importers are close to the corresponding values in the R-R group. The distance trend exhibits an evolution similar to that in the P-P group. In sum, the margin- alization of poor countries with respect to trading partners still holds. That said, the results, especially those for the P-P group, must be interpreted with caution. First, the diverging evolution of bt between the standard gravity model (see table 2, columns 3 and 5) and the augmented model (columns 4 and 6) in the two samples can be explained by the rate of improvement of the infrastructure index, which is twice as large for the high-income portion of the sample as for the low-income portion. The impact of infrastructure is, in principle, controlled for through the index Ki . But at least two variables that (j)t have a bearing on the impact of distance are correlated with Ki (j)t but are not included in the model. One is time in transit, which is higher for P-P bilateral trade.16 Another is mode of transport, which has changed more for the R-R group. Hence, when Ki (j)tis included, |bt| decreases significantly for R-R while remaining unaffected in the P-P sample. Second, explaining the puzzle for the P-P group probably requires a reconsideration of the specificationof the augmentedequation.One candidate isbilateral foreigndirect investment, which has increased more rapidly for the R-R group and could be corre- lated with any of the factors independent of distance included in the model. Exchange rate volatility and various omitted trade costs could also be involved. Rose and Van Wincoop (2001) use a gravity model framework to argue that countries with currency unions trade more than three times as much with each other as with other countries. Likewise, Obstfeld and Rogoff (2001, p. 9) find that currency conversion costs and exchange rate uncertainty can boost trade costs. The same applies for informational costs associated with international trade, such as search costs (Rauch 1999), for corruption and imperfect contract enforcement (Anderson and Marcouiller 16. Using shipments of manufactures to the United States, Hummels (2001b) estimates the cost of an extra day in transit at 0.5 percent of the value shipped. At equal distance time in transit is higher for P-P bilateral trade, in part because ships travel routes less frequently. 114 , 19, 1 T H E WO R L D B A N K E C O N O M I C R E V I E W V O L . N O . 2002), or for technological changes in transport during the period under analysis (such as containerization, which raises the quality of shipping and lags behind in poorer countries; see Hummels 2001a). Finally, the changing composition of trade over time should be investigated beyond the rough decomposition into primary and manufac- tured products used here.17 A final issue is the realism of the standard multiplicative form assumed for ``trade frictions'' in all gravity models, including this one. In equation 6 the elasticity of transport costs to distance depends only on time. Suppose instead that transport costs have two components, one that is fixed with respect to distance (such as the quality of infrastructure) and one that is variable (such as the price of oil). It can easily be shown that the elasticity of transport costs to distance could increase if the fixed cost component were falling sufficiently faster than the variable cost component (appendix D). As detailed in appendix D, efforts to estimate the resulting highly nonlinear model in the transport cost variables were unsuccessful, preventing exploration of alternatives to the stan- dard multiplicative form for the transport cost function. However, Limao and Venables (2001) conclude that the multiplicative form of their transport cost function fit their data better than did the additive form. IV. CONCLUSION Several variants of a panel gravity model were used to address the distance puzzle for a sample of 130 countries over the period 1962­96. The puzzle proved robust to several ad hoc versions of the gravity model, but it was significantly reduced when the gravity model was correctly specified to include remoteness (or an index of multilateral trade resistance). Adding an augmented trade barrier function (real price of oil, index of infrastructure, and share of primary exports in total bilateral trade) that corrects for the misspecification inherent in the standard representation of transport costs by distance yielded plausible estimates of the expected death of distance. Despite the many shortcomings associated with gravity-based indirect esti- mates of transport costs, several intuitively plausible results emerge from the model estimations: an elasticity of trade to income close to unity (as suggested by theory), a significant impact of the real exchange rate on the volume of bilateral trade, and expected significant signs for exporter and importer country characteristics and for the impact of remoteness on the volume of trade. The model produces an estimate of the elasticity of trade with respect to distance that is very close to direct estimates obtained from transport cost data, and the results are consistent with those obtained using more reliable data on U.S. transport costs. The model with the augmented trade barrier function 17. Actually, changing the composition of trade toward more distance-sensitive goods can be expected to increase the elasticity of trade to distance. However, Berthelon and Freund (2004) find no support for this argument. Brun and others 115 yields a plausible estimate of an 11 percent decrease in the impact of distance on bilateral trade over the 35-year period. Splitting the sample into three equal-size groups by income per capita revealed significant differences in bilateral trade coefficient estimates for low- income bilateral trade and high-income bilateral trade. The coefficients captur- ing barriers to trade, including distance, have much higher values for the low- income group (P-P). The puzzle remains for low-income bilateral trade in the standard and the augmented models, while it remains for high-income bilateral trade only in the standard model. Even though problems of interpretation persist, the results from this sample-splitting procedure are consistent with recent claims that poor countries have been marginalized by the current wave of globalization while rich countries have benefited from a death of distance. APPENDIX A. DATA SOURCES AND DATA PREPARATION . Mijt: Total bilateral imports, in current U.S. dollars, by country i from country j atdatet.UNComtrade.Theoriginaldatabasedoesnotcontainanyzeroentries. . Yi : (j)t GDP of country i (j) at date t, in constant 1995 U.S. dollars. World Bank (1999). . Ni : Total population of country i (j) at date t. World Bank (1999). (j)t . Dij: Distance in kilometres between the main city in country i and the main city in country j. Database developed by CVN . Usually, the main city is the capital city, but for some countries the main economic city is considered. The distance used is orthodromic--it takes into account the sphericity of the Earth. . Bij: Dummy variable equals 1 if i and j share a common land border, and 0 otherwise. . Li : Dummy variable equals 1 if i (j) is a landlocked country, and (j) 0 otherwise. . Ki : Infrastructure index, built using four variables from the database (j)t constructed by Canning (1996): number of kilometers of roads, paved roads, railways, and number of telephone sets or lines per capita. The first three variables are ratios to the surface area (World Bank 1999) to obtain a density. Each variable is normalized with a mean equal to one. An arithmetic average is then calculated over the four variables. Data for 1996 are extrapolated because 1995 is the final year for the database. . PFt: World oil price index. International Monetary Fund (IMF) International Financial Statistics database. . pjt: Ratio of primary export products to total exports of the country j at date t. UN Comtrade. . RERijt: Bilateral real exchange rate. IMF International Financial Statistics database. It is computed as follows: RERijt ¼ ðCPIjtÞ=ðCPIitÞ:ðNERit= =NERjt= Þ; $t $t 116 , 19, 1 T H E WO R L D B A N K E C O N O M I C R E V I E W V O L . N O . where NERi(j)t /$t is country i's (j's) currency value for US$1 at date t and CPIi (j)tis the consumption price index for country i (j) at date t. If the CPIis not available, the GDP deflator is used. For each pair of countries, the real exchange rate is specified such that its mean over the period is zero. . Ri : Remoteness index defined as the weighted distance to all trading (j)t partners of country i: Rit ¼ jwjtDij for i 6¼ j and with wjt ¼ Yjt= j jt Y : APPENDIX B. ESTIMATION METHOD Write the model as: ðB1Þ Mijt ¼ Xijtj þ Wijd þ eijt with eijt ¼ mij þ ijt; where X represents k variables varying over time, and W represents g variables that are time invariant. Some explanatory variables, such as GDP, are likely to be correlated with the bilateral specific effects, which is confirmed by the w2 value for the Hausman test in column 2 of table 1. This test rejects the null hypothesis of no correlation between the bilateral specific effects and the explanatory variables. Hence the generalized least squares (GLS) estimator in columns 1 and 2 of table 1 gives inconsistent estimates. The instrumental variables estimator proposed by Hausman and Taylor (1981) is used to deal with this issue. Assume that the X1 terms (dimension k1) are exogenous variables, and the X2 terms (dimension k k1) are endogenous variables (correlated with the random specific effects). Variable X2 includes the income variables, Yit and Yjt. Breusch and others (1989) suggest using [QX1, QX2, PX1, W] as instruments, which are then taken within the model. (Here Q is a matrix that expresses deviations from country-pair means, and P is a matrix that averages the observations across time for each country pair.) The resulting estimator is consistent but not efficient because it is not corrected for heteroscedasticity and serial correlation. Follow- ing the suggestion of Hausman and Taylor (1981), first-round estimates are used to compute the variance of the specific effects, mij, and the variance of the error term, vijt (see, for example, Guillotin and Sevestre 1994). The instrumental variable estimator is then applied to the following transformed equation: ðB2Þ ½Mijt ð1 yÞMij: ¼ ½Xijt ð1 yÞXij: j þ ðyWijÞd þ fymij þ ½ ijt ð1 yÞ ij:g; where y ¼ ðs2 =½Ts2m þ s2 Þ1= 2 Brun and others 117 The test proposed by Guillotin and Sevestre (1994) is used to compare the Hausman and Taylor (HT) estimator, bHT, and the GLS estimator, bGLS. The Hausman statistic is based on: ðB3Þ ðbGLS bHTÞ½varðbHTÞ varðbGLSÞ 1ðbGLS bHTÞ0: Under the null hypothesis, this test statistic is distributed as a w2 with degrees of freedom equal to the dimension of the vector bGLS (k + g), constant excluded. If the calculated statistic is greater than the critical value, the null hypothesis is rejected and the Hausman and Taylor estimator is preferred to the GLS estima- tor. The values of the w2 statistic for that test in table 1 turn out to be always superior to the critical value, so that the null hypothesis is rejected and the Hausman and Taylor estimator in column 3 of table 1 is preferred to the GLS estimator when GDP variables are instrumented. APPENDIX C. EVOLUTION OF |b| BY SUBPERIOD ðC1Þ lnðMijÞ ¼ a1 þ iþ kj þ a1: lnðYiYjÞ þ a2: lnðNiNjÞ þ b lnðDijÞ þ a3:Bij þ ij with Bij = 1 if i and j share a common land border. Estimated for each year with country fixed effects (corresponding to equation 1). ðC2Þ lnðMijtÞ ¼ Z1c1 þ b: lnðDijÞ þ mij þ vijt Estimated by subperiods of three years with bilateral random effects (corre- sponding to equation 2). ðC3Þ lnðMijtÞ ¼ Z1c1 þ Z2c2 þ b: lnðDijÞ þ mij þ eijt Estimated by subperiods of three years with bilateral random effects (corre- sponding to equation 7). ðC4Þ lnðMijtÞ ¼ Z1c1 þ Z2c2 þ b: lnðDijÞ þ a8: lnðKitÞ þ a9: lnðKjtÞ þ a10: lnðPFtÞ þ a11: lnðpjtÞ þ mij þijt Estimated by subperiods of three years with bilateral random effects (corre- sponding to equation 9). 118 , 19, 1 T H E WO R L D B A N K E C O N O M I C R E V I E W V O L . N O . FIGURE C1. Evolution of the Coefficient of Distance |b| by Subperiod 1.50 1,50 1.40 1,40 1.30 1,30 1.20 1,20 1.10 1.10 1,10 1.00 1,00 0.90 0,90 0.80 0,80 1962 1966 1970 1974 1978 1982 1986 1990 1994 Equation C1 Equation C2 Equation C3 Equation C4 Source: Authors' computations based on data described in appendix A. APPENDIX D. ADDITIVE TRANSPORT COST FUNCTION Under the standard multiplicative form used in the article, the elasticity of trade costs to distance depends only on time (equation 6). Suppose now that transport costs have two components, one that is fixed with respect to distance, yFijt, and one that is variable, yVijt, that is, yijt = yFijt + yVijt. In that case equation 8 would be rewritten as: ðD1Þ yijt ¼ ðKitÞr1ðKjtÞr2 þð|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} PFtÞr3ðpjtÞr4ðDijÞg1þg2 tþg3 t2 |fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} ¼ yFijt þ yV ijt Under the specification in equation D1, it can be shown that equation 6 is now given by: ðD2Þ ð½@yijt=yij =½@Dij=DijÞ ¼ ½yVijt ðg1 þ g2:t þ g3:t2Þ =ðyFijt þ yVijtÞ: From equation D2 it can be seen that the elasticity of transport costs with respect to distance could increase if the fixed cost component were falling sufficiently faster than the variable component. Substituting equation D1 into the modified version of equation 3 yields: ðD3Þ lnðMijtÞ ¼ Z1c1 þ Z2c2 þ f ln½ðKitÞr1ðKjtÞr2 þ ðPFtÞr3ðpjtÞr4ðDijÞg1þg2: tþg3:t2 Brun and others 119 where f = (1 s). Expression D3 is highly nonlinear. Attempts to estimate equation D3 failed to yield convergence in the estimates even when estimated only for the R-R group (where errors in the variables measurement problems are likely to be smallest). The only consolation is that Limao and Venables (2001), who have real transport cost data for a sample similar to the one used here, claim that the multiplicative form is supported by their data. REFERENCES Anderson, J. E., and D. Marcouiller. 2002, ``Insecurity and the Pattern of Trade: An Empirical Investiga- tion.'' Review of Economics and Statistics 84(2):342­52. Anderson, J. E., and E. Van Wincoop. 2003. ``Gravity with Gravitas: A Solution to the Border Puzzle.'' American Economic Review 93(1):170­92. ------. 2004. ``Trade Costs.'' Journal of Economic Literature 42(September):691­751. Baier, S. L., and J. H. Bergstrand. 2001. ``The Growth of World Trade: Tariffs, Transport Costs, and Income Similarity.'' Journal of International Economics 53(1):1­27. Bayoumi, T., and B. Eichengreen. 1997. ``Is Regionalism Simply a Diversion? Evidence from the Evolu- tion of the EC and EFTA.'' In T. Ito and A. Krueger, eds., Regionalism versus Multilateral Trade Arrangements. National Bureau of Economic Research East Asia Seminar on Economics, vol. 6. Chicago, Ill.: University of Chicago Press. Bergstrand, J. H. 1989. ``The Generalized Gravity Equation, Monopolistic Competition, and the Factor- Proportions Theory of International Trade.'' Review of Economics and Statistics 71(1):143­53. Berthelon, M., and C. Freund. 2004. ``On the Conservation of Distance in International Trade.'' World Bank Working Paper 3293. Washington, D.C. Breusch, T., G. Mizon, and P. Schimdt. 1989. ``Efficient Estimation Using Panel Data.'' Econometrica 57(3):695­700. Brun, J. F., P. Guillaumont, and J. de Melo. 1999. ``La distance abolie? Crite`res et mesures de la mondialisation du commerce exte´rieur.'' In A. Bouet and J. Le Cacheux, eds., Globalisation et politiques e´conomiques: les marges de manoeuvre. Paris: Economica. Brun, J. F., C. Carrere, P. Guillaumont, and J. de Melo. 2002. ``Has Distance Died? Evidence from a Panel Gravity Model.'' CEPR Discussion Paper 3500. Centre for Economic Policy Research, London. Canning, D. 1996. ``A Database of World Infrastructure Stocks, 1950­1995.'' World Bank Economic Review 12(3):529­47. Cheng, I. H., and H. J. Wall. 1999. ``Controlling for Heterogeneity in Gravity Models of Trade.'' Federal Reserve Bank of St. Louis Working Paper 99-010A. St. Louis, Mo. Coe, D. T., A. Subramanian, N. T. Tamirisa, and R. Bhavnani. 2002. ``The Missing Globalization Puzzle.'' IMF Working Paper WP/02/171. International Monetary Fund, Washington, D.C. Deardorff, A. 1998. ``Determinants of Bilateral Trade: Does Gravity Work in a Neoclassical World?'' In J. A. Frankel, ed., The Regionalization of the World Economy. Chicago, Ill.: University of Chicago Press. Egger, P., and M. Pfaffermayr. 2003. ``The Proper Panel Econometric Specification for the Gravity Equation: A Three-way Model with Bilateral Interaction Effects.'' Empirical Economics 28(3):571­80. Estevadeordal, A., B. Frantz, and A. M. Taylor. 2003. ``The Rise and Fall of World Trade, 1870­1939.'' Quarterly Journal of Economics 118(May):359­407. Evenett, S. J., and W. Keller. 2002. ``On Theories Explaining the Success of the Gravity Equation.'' Journal of Political Economy 110 (2): 281­316. Feenstra, R. 2003. ``Increasing Returns and the Gravity Equation.'' Advanced International Trade: Theory and Evidence. Princeton, N.J.: Princeton University Press. 120 , 19, 1 T H E WO R L D B A N K E C O N O M I C R E V I E W V O L . N O . Frankel, J. 1997. Regional Trading Blocs in the World Economic System. Washington, D.C.: Institute for International Economics. Guillotin, Y., and P. Sevestre. 1994. ``Estimations de fonctions de gains sur donne´es de panel: endoge´neite´ du capital humain et effets de la se´lection.'' Economies et Pre´vision 116(5):119­35. Hausman, J. A., and E. Taylor. 1981. ``Panel Data and Unobservable Individual Effects.'' Econometrica 49 (6):1377­98. Hummels, D. 1999. ``Towards a Geography of Transport Costs.'' University of Chicago, Graduate School of Business, Chicago, Ill. ------. 2001a. ``Have International Transportation Costs Declined?'' Journal of International Econom- ics 54(1):75­96. ------. 2001b. ``Time as Trade Barrier.'' Purdue University, School of Management, West Lafayette, Ind. Hummels, D., and V. Lugovsky. 2003. ``Usable Data? Matched Partner Trade Statistics as a Measure of International Transportation Costs.'' Purdue University, Department of Economics, West Lafayette, Ind. IMF(International Monetary Fund). Various years. International Financial Statistics database. Washington, D.C. Leamer, A. 1993. ``The Commodity Composition of International Trade in Manufactures: An Empirical Analysis.'' Oxford Economic Papers 26(3):350­74. Leamer, E., and J. Levinsohn. 1995. ``International Trade: The Evidence.'' In G. M. Grossman and K. Rogoff, eds., Handbook of International Economics, vol. 3. New York: Elsevier, North-Holland. Levin, A., and C. F. Lin. 1993. ``Unit Root Tests in Panel Data: New Results.'' University of California, San Diego, Department of Economics. Limao, N., and A. J. Venables. 2001. ``Infrastructure, Geographical Disadvantage and Transport Costs.'' World Bank Economic Review 15(3):451­79. Matyas, L. 1997. ``Proper Econometric Specification of the Gravity Model.'' World Economy 20(3):363­68. Nijman, T., and M. Verbeek. 1992. ``Incomplete Panels and Selection Bias.'' In L. Matyas and P. Sevestre, eds., The Econometrics of Panel Data. Dordrecht, Netherlands: Kluwer. Obstfeld, M., and K. Rogoff. 2001. ``The Six Major Puzzles in International Macroeconomics. Is There a Common Cause?'' NBER Working Paper 7777. National Bureau of Economic Research, Cambridge, Mass. Rauch, J. E. 1999. ``Networks versus Markets in International Trade.'' Journal of International Econom- ics 48(1):7­35. Rose, J., and E. Van Wincoop. 2001. ``National Money as a Barrier to International Trade: The Real Case for Currency Union.'' American Economic Review, Papers and Proceedings 91(2):386­90. Soloaga, I., and A. Winters. 2001. ``How Has Regionalism in the 1990s Affected Trade?'' North American Journal of Economics and Finance 12(1):1­29. World Bank. 1999. World Development Indicators 1999. CD-ROM. Washington, D.C. Measuring and Explaining the Impact of Productive Efficiency on Economic Development Ruwan Jayasuriya and Quentin Wodon A limitation of most empirical cross-country studies that focus on determinants of GDP or GDP growth is that they fail to distinguish explicitly between inputs used in produc- tion and conditions that facilitate production. For example, physical capital, human capital, and labor are production inputs, whereas the quality of institutions, macro- economic stability, and market quality are conditions that facilitate production. This article takes this distinction seriously and uses a stochastic frontier approach to study factors affecting economic performance. A panel data set of 71 countries for the 1980­98 period is used to estimate a production frontier with physical capital, human capital, and labor as inputs. The article also analyzes what drives productive efficiency, using the institutional framework, macroeconomic stability, market quality, and urbaniza- tion as possible explanatory factors. Urbanization turns out to be an important deter- minant, with the rule of law, inflation rate, and market quality also affecting productive efficiency. Measuring economic performance is an issue not only of academic interest but also of practical concern. Numerous cross-country studies that use gross domes- tic product (GDP) levels or growth rate as a yardstick for economic performance have found that conventional factors used to determine output, such as physical and human capital and labor force size, do not fully explain production. Although the results are somewhat sensitive to the model specification, mea- sures of market distortion, macroeconomic environment, political stability, research and development, and the depth of financial markets have all been found to affect economic development (for reviews, see among others Barro and Sala-i-Martin 1995; Sala-i-Martin 1997; Solow 2000; Aron 2000; Easterly 2001). Ruwan Jayasuriya is a senior associate at PricewaterhouseCoopers LLP; his email address is ruwan. jayasuriya@us.pwc.com. Quentin Wodon is lead poverty specialist in the Poverty Reduction and Eco- nomic Management Unit of the Africa Region at the World Bank; his email address is qwodon@world- bank.org. This study was funded by the Research Support Budget at the World Bank. The article benefited from discussions with Christine Fallert Kessides, comments from referees, and suggestions from Alan Winters. , THE WORLD BANK ECONOMIC REVIEW VOL. 19, NO. 1, pp. 121­140 doi:10.1093/wber/lhi006 Ó The Author 2005. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK . All rights reserved. For permissions, please e-mail: journals.permissions@oupjournals.org. 121 122 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . I. THE LITERATURE ON PRODUCTIVE EFFICIENCY The focus in the literature has recently shifted to the quality of public and private institutions and of markets in explaining economic performance in cross-country analyses (Brunetti and others 1998; Hall and Jones 1999; Knack and Keefer 1995; Keefer and Knack 1997).1 Although the institutional framework and market structure of a country measure different aspects, they overlap to a considerable degree. Both can be measured by such factors as the quality of bureaucracy, pervasiveness of corruption, rule of law, risk of appropriation, contract repudia- tion, political environment, and civil liberties, and both would be expected to have an impact on production and allocation decisions. Market and institutional defi- ciencies may distort public and private decisionmaking and lead entrepreneurs to undertake wasteful rent-seeking activities that divert time and resources from productive activities, thereby preventing firms from adjusting effectively to techno- logical change. Weak institutions and market structures may result in suboptimal selection and use of inputs. In developing economies, where the potential for industrialization and the potential gains from industrialization are higher, the inability of firms to fully benefit from low-cost access to advanced technology from overseas and better returns to scale (relative to developed economies) may be especially damaging to development. The macroeconomic environment has also received much attention in studies of economic performance. The inflation rate, and to a lesser extent the black market premium, are widely used as proxies for macroeconomic conditions. Numerous theoretical studies have also focused on the costs of inflation (for surveys, see Briault 1995; Temple 2000). These analyses have shown that businesses and households perform poorly when inflation is high and unpre- dictable. Although empirical studies have found some support for the harmful effects of inflation, the evidence is not overwhelming. Inflation rates of 100 percent a year or higher have been found to inhibit economic development, but the impact of moderate inflation is less clear. Urbanization has largely been omitted in models of economic performance, yet the results reported here show it to have a key positive impact on productive efficiency. Urbanization likely influences productive efficiency through a variety of channels (for a review of the role of cities in development, see World Bank 2002, chap. 6). With the presence of universities, research centers, and many firms, cities thrive on learning and innovation, thereby facilitating spillover 1. Brunetti and others (1998), using firm-level data from a private sector survey in 73 countries to gauge the environment faced by local businesses, find that the institutional framework is crucial in explaining differences in economic performance. Hall and Jones (1999) also find that good institutions and sound policies help for economic development by supporting entrepreneurial activities, capital accumulation, invention, skill acquisition, and technology transfers. Aiming to explain why poor coun- tries are falling behind rather than catching up with wealthy nations, Keefer and Knack (1997) also conclude that deficient institutions and government policies lead to poor performance. Jayasuriya and Wodon 123 effects (Glaeser and others 1992; Adams 2001). Personal contacts remain important in the digital age, and they are easier to maintain in cities (Wheeler and others 2000; Glaeser 1998; Lall and Ghosh 2002). Cities lead to economies of scale, encourage the division of labor, and provide a better environment for matching skills with needs (Quigley 1998; Mills 2000; Ciccone and Hall 1996). Cities also make access to education, health services, and infrastructure easier because costs tend to be lower and competition is greater. One limitation of most cross-country studies is that they lump together all the independent variables in regressions that focus on the determinants of GDP levels or growth rates. Yet not all independent variables are the same. Variables such as physical capital, human capital, and labor are production inputs, but others such as the quality of institutions, market structures, or macroeconomic management are conditions that facilitate production, not inputs. This article takes this distinction seriously. It estimates a production function that depicts optimal output levels given input use and measures economic performance using the productive efficiency of reaching optimal output. This framework is used to analyze the determinants of efficiency that facilitate the production process. A range of institutional, macroeco- nomic, and market quality variables as well as the level of urbanization are explored. The work on analyzing productive efficiency dates back to empirical work by Farrell (1957). Over time two broad approaches have been used in production frontier estimation: deterministic methods and stochastic techniques.2 The deterministic methods, data envelopment analysis and the free disposal hull, apply linear programming techniques to construct a frontier by using a piece- wise linear envelope that connects best performers.3 The main advantage of the deterministic methods is that none or few restrictions are imposed on the production technology, but the disadvantage is the inability to disentangle white noise from the inefficiency measures. In the stochastic techniques random shocks are incorporated that account for some of the deviations from the production frontier. Following Aigner and others (1977) and Meeusen and van den Broeck (1977), the first technique (the error components model) assumes that the error term has two components: one for white noise and one a nonnegative component for inefficiency (Battese and Coelli 1992, 1995). The second technique applies fixed-effects and random-effects methods to measure efficiency, with the effects being nonnegative (Cornwell and others 1990; Kumbhakar 1990). 2. As noted by an anonymous referee, in the 1960s there was much activity estimating production functions and using the results to assess relative levels of efficiency (see, for example, Arrow and others 1961). Economists were reluctant to introduce noninput variables directly into the production function but would compute the residual and try to explain it through various factors (see, for example, Denison 1964). Covariance analysis, or panel estimation, was used to allow for unobserved country-specific effects, and some researchers regarded the dummy variables as measures of efficiency to be explained. 3. On data envelopment analysis, see Charnes and others (1978), Fa¨re and others (1994), Coelli (1995), Tulkens and Vanden Eeckhaut (1995), and Kumar and Russell (2002). On free disposal hull analysis, see Deprins and others (1984) and Tulkens (1993). 124 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . II. METHODOLOGY This article estimates a production frontier using an extension to panel data of the error components model of Aigner and others (1977) proposed by Battese and Coelli (1992, 1995). Similar to the augmented neoclassical model, the model uses physical capital, human capital, and labor force size as production inputs. The production frontier, given input use, depicts the optimal output level, whereas country-level productive efficiency is measured by comparing actual outcome and optimal outcome. The model estimates the impact on productive efficiency of the institutional framework, macroeconomic stability, market quality index (reliance on market mechanisms in the production process and allocation of resources), and level of urbanization. Other efforts have been made recently to analyze the role of various factors in determining productive efficiency. For example, Kumar and Russell (2002) use the data envelopment analysis estimation approach and focus on labor produc- tivity growth for 57 countries over 1965­90. They construct a world production frontier and decompose labor productivity growth into technological change, improvements in efficiency, and capital accumulation. This article uses the stochastic frontier approach instead and focuses on a different outcome, namely, GDP . It calculates productive efficiency by estimating a world produc- tion frontier and attempts to explain what drives productive efficiency. Rather than explicitly discussing the impact of technological change, improvements in efficiency, and capital accumulation on country-level production, it focuses primarily on factors driving improvements in efficiency. The stochastic frontier approach proposed in Battese and Coelli (1992, 1995) and panel data are used to estimate a production possibilities frontier to deter- mine optimal GDP outcomes given input use. The model estimated in this study is discussed in Kumbhakar and Lovell (2000), and a generalized production frontier approach to estimating inefficiency can be found in Kumbhakar and others (1991). Comparing a country's actual GDP outcome with the optimalGDP outcome derived from the production frontier yields a measure of economic performance--productive efficiency. This estimation framework can be used to quantify the impact of the institutional structure, macroeconomic environment, market quality, and urbanization on a country's economic performance in reaching optimal GDP outcomes. Let Yit represent real GDP for country i at time period t, and let Xit depict the inputs used in production: physical capital, human capital (years of schooling), and number of workers. The log-log specification is used in the estimation. Incorporating the period variable (t) captures the impact of technology improve- ments. Over time, technology is expected to improve and cause an outward shift in the production frontier. As a result, the parameter corresponding to the period variable is expected to be positive (and is also used as a robustness test for the stability of the impact of other variables). To enable the production frontier to vary by region, regional dummy variables (DRegion) are used for Asia, Jayasuriya and Wodon 125 Latin America and the Caribbean, Middle East and North Africa, and North America and Europe, with Africa as the omitted region (see appendix table A.1 for the list of countries by region).4 These regional dummy variables enable testing for the robustness of the findings with and without their inclusion in the specification. The regional dummy variables are also important because, even after controlling for inputs and efficiency, other regional (or country-level) factors may affect GDP .5 For example, as noted by Nelson and Pack (1999), rapid growth in some Asian economies was accompanied by ``productive assim- ilation'' or shifts in the size of firms and sectors of specialization, which led to changes in economic structure and higher growth. The production frontier is estimated as follows: ð1Þ ln Yit ¼ þ ln Xit þ t Period þ XgRegionDRegion þ ðvit uitÞ i = Country 1,. .. , Country N and t=1, .. ., T. The technical inefficiency effects are estimated as: X ð2Þ uit ¼ Institutinal Institutional 0 þ j Zit þINF Inflationit þ MKT Marketit þ URB Urbanit þ wit uit ¼ Zit þ wit Estimating equations 1 and 2 separately leads to biased results (Wang and Schmidt 2002), and thus a one-step procedure is used in the estimation. The error term in the production frontier presented in equation 1 consists of two components: the random noise term (vit) that accounts for random shocks and measurement errors, and the nonnegative term (uit) used to measure ineffi- ciency. The vit and the uit terms are assumed to be independent. The vit term is assumed to be iid N(0,sv ). The nonnegative uit term that depicts deviation 2 from the optimal (best practice) outcome is assumed to be independently dis- tributed of the factor inputs (X), and as modeled in equation 2 is a function of country-specific variables that vary over time. The uit term is obtained from a truncated (at zero) normal distribution with a variance of su , but with means 2 that are a linear function of the observable country-specific variables.6 The production frontier presented in equation 1 is in terms of known input and output variables, whereas the inefficiency terms are assumed to be a function of 4. Any geographic grouping is somewhat arbitrary, but the regional groups probably capture some relevant commonalities across countries. 5. The authors are grateful to an anonymous referee for pointing this out. 6. The uit term is assumed to be independently distributed and is obtained from a truncated normal distribution with a mean of *Zitd and a variance of su (Battese and Coelli 1995). The derivation of the 2 log-likelihood function using the distributional assumptions on the uit and vit terms and the maximum likelihood estimation approach used in the estimation are discussed in Battese and Coelli (1995). 126 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . an unknown vector of coefficients, d, and a known set of the institutional, macroeconomic, and market quality variables along with urbanization. Indices of bureaucratic quality, prevalence of corruption, and rule of law are used as institutional variables. The inflation rate is used as a proxy for macroeconomic stability. The reliance of a country on market mechanisms in the production process and allocation of resources is proxied using a market quality index. The productive efficiency measure of country i at time period t is defined as:7 E YitjXit; DRegion; uit ð3Þ Efficiencyit ¼ E YitjXit; DRegion; uit ¼ 0 i =Country 1, . .. , Country N and t =1,. .. , T The numerator depicts the observed country i outcome in period t at a given level of input use, Xit. The denominator represents the corresponding optimal (or best practice) outcome for country i in period t, which implies no ineffi- ciency (uit = 0). The maximum likelihood estimation method is used to simultaneously esti- mate parameters of the stochastic frontier (equation 1) and the model for technical inefficiency effects (equation 2).8 III. DATA The study uses data for 71 countries for 1980­98. All variables are averaged over five-year intervals (1980­84, 1985­89, 1990­94, and 1995­98) to reduce the impact of short-run fluctuations on the estimated parameters (to capture long-term effects). There are two groups of variables: one for estimating the production frontiers and one for explaining country efficiency in producing output. The first group of variables consists of real GDP, real domestic capital stock, average years of schooling (a proxy for a country's stock of human capital), and total number of workers. Data on real GDP and total number of workers (a country's employment base) are from Penn World Table 6.0, compiled by Heston and Summers (1996). Real GDP is in constant U.S. dollars at purchasing power parity (PPP) terms (chain index; expressed in international prices, base 1996). The Heston and Summers real GDP measures account for and assign suitable weights for cross-country price differences of various components of GDP , which enables meaningful cross-country comparisons. Real domestic capital stock data are from Kraay and others (2001) in constant U.S. dollars 7. The conditional expectation of the uit term, conditional on the observed value of vit uit, is used in calculating the efficiency measures (Battese and Coelli 1992, 1995). 8. FRONTIER version 4.1 is used in the estimation (Coelli 1996). Jayasuriya and Wodon 127 in PPP terms.9 The human capital data are from the educational attainment database compiled by Barro and Lee (2000). The second group of variables consists of country-level data on the institu- tional framework, macroeconomic stability, market quality, and urbanization. Indices of bureaucratic quality, prevalence of corruption, and rule of law are used to proxy a country's institutional framework. Data on these indices are from the International Country Risk Guide (ICRG) published regularly by Poli- tical Risk Services (2004; see www.prsgroup.com/icrg/icrg.html). Data on the structure of the economy and use of markets variable used to measure a country's market quality are from the Economic Freedom of the World: 2001 Annual Report (Gwartney and Lawson 2001). Data on the inflation rate, used as a proxy for a country's macroeconomic stability, and urbanization are from the World Development Indicators database (World Bank 2001). The ICRG indices are subjective assessments by a worldwide network of experts. To ensure coherence and cross-country comparability, the indices are subject to peer review. The bureaucratic quality index measures the strength and expertise of bureaucrats and their ability to manage political alterations without drastic interruptions in government services or policy changes. Higher values of this index indicate greater bureaucratic quality. The rule of law index assesses the strength and impartiality of the legal system and popular observance of the law. Higher values of this index indicate more effective enforcement and greater adherence to the law. The corruption index measures actual or potential corrup- tion within the political system, which distorts the economic and financial environment, reduces government and business efficiency by enabling indivi- duals to assume positions of power through patronage rather than ability, and introduces inherent instability in the political system. Higher values indicate a decreased prevalence of corruption. The three indices, which use different rating systems, have been normalized to take values between 0 and 100, with higher values indicating better outcomes. Inflation as measured by the consumer price index reflects the annual percen- tage change in the cost to the average consumer of acquiring a fixed basket of goods and services. A country's market power is proxied by the structure of the economy and use of markets variable. The share of the public sector in industry and investment, use of price controls, and top marginal tax rates are incorpo- rated in this index. This index has been normalized to take values between 0 and 100, with higher values indicating the existence of more effective market struc- tures. Urbanization data refer to the urban population as a share of the total population. 9. As described in Kraay and others (2001), initial estimates of domestic capital stock are obtained from the Penn World Table data set. Flow data on gross domestic investments are then used to construct time series of capital stock measures valued in constant U.S. dollars at PPP (base year 1990). Although it could be interesting to test for the sensitivity of the results here to alternative choices for the measurement of capital (say, not using PPP-based data), such alternative measures are not available. 128 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . Summary statistics indicate that in general, input use increased over the 1980­98 period, although with differences across regions (table 1). The Africa region is the least endowed in physical capital (the stock of physical capital even declined), human capital, and number of workers and thus has the lowest optimal output among all regions. In contrast, the North America and Europe region has high and increasing endowment levels and is thus able to reach the highest optimal output levels. The Asia, Latin America and Caribbean, and Middle East and North Africa regions in general have also experienced a steady increase in input endowments over time. Summary statistics indicate that the variables used for the determinants of efficiency improved over the 1980­98 period, although again with differences across regions (table 2). As was the case for input endowments, the Africa region has the smallest magnitudes for the efficiency determinant variables, implying greater potential for efficiency enhancements in the region from improvements in the institutional framework, macroeconomic stability, and market quality and from greater urbanization. The North America and Europe region has high magnitudes for the efficiency-determinant variables and has had steady improvements over time. The Asia, Latin America and Caribbean, and Middle East and North Africa regions generally also experienced steady improvements in the determinants of efficiency variables during this period. IV. RESULTS The estimation results can be divided into two broad categories: production frontier estimates and determinants of efficiency. Results are presented for four different specifications to provide tests for robustness. The specifications vary by whether a time trend is used in the production frontier or in the determinants of inefficiency and whether regional dummy variables are included in the production frontier. Regional differences as captured by regional dummy variables could themselves reflect differences in conditions that facilitate production, such as the bureaucratic quality index or the market quality index.10 For example, if African countries have a lower efficiency level, this may not be because of ``Africanness'' but possibly because of factors influencing efficiency (for example, lower rule of law index or a higher black market premium). At the same time there may still be real differences in efficiency related to geographic location. To deal with this issue, the model is estimated with and without regional dummy variables. The parameter estimates for the production frontiers show that a country's physical capital stock and number of workers have a positive and statistically significant affect on GDP levels (table 3). Given the log-log specification for key inputs, the associated parameters represent elasticities. A 1 percentage point increase in the level of capital stock leads to a 0.38­0.43 percentage point increase in GDP . A 1 percentage point increase in the number of workers leads 10. The authors are grateful to an anonymous referee for pointing this out. TABLE 1. Summary Statistics for Production Function Variables, 1980­98 1980­84 1985­89 1990­94 1995­98 Africa region Number of observations 5 10 13 7 GDP (constant 1996 US$ at PPP; billions) Mean 16.71 16.10 14.57 19.50 Capital stock (constant 1990 US$ at PPP; billions) Mean 18.21 12.47 10.52 13.32 Years of schooling Mean 2.29 2.47 2.87 3.75 Workers (thousands) Mean 5,859 5,340 5,869 6,129 Asia region Number of observations 11 13 12 14 GDP (constant 1996 US$ at PPP; billions) Mean 392.33 568.48 786.71 876.24 Capital stock (constant 1990 US$ at PPP; billions) Mean 745.60 1,044.20 1,518.83 1,755.54 Years of schooling Mean 5.46 5.70 6.46 6.69 Workers (thousands) Mean 48,481 95,543 110,871 103,011 Latin America and Caribbean region 129 Number of observations 15 19 18 19 GDP (constant 1996 US$ at PPP; billions) Mean 136.15 126.15 149.14 166.30 Capital stock (constant 1990 US$ at PPP; billions) Mean 200.38 184.84 242.77 278.42 Years of schooling Mean 4.39 4.80 5.17 5.39 Workers (thousands) Mean 7,392 7,026 8,439 9,088 Middle East and North Africa region Number of observations 3 3 5 5 GDP (constant 1996 US$ at PPP; billions) Mean 72.17 86.11 119.44 144.66 Capital stock (constant 1990 US$ at PPP; billions) Mean 55.11 77.99 313.12 115.60 Years of schooling Mean 4.42 4.81 4.87 5.46 Workers (thousands) Mean 5,100 5,856 7,612 8,325 North America and Europe region Number of observations 14 17 17 18 GDP (constant 1996 US$ at PPP; billions) Mean 645.23 678.10 760.07 819.12 Capital stock (constant 1990 US$ at PPP; billions) Mean 1,453.91 1,467.97 1,657.27 1,674.34 Years of schooling Mean 7.56 8.06 8.63 9.01 Workers (thousands) Mean 16,392 15,316 16,104 15,937 Source: Heston and Summers (1996); Barro and Lee (2000); Kraay and others (2001); World Bank (2001). 130 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . TABLE 2. Summary Statistics for Determinants of Inefficiency, 1980­98 1980­84 1985­89 1990­94 1995­98 Africa region Bureaucratic quality index Number of observations 5 10 13 7 Mean 33.33 47.00 46.67 47.62 Minimum 16.67 20.00 26.67 16.67 Maximum 50.00 66.67 66.67 62.50 Corruption index Number of observations 5 10 13 7 Mean 36.67 43.00 48.46 45.24 Minimum 0.00 0.00 0.00 33.33 Maximum 66.67 66.67 63.33 50.00 Rule of law index Number of observations 5 10 13 7 Mean 23.33 40.00 43.07 61.31 Minimum 16.67 16.67 13.33 37.50 Maximum 33.33 83.33 83.33 75.00 Inflation Number of observations 5 10 13 7 Mean 32.32 37.37 29.05 20.33 Minimum 13.56 2.73 4.35 5.68 Maximum 70.28 155.25 122.19 37.13 Market quality index Number of observations 5 10 13 7 Mean 23.80 22.60 19.77 46.64 Minimum 17.00 13.00 0.00 34.50 Maximum 30.00 31.00 43.00 60.50 Urbanization Number of observations 5 10 13 7 Mean 22.14 27.82 31.86 34.04 Minimum 9.62 10.42 11.72 13.35 Maximum 31.64 39.58 55.40 49.00 Asia region Bureaucratic quality index Number of observations 11 13 12 14 Mean 59.09 59.74 63.33 69.35 Minimum 16.67 16.67 30.00 37.50 Maximum 100.00 100.00 100.00 100.00 Corruption index Number of observations 11 13 12 14 Mean 48.48 50.00 59.44 58.63 Minimum 0.00 0.00 33.33 33.33 Maximum 100.00 100.00 96.67 91.67 Rule of law index Number of observations 11 13 12 14 Mean 56.06 50.26 61.39 78.57 Minimum 16.67 13.33 16.67 50.00 Maximum 100.00 100.00 100.00 100.00 Inflation Number of observations 11 13 12 14 Mean 10.41 7.18 7.25 6.60 Jayasuriya and Wodon 131 Minimum 3.91 1.15 2.00 0.60 Maximum 18.44 14.76 13.05 20.44 Market quality index Number of observations 11 13 12 14 Mean 34.09 35.23 46.58 51.18 Minimum 17.00 13.00 19.00 20.00 Maximum 53.00 56.00 79.00 92.00 Urbanization Number of observations 11 13 12 14 Mean 45.26 44.08 48.83 52.84 Minimum 15.44 17.92 19.22 20.80 Maximum 85.68 85.34 84.94 100.00 Latin America and Caribbean region Bureaucratic quality index Number of observations 15 19 18 19 Mean 40.00 38.95 43.15 48.68 Minimum 16.67 16.67 16.67 12.50 Maximum 66.67 66.67 66.67 70.83 Corruption index Number of observations 15 19 18 19 Mean 46.67 46.49 50.93 50.66 Minimum 16.67 3.33 26.67 33.33 Maximum 83.33 83.33 80.00 79.17 Rule of law index Number of observations 15 19 18 19 Mean 41.11 42.28 49.07 57.90 Minimum 16.67 16.67 23.33 33.33 Maximum 66.67 66.67 70.00 83.33 Inflation Number of observations 15 19 18 19 Mean 52.82 62.23 56.99 17.20 Minimum 5.81 0.49 8.19 1.25 Maximum 178.46 219.47 432.78 61.41 Market quality index Number of observations 15 19 18 19 Mean 41.47 38.47 46.00 61.08 Minimum 17.00 19.00 0.00 27.50 Maximum 83.00 80.00 74.00 89.50 Urbanization Number of observations 15 19 18 19 Mean 58.73 60.92 63.76 64.58 Minimum 36.02 37.92 38.30 34.15 Maximum 86.00 87.80 89.22 90.65 Middle East and North Africa region Bureaucratic quality index Number of observations 3 3 5 5 Mean 44.44 55.56 55.33 63.33 Minimum 33.33 50.00 43.33 50.00 Maximum 50.00 66.67 80.00 95.83 Corruption index Number of observations 3 3 5 5 Mean 55.55 55.55 56.67 59.17 Minimum 33.33 33.33 46.67 41.67 Maximum 83.33 83.33 80.00 70.83 (Continued) 132 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . TABLE 2. Continued 1980­84 1985­89 1990­94 1995­98 Rule of law index Number of observations 3 3 5 5 Mean 38.89 38.89 52.00 78.33 Minimum 33.33 33.33 43.33 66.67 Maximum 50.00 43.33 60.00 83.33 Inflation Number of observations 3 3 5 5 Mean 67.40 36.02 12.38 10.79 Minimum 8.90 7.32 5.85 4.08 Maximum 177.53 81.82 20.63 28.78 Market quality index Number of observations 3 3 5 5 Mean 11.67 16.33 20.00 27.80 Minimum 10.00 11.00 12.00 21.00 Maximum 14.00 21.00 30.00 39.00 Urbanization Number of observations 3 3 5 5 Mean 61.78 63.14 64.18 66.51 Minimum 43.84 43.98 44.22 44.80 Maximum 89.08 90.00 90.46 90.95 North America and Europe region Bureaucratic quality index Number of observations 14 17 17 18 Mean 86.91 86.47 89.41 93.29 Minimum 50.00 50.00 56.67 70.83 Maximum 100.00 100.00 100.00 100.00 Corruption index Number of observations 14 17 17 18 Mean 86.90 87.26 85.88 83.33 Minimum 50.00 66.67 56.67 62.50 Maximum 100.00 100.00 100.00 100.00 Rule of law index Number of observations 14 17 17 18 Mean 88.10 87.25 91.18 96.53 Minimum 50.00 46.67 70.00 87.50 Maximum 100.00 100.00 100.00 100.00 Inflation Number of observations 14 17 17 18 Mean 12.12 5.42 4.66 2.26 Minimum 5.05 0.69 2.08 0.79 Maximum 22.76 17.19 16.21 6.86 Market quality index Number of observations 14 17 17 18 Mean 31.29 36.24 48.00 58.00 Minimum 10.00 14.00 17.00 40.00 Maximum 53.00 72.00 79.00 83.00 Urbanization Number of observations 14 17 17 18 Mean 71.74 71.73 73.01 75.55 Minimum 32.52 41.00 50.58 58.45 Maximum 95.64 96.20 96.70 97.15 Source: PRS (2004); Gwartney and Lawson (2001); World Bank (2001). TABLE 3. Joint Estimation for the Production Frontier and the Determinants of Inefficiency Model I Model II Model III Model IV Production frontier Constant 2.8794 (7.70) 3.3083 (8.05) 2.6540 (6.90) 3.0903 (7.65) Log(capital stock) 0.4173 (16.12) 0.3813 (14.46) 0.4283 (15.69) 0.3929 (15.24) Log(years of schooling) 0.0223 (0.36) 0.0538 (0.97) 0.0009 (0.01) 0.0358 (0.66) Log(workers) 0.5949 (19.71) 0.6137 (21.61) 0.5860 (17.97) 0.6015 (21.40) Period 0.0209 (1.55) 0.0148 (1.20) Dummy variables (Sub-Saharan Africa omitted) Asia 0.1723 (2.38) 0.1885 (2.59) Latin America and Caribbean 0.1316 (2.14) 0.1485 (2.49) Middle East and North Africa 0.4583 (5.51) 0.4543 (5.67) 133 North America and Europe 0.3606 (4.62) 0.3736 (4.81) Determinants of inefficiency Constant 1.5358 (9.96) 1.3380 (8.98) 1.6090 (10.10) 1.3689 (8.78) Bureaucratic quality index 0.0828 (0.32) 0.0230 (0.09) 0.0362 (0.14) 0.0328 ( 0.14) Corruption index 0.0172 (0.08) 0.2045 (0.94) 0.0438 ( 0.18) 0.1777 (0.88) Rule of law index 0.6373 ( 2.51) 0.4953 ( 2.10) 0.4226 ( 1.75) 0.3338 ( 1.48) Inflation 0.2016 (3.04) 0.1150 (1.87) 0.2060 (3.09) 0.1176 (1.82) Market quality index 0.3851 ( 2.01) 0.5895 ( 3.11) 0.2437 ( 1.25) 0.4613 ( 2.45) Urbanization 1.8223 ( 7.62) 1.5690 ( 5.48) 1.8814 ( 7.28) 1.6037 ( 5.28) Period 0.0650 (1.77) 0.0489 (1.52) Log likelihood function 107.48 108.51 100.80 101.68 Number of observations 238 238 238 238 Note: Numbers in parentheses are t-statistics. Source: Authors' estimation based on data described in text. 134 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . to a 0.59­0.61 percentage point increase in GDP . Human capital (measured by the number of years of schooling) has a positive, but insignificant impact on GDP , which is somewhat surprising.11 The regional dummy variables are statistically significant, with each region having a higher production frontier than Africa, the excluded region.12 These results indicate that increasing input use (or factor endowments) is one of the means of reaching higher optimal output levels. The time period variable, when included in the production possibilities frontier, is positive with a significance level (p-value) of 0.1229 in model III and 0.2325 in model IV. These results (especially for model III) depict improvements in technol- ogy during the 1980­98 period and indicate that with each additional period (and controlling for other factors) the production possibilities frontier shifts outward by 1.5­2.1 percentage points over the level reached in the previous period. The analysis of the impact of the institutional framework, macroeconomic stabi- lity, market quality, and urbanization on a countries' productive efficiency does not incorporate differences in industry mix across countries because of a lack of reliable data.13 Improvements in the rule of law index and the market quality index and a decrease in inflation lead to decreases in inefficiency (see table 3). Inefficiency declines by 1.57­1.88 percentage points for a 1 percentage point increase in urba- nization, by 0.42­0.64 percentage point for a 1 percentage point increase in the rule of law index, by 0.38­0.59 percentage point for a 1 percentage point increase in the market quality index, and by 0.12­0.21 percentage points for a 1 percentage point decrease in the inflation rate. These results suggest that improvements in urbaniza- tion, institutional framework, market quality, and macroeconomic stability that lead to better productive efficiency outcomes are another way to boost output levels.14 In recent years indices of bureaucratic quality, corruption, rule of law, and market quality have improved in many countries. The level of urbanization has also increased, and inflation rates have declined. This implies that many countries 11. An anonymous referee suggested including the education variable instead as a determinant of efficiency, but this study follows the literature in keeping education as an input to the production function. 12. Note that the Middle East and North Africa dummy variable is higher than the North America and Europe dummy variable, which would imply that more can be produced in the Middle East and North Africa thaninNorthAmericaandEuropewiththesamelevelofinputs.Onepotentialexplanationforthisapparently counterintuitive finding could be the impact on GDP of oil and tourism in the countries included in the Middle EastandNorthAfricaregion (althoughonecouldarguethattheseindustriesalso require highlevelsofinputs). 13. As indicated by an anonymous referee, industry mix may have an impact on productive efficiency because of differences in productivity across sectors. For example, countries with a higher share of GDPin the primary sector may be less efficient, and this will not be captured in the estimation of the frontier. However, the use of urbanization in the analysis of the determinants of efficiency may be considered a proxy for sectoral shares of GDP, because a higher level of urbanization is typically a sign of an economy with a lower emphasis on agriculture (but as mentioned earlier, there may also be other reasons for more efficient production in countries that are more urbanized). 14. As pointed out by an anonymous referee, productivity (and hence efficiency) and per capita output may themselves have an impact on urbanization. Unfortunately, the authors are unaware of any mechanism to incorporate appropriate instruments to correct for such potential endogeneity using the stochastic frontier approach. This must bet left for future work. Jayasuriya and Wodon 135 have achieved greater productive efficiency. Indeed, summary statistics at the region level on the estimated efficiency measures for models I and II show an increase in efficiency during the period under review (table 4).15 Yet levels of productive efficiency, like input use, vary by region, with the lowest levels in the Africa region and the highest in North America and Europe over the period under review. These results suggests that both high input endowments and greater produc- tive efficiency have played a key role in North America and Europe's success in achieving high output levels, whereas lack of input endowments and compara- tively low productive efficiency have contributed to the Africa region's poorer performance. It is heartening to note, however, that the Africa region has experienced strong improvements in productive efficiency over the period under review, with this result being robust to the choice of specification. Although the Africa region has the smallest magnitudes for the efficiency-determinant variables (see table 2), it experienced relatively greater improvements in these variables, with greater impacts on efficiency (see table 3). As a result, the Africa region experienced strong improvements in productive efficiency during the 1980­98 period. While the sample of countries used in the estimation is unbalanced, that does not seem to affect the efficiency trend results. A general increase in the levels of efficiency is observed for a majority of the countries in the sample, including those in Africa, for all four periods. For countries that are not in the sample for all four periods, there is no evidence of more omissions because of lack of data for countries with high levels of efficiency in the early periods or for countries with low levels of efficiency in the later periods. If there had been such bias, the improvements in efficiency could have been due to the unbalanced nature of the panel. Also, although the use of regional dummy variables in some of the specifica- tions means that the standard for comparing efficiency levels is region-specific, this standard is increasing over time because the regional dummy variable is invariant over time and the time trend in the production function is positive. Therefore, the improvement in productive efficiency in Africa is not just a reflection of a scenario in which even the best-performing countries in the region would be doing poorly. It is worth emphasizing again that the stochastic frontier approach used here requires that the inefficiency terms (uit) be nonnegative, as the terms measure the deviation from the optimal (best practice) outcome. An alternative estimation method would be to substitute equation 2 into equation 1 and then estimate a single regression using traditional estimation techniques (ordinary least squares). This method would not permit imposing efficiency measures, so any comparison of the results with those presented here would need to be done with care. A traditional approach would provide less information because it could not mea- sure efficiency as well. Still, it is worth noting that in the traditional approach, the capital stock and labor force variables remain statistically significant, whereas 15. The estimated efficiency measures for models III and IV are available on request. 136 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . TABLE 4. Summary Statistics on the Estimated Efficiency Measures (models I and II) Number of Standard Observations Mean Minimum Maximum Deviation Model I Africa region 1980­84 5 32.24 21.62 41.48 8.64 1985­89 10 39.90 21.62 71.51 15.41 1990­94 13 36.88 21.30 81.61 15.52 1995­98 7 46.65 23.01 83.79 19.54 Asia region 1980­84 11 55.81 29.43 91.45 20.46 1985­89 13 55.13 27.96 90.63 19.87 1990­94 12 59.36 30.75 91.20 20.23 1995­98 14 64.70 35.20 94.40 19.36 Latin America and Caribbean region 1980­84 15 65.64 36.06 88.90 15.35 1985­89 19 66.63 37.45 89.62 15.40 1990­94 18 68.66 41.38 87.22 15.17 1995­98 19 68.52 40.78 91.22 15.90 Middle East and North Africa region 1980­84 3 86.19 79.69 94.35 7.47 1985­89 3 83.04 79.48 89.49 5.60 1990­94 5 81.36 40.88 94.43 22.85 1995­98 5 91.52 83.77 95.69 4.78 North America and Europe region 1980­84 14 85.23 74.28 93.81 6.54 1985­89 17 87.86 74.55 95.34 5.86 1990­94 17 89.07 74.50 96.12 4.93 1995­98 18 92.10 81.49 96.72 4.05 Model II Africa region 1980­84 5 39.92 25.90 51.12 10.89 1985­89 10 48.89 25.83 86.76 18.99 1990­94 13 44.34 25.58 93.95 17.71 1995­98 7 55.30 27.75 95.03 21.61 Asia region 1980­84 11 61.12 33.06 94.76 21.31 1985­89 13 60.43 31.23 95.38 20.79 1990­94 12 65.51 34.34 95.74 21.35 1995­98 14 70.96 40.08 97.24 19.48 Latin America and Caribbean region 1980­84 15 73.60 40.20 94.02 16.64 1985­89 19 74.36 41.71 95.16 16.37 1990­94 18 76.37 46.02 93.12 16.25 1995­98 19 75.67 45.30 96.31 16.62 Jayasuriya and Wodon 137 Middle East and North Africa region 1980­84 3 75.80 67.82 87.42 10.29 1985­89 3 71.10 67.56 78.07 6.04 1990­94 5 72.63 36.98 87.58 20.75 1995­98 5 82.83 72.40 91.45 8.38 North America and Europe region 1980­84 14 81.82 69.25 93.28 8.05 1985­89 17 85.32 70.33 96.19 7.36 1990­94 17 87.05 70.02 96.96 6.22 1995­98 18 91.09 78.40 97.55 5.21 Source: Authors' estimation based on data described in text. education (years of schooling) remains statistically insignificant.16 The ranking of the regional dummy variables also remains the same. The rule of law index, the market quality index, and the urbanization rate are again statistically significant, and inflation is statistically significant only in some of the specifications. The results are thus broadly similar, but the advantage of the approach taken here is the ability to explicitly separate the impact of production inputs and that of efficiency in using the inputs. V. CONCLUSION There is an extensive literature on identifying and measuring factors that improve economic performance, as measured by GDP levels and growth rates, using cross-country analyses. In contrast to previous studies, the approach used here estimates a production possibilities frontier that depicts optimal output for different levels of input use and calculates efficiency by comparing actual output levels with optimal outputs. This framework permits studying not only how greater input use increases the optimal output levels that can be reached (according to the production frontier) but also how better conditions that facilitate production can help in reaching these optimal output levels. Similar to previous growth studies, the results indicate statistically significant positive relationships between production and levels of physical capital and number of workers employed. The impact of years of schooling is positive in all cases but lacks statistical significance. The production frontier estimation framework shows an impact of the institutional framework, macroeconomic stability, quality of markets, and level of urbanization on productive efficiency. Finally, the results also indicate that average world productive efficiency levels have improved during 1980­98. High input endowments and greater productive efficiency played a key role in North America and Europe's success, 16. The results of the ordinary least squares estimation are available on request. 138 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . TABLE A.1. Countries by Region Sub-Saharan Latin Middle East and North America Africa Asia America North Africa and Europe Botswana Australia Argentina Egypt, Arab Rep. Austria Cameroon Bangladesh Bolivia Iran, Islamic Rep. Belgium Congo, Rep. China Brazil Israel Canada Ghana Indonesia Chile Jordan Switzerland Kenya India Colombia Tunisia Denmark Malawi Japan Costa Rica Spain Niger Korea, Rep. Ecuador Finland Senegal Sri Lanka Guatemala France Sierra Leone Malaysia Honduras United Kingdom Togo New Zealand Haiti Greece Uganda Pakistan Jamaica Ireland Congo, Dem. Rep. Philippines Mexico Iceland Zambia Singapore Nicaragua Italy Zimbabwe Thailand Panama Netherlands Peru Norway Paraguay Portugal El Salvador Sweden Trinidad and Tobago United States Uruguay Venezuela whereas the Africa region performed poorly due to a lack of input endowments and low productive efficiency. The highest improvement in productive efficiency over time was observed in the Africa region, however, which is promising for the future. REFERENCES Adams, J. D. 2001. Comparative Localization of Academic and Industrial Spillovers. NBER Working Paper 8292. Cambridge, Mass.: National Bureau of Economic Research. Aigner, D. J., C. A. K. Lovell, and P. Schmidt. 1977. ``Formulation and Estimation of Stochastic Frontier Production Function Models.'' Journal of Econometrics 6(1):21­37. Aron, J. 2000. ``Growth and Institutions: A Review of the Evidence.'' World Bank Research Observer 15(1):99­135. Arrow, K. J., H. B. Chenery, B. S. Minhas, and R. M. Solow. 1961. ``Capital-Labor Substitution and Economic Efficiency.'' Review of Economics and Statistics 43(3):225­50. Barro, Robert J., and Jong-Wha Lee. 2000. ``International Data on Educational Attainment: Updates and Implications.'' Harvard University, Department of Economics, Cambridge, Mass. Barro, Robert, and Xavier Sala-i-Martin. 1995. Economic Growth. New York: McGraw-Hill. Battese, G. E., and T. J. Coelli. 1992. ``Frontier Production Functions, Technical Efficiency, and Panel Data: With Applications to Paddy Farmers in India.'' Journal of Productivity Analysis 3(1/2):153­69. ------. 1995. ``A Model for Technical Inefficiency Effects in a Stochastic Frontier Production Function for Panel Data.'' Empirical Economics 20(2):325­32. Briault, C. 1995. ``The Costs of Inflation.'' Bank of England Quarterly Bulletin 35(1):33­45. Jayasuriya and Wodon 139 Brunetti, A., G. Kisunko, and B. Weder. 1998. ``Credibility of Rules and Economic Growth: Evidence from a Worldwide Survey of the Private Sector.'' World Bank Economic Review 12(3): 353­84. Charnes, A., W. W. Cooper, and E. Rhodes. 1978. ``Measuring the Efficiency of Decision Making Units.'' European Journal of Operational Research 2(6):429­44. Ciccone, C., and R. E. Hall. 1996. ``Productivity and the Density of Economic Activity.'' American Economic Review 86(1):54­70. Coelli, T. J. 1995. ``Recent Developments in Frontier Modeling and Efficiency Measurement.'' Journal of Agricultural Economics 39(3):219­45. ------. 1996. ``A Guide to FRONTIER Version 4.1: A Computer Program for Stochastic Frontier Production and Cost Function Estimation.'' CEPA Working Paper 96/07. The Centre for Efficiency and Productivity Analysis, Armidale, Australia. Cornwell, C., P. Schmidt, and R. C. Sickles. 1990. ``Production Frontiers with Cross-Sectional and Time- Series Variation in Efficiency Levels.'' Journal of Econometrics 46(1/2):185­200. Denison, E. F. 1964. ``Measuring the Contribution of Education (and the Residual) to Economic Growth'' and ``Reply.'' In The Residual Factor in Economic Growth. Paris: Organization for Economic Cooperation and Development, Study Group in the Economics of Education. Deprins, D., L. Simar, and H. Tulkens. 1984. ``Measuring Labor-Efficiency in Post Offices.'' In M. Marchand, P. Pestieau, and H. Tulkens, eds., The Performance of Public Enterprises: Concepts and Measurement. North Holland: Amsterdam. Easterly, William R. 2001. The Elusive Quest for Growth: Economists' Adventures and Misadventures in the Tropics. London: MIT Press. Fa¨re, R., S. Grosskopf, M. Norris, and Z. Zhang. 1994. ``Productivity Growth, Technical Progress, and Efficiency Change in Industrialized Countries.'' American Economic Review 84(1): 66­83. Farrell, M. J. 1957. ``The Measurement of Productive Efficiency.'' Journal of the Royal Statistical Society (A) 120(3):253­82. Glaeser, E. L. 1998. ``Are Cities Dying?'' Journal of Economic Perspectives 12(2):139­60. Glaeser, E. L., H. D. Kallal, J. A. Scheinkman, and A. Shleifer. 1992. ``Growth in Cities.'' Journal of Political Economy 100(6):1126­52. Gwartney, James, and Robert Lawson, with Walter Park and Charles Skipton. 2001. Economic Freedom of the World 2001 Annual Report. Vancouver, Canada: Fraser Institute. Available online at www.freetheworld.com. Hall, R. E., and C. I. Jones. 1999. ``Why Do Some Countries Produce So Much More Output per Worker than Others?'' Quarterly Journal of Economics 114(1):83­116. Heston, A., and R. Summers. 1996. ``International Price and Quantity Comparisons: Potentials and Pitfalls.'' American Economic Review 86(2):20­24. Keefer, P., and S. Knack. 1997. ``Why Don't Poor Countries Catch Up? A Cross-National Test of an Institutional Explanation.'' Economic Inquiry 35(3):590­602. Knack, S., and P. Keefer. 1995. ``Institutions and Economic Performance: Cross-Country Tests Using Alternative Institutional Measures.'' Economics and Politics 7(3):207­27. Kraay, A., N. Loayza, L. Serven, and J. Ventura. 2001. ``Country Portfolios.'' CEPR Discussion Paper 2974. Centre for Economic Policy Research, London. Kumar, S., and R. R. Russell. 2002. ``Technological Change, Technological Catch-up, and Capital Deepening: Relative Contributions to Growth and Convergence.'' American Economic Review 92(3):527­48. Kumbhakar, S. C. 1990. ``Production Frontiers, Panel Data, and Time-Varying Technical Inefficiency.'' Journal of Econometrics 46(1/2):201­12. Kumbhakar, S. C., and C. A. K. Lovell. 2000. Stochastic Frontier Analysis. Cambridge: Cambridge University Press. 140 , 19, 1 TH E W O R L D B A N K E C O N O M I C R E V I E W V O L . N O . Kumbhakar, S. C., S. Ghosh, and T. McGuckin. 1991. ``A Generalized Production Frontier Approach for Estimating Determinants of Inefficiency in U.S. Dairy Farms.'' Journal of Business and Economic Statistics 9(3):279­86. Lall, S. V., and S. Ghosh. 2002. ``Learning by Dining: Informal Networks and Productivity in Mexican Industry.'' Policy Research Working Paper 2789. World Bank, Washington, D.C. Meeusen, W., and J. van den Broeck. 1977. ``Efficiency Estimation from Cobb-Douglas Production Functions with Composed Error.'' International Economic Review 18(2):435­44. Mills, E. S. 2000. ``The Importance of Large Urban Areas--and Governments' Roles in Fostering Them.'' In S. Yusuf, W. Wu, and S. Evenett, eds., Local Dynamics in an Era of Globalization. Washington, D.C.: World Bank. Nelson, R. R., and H. Pack. 1999. ``The Asian Miracle and Modern Growth Theory.'' Economic Journal 109(457):416­36. Political Risk Services. 2004. International Country Risk Guide. East Syracuse, N.Y. Available online at www.prsgroup.com/icrg/icrg.html. Quigley, J. M. 1998. ``Urban Diversity and Economic Growth.'' Journal of Economic Perspectives 12(2):127­38. Sala-i-Martin, X. 1997. ``I Just Ran 2 Million Regressions.'' American Economic Review 87(2):178­83. Solow, R. M. 2000. Growth Theory: An Exposition, 2nd ed. New York: Oxford University Press. Temple, J. 2000. ``Inflation and Growth: Stories Short and Tall.'' Journal of Economic Surveys 14(4):395­426. Tulkens, H. 1993. ``On FDH Analysis: Some Methodological Issues and Applications to Retail Banking, Courts and Urbana Transit.'' Journal of Productivity Analysis 4:183­210. Tulkens, H., and P. Vanden Eeckhaut, 1995. ``Non-Parametric Efficiency, Progress and Regress Measures for Panel Data: Methodological Aspects.'' European Journal of Operational Research 80:474­99. Wang, H.-J., and P. Schmidt. 2002. ``One-Step and Two-Step Estimation of the Effects of Exogenous Variables on Technical Efficiency Levels.'' Journal of Productivity Analysis 18(2):129­44. Wheeler, J. O., Y. Aoyama, and B. Wolf, eds. 2000. Cities in the Telecommunications Age: The Fracturing of Geography. New York: Routledge. World Bank. 2001. World Development Indicators Database. Washington, D.C. Available online at www.worldbank.org/data. ------. 2002. World Development Report 2003: Dynamic Development in a Sustainable World. New York: Oxford University Press. .ipproach for >zd Economic y in Mexican Reprints IS Production rering Them." Washington, WE OFFER A REPRINTS SERVICE FOR THIS JOURNAL $zomicJournal :able online at included : -,..,:t <.,...y.+~~,i;~~k+~:+.~; +P3TL%n.:. ,, K.--:- c Perspectives 87(2):178-83. :ty Press. ,onzic Surveys + etail Banking, Special pricing ;ress Measures + 80:47+99. Available in minimum quantities of 500 of Exogenous 44. + In compliance with copyright requirements ons Age: The + Company information can be added able online at Contact: :World. New JulieGribben (please use the full journal name) Email: julie.gribben@oupjournals.org Tel: +44 (0)1865 353827 Fax: +44 (0)1865 353774 THE WORLDBANK ECONOMIC REVIEW SUBMISSION OF PAPERS TO THE WORLD BANK ECONOMIC RE VIEW The World Bank Economic Review is a professional journal for the dissemina- tion of World 'Bank-sponsored and other research that may inform policy analysis and choice. It is directed to an international readership among econo- mists and social scientists in government, business, international agencies, uni- versities, and development research institutions. The Review emphasizes policy relevance and operational aspects of economics, rather than primarily theore- tical and methodological issues. It is intended for readers familiar with eco- nomic theory and analysis but not necessarily proficient in advanced mathematical or econometric techniques. Articles illustrate how professional research can shed light on policy choices. Consistency with World Bank policy plays no role in the selection of articles. Articles are drawn from work conducted by World Bank staff and consul- tants and by outside researchers. Submissions generally should not exceed 8,000-9,000 words, including references and notes, and contain no more than half a dozen figures and tables combined. Authors are requested to indicate the word count with their submissions. Before being accepted for publication, articles are reviewed by three referees- one from the World Bank and two from outside the institution. Articles must also be recommended by a member of the Editorial Board before final acceptance. Non-Bank contributors are encouraged to submit their work. Authors may, if they wish, first submit a proposal of no more than two pages to the Editor for advice on a planned paper% suitability. Comments or brief notes responding to Review articles are welcome and will be considered for publication to the extent that space permits. Submission of an article will be taken to imply that it is original research by the author and that it is not under consideration for publication in any other outlet or form. Please direct all editorial correspondence to the Editor :wber@worldbank.org. For more information, please visit the Web sites of the Economic Review at www.wber.oxfordjournals.org, the World Bank at www.worldbank.org, and Oxford University Press at www.oxfordjournals.org. Forthcomingpapers in THE WORLD BANK ECONOMIC REVIEW Yolume 19, Number 2, 2005 Participation in WTO Dispute Settlement: Complainants, Interested Parties and Free Riders Brown 19 Chad Micro-Finance and Poverty: Evidence Using Panel Data from Bangladesh Shahidur Khandker Evaluating the Impact of Infrastructure Rehabilitation Projects on Household Welfare in Rural Georgia Michaei Lokshin and Rusian Yemtsov Child Health and the 1988-1992 Economic Crisis in Peru Norbert Schady and Christina Paxson Entrepreneurship Selection and Performance: A Meta-Analysis of the Impact of Education in Less Developed Countries Wim Vqverberg,Justin van der Siuis, and Mirjam van Praag The Varieties of Resource Experience: How Natural Resource Export Structures M e c t the Political Economy of Economic Growth Mzchaei Wooicock,Jonathan Isham, Lant Pritchett, and Gwen Bzlsby