The Yale Journal of Economics Spring 2014
Volume 2, Issue 2
Staff Editor-in-Chief Antonia Woodford Managing Editor Moss Weinstock Associate Editors Dhruv Aggarwal Elijah Goldberg Jimin He Copy Editors Jun Hwan Ryu James Austin Schaefer Yalun Zhang Production and Design Editor Madeline McMahon Publisher Brian P. Lei Board of Advisers Joseph G. Altonji Pinelopi K. Goldberg Samuel S. Kortum Anthony A. Smith, Jr.
The Yale Journal of Economics Spring 2014 Volume 2, Issue 2 New Haven, CT Website: http://econjournal.sites.yale.edu/
2
Table of Contents Editors’ Note
4
Alex Dombrowski (University of California, Berkeley)
Returns to Schooling: A College Athlete’s Perspective
7
Stefano Giulietti (Yale)
Contagion in the Eurozone Sovereign Debt Crisis
47
Disha Verma (Harvard)
The Socioeconomic Impact of Political Fragmentation in India: What the Rise of Regional Politics Implies for Economic Growth and Development
67
Dounia Saeme (University of California, Berkeley)
Does the Implementation of Affirmative Action in a Competitive Setting Incentivize Underrepresented Public School Applicants’ Performance? Evidence From São Paulo
93
Adin Lykken (Yale)
A Question of Intent: Explaining the Performance of Governments in Global Development Projects
115
This journal is published by Yale College students and Yale University is not responsible for its contents. A full list of references for the papers in this issue is available on our website.
3
Editors’ Note Another semester in the books. Now a little more than a year old, the Yale Journal of Economics is still finding its footing in the academic community—a process that we knew would require time and, more importantly, effort. Without those, the Journal could never achieve its goal: showcasing original economic research performed by undergraduates. We are thankful for the enthusiasm for our project shown by students around the globe, who have continued to submit insightful and innovative papers; we are excited to publish their research here and in future issues. We are also thrilled to expand the Journal’s online presence via a redesigned website that allows undergraduates’ excellent work to reach an even wider audience. This issue contains five essays written by students at Harvard University, Yale University, and the University of California, Berkeley. In fact, this issue has a bit of a global flavor. To start the issue, Alex Dombrowski examines a key U.S. institution: the National Basketball Association. Specifically, he considers the optimal time for college players to “go pro” by declaring for the NBA draft. Stefano Giulietti turns his gaze towards Europe and its sovereign debt crisis, using credit default swap spreads to measure contagion effects across countries. Disha Verma studies the effects of political fragmentation in India and finds that coalition governments grow faster, undertake more social spending, and incur less debt than majority governments. Dounia Saeme investigates Brazil’s college entrance examination system to see how the implementation of affirmative action incentivizes public school students in São Paulo. Completing the issue, Adin Lykken’s research spans the globe; he analyzes the performance of governments that complete projects financed by the World Bank, and concludes that improving project supervision may be the most effective way to yield better project outcomes. We also need to thank the generous donors whose contributions make the Journal possible. In addition to the Yale Department of Economics and the Yale Undergraduate Organizations Committee, Stephen Freidheim ’86 and David Swensen GRD ’80 made substantial contributions that allow us 4
to continue to produce the Journal and distribute it free of charge. Thanks to their support, we have published the Journal two times this academic year, this spring and last fall. Finally, we would like to thank the members of our advisory board for their guidance during the publishing process. Together, we hope to bring the Journal to new heights.
5
6
Returns To Schooling: A College Athlete’s Perspective Alex Dombrowski, University of California, Berkeley1 Abstract. Two decades ago, 90% of the National Basketball Association (NBA) college draftees had completed their senior year of college. Today that number is only 30%. College basketball players are making the jump to the NBA earlier on, after their freshman, sophomore, or junior seasons. This paper uses both empirical and theoretical frameworks to study a player’s decision of when to "go pro." I estimate returns to schooling by comparing the earnings of two groups of NBA players: those who went pro out of high school or immediately after freshman year of college from 1989 to 2005 versus those who went pro immediately after freshman year from 2006 to 2012. A new rule was implemented in 2005 that forced players to wait at least one year after their high school graduation before going pro. Thus, 2005 was the last year when players could move directly from high school to the NBA, and the latter group contains players who were forced to complete an additional year of school. I find no significant difference in earnings between the two groups. Keywords: returns to schooling, optimal stopping, NBA
1 Alex
Dombrowski is a senior at the University of California, Berkeley majoring in mathematics, statistics, and economics. He wrote this senior honors thesis for Professor David Card. He would like to thank Professor Card for his time and support.
7
1
Introduction
In economics, returns to schooling are commonly thought of as the expected change in earnings due to an additional year of school. So for example, an undergraduate considering whether or not to pursue graduate study may like to know the expected salary difference from getting those additional years of education. This paper approaches returns to schooling from the perspective of a college athlete. In particular, I consider college basketball players, but as a motivating example and illustration of the widespread nature of this phenomenon, consider the football player Matt Barkley. Barkley, a member of the class of 2013, was a quarterback at the University of Southern California (USC) where he had a phenomenal junior season. Speculators thought he would forego his senior year at USC to go directly into the National Football League (NFL). Instead, Barkley opted to stay at USC for his senior year, intending to go pro right after. Unfortunately, his senior year performance was not nearly as compelling and he was drafted much lower. His contract was estimated to be worth millions of dollars less than if he had gone pro after junior year. In Barkley’s case, his returns to schooling for that year were negative.2 The composition of the NBA draft from 1989 to 2012 shown in Figure 1 illustrates how players have been going pro earlier. In each draft, the 30 teams each select two players, for a total of 60 players moving from college into the NBA. Throughout the early 1990s, around 90% of those drafted were seniors. This percentage has fallen dramatically over the last two decades. In 2012 only 35% of those drafted were seniors. So why strategize about when to leave college and go pro? If a collegiate basketball player enters the NBA draft, he foregoes his remaining college eligibility regardless of whether or not he is drafted.3 For example, if a freshman enters the draft, he can no 2 There are of course other benefits to finishing senior year (e.g. earning the degree), but here I focus solely on maximizing the player’s earnings as a professional athlete. 3 If the player does not sign with an agent, has never applied for a previous NBA draft, and withdraws his name from the draft by the deadline, then he can retain his eligibility. However, the National Collegiate Athletic Association (NCAA) rules only allow players to enter the draft once without losing eligibility.
8
Figure 1: Draft Composition 1989-2012
longer play college basketball even if he is not drafted. The paper is laid out as follows: Section 2 gives a brief literature review. Section 3 is an overview of the NBA. Section 4 is a study testing the effect of playing an additional year of college basketball on earnings. Sections 5, 6, and 7 contain the theoretical framework. Section 8 concludes.4
2
Literature Review
There are a few notable papers that discuss contracts and early entry. Li and Rosen (1998) analyze when and why contracts are made early. Winfree and Molitor (2007) analyze returns to schooling for baseball players. In particular, they focus on a recent high school graduate’s decision of whether to go to college or go directly into the Major League. Arel and Tomas (2012) view declaring for the NBA draft as exercising an American-style "put option" early. 4 All
figures throughout this paper are original.
9
3
Overview of the NBA
The NBA can be viewed as a labor market where each year the 30 teams (firms) hire 60 players (employees) from a pool of players, most of whom live in the United States. The NBA players have a labor union, called the National Basketball Players Association (NBPA), which negotiates rules with the League, which is comprised of the commissioner and team owners. Every few years, the League and NBPA form a new Collective Bargaining Agreement (CBA), which lays the foundation for how the NBA is governed. The CBA contains information about how the draft works, how basketball revenue is allocated among the players and the League, and the minimum and maximum amounts a team can pay its players. For example, the 1995 CBA introduced the “rookie pay scale," which structured how incoming players would be paid. Players drafted in the first round are guaranteed a two-year contract, followed by two one-year team options. Players drafted in the second round are not guaranteed contracts. Also, being picked later in the draft results in lower pay. The NBA draft is held annually at the end of June. Starting in 1989, the draft instituted a two-round system, in which each team selects one player in each round. This new drafting system is the main reason I focus on draft data beginning in 1989. I constructed a dataset of 1,382 players who were drafted from 1989 to 2012.5 Data were collected from www.basketball-reference.com and verified using basketball.realgm.com. During the 24 draft years from 1989 to 2012, 53.11% of the 1,382 players drafted were seniors, 14.40% were juniors, 14.26% were international, 9.62% were sophomores, 5.79% were freshmen, and 2.82% were high school seniors.
4
Returns to Schooling: An Additional Year
The 2005 CBA made 2005 the last season in which a high school player could go directly into the NBA. Beginning in 2006, a player had to be one year removed from high school before going into 5 Although there were 30 total teams in 2012, at the beginning of 1989 there were only 27. Consequently, earlier drafts saw fewer than 60 players drafted.
10
the NBA.6 Figure 2 shows how the numbers of high school seniors and college freshmen have evolved in drafts dating back to 1989. The dark line falls to zero in 2006 as a result of the new CBA. In 2007, the light line spikes from two to eight. This spike represents the 2006 high school class that was forced to play a year in college, along with a couple of 2006 high school graduates who would have chosen to play their freshman year even without the new rule. Thus, the light line post-2005 can be thought of as a merging of the pre-2006 dark line and the pre-2006 light line. In this study, I explore differences in earnings between the following two groups of players: freshmen and high school seniors drafted in 1989-2005 versus freshmen drafted in 2006-2012. The first group, which I Figure 2: Number of High Schoolers and Freshmen Drafted
call the “pre-law” group, has 67 players. The second group, the “post-law” group, has 52 players. Of these 119 players, three never played a game in the NBA and were removed from the data.7 Table 1 provides a detailed comparison of these two groups. From Table 1, we can informally compare the statistics of the 6 The
player doesn’t have to attend college for that year; however, attending college for that year is standard. 7 Ousmane Cisse (2001 draft), Ricky Sanchez (2005 draft), and Keith “Tiny” Gallon (2010 draft) never played in the NBA.
11
12
Career Statistics Games Played Minutes Played Total Points Total Rebounds Total Assists Field Goal Shooting Percentage 3 Point Shooting Percentage Free Throw Shooting Percentage Minutes Per Game Points Per Game Rebounds Per Game Assists Per Game Career Length (years)
Mean 548.9 15740 7556 3129.0 1302.0 0.4439 0.2626 0.7007 23.80 10.91 4.679 1.858 9.269
1989-2005 Median 560.0 14250 5709 2516.0 840.0 0.4540 0.3130 0.7410 24.90 9.90 4.200 1.500 9.000 SD 323.9 11557.1 6750.9 2785.2 1464.4 0.092 0.1541 0.15571 9.9822 6.4508 2.73399 1.5396 4.269
Mean 193.4 4883 2201.0 884.0 425.9 0.4454 0.2471 0.6858 21.95 9.425 4.015 1.862 3.481
2006-2012 Median 170.5 3568 1400.0 601.0 156.0 0.4455 0.2985 0.7330 21.45 8.250 3.650 1.200 3.000
Table 1: Comparison of Pre-law and Post-law Group Performance SD 129.9 4176.8 2281.2 846.6 549.6 0.0883 0.1458 0.1506 9.3834 5.7259 2.5015 1.8257 1.862
two groups. Since the pre-law group has players with longer careers than those in the post-law group, it may be more difficult to compare statistics like points and total rebounds. To compare these, note that the median career length of the pre-law group is three times that of the post-law group. Thus, multiplying the post-law group’s total points by three may give a reasonable number to compare to the pre-law group’s total points. For a finer comparison, use minutes played instead of career length. The median minutes played of the pre-law group is four times that of the post-law group. Thus, numbers could be scaled appropriately by a factor of three to four to make more accurate comparisons.
4.1
Analysis of Earnings
Salary data were collected from www.basketball-reference.com and checked against http://www.eskimo.com/⇠pbender/.8 Salaries were put into real 2013 dollars using CPI numbers from the Federal Reserve Bank of St. Louis. Figure 3 shows how earnings have evolved throughout this time period. All 116 players are plotted, with each player represented by a line. The light lines are the pre-law players and the dark lines are the post-law players. The figure is meant to give a general sense of how earnings progress as the players gain more years of experience. Notable players Kevin Garnett (1995 draft), Kobe Bryant (1996 draft), and Kevin Durant (2007 draft) are highlighted. Figure 4 is a condensed version of Figure 3. Figure 4 illustrates the difference in earnings between the pre-law and post-law groups. Average earnings in real 2013 dollars are plotted against career year. So for example, the post-law group (dark line) for career year one corresponds to the average earnings of players in that group during their rookie year. In career year 1, the post-law group earned on average $580,000 more than the pre-law group. In career year 2, the postlaw group earned on average $480,000 more than the pre-law group. In career year 3, the post-law group earned on average $270,000 more than the pre-law group. In career years 4 and 5, 8 For
the NBA lockouts in 1998 and 2011, I used the full years’ earnings, not the prorated salary.
13
Figure 3: Annual Earnings
the pre-law group out-earned the post-law group by $270,000 and $1,250,000 respectively.9 Draftees often hire agents to negotiate their rookie contracts. The rookie pay scale is not completely rigid: The team can pay between 80% and 120% of the salary specified by the rookie pay scale. However, nearly all contracts end up at 120%. Arel and Tomas (2012) find that, of the players drafted in the first round between 2006 and 2012, 98% had contracts for 120% of the amount specified by the rookie pay scale. Therefore I did not control for the quality of the agent in the analysis. The following two subsections use a two sample z-test and Mann-Whitney test to determine if the difference in earnings between the two groups is significant. 9I
don’t analyze career years 6 and 7 in detail because there are less than 10 players from the post-law group who played six or more years, and only two who played all seven.
14
Figure 4: Average Earnings of Pre-law and Post-law Groups
4.1.1
Parametric Two Sample Z-Test iid
iid
The test assumes X1 , . . . , Xn ⇠, N(µ X , s2 ), and Y1 , . . . , Ym ⇠ N(µY , s2 ) where s2 is estimated by a pooled variance: s2p =
(n
1)s2X + (m 1)sY2 m+n 2
where
s2X =
1 n
n
(Xi 1Â
X)2 . (1)
i=1
In this case, the observations Xi , Yi are the annual earnings of each player for some specified year. The observations are reasonably independent and identical. The histograms in Figure 5 show the distribution of earnings for each group in the first two years. Though the data are not convincingly normal, the next section’s analysis uses the nonparametric Mann-Whitney test and gives very similar results to this two sample z-test. Table 2 gives the results from this two sample z-test (twosided) for career years 1 through 5. In each test, the null hypothesis is: H0 : µ Xj = µYj
j = 1, 2, 3, 4, 5
(2)
where for example µ Xj is the mean of the distribution of earnings 15
Figure 5: Earnings in Years 1 and 2 for the Pre-law and Post-law Groups
Pre-law group earnings in year 1 are on the top left, pre-law group earnings in year 2 are on the bottom left, post-law group earnings in year 1 are on the top right, and post-law group earnings in year 2 are on the bottom right.
in career year j for the pre-law group. The difference in earnings between the two groups during career year 1 is significant at the 5% level. The difference in earnings between the two groups during career year 2 is just shy of being significant at the 10% level. The other tests do not yield significant results. 4.1.2
Nonparametric Mann-Whitney Test
The Mann-Whitney test is nonparametric, which means it makes no assumptions about the underlying distribution of the observations. The test instead ranks the observations from smallest to largest and compares the sum of ranks. The null hypothesis is that the treatment has no effect, where in this case the treatment is the additional year of school the post-law group experienced. Table 3 has the results, which are similar to those in Table 2. Again, the post-law group’s average earnings in career year 1 are significantly higher than the pre-law group’s average earnings in career year 1. 16
Table 2: Results of Two Sample Z-Test Career year 1 2 3 4 5
t-statistic 2.12749991 1.6017363 0.8169752 -0.3234246 -0.9189319
p-value 0.03337857 0.1092139 0.4139426 0.7463737 0.3581312
Table 3: Results of Mann-Whitney Test Career year 1 2 3 4 5
4.2
t-statistic -2.06083935 -1.6164613 -0.7660718 -0.7233642 -0.7660718
p-value 0.03931837 0.1059946 0.4436336 0.4694561 0.4436336
Linear Regression Model
The following model is the form for the regressions: ln Earningsi = b 0 + b 1 Si + b 2 Abilityi + b 3 Draft picki + b 4 Experiencei + ei
(3)
The dependent variable Earnings varies from regression to regression. S is an indicator variable taking on zero for the prelaw group and one for the post-law group. Ability is measured by rookie year statistics: minutes played, points, assists, and rebounds. Experience is measured by a player’s total career points. Experience can be thought of as long term ability, controlling for whatever the ability regressor doesn’t. Table 4 gives the results. Table 4 has four regressions. The choice of regressions is unconventional in the sense that all regressions use the same set of regressors, but have different dependent variables. Instead of settling on one measure for earnings, it seemed more appropriate to give several measures. The dependent variables in regressions (1) and (2) are Career Average Yearly Earnings and Year 1 Earnings; dependent variables in regressions (3) and (4) are Earnings in First 2 Years and Earnings in First 3 Years. 17
For all regressions, the coefficient of Year of School Dummy is not significant. Thus, in these tests, there is no evidence that supports that one group had significantly different earnings than the other group. The results of regressions (2), (3), and (4) are very similar, mostly because of the rookie pay scale, which was introduced in 1995.10 Under the new contract system, earnings in years 1 and 2 are highly correlated.11 In each regression, Draft Pick is significant at 1%. This is not surprising for regressions (2) and (3), because, as a player is drafted later, his pay will decline according to the rookie pay scale. It is not as obvious why Draft Pick is significant in regression (1) where Career Average Yearly Earnings is the dependent variable. The coefficient of Draft Pick in regression (1) is 0.024, which is smaller in absolute value than the corresponding coefficient in the other three regressions. This is consistent with the rookie pay scale’s influence on average career yearly earnings becoming diluted because of the expiration of the contract and room for more variability in earnings in later years. The regressions highlight the importance of draft pick on average yearly earnings throughout a player’s career. Regression (1) estimates that being drafted one position later leads to a 2.4% decrease in average career earnings, on average. Table 5 summarizes how each group was drafted. A Mann-Whitney test on the draft number for the pre-law and post-law groups yielded a p-value of 0.16. Thus, there is no evidence that suggests draft positions are significantly different between the two groups.
5
Optimal Entry
A major concern for college players is when to go pro. Players may play all four years of college basketball or may opt to enter the draft early. Once a player declares for the draft, he can never play college basketball again, regardless of whether he is drafted 10 Although this data set of 116 players encompasses the time period 1989-2012,
only 3 players were drafted before 1995: Shawn Kemp (1989), Shawn Bradley (1993), and Dontonio Wingfield (1994). 11 Of the 116 players, 104 played at least 2 years. The correlation between these players’ earnings in years 1 and 2 of their career is 0.996.
18
19
0.0002 (0.001)
Assists (Rookie)
0.074⇤⇤⇤ (0.011) 0.00004 (0.00003) 15.919⇤⇤⇤ (0.383)
0.024⇤⇤⇤ (0.004) 0.0001⇤⇤⇤ (0.00001) 14.895⇤⇤⇤ (0.140)
Draft Pick
Total Career Points
Constant
16.650⇤⇤⇤ (0.414)
0.00004 (0.00003)
0.073⇤⇤⇤ (0.012)
0.001 (0.001)
0.001 (0.002)
17.055⇤⇤⇤ (0.460)
0.00004 (0.00003)
0.068⇤⇤⇤ (0.013)
0.001 (0.001)
0.001 (0.002)
0.002 (0.002)
0.001 (0.001)
0.027 (0.360)
(4)
*, **, *** indicates statistical significance at the 10%, 5%, and 1% level, respectively. (1) Career Average Yearly Earnings (log); (2) Year 1 Earnings (log); (3) Earnings in First 2 Years (log); (4) Earnings in First 3 Years (log)
0.001 (0.001)
0.0001 (0.0004)
Points (Rookie)
0.001 (0.002)
0.001 (0.001)
0.001⇤ (0.0005)
Rebounds (Rookie)
0.002 (0.002)
0.001⇤ (0.001)
0.001⇤ (0.001)
0.00002 (0.0002)
Minutes Played (Rookie)
(3) 0.088 (0.319)
(2) 0.091 (0.284)
0.111 (0.104)
Year of School Dummy
(1)
Dependent variable:
Table 4: The Effect of an Additional Year of Schooling on Earnings
Table 5: Summary of Draft Pick Number Group Pre-law Post-law
Min 1.00 1.00
1st Qu. 6.00 4.00
Median 13.00 11.00
Mean 17.98 14.75
3rd Qu. 26.00 22.00
Max 56.00 49.00
to an NBA team or not. A very talented underclassman may want to enter the draft early for many reasons. He may have had an outstanding season or his team may have won the national championship. He could have received a prestigious award like “Most Valuable Player" or may be concerned his performance will be worse during his later years of college. He could also get injured in a later season. There are many cases in which college athletes went pro at the “wrong time." Hence, we would like to analyze optimal entry. Consider a college player wanting to go pro. He wants to be drafted high, not low (i.e. in the first round rather than the second or third round). Let: PH := probability of being drafted high, PL := probability of being drafted low, EH := high career earnings (when drafted high), EL := low career earnings (when drafted low), where EH > EL > 0. Consider a player seeking to go pro and assume this player won’t go undrafted.12 That is, PH + PL = 1. Suppose that PH and PL depend only on that player’s skill level, S. Thus, PH = PH (S) and PL = PL (S), (4) where these two functions have the property dPH >0 dS
and
dPL < 0, dS
(5)
since a player’s chance of being drafted high should be 12 Relaxing
proof.
this assumption leads to the same conclusion. See Appendix for
20
monotonically increasing in his skill level. Likewise, a player’s chance of being drafted low should decline as the player’s skill increases. A player’s skill level will vary over time as he progresses through college. So letting y denote years of college experience we have S = S(y)
y 2 [0, 4],
(6)
where y = 2 ,for example, corresponds to 2 years of experience. Let the player have a utility function, u(e), where e is career earnings as a professional with u0 (e) > 0 and u00 (e) < 0. In other words, the player’s utility is increasing and concave in his career earnings. Thus, to maximize utility, it is sufficient to maximize expected career earnings, E(e). Therefore, we have the optimization problem, h i max E(e) = max PH (S) · EH + PL (S) · EL (7) y
y
From (4), (7) becomes ⇣ max PH (S(y)) EH + 1 y
⌘ PH (S(y)) EL
(8)
The solution to (8) involves making PH (S(y)) as large as possible, which occurs when S(y) is as large as possible. Let us verify that this is indeed the case. The first order condition says, ⇣ ⌘ d PH (S(y)) EH + 1 PH (S(y)) EL dy = EH PH0 (S(y))S0 (y) EL PH0 (S(y))S0 (y) = (EH EL )PH0 (S(y))S0 (y) =0
(9)
By assumption, EH > EL and so EH 6= EL . Also by assumption, PH (S) is monotonically increasing, and hence PH0 (S) 6= 0 for all S. Therefore, (EH
EL )PH0 (S(y))S0 (y) = 0 () S0 (y) = 0
(10)
Let y⇤ be such that S0 (y⇤ ) = 0 and S00 (y⇤ ) < 0. Then y⇤ denotes the number of years of college experience that maximizes skill level. 21
Claim:
max E(e) = E(e)|y=y⇤
(11)
y L yy R
provided that y⇤ is the absolute maximum of S(y). Proof : We already know S0 (y⇤ ) = 0 which makes y⇤ a critical point of E(e). So we must check d2 y [E(e)|y=y⇤ ] < 0 dy2
(12)
From (10), d (EH dy
EL )PH0 (S(y))S0 (y)
EL ) PH0 (S(y))S00 (y)
= (EH +S
0
(13)
(y)PH00 (S(y))S0 (y)
Letting y = y⇤ , = (EH
EL ) PH0 (S(y⇤ ))S00 (y⇤ ) + (S0 (y⇤ ))2 PH00 (S(y⇤ ))
(14)
Since by assumption S0 (y⇤ ) = 0 this becomes = (EH
EL )PH0 (S(y⇤ ))S00 (y⇤ )
(15)
By assumption, EH > EL and S00 (y⇤ ) < 0. Also, PH0 (S) > 0 for all S. So in particular, PH0 (S(y⇤ )) > 0. Therefore, (EH
EL )PH0 (S(y⇤ ))S00 (y⇤ ) < 0.
(16)
From (16) we conclude max E(e) = E(e)|y=y⇤ for 0 < y < 4. y
(17)
Now evaluate expected earnings at the boundary and compare to
22
E(e)|y=y⇤ . Let y B generically denote either y L or y R . Then E(e)|y =y⇤ > E(e)|y=yB
h i () PH (S(y⇤ )) EH + 1 PH (S(y⇤ )) EL h i > PH (S(y B )) EH + 1 PH (S(y B )) EL
() PH (S(y⇤ )) EH PH (S(y⇤ )) EL > PH (S(y B )) EH PH (S(y B )) EL () EL PH (S(y B )) PH (S(y⇤ )) > EH PH (S(y B )) PH (S(y⇤ ))
(18)
Since EL < EH ,
() PH (S(y B ))
PH (S(y⇤ )) < 0
(19)
Since PH0 (S) > 0,
() S(y⇤ ) > S(y B )
(20)
Therefore to maximize expected career earnings, and hence utility, a player should go pro when his college skills are best. This conclusion is simple. The complication is that a player does not know when his skills will be best. He may have a great freshman year and expect to get better, but actually have worse years later in college. Players don’t know when S(y) is at its maximum, just like stock market agents don’t know when the price of a stock is at its maximum. This naturally leads to random walks and optimal stopping rules, which are considered next.
6
Stochastic Model
In the previous section, we concluded that a player can optimize expected career earnings by strategically entering the draft when his skill level is highest. Thus we should analyze how skill level moves from year to year and find when it’s likely to be highest.
23
Let Sn be skill level after n years of experience and let {Sn , n = 1, 2, 3, 4} be a random walk defined by Sn+1 = Sn + µ + e,
(21)
where e ⇠ N(0, s2 ) and the jumps from year to year are independent. e allows for variation in the amount of skill gained. µ represents a drift and is thought of as the baseline amount of skill gained from year to year. Typically µ > 0, so we make this assumption in the algebraic solution, though the simulations in section 6.2.3 allow for µ 0. Note that S(y) is more naturally thought of as being continuous since a player’s skill evolves (perhaps continuously) throughout his career. However, we can consider our discrete time analysis as using only the values S(1), S(2), S(3), and S(4) of the continuous time S(y). The probability that skill increases over a year is given by P(Sn+1 > Sn ) = P(Sn + µ + e > Sn ) = P(e > =1
µ)
(22)
F( µ)
= F(µ) where F(µ) =
Z µ
•
1 p e s 2p
1 2
2
( sx ) dx.
(23)
The conditional expectation and variance are
and
E(Sn+1 |Sn ) = E(Sn + µ + e | Sn ) = Sn + µ
(24)
Var(Sn+1 |Sn ) = Var(Sn + µ + e | Sn ) = s2 .
(25)
Next, consider the maximum value of the random walk. Define “probability functions” Pi to be ⇣ ⌘ Pi = P max{S1 , S2 , S3 , S4 } = Si , i = 1, 2, 3, 4. (26) We would like to find expressions for Pi . For µ = 0, the expressions are simple. However, for µ > 0, the expressions are messier. The following solves for Pi when µ = 0, gives an outline 24
of the solution for Pi when µ > 0, and provides a simulation that illustrates the estimated solution for Pi for all µ.
6.1
No Drift (µ = 0)
Let’s find expressions for Pi , i = 1, 2, 3, 4 when µ = 0. To simplify the notation, let p = F(µ), the probability the random walk goes up. Since µ = 0, we could use p = F(µ) = F(0) = 1/2; however, the next subsection generalizes these expressions, so we do not explicitly use 1/2 here. The following lemma illustrates the logic used in this section. (Lemma 6.1) For the walk Sn+1 = Sn + e, P(max{S0 , S1 , S2 } = S1 ) = p(1-p). Verification: P(max{S0 , S1 , S2 } = S1 ) = P(S1 > S0 , S1 > S2 )
= P(S1 > S2 |S1 > S0 )P(S1 > S0 )
= P(S1 > S1 + µ + e)F(µ) = P(e <
µ)F(µ)
= (1
F(µ))F(µ)
= (1
p)p
(27)
Now let’s find P1 , the probability that S1 is the maximum (i.e. the player’s skill is highest after freshman year). S1 is the maximum in four cases: the walk goes 1. down down down 2. down down up, with the two down steps being larger than the up step 3. down up down, with the first down step being larger than the up step 4. down up up, with the down step being larger than the two up steps
25
Thus,
⇣ p)3 + p(1 p)2 · P N(0, 2s2 ) ⌘ ⇣ > N(0, s2 ) + p(1 p)2 · P N(0, s2 ) ⌘ ⇣ > N(0, s2 ) + p2 (1 p) · P N(0, s2 ) ⌘ > N(0, 2s2 ) ,
P1 = (1
(28)
d
where we’ve used the fact that N(0, s2 ) + N(0, s2 ) = N(0, 2s2 ). Also, by symmetry, if X ⇠ N(0, sX2 ) and Y ⇠ N(0, sY2 ), then P(X < Y) = P(X > Y) = 1/2
8 sX , sY .
(29)
From (28), using (29) and simplifying gives P1 = (1
p)3 +
! P1 = 1
p (1 2
p)2 +
3 2p + p2 2
p (1 2 1 3 p 2
p)2 +
p2 (1 2
Continuing in this manner, let’s find P2 , P3 , P4 . maximum in two cases: the walk goes
p)
(30)
S2 is the
1. up down up, with the down step begin larger than the last up step 2. up down down Thus,
⇣ p) · P N(0, s2 ) ⌘ > N(0, s2 ) + p(1 p)2
P2 = p2 (1
! P2 = p
(31)
3 2 1 3 p + p 2 2
S3 is the maximum in two cases: the walk goes 1. down up down, with the up step greater than the first down step 2. up up down 26
Thus,
⇣ p)2 · P N(0, s2 ) ⌘ > N(0, s2 ) + p2 (1 p)
P3 = p(1
! P3 =
1 p 2
(32)
1 3 p 2
S4 is the maximum in four cases: if the walk goes 1. up up up 2. down down up, with the up step larger than the sum of the two down steps 3. down up up, with the sum of the two up steps larger than the down step 4. up down up, with the last up step larger than the down step So,
⇣ p)2 · P N(0, s2 ) ⌘ ⇣ > N(0, 2s2 ) + p2 (1 p) · P N(0, 2s2 ) ⌘ ⇣ > N(0, s2 ) + p2 (1 p) · P N(0, s2 ) ⌘ > N(0, s2 )
P4 = p3 + p(1
! P4 =
(33)
1 1 p + p3 2 2
As a sanity check, let us verify that P1 + P2 + P3 + P4 = 1
(34)
Summing (30)–(33), ⇣
3 1 3⌘ ⇣ 3 2 2p + p2 p + p p + 2 2 2 ⌘ ⇣ ⌘ ⇣3 1 1 1 + p3 = 1+ 2+1+ + p + 2 2 2 2 =1
1
1 3⌘ ⇣ 1 1 3⌘ ⇣ 1 p + p p + p 2 2 2 2 ⌘ ⇣ 1 1 3 2 1 1⌘ 3 p + + + + p 2 2 2 2 2 (35)
27
Equations (30)-(33) are only valid for p = 1/2, as assumed in this section. Evaluating these four expressions when p = 1/2 gives P1 = 5/16 P2 = 3/16 P3 = 3/16
(36)
P4 = 5/16 Hence when a player has no drift in skill, the probability his skill will be highest after freshman year is 5/16, after sophomore year is 3/16, after junior year is 3/16, and after senior year is 5/16. The assumption of no drift may hold for some players, but it is more revealing to incorporate drift and build the full model.
6.2
Probability Functions with Positive Drift (Âľ > 0)
We want to find expressions similar to (30)-(33) which relax the assumption of zero drift. The process for constructing the analogue to (30) is the same. That is, there are still the same four ways S1 could be the maximum. However, the second, third, and fourth ways will have a different form. Three problems need to be solved: cases 2, 3, and 4 from (30). 1. The probability the walk goes down down up, with the two down steps being larger than the up step. 2. The probability the walk goes down up down, with the first down step being larger than the up step. 3. The probability the walk goes down up up, with the down step being larger than the two up steps. After finding 1, 2, and 3, these values can be substituted in (30) to get the new P1 . Similarly, substituting these values (or their complements) into the old expressions for the other Pi will give the new probability functions. We begin with case 2. 6.2.1
Case 2
The task here is to find the probability of the walk going down, up, down, with the first down step being larger than the up step. 28
Keeping with the same notation, let the probability of an up jump be p = F(µ). Let A be the event the walk goes down, then up, then down. Let B be the event that the down step is larger than the up step. Then P(AB) = P(A) · P(B| A) = p(1
p)2 P(B| A).
(37)
So we must solve P(B| A), the probability that the down jump is larger than the up jump, given the walk went down then up on those first two steps. Let X = size of the up jump Y = size of the down jump.
(38)
Figure 6: Distribution of How Sampling is Done
First, make X and Y into densities by scaling the original N(0, s2 ) by the appropriate factor. The density of X is given by 1 feX (x) = p e s 2p
1 2s2
x2
1 , F(µ)
µ x < •.
(39)
The density of Y is given by 1 feY (y) = p e s 2p
1 2 y 2s2
1
29
1 , F(µ)
µ y < •.
(40)
Since we are concerned with | X µ| and |Y µ|, shift the densities so the support is [0, •). So the density of X becomes 1 f X (x) = p e s 2p
1 x µ 2 2( s )
1 , F(µ)
0.
x
(41)
The density of Y is now given by 1 f Y (y) = p e s 2p
1 y+µ 2 2( s )
1
1 , F(µ)
y
0.
(42)
Using (41) and (42), compute P(X < Y): P(X < Y) = =
ZZ
Z ZR Z
f X,Y (x, y) dA
f X (x) f Y (y) R x !• Z y!•
R = {(x, y) : x
0, y > x }
dA
1 x µ 2 1 1 p e 2( s ) F(µ) x=0 y=x s 2p 1 y+µ 2 1 1 · p e 2( s ) dy dx 1 F(µ) s 2p Z y!• Z x !• 1 x µ 2 1 y+µ 2 1 1 1 p e 2( s ) p e 2( s ) = F(µ) y=x s 2p x=0 s 2p 1 · dy dx. 1 F(µ) (43)
=
The inner integral involves a N( µ, s2 ), which can be shifted and scaled to get a N(0, 1). Z x !•
h ⇣ x + µ ⌘i 1 x µ 2 1 1 1 p e 2( s ) · 1 F dx F(µ) 1 F(µ) s x =0 s 2p Z • h ⇣ x + µ ⌘i 1 x µ 2 1 1 p e 2( s ) · 1 F = dx F(µ)(1 F(µ)) 0 s 2p s Z • Z • 1 x µ 2 1 x µ 2 1 1 1 ( s ) 2 p p e 2( s ) = e dx F(µ)(1 F(µ)) 0 s 2p 0 s 2p ⇣x + µ⌘ ·F . s (44)
30
The first integral can be shifted and scaled. The second we do not solve explicitly: Z • ⇣x + µ⌘ 1 x µ 2 1 1 p e 2( s ) · F F(µ/s) . (45) F(µ)(1 F(µ)) s 0 s 2p The expression in (45) can be substituted into (37) for P(B| A). As a sanity check on (45), plug in µ = 0 and s = 1. Then (45) reduces to Z • 1 F(0) f(x)F(x) dx = 4(1/2 3/8) = 1/2 (46) (1/2)(1 1/2) 0 This is exactly what is expected if µ = 0 since the up jump and down jump come from the same distribution. 6.2.2
Cases 1 and 3
I outline the solution for cases 1 and 3, but do not solve explicitly for them. This is because the next section gives the full approximate probability functions through simulation, which are much more enlightening than the algebraic derivations. For case 1, we want to solve P(Y + Y > X). First, use convolution to find the density of Y + Y, then set up an integral as in the previous case. For case 3, we want to solve P(Y > X + X). Again, use convolution to find the density of X + X, then set up an integral. 6.2.3
Simulated Probability Functions
Figure 7 plots the probability functions P1 , P2 , P3 , and P4 for various levels of s. All plots have µ along the x-axis. Note that for each plot in Figure 7, when µ = 0, the probability functions coincide with the numbers from section 6.1: P1 = 5/16 = P4 and P2 = 3/16 = P3 . The plots include values of negative drift, just to include players who may be best at the start of college, then steadily decline throughout their college career. The top left plot is for a player with s = 0.5. This player has low variation above and beyond his usual drift from season to season. The solid line in the top left plot shows that if this player has a µ > 1, it is very likely his skills will be highest after senior year. However, if the player’s drift is between 0 and 1/2, the solid line is much 31
Figure 7: Probability Functions
The solid line is P4, the dotted line is P3, the dashed line is P2, and the dotted-dashed line is P1. The top left plot is for s = 0.5. Top right is s = 1. Bottom left is s = 2. Bottom right is s = 3.
lower and so the probability his skill level is highest before senior year is more substantial (sum of the heights of the three non-solid lines). The bottom right plot is for a player with high volatility in his skills above and beyond the season to season drift (s = 3). The solid line is still monotonic, yet increases much slower. The probability a high volatility player is best after senior year is less than the probability a low volatility is best after senior year for all µ > 0.
6.3
The Stochastic Model as a Predictive Model
From Figure 7, we could give an estimate as to when the player is most likely to have the highest skills, based on his s and µ. To use Figure 7 as a predictor, we would like to measure a player’s µ and s. I do not pursue this idea here; rather, I suggest it as potential future work. One way is to look at the player’s high school statistics or the player’s game-by-game statistics in his first year of college. 32
For each year in high school, say, use statistics like the player’s points, rebounds., etc. to construct a proxy for skill level that year. Then for each of the four years, we would have a number that corresponds to skill level. Plot these four skill levels versus years 1, 2, 3, and 4. The best-fit line through these points could give an estimate for µ and s: The slope of the best-fit line would be µˆ and the sum of squared residuals would be sˆ . For each player, we could construct a µˆ and sˆ and compare across players. So, for example a sˆ in the first quartile of all players’ sˆ ’s would classify that player as a low volatility player, whereas a sˆ in the fourth quartile of all players’ sˆ ’s would classify that player as a high volatility player. The same could be done for the drift values. This would allow us to compare players and predict when a player’s skill level would most likely be highest, relative to other players.
6.4
Threshold Idea
Section 5 determined a player can maximize expected career earnings by entering the draft when his skills are the best (i.e. by maximizing S(y)). Section 6.2 showed that if the player has a positive drift, then he is most likely to be best after his senior year, regardless of the actual value of the drift and regardless of the volatility in e. Thus, if a player has positive drift, which I suspect almost all do, we should not expect players to enter the draft before senior year. So why is this not the case? To reconcile this, I suggest that players do indeed want to maximize their success by strategically entering the draft when their skills are best; however, if a player is very talented he may already be good enough to go pro. That is, perhaps there is some “skill threshold” players want to reach before going pro. If they attain this threshold before senior year, then they enter the draft before senior year. For example, after a player’s team wins the national championship, or after the player receives a prestigious award, or after the player averages more than a certain number of points in a season. A player coming off a big year receives widespread attention which puts him in the spotlight and may make him more likely to go pro. The player may feel like his chances of being drafted are especially good, despite the fact that his skills may indeed improve if he stays in college for an
33
additional year.
7
The Optimal Stopping Problem
The decision to stop playing college basketball and go pro is a stopping problem. Since a player wants to optimize earnings with this decision, it is an optimal stopping problem. Figure 8 shows how, after completing each year in college, a player must choose whether to stop (S) and try to go pro or continue (C) playing in college. At each node a player must choose whether to continue playing college basketball or stop and go pro. Stopping after year i leads to career earnings of ei . Figure 8: College Basketball Playerâ&#x20AC;&#x2122;s Decision Tree
Figure 8 captures the playerâ&#x20AC;&#x2122;s dilemma, but it simplifies the problem because it doesnâ&#x20AC;&#x2122;t consider the probability a player is drafted to the NBA. Not all early entrants are drafted and so this approach should incorporate the chance of being drafted.13 Consider the finite state Markov chain in Figure 9 with state space S = {0, 1, 2, 3, 4, d1, d2, d3, d4, NBA, ND }, where
13 Using my data set of 856 players from 1989 to 2012 who opted for early entry, 35.5% went undrafted. Arel and Tomas (2012) find that, from 2006 to 2010, 38% went undrafted.
34
(i) State 0 means the player has 0 years of college experience, 1 means the player has 1 year of college experience, etc. (ii) d j = declare for draft with j years of experience, j = 1, 2, 3, 4 (iii) NBA = player was drafted to the NBA. ND = player was not drafted to the NBA (both absorbing). Figure 9: NBA Player Markov Chain
The stochastic matrix is 2 0 1 0 0 0 0 0 0 6 0 0 p 0 0 q2 0 0 2 6 6 0 0 0 p 0 0 q3 0 3 6 6 0 0 0 0 p 0 0 q4 6 4 6 6 0 0 0 0 0 0 0 0 6 P=6 0 0 0 0 0 0 0 0 6 6 0 0 0 0 0 0 0 0 6 6 0 0 0 0 0 0 0 0 6 6 0 0 0 0 0 0 0 0 6 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 35
0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 p10 p20 p30 p40 1 0
0 0 0 0 0 q10 q20 q30 q40 0 1
3
7 7 7 7 7 7 7 7 7 7. 7 7 7 7 7 7 7 5
P has dimension 11 by 11, where the states run across the top and down the side in the order 0, 1, 2, 3, 4, d1, d2, d3, d4, NBA, ND. So for example, the (1, 2) entry of P gives the probability of moving from state 0 to state 1, which is 1 in Figure 9 since a player must be at least a freshman to declare for the draft. The kth step stochastic matrix (k 6) is
36
37
6 6 6 6 6 6 6 6 6 k P =6 6 6 6 6 6 6 6 4
2
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 q2 p10 + p2 q3 p20 + p2 p3 q4 p30 + p2 p3 p4 p40 q2 q10 + p2 q3 q20 + p2 p3 q4 q30 + p2 p3 p4 q40 0 q2 p10 + p2 q3 p20 + p2 p3 q4 p30 + p2 p3 p4 p40 q2 q10 + p2 q3 q20 + p2 p3 q4 q30 + p2 p3 p4 q40 0 q3 p20 + p3 q4 p30 + p3 p4 p40 q3 q20 + p3 q4 q30 + p3 p4 q40 0 q4 p30 + p4 p40 q4 q30 + p4 q40 0 0 p4 q40 0 0 p1 q10 0 p20 q20 0 0 p3 q30 0 0 p4 q40 0 1 0 0 0 1
7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5
3
The entry of concern in this matrix is the (1, 10) entry, which represents the probability of going from state 1 to state NBA: Pk (1, NBA) = q2 p10 + p2 q3 p20 + p2 p3 q4 p30 + p2 p3 p4 p40 ,
(k
6). (47)
That is, (47) is the probability that a player who just finished freshman year will be in the NBA by the time he graduates. P6 (1, NBA) is the probability he makes it to the NBA after freshman year (q2 p10 ) plus the probability that he makes it to the NBA after sophomore year (p2 q3 p20 ) plus the probability that he makes it after junior year (p2 p3 q4 p30 ) plus the probability that he makes it after senior year (p2 p3 p4 p40 ). Hence P6 (1, NBA) is the probability that he eventually makes it to the NBA. Clearly, a player would like this probability to be as high as possible. Since pi + qi = 1 for i = 2, 3, 4, substitute qi = 1 pi in (47) to get P6 (1, NBA) = (1
p2 )p10 + p2 (1
p3 )p20 + p2 p3 (1
p4 )p30 + p2 p3 p4 p40 . (48)
We can use data from the empirical section to estimate P6 (1, NBA). Getting estimates for pi0 is simple: Consider all players who applied to the NBA draft after i years of college and look at how may were drafted into the NBA.14 This is a good first approximation for pi0 . The pi are more difficult to estimate. This would involve looking at all players who finished i years of college basketball and seeing how many declared for the draft versus how many stayed for another year. Trying to estimate all the college players who did not apply for the draft is challenging since there are many colleges in the nation with many college players on each team. To solve this issue, consider p2 as a function of p10 . That is, p2 = p2 (p10 ). Recall, p2 is the probability that a rising sophomore remains in college for a second year and p10 is the probability that a rising sophomore makes it to the NBA after applying for the draft. It is reasonable to assume dp2 < 0. dp10
(49)
14 Using my data set of 856 domestic players from 1989-2012 who remained early entry, 78% of freshmen were drafted, 66% of sophomores were drafted, and 54% of juniors were drafted. Also, 81% of high school seniors who applied were drafted. p40 is difficult to estimate, but is likely to be well less than 50%.
38
That is, as the probability of being drafted increases, the player will be less and less likely to want to remain in college for a second year. We can imagine that p2 (0) ⇡ 1 since the player has no chance of being drafted and so it is likely he will remain in college. Also, p2 (1) ⇡ 0 since the player will definitely be drafted and so he forgoes his second year to apply for the draft. This reasoning makes the analysis of this optimal stopping section more appropriate for an average college player. A very talented player may still be reluctant to try for the draft just because he thinks he will be drafted. The talented player would probably strive for a high draft position, whereas the average player would be happy to be drafted at all. The last thing to notice about p2 (p10 ) is its second derivative: risk neutral ) risk averse ) risk loving )
d2 p2 =0 dp210
d2 p2 <0 dp210
d2 p2 > 0. dp210
These three properties are summarized by Figure 10. To work with explicit equations, I assume the following functional form of p2 (p10 ) : p2 (p10 ) = 1
(p10 )k ,
k > 0.
(50)
If the player is risk loving, k < 1. If the player is risk neutral, k = 1. If the player is risk averse, k > 1. The argument above for reasoning that p2 is a function of p10 can be used to conclude that p3 is a function of p20 and p4 is a function of p30 . These other two functions behave exactly the same. Hence (48) becomes k k f (k) = 1 (1 p10 ) p10 + (1 p10 ) 1 (1 p2k 0 ) p20 + (1 p1k 0 )(1 k p20 ) 1 (1 p3k 0 ) p30 + (1 p1k 0 )(1 p2k 0 )(1 p3k 0 ) p40 (51)
39
Figure 10: Risk Preferences of a Player
which, after some algebra, simplifies to f (k) = p40 + (p30 + (p40
k
p40 )p0 3 + (p10
k
p40 )p0 1 + (p20
p30 )(p20 p30 )k + (p40
k
p40 )p0 2 + (p40
p30 )(p10 p30 )k + (p30
p20 )(p10 p20 )k
p40 )(p10 p20 p30 )k (52)
f (k) in (52) can be thought of as the chance of making it to the NBA eventually, where k is a measure of the player’s risk averseness. Regardless of the values of p10 , p20 , p30 , and p40 , f (0) = p10 and limk!• f (k) = p40 . Data from the previous page in footnote 14 showed p10 ⇡ 0.78 > p20 ⇡ 0.66 > p30 ⇡ 0.54 > p40 . Using these values in (52) and taking p40 = 0.3 (without loss of generality), Figure 11 depicts f (k). Unlike the first optimal entry model where we assumed the player was talented (i.e. would be drafted if he were to apply to the draft), a not so talented player cannot be as picky about when he applies to the draft. The average player may have only one shot throughout his college career to make a feasible attempt to go pro. Therefore, if an average player wants to optimize his chances of making it to the NBA, he should be as risk loving as possible. If he thinks he has any shot of making it to the NBA after 40
completing any year of college, then he should drastically reduce the probability he stays in college for another year and strongly consider applying for the NBA draft. Figure 11: Chances of Making it to the NBA
7.1
Secretary Problem Approach
The classic Secretary Problem from optimal stopping theory can provide another template for when a college basketball player should go pro. The secretary problem is as follows: An employer is looking to hire a single secretary from an applicant pool of n applicants. The applicants can be ranked from best to worst (i.e. there is a best applicant, a second best, etc.). The employer does not know the talent of an applicant until that applicant is interviewed. The employer interviews applicants one by one hoping to find the best applicant. After an interview, the employer can hire that applicant and never get to see the remaining applicants, or reject that applicant forever. The optimal solution involves automatically rejecting the first n/e applicants immediately after the interview, then hiring the first applicant who is better than all of those interviewed so far. If no such applicant exists, the employer hires the last applicant 41
interviewed. This strategy can be shown to be successful in hiring the best applicant in the pool with probability 1/e, as n ! •. This solution method can be applied to a college basketball player’s decision about when to make the jump to the NBA. A player can opt to go pro after any one of four years. But after electing to go pro, he cannot try to go pro in later years because he has relinquished his college eligibility. At the end of each year of college basketball the player assesses his “offer,” which can be thought of as his chance of going pro, his potential earnings, and projected overall initial success in the NBA. This is analogous to interviewing an applicant. The player does not necessarily know if his skill level will be higher or lower the following year, or if his offer will be better or worse. The secretary problem solution implies the player should reject the first n/e offers automatically. In the case of the player, n/e = 4/e ⇡ 1.47 offers. Therefore, the player should not go pro after freshman year. Instead, he should go pro in the subsequent year that gives him a better offer than the offer he received his freshman year. In the case that his offers after sophomore year and junior year are worse than after freshman year, the player should go pro after senior year. Using this approach, it can be shown that the player successfully accepts the best offer 46% of the time. To see why, rank the four offers as 1, 2, 3, and 4 where 1 corresponds to the best offer and 4 is the worst. Since there are 24 permutations, all equally likely, directly count the number of cases for which this secretary solution successfully has the player select 1. For example consider the order 2, 3, 1, 4. This means that the player will receive the best offer following his junior year, second best offer after freshman year, third best offer after sophomore year, and worst offer after senior year. The solution says to reject the offer after freshman year, then select the first offer that is better than the offer freshman year. Since the offer after sophomore year is worse, it is rejected. Since the offer after junior year is better than the offer after freshman year, the player accepts this offer. He never plays a senior year of basketball and never sees what his offer after senior year would have been. Therefore this method was successful in picking out the best offer. In the 24 cases, the best offer is picked in 11 of them (⇡ 46%). In seven cases, the second-best offer is picked, in four cases the third-best offer is 42
picked, and in only 2 cases (1, 2, 3, 4 and 1, 3, 2, 4) the worst offer is picked. The above example is encouraging since the player would have been rather unlucky if he remained for his senior year. The player of course does not know if his offer after senior year will be better than after his junior year. Coming off a great junior year he may have been tempted to play senior year, thinking that he would get an even better offer after senior year. The method above results in the player stopping at the optimal time. 7.1.1
Nondeterministic Secretary Problem
A final thought introduces more randomness into the playerâ&#x20AC;&#x2122;s decision about when to go pro. In the previous section it was assumed that the number of applicants n was known. However, this does not have to be the case. Letting N be a random variable denoting the number of applications received, the employer would have to find a new optimal stopping rule to optimize the probability of selecting the best applicant (see Presman & Sonin 1973). Likewise for the player, n = 4 is not always the case. The player for example may suffer a long-term injury junior year which ends his college career. Hence N took on the value 2, since he only saw two offers, both of which he rejected. Using N also makes sense in the following way: suppose in the previous section we defined "offer" as the event in which the probability of being drafted once a player enters the draft is nonzero. Then the player may not have offers every year. That is, there may be some years where he simply may not be talented enough, in which case if he applied to the draft he would almost certainly not be drafted. Thus N would denote the number of years in which he has a nonzero probability of being drafted.
8
Conclusion
This paper analyzes when a player should go pro through both empirical and theoretical frameworks. The empirical section estimates returns to schooling by comparing two groups of players: high school seniors and freshmen drafted from 1989 to 2005 versus freshmen drafted from 2006 to 2012. A new law made 43
2005 the last year that players could go directly into the NBA out of high school, so the latter group contains individuals who were forced to complete an additional year of school. Comparing the earnings of the two groups shows that the â&#x20AC;&#x153;post-lawâ&#x20AC;? group earned $580,000 more during their first year in the NBA. However, the regressions, which controlled for ability, experience, and draft position, found no evidence to support a significant difference in earnings. The theoretical section showed that players can optimize their career earnings by going pro when their skills are highest. Players who are especially talented should go pro as soon as they attain some minimum skill threshold, despite the possibility that their skills may be higher in later years. Players who are average should go pro if they receive any positive signals that indicate they have a chance of being drafted. That is, they should have risk loving preferences.
44
Appendix Here I generalize Section 5 to include the probability of the player going undrafted. This is mainly to show the robustness of the analysis in Sections 5-7 and also to allow for “average players” to be defined more broadly. Let PH , PL be defined as they were previously and let PU be the probability that the player is undrafted. If the player is undrafted, his earnings are zero: EH > EL > 0 = EU . Assume dPU /dS < 0 and that PH + PL + PU = 1. The optimization problem is h i max E(e) = max EH PH (S(y)) + EL PL (S(y)) , (53) y
y
which becomes h ⇣ max EH PH (S(y)) + EL 1 y
PH (S(y))
Differentiating with respect to y gives
PU (S(y))
⌘i
.
(54)
dE(e) = EH PH0 (S(y))S0 (y) EL PH0 (S(y))S0 (y) EL PU0 (S(y))S0 (y) dy =0 h ⇣ ⌘i ) S0 (y) EH PH0 (S(y)) EL PH0 (S(y)) + PU0 (S(y)) = 0,
(55) which holds when S0 (y) = 0. To see why the term in the bracket cannot be zero, rearrange to get EH PH0 (S(y)) + PU0 (S(y)) = EL PH0 (S(y)) P0 (S(y)) = 1 + U0 . PH (S(y))
(56)
Since EH > EL , EH / EL > 1. But the term on the right in (56) is less than one since PU0 (S) < 0 and PH0 (S) > 0. Therefore, the maximum of S(y) is the only candidate to optimize earnings. Letting y⇤ be such that S0 (y⇤ ) = 0 and S00 (y⇤ ) < 0, let’s verify that this is indeed
45
the maximizer. Differentiating (55), d2 E(e) dy2
(57)
y =y⇤
h ⇣ = S0 (y⇤ ) EH PH00 (S(y⇤ ))S0 (y⇤ ) EL PH00 (S(y⇤ ))S0 (y⇤ ) ⌘i h + PU00 (S(y⇤ ))S0 (y⇤ ) + S00 (y⇤ ) EH PH0 (S(y⇤ )) ⇣ ⌘i EL PH0 (S(y⇤ )) + PU0 (S(y⇤ )) h i = S00 (y⇤ ) ( EH EL ) PH0 (S(y⇤ )) EL PU0 (S(y⇤ )) < 0
since S00 (y⇤ ) < 0 and the term in brackets is positive. Lastly, check the boundary: E(e)|y =y⇤ > E(e)|y =yB () EH PH (S(y⇤ )) ⇣ ⌘ + EL 1 PH (S(y⇤ )) PU (S(y⇤ )) > EH PH (S(y B )) ⇣ ⌘ + EL 1 PH (S(y B )) PU (S(y B )) h i () EL PH (S(y B )) PH (S(y⇤ )) + PU (S(y B )) PU (S(y⇤ )) h i > EH PH (S(y B )) PH (S(y⇤ ))
(58)
For (58) to hold, each term in brackets must be negative. To make the term in the right bracket negative, it’s necessary that S(y⇤ ) > S(y B ). The term in the left bracket of (58) is PH (S(y B )) PH (S(y⇤ )) < 0 and PU (S(y B )) PU (S(y⇤ )) > 0. To see how the first difference dominates, recall PH + PL + PU = 1
=) =0
=)
dPH dPL dPU + + dS dS dS
(59)
dPH dPU > dS dS
since we assumed the derivative of PL is nonzero. Therefore S(y⇤ ) optimizes expected earnings. The player should go pro when his skills are highest. 46
Contagion in the Eurozone Sovereign Debt Crisis Stefano Giulietti, Yale University1 Abstract. Since the end of 2009, the Eurozone has faced a severe sovereign debt crisis, which had its roots in Greece and gradually spread to other countries. This paper estimates an autoregressive conditional heteroskedasticity (ARCH) model of credit default swap (CDS) spreads in order to analyze whether contagion effects are identifiable during the crisisâ&#x20AC;&#x201D;equivalently, whether the financing difficulties faced by several European countries were due to investor panic, herding, or speculation, or actual fundamental problems. The analysis shows the presence of Greek contagion effects on Spain, Italy, Belgium, France, and Portugal, both through CDS markets and credit rating downgrades. Further contagion is documented among Portugal, Spain, Italy, and France in later stages of the crisis. Keywords: sovereign debt crisis, credit default swaps, Eurozone, ARCH model
1 Stefano Giulietti graduated from Yale University in 2013. He wrote this senior essay in economics for Professor Costas Arkolakis. The author would like to thank Professor Arkolakis for his continuous guidance and helpful and stimulating discussions. He would also like to thank Professor Yuichi Kitamura for helping him with the more complex econometric issues he encountered.
47
1
Introduction
Since the end of 2009, the Eurozone has faced a severe sovereign debt crisis. EU and IMF interventions neither reversed the crisis nor contained it to Greece. On the contrary, the problem spread to several other countries, such as Portugal and Ireland. Even the sovereign debt markets of large economies like Spain’s and Italy’s, and to a much lesser extent France’s, came under pressure. Sovereign debt woes may have spread to other countries when financial markets recognized an effective increase in credit risk, but it may also be the case that Greece infected other debt markets by negatively impacting the market’s assessment of countries whose conditions were not as critical. This paper investigates whether the financing problems faced by some Eurozone countries are disproportionate compared to their fundamentals; that is, whether there have been any contagion effects across countries during the debt crisis. Contagion due to market panic, investor herding, and other factors is identified whenever the correlation coefficients across two countries’ credit default swap markets increase temporarily, together with their volatilities. An ARCH regression model is estimated in order to analyze the size and volatility of cross-market correlations. Credit default swap spreads are used as an indicator of countries’ perceived default risk, as explained in Section 2. The sample includes the credit default swap spreads of seven Eurozone countries over the German benchmark: France, Italy, Austria, Portugal, Spain, Belgium, and the Netherlands. The model investigates contagion caused by spillovers in credit default swap markets and by the effects of credit ratings. The analysis shows evidence that contagion stemming from Greece affected Portugal, Spain, Italy, France, and Belgium during the crisis. The only two countries that were immune to contagion were Austria and the Netherlands. Additionally, further instances of contagion are identified from Spain to Italy, from Portugal to Spain and vice versa, and from Italy to France and vice versa. The study generally confirms the findings of Missio and Watzka (2011), who showed that the Portuguese, Spanish, Italian, and Belgian bond markets were affected by spillovers of the 48
Greek crisis, while Austria and the Netherlands were immune to contagion. However, this paper not only expands on previous research by including France in the sample of analyzed countries, but also uses credit default swaps instead of bond yields as a more accurate measure of perceived credit risk.
2
Bonds vs. Credit Default Swaps
Government bond yields and prices of credit default swaps on government bonds are both measures of a countryâ&#x20AC;&#x2122;s credit risk as perceived by financial markets. The following section explains why this paper chooses to rely on credit default swap spreads rather than bond yields as an indicator of sovereign credit risk. A bond buyer is exposed to an interest rate risk and a funding risk, given by the initial outlay of the principal. The need to hedge or speculate on credit risk has given rise to a large market for credit default swaps (CDS), a type of derivative that protects the lender in case of default of a sovereign or corporate bond. A CDS gives a bondholder insurance against pure credit risk: the buyer of the CDS agrees to make periodic payments to the seller, and in return receives a payoff if the underlying security experiences a credit event such as a default. In the case of such a credit event, the CDS buyer is entitled to a payment equal to: Payment = B(1
R)
(1)
where B is the net notional value of the bond and R is the recovery rate. CDS are traded over-the-counter, which makes them highly liquid, unlike fixed income instruments. Moreover, the fact that the position taken through the bond has to be funded by collateral, while the position taken through the CDS is unfunded, implies that derivatives have higher built-in leverage, which can mean that a CDS may be a cheaper instrument than a bond for acquiring exposure to the same credit risk. For these reasons, credit default swaps are used not only for hedging risk, but also for speculation. Liquidity and cheapness make CDS ideal for placing bets on debt instruments. CDS buyers are often speculative investors such as hedge 49
funds, while bond buyers tend to have a longer-term investment perspective; this difference implies that CDS prices are more responsive to changing economic conditions than bond spreads. Hence, CDS markets have emerged as a highly visible indicator of a country’s perceived credit risk. Research on European sovereign debt markets has indeed shown that when explosive trends appeared during the sovereign crisis, the CDS market appeared to have been a driver in most cases. In other words, due to their high liquidity, CDS had a price discovery effect: changes in CDS prices anticipated corresponding changes in bond prices. The price discovery effect was further accentuated by the flight to liquidity that characterized the crisis. Therefore, in order to investigate contagion effects across Eurozone sovereign debt markets, this paper uses CDS prices as a proxy for European countries’ perceived default risk.
3
Identifying Contagion
As defined by Corsetti et al. in Financial Contagion: The Viral Threat to the Wealth of Nations, the term “contagion” (generally used in contrast to “interdependence”) conveys the idea that during financial crises there might be breaks or anomalies in the transmission mechanism among markets, reflecting switches across multiple equilibria, market panics unrelated to fundamentals, investor herding, and so on. Pericoli and Sbracia (2003) give an overview of the most commonly used definitions of contagion. According to the existing literature, contagion can be identified if: (i) the probability of a crisis in one country increases conditional on the probability of a crisis occurring in another country; (ii) volatility of asset prices spills over from a crisis country to other countries; (iii) cross-country co-movements of asset prices cannot be explained by fundamentals alone; (iv) co-movements of prices and quantities across markets 50
increase conditional on the probability of a crisis occurring in a market or group of markets; and (v) the transmission channel intensifies or changes after a shock in one market. All these definitions of contagion take into account the correlations and volatilities of markets and financial assets and clarify the importance of volatility measures in the study of contagion. Earlier models defined contagion as an increase in cross-market correlations of stock market returns during a crisis. Forbes and Rigobon (1999) argue that this approach is biased, because standard estimates of cross-market correlations will be biased upward when stock market volatility increases, such as during a time of financial turmoil. Indeed, an adverse shock in one country could propagate to other countries by directly affecting their fundamentals through a series of real linkages. For example, one country being hit by a crisis could lead to a decline in asset prices in other countries due to trade or policy coordination. In such cases, the propagation of the crisis would be due not to investor panic, but to an actual increase in interdependence. Only when the propagation mechanism cannot be explained by interdependence does contagion come into play; in such a case, we expect to observe increases and volatility (as measured by the standard error of the correlations), which then revert to normal after the shock. The importance of accounting for volatility can be explained by a simple model of correlation, where x and y represent returns in two different stock markets and # is an idiosyncratic shock, independent of any aggregate shocks: y = a + bx + #
(2)
where E[#] = 0, E[x#] = 0 and | b|< 1. We can divide the hypothetical dataset into two periods: one of relative market stability, where sx will be low (sl ), and one of financial turmoil, where sx will be high ( s h ). In a standard OLS regression, the estimator b is by definition equal to: b=
 xi yi  xi2
1 n  xi  yi 1 2 n ( x i )
51
=
Cov[x, y] sxy = Var[x] sx
(3)
Notwithstanding changes in variances, bl = bh , because E[x#] = 0. We can rewrite the equality as: bl = bh =
l sxy
sxl
=
h sxy
sxh
(4)
h > s l , because s h > s l . The variance of y which implies that sxy xy x x is given by:
sy = b2 sx + s#
(5)
Since the variance of # is constant and | b|< 1, any increase in the variance of y is less than proportional to the increase in the variance of x. Correlation r is defined as: r=
sxy sx sy
(6)
which implies that rh > rl . This inequality shows that the correlation between stock market returns is conditional on the variance of the stock market returns; in other words, an increase in the variance of x and y would cause an apparent increase in the correlation coefficient even if the volatility-adjusted coefficient had not actually risen. The above example shows the importance of taking volatility into account when performing a contagion analysis. As Forbes and Rigobon (1999) point out, a permanent and stable increase in cross-market correlations indicates stronger economic interdependence rather than contagion. Indeed, economic integration is a long-term process and does not revert back in short periods of time. Therefore, contagion is identified when correlation coefficients and their standard errors increase during a period of financial turmoil and subsequently return to pre-crisis levels. The main model developed in this paper analyzes how the Greek CDS market and credit ratings affected the CDS markets of other Eurozone countries. In this specific case, contagion is identified whenever the correlation coefficient of Greece with another country increases from one year to the next, coupled with an increase in the standard error (the volatility) of that coefficient over the same time period. 52
4
Data
For the analysis of contagion in European sovereign debt markets, this paper uses CDS data from seven countries: France, Italy, Portugal, Spain, the Netherlands, Austria, and Belgium. The sample thus includes both countries whose sovereign debt markets underwent severe pressure and countries relatively unaffected by the crisis. The dataset includes daily closing prices of US dollar-denominated credit default swaps on 10year government bonds, as calculated by Bloomberg. The prices quoted are for five-day weeks, in order to exclude non-trading days from the analysis, over a period ranging from January 1, 2008 to September 16, 2011. The time period was specifically chosen in order to allow for a comparison of the pre-Euro crisis period (2008 to mid-2009) with a period of high financial distress. The analysis does not include CDS prices after Friday, September 16, 2011, because shortly thereafter Greek Finance Minister Evangelos Venizelos announced the possibility of a 50% debt haircut during a speech to ruling Socialist party lawmakers. European policymakers repeatedly stressed the fact that the haircut would be borne by private-sector creditors but would have to be strictly voluntary, meaning that the write-down would not trigger the $3.7 billion worth of CDS contracts held by Greek banks. The International Swaps and Derivatives Association (ISPDA), which regulates the CDS market, stated that a voluntary bond exchange into new debt would not trigger CDS payments, even if there may have been some degree of coercion. Eventually, the ISDA declared in March 2012 that the terms of the 2012 Greek debt restructuring did trigger CDS payouts, but until then it had stated that a voluntary agreement would not have been officially catalogued as a credit event. The ISDAâ&#x20AC;&#x2122;s original stance could have thus hindered the value of CDS prices as the best indicators of credit risk; if an effective default on the part of Greece does not trigger debt-insurance payments, CDS prices no longer represent the best indicator of Greeceâ&#x20AC;&#x2122;s perceived sovereign risk.
53
5
Methodology
German CDS prices are used as a risk-free benchmark in order to obtain an indication of the risk premia of the other countries. CDS prices on 10-year US dollar-denominated bunds are thus subtracted from the CDS prices of the other countries in the sample. The use of CDS spreads over the benchmark, rather than pure CDS prices, allows us to remove parallel economic developments of the Eurozone from the contagion study and to analyze the country-specific risk premium. It is also important to include the German benchmark in the study because of the sheer size of the German economy, the largest within the Eurozone. The exclusion of such an important country may lead to a model that is potentially misspecified and could yield misleading outcomes. CDS spreads are a non-stationary time series and exhibit clustering of volatilities, meaning that the size of price volatility tends to cluster in periods of low volatility and periods of high volatility. Volatility throughout the sample is particularly high, but it does tend to group in periods of relative calm and periods of turmoil. Due to the nature of the data, an autoregressive conditional heteroskedasticity model (ARCH) should be specified, since it can describe the time evolution of the average size of the squared errors; i.e. the evolution in the magnitude of uncertainty. ARCH captures the time dependent nature of the variance by using a short rolling window for estimates; in fact, the variance is forecast as a moving average of past error terms. In a simple ARCH specification, the dependent variable return rt is given by the mean value mt plus an error term: rt = mt + # t
(7)
The error term will have the form: # t = zt
p
ht
(8)
where zt is an independent, standard normal variable and ht is the
54
variance as a function of the moving averages of past error terms: p
ht = w + Ă&#x201A; ai #2t i=1
i
(9)
with w and ai being positive constants and # t i being the error term in the previous periods, with lags of t i. Running an ARCH regression with CDS spreads, however, generates a potentially misspecified outcome: the Durbin-Watson statistic and Ljung-Box test for randomness reveal that the residuals for such a model are non-random, meaning that the data are not independently distributed. Correlations identified in the sample could thus be caused by autocorrelation in the data rather than by effective contagion effects. An autoregressive integrated moving average (ARIMA) filtration of the data is thus necessary. ARIMA removes autocorrelation by stationarizing the time series; lags of the series are added to the prediction equation in order to remove autocorrelation from the forecast errors. An ARIMA(p) model is applied to the CDS spread of each country, where (p) is the number of autoregressive lags. The optimal value of (p) is different for each country, and is obtained with the SchwarzBayesian information criterion. The Schwarz-Bayesian criterion predicts that the optimal number of lags is 1 for the French series, 3 for Italy, 3 for Portugal, 3 for Spain, 1 for the Netherlands, 1 for Austria, 1 for Belgium, and 4 for Greece. These lags are then used in the ARIMA filtration, which specifies the following autoregressive equation: p
Xt = c + Ă&#x201A; j i Xt i=1
i
+ #t
(10)
where c is a constant, ji are the parameters or coefficients of the lags, and # t is white noise. For example, the ARIMA(1) process for France is: Xt = c + jXt
1
+ #t
(11)
where Xt is the French CDS spread on day t. Xt 1 is the CDS spread on the previous day. The residuals of each ARIMA regression are then extracted 55
and used as variables in the ARCH regression. The model can now be correctly specified, since residuals are random, as confirmed by the Ljung-Box test. Ljung-Box tests the null hypothesis that the data are random against the alternative hypothesis that the data are not random. Since the test gives pvalues < 0.05 for all series, the residuals pass the test and can be used as ARCH inputs. To capture any contagion effects of Greece on other CDS markets, the dataset is divided by year (2008-2011) and then each country’s residuals are regressed ARCH(1) year-byyear against the Greek residuals. Rating changes should also be accounted for, given the weight that they have in forming market perceptions of credit risk. For this reason, credit ratings from Standard & Poor’s (which are available on the agency’s website) are included in the analysis. Greek ratings may be related to other countries’ ratings, in the sense that a rating cut in Greece may make a rating cut in another country more likely. For example, it may be rational for investors to expect a Portuguese downgrade after a Greek downgrade if there is interdependence between the two countries. The model thus considers changes in relative credit rating, as defined by a dummy variable indicating the rating spread between the country being analyzed and Greece. The S&P rating scale ranges between C and AAA, which were respectively assigned numerical values of 0 and 20. Every country’s rating thus corresponds to a number between 0 and 20, with AA+ being 19, AA being 18, AA- being 17, and so on. The Greek rating is then subtracted from such a number to obtain a rating spread; because Greece’s rating was the lowest for the whole period being analyzed, all the rating spreads are positive. In sum, the original data are modified through a series of statistical procedures in order to obtain the residuals used in the ARCH model: (i) German CDS prices are subtracted from each country’s CDS prices in order to obtain a CDS spread; (ii) An ARIMA regression is performed on the CDS spreads, with a number of lags specified by the Schwarz-Bayesian information criterion; (iii) The residuals of each ARIMA regression are extracted and 56
0
1000 2000 3000 4000 5000
Figure 1: CDS Spreads and Greece S&P Downgrades 1000 0 5000 4000 3000 2000 1000 01 Greece France Italy Portugal Belgium Netherlands Austria Spain Data: CDS Jan Bloomberg, Spreads 08 09 10 11 12 Standard&Poor's and Greece S&P Downgrades
01 Jan 08
01 Jan 09
01 Jan 10
01 Jan 11
Greece Italy Belgium Austria
01 Jan 12
France Portugal Netherlands Spain
Data: Bloomberg, Standard&Poor's
used in the ARCH(1) model. Thus our final model specification is: pt = at + b 1 G + b 2 R + # t
(12)
where pt represents the CDS residuals of a country after the ARIMA filtration, at is a constant, G are the Greek residuals and R is the rating spread. The error term is described by equation (8) with: ht = w + a#2t
1
(13)
In the cases in which the rating spread is dropped because of collinearity, the term is eliminated and the equation becomes: pt = at + b 1 G + # t
(14)
The regressions that produce a probability > c2 less than 0.05 indicate a statistically significant effect of the independent variables on a countryâ&#x20AC;&#x2122;s residuals, with a 95% confidence level. As 57
previously explained, contagion is identified where the correlation coefficients in the ARCH regression, as well as their standard errors, increase from one year to the next. The analysis shows contagion effects in France, Italy, Portugal, Spain, and Belgium; Austria and the Netherlands, on the other hand, are immune from contagion. This finding does not mean that Greece alone caused the financing difficulties faced by the “infected” countries, but rather that the problems in Greece worsened the potentially existing fundamental problems in other countries. Quantifying the proportion of CDS price increases due to contagion as opposed to fundamentals is, however, beyond the scope of this analysis.
6
Results and Discussion
The model identifies contagion effects stemming from Greece in Portugal, Spain, Italy, Belgium, and France, caused both by turmoil in the Greek CDS market and by Greek downgrades. Spillage of the debt crisis from country to country is driven in most cases partly by fundamentals and partly by contagion. Economic interlinkages often accentuate fundamental problems by acting as transmission channels for economic turmoil. The only two countries in the sample that appear to have been immune from contagion throughout the whole period are Austria and the Netherlands, since both have enjoyed top credit ratings and stable political systems. In other words, the countries that were already under investor scrutiny suffered from spillover effects. The ARCH model also finds evidence for contagion effects of Portugal on Spain, Spain on Italy and vice versa, and Italy on France and vice versa. The volatility of Portugal-Greece correlations—together with their standard error—more than doubled from 2008 to 2009, indicating mounting worries about Portugal’s solvency risk that cannot be explained by fundamentals alone. As of 2010, the trade relations between Portugal and Greece were negligible—exports to Greece constituted only 0.1% of Portuguese GDP. However, Portuguese banks and other financial firms had extended loans to Greece worth 4.2% of Portuguese GDP (Global Trade Information Services). Such large exposure to Greece unsettled the Portuguese financial system, worsening the 58
Table 1: ARCH Model of Contagion Effects From Greece to Other Eurozone Countries France
Italy
Portugal
Spain
Netherlands
Austria
Belgium
Portugal on Spain Italy on France Spain on Italy France on Italy Spain on Portugal
Prob.>X 2 CDS coefficient Std. error Rating coefficient Std. Error Prob.>X 2 CDS coefficient Std. error Rating coefficient Std. Error Prob.>X 2 CDS coefficient Std. error Rating coefficient Std. error Prob.>X 2 CDS coefficient Std. error Rating coefficient Std. error Prob.>X 2 CDS coefficient Std. error Rating coefficient Std. error Prob.>X 2 CDS coefficient Std. error Rating coefficient Std. error Prob.>X 2 CDS coefficient Std. error Rating coefficient Std. error Prob.>X 2 CDS coefficient Std. error Prob.>X 2 CDS coefficient Std. error Prob.>X 2 CDS coefficient Std. error Prob.>X 2 CDS coefficient Std. error Prob.>X 2 CDS coefficient Std. error
2008 0.0000*** 0.0297239 0.0050456 collinear – 0.0009*** 0.0751675 0.0226569 collinear – 0.0000*** 0.1473455 0.0129079 collinear – 0.0000*** 0.068791 0.0054091 collinear – 0.7611 -0.0202776 0.0667035 collinear – 0.8387 0.0148756 0.0730833 collinear – 0.3302 -0.0532767 0.0547142 collinear 0.3677701 0.0000*** 0.4448857 0.0084142 0.0000*** 0.3811751 0.0207702 0.0000*** 0.9434361 0.0144626 0.0000*** 1.029702 0.0412852 0.0000*** 1.030083 0.033918
2009 0.0000*** 0.1455889 0.0199051 0.4753365 0.636 0.5935 -0.0263292 0.0306993 0.4595032 0.8139468 0.0000*** 0.1366586 0.0288803 2.3796360 0.9277528 0.0000*** 0.0368748 0.0270421 2.4643810 1.0932330 0.5279 0.0075679 0.0208615 -0.3608043 0.3390125 0.0791* -0.0017699 0.0204286 1.680653 0.7464425 0.0002*** 0.0293775 0.012252 1.192503 0.7009665 0.0000*** 0.2026737 0.0388562 0.0038** 0.097476 0.0336625 0.2116 0.0660258 0.052855 0.0494** 0.1829117 0.0930866 0.0000*** 0.290055 0.0401719
2010 0.0006*** 0.0186672 0.0048311 -0.1154252 0.6210384 0.0016** 0.0008332 0.0181181 2.8750550 0.8219466 0.0025** 0.0601887 0.0183478 1.6276110 1.9108080 0.0000*** -0.0470753 0.0045154 10.4030700 0.0721375 0.8822 -0.000664 0.0057261 -0.2928234 0.597633 0.1355 -0.0079454 0.0039799 -0.0291779 0.5229457 0.9171 0.0030445 0.0096197 0.1935683 0.7009665 0.0104 -0.0766774 0.0299338 0.1247 0.0412572 0.0268688 0.1107 0.0648574 0.0406627 0.0000*** 0.4946055 0.0975066 0.1872 -0.041686 0.0316087
*, **, *** indicates statistical significance at the 10%, 5%, and 1% level, respectively.
59
2011 0.8158 0.0039692 0.0082143 0.0833719 0.2259387 0.3306 0.0219262 0.0202039 0.4660443 0.5564207 0.8626 0.0008259 0.0064993 0.4088153 0.8305202 0.3346 -0.0026164 0.0105459 0.378626 0.256944 0.6111 -0.01885 0.0370737 collinear – – – – – – 0.7305 -0.0195063 0.0404967 0.3814195 0.7651257 0.0025** 0.0855804 0.0282931 0.0000*** 0.3039836 0.0269942 0.0000*** 0.4389144 0.045918 0.0000*** 0.9933403 0.0621817 0.0000*** 0.9224701 0.1414841
country’s already weak fundamentals. A poor educational system and a rigid labor market had long hindered growth in Portugal, while low-cost labor in Eastern Europe had diverted foreign direct investment away from the country. The combination of low growth prospects and a budget deficit amounting to 10.1% of GDP in 2009 made Portugal particularly vulnerable to bond market turbulence, until yields reached unsustainable levels and the EU/IMF granted the country a e78 billion emergency loan to overhaul its economy. Hence, the financial interlinkages between Greece and Portugal and Portugal’s large budget deficit paved the way for a contagion effect in 2009. The 2009 contagion effect between Greece and France was due to the French banking sector’s exposure to Greece, rather than by weakness in French fundamentals. In 2009, the correlation coefficient between French residuals and Greek residuals increased by a factor of 4.9, while volatility increased 3.98 times, even though it remained at relatively low levels. The French banking sector’s direct exposure to Greece totaled 2.5% of French GDP as of 2010. Conservative UBS estimates show that net sovereign exposure to Greece amounted to 14% of equity for BPCE, 8% for BNP Paribas, and 6% for Société Générale, the country’s largest banking groups (UBS Equity Research 2011). Société Générale’s business was also negatively affected through its Greek subsidiary Geniki Bank. By contrast, Unicredit and Banco Popolare, Italy’s most exposed banks, had net exposure to Greece totaling 1% of equity. Nonetheless, French fundamentals were unanimously considered very solid—even though the country was running a relatively large budget deficit in 2009, at 7.6% of GDP, there was no uncertainty regarding its solvency. Therefore, the contagion identified by our model must stem from the systemic risk of the French banking system, which held ample exposure to Greek sovereign debt and the Greek economy. Since the beginning of the Eurozone crisis, French and international banks have recognized the importance of peripherals’ credit risk and have sought to reduce their exposure to Greece by selling off assets at a loss, thus further contributing to the worsening of Greece’s solvency problems. The last country to suffer from contagion in 2009 was Belgium. 60
0
1000
2000
3000
4000
5000
Figure 2: Greece-France CDS Spreads 1000 0 5000 4000 3000 2000 1000 01 Greece France Data: CDS Jan Bloomberg Spreads: 08 09 10 11 12 Greece-France
01 Jan 08
01 Jan 09
01 Jan 10 Greece
01 Jan 11
01 Jan 12
France
Data: Bloomberg
Figure 3: Banksâ&#x20AC;&#x2122; Exposure to Greece
0
1000 2000 3000 4000 5000
1000 0 5000 4000 3000 2000 1000 01 Greece France Italy Portugal Belgium Netherlands Austria Spain Data: CDS Jan Bloomberg spreads 08 09 10 11 12
01 Jan 08
01 Jan 09
01 Jan 10 Greece Italy Belgium Austria
01 Jan 11 France Portugal Netherlands Spain
Data: Bloomberg
61
01 Jan 12
Indeed, 2009 is the only year of the dataset in which Greek CDS residuals appear to have a significant effect on Belgian CDS residuals; in 2010 and 2011 the model reverts back to statistical insignificance. Belgium’s small open economy was hit in earnest by the global recession and the decline in world trade, leading to deterioration of the interbank market and lending to firms and households. Its government deficit increased from 1.3% of GDP in 2008 to 5.9% in 2009 (Organization for Economic Cooperation and Development). Investors in Belgian public debt also became preoccupied with the country’s long-running political instability; Belgium had been effectively devoid of a government since 2007, and would remain highly unstable until Elio Di Rupo’s election as Prime Minister in December 2011. These worries materialized in 2009, the only year in which the ARCH model produces a statistically significant output, to then revert back to values of probability > c2 greater than 0.05 in 2010 and 2011. In 2009, the Greek downgrade also had a statistically significant effect on the residuals of Belgian CDS spreads, with a coefficient of 1.1925, indicating some mild degree of contagion both across CDS markets and credit ratings. In 2010, contagion spread to Italy and Spain. The correlation coefficient for Italy increased in 2010, but, most importantly, both Italy and Spain underwent financing pressures due to the effect of a Greek rating cut. The downgrade in 2010 had a sizable and statistically significant effect, with a coefficient of 2.8751 for Italy and 10.4031 for Spain. Spain was vulnerable due to its high budget deficit (9.3% of GDP) and ailing economy, which had contracted 3.7% in the previous year. Public debt had risen from 47.4% of GDP in 2008 to 66.1% in 2010, while unemployment rose above 20% in 2010 (OECD). The state of the Spanish economy had been severely worsened by the real estate bubble that began to explode in 2007-2008. The loss of value of residential properties, coupled with soaring household mortgage debt, had negative repercussions on the banking system. Notwithstanding Spain’s problematic economic fundamentals, it appears that the Greek downgrade still did have a contagion effect on the prices of CDS on bonds. In fact, the interdependence of the Spanish and Greek economies is negligible—Spanish exports to Greece amount to a mere 0.2% of 62
Spain’s GDP, and lending to Greece to 0.1%. Contagion can also be identified in 2010 from Greece to Italy, because the rating coefficient increased even though the two economies are not significantly interconnected: loans to Greece account for only 0.3% of Italy’s GDP, and exports to Greece for 0.4%. In fact, Italian fundamentals in 2010 were significantly stronger than those of the peripherals, yet the Eurozone’s third largest economy was not spared from financing difficulties. Italy had a history of successfully servicing a high public debt load, which stood at 119% of GDP in 2010, and had a budget deficit of 4.5% of GDP, roughly in line with Germany’s 4.3% (International Monetary Fund). The economy had not been affected by any property bubble, and the banking system was among the safest in the continent, given its relatively conservative business practices, strong Tier I capital ratios, and low exposure to peripheral debt. The combination of low private debt and high consumer savings made the Italian economy “substantially robust,” according to rating agency Moody’s. The solidity of the Italian economy is highlighted by a crosscountry comparison of external debt, meaning the sum of public and private debt payable in a foreign currency. According to this metric, Italy’s external debt-to-GDP ratio in 2010 stood at 108%, compared to 142% for Germany, 182% for France, 154% for Spain, 217% for Portugal, and 1,103% for Ireland (Moody’s 2010). Nonetheless, the relatively stronger fundamentals and the sheer size of the Italian economy appear to have been insufficient to stave off contagion stemming from Greece. Worries about the size of the Italian public debt began mounting, further increased by the lack of policy responses on the part of the government, and CDS on Italian Treasury Bonds suffered from the Greek malaise. The only two countries in the sample that appear to have been completely immune from contagion are Austria and the Netherlands, which both enjoyed top credit ratings and political stability during the crisis. The model is in fact statistically insignificant for both Austria and the Netherlands in every year of the series (2008-2011 for the Netherlands, and 2008-2010 for Austria because of insufficient CDS price data in 2011), having probability > c2 greater than 0.05. Austria and the Netherlands had a public debt-to-GDP ratio equal to 65.8% and 51.8%, 63
0
1000
2000
3000
4000
5000
Figure 4: Greece-Italy CDS Spreads 1000 0 5000 4000 3000 2000 1000 01 Greece Italy Data: CDS Jan Bloomberg Spreads: 08 09 10 11 12 Greece-Italy
01 Jan 08
01 Jan 09
01 Jan 10 Greece
01 Jan 11
01 Jan 12
Italy
Data: Bloomberg
respectively, in 2010, with deficits of 4.4% and 5%. Interestingly, the CDS coefficients for Austria were negative in 2009 and 2010, and those for the Netherlands were negative in 2008, 2010, and 2011, highlighting the flight to safety that took place in European debt markets. Therefore, while turmoil in Greece did influence investor sentiment about the financial stance of economically problematic or politically unstable countries, contagious tendencies did not seem to hit countries that were perfectly stable both economically and politically. Yet if a country was already under close investor scrutiny for any reason, the sudden downturn of financing conditions in one country generated spillover effects. The contagion dynamics do not apply to Greece alone. As the Eurozone crisis progressed and turmoil spread to several other sovereign debt markets, more instances of crosscountry contagion can be observed. Economic interdependences accentuated fundamental problems when, in 2011, pressure on Spanish CDS spilled over to Italy, which exports 1.5% of its GDP to its southern European neighbor. In the same year, 64
0
1000
2000
3000
4000
5000
Figure 5: Greece-Austria CDS Spreads 1000 0 5000 4000 3000 2000 1000 01 Greece Austria Data: CDS Jan Bloomberg Spreads: 08 09 10 11 12 Greece-Austria
01 Jan 08
01 Jan 09
01 Jan 10 Greece
01 Jan 11
01 Jan 12
Austria
Data: Bloomberg
contagion effects spread from Portugal to Spain and vice versa, and from Italy to France. Exports to Spain are equivalent to 5.3% of Portugal’s GDP, while exports to France constitute 3% of Italy’s GDP. Even more importantly, France has extended loans worth 18.2% of its GDP to Italy through its banking system. In 2010, a small contagion effect of French CDS markets on Italy can also be observed. Hence it appears that, for the larger European economies, economic interdependences act as transmission channels through which sovereign debt turmoil moves from one country to another, worsening the actual fundamental problems.
7
Conclusion
This paper estimates an ARCH model in order to analyze contagion dynamics across Eurozone CDS markets, which are a proxy for investors’ perceptions of sovereign risk. The results do indicate the presence of contagion effects stemming from
65
Greece in Portugal, Spain, Belgium, Italy, and France, and further contagion among countries as the crisis progresses. The policy implications that can be drawn from this analysis are ambiguous; however, contagion could be staved off by signaling a countryâ&#x20AC;&#x2122;s strength to international investors. Possible reactions thus include bailouts, such as those implemented for Greece, Portugal, and Ireland, as well as measures aimed at consolidating a countryâ&#x20AC;&#x2122;s fiscal stance and competitiveness. For example, Spain has significantly consolidated and strengthened its banking sector, while Italy has embarked on a series of austerity measures and tax increases in order to balance its budget. However, since contagion cannot be explained by economic fundamentals, it remains particularly difficult to forecast and quantify, so the timing and correct measure of successful policy interventions continue to be challenging.
66
The Socioeconomic Impact of Political Fragmentation in India What the Rise of Regional Politics Implies for Economic Growth and Development Disha Verma, Harvard University1 Abstract. This paper uses a regression discontinuity approach based on panel data from the 28 states of India and the union territories of Delhi and Puducherry during the period 1994-2012 to study the association between political fragmentation in the State Legislatures and growth, debt, and social spending. Political fragmentation is judged along three different measures. The first is the re-election of the incumbent, the second is the margin of victory, and the third is Herfindahl’s index.2 This study demonstrates that the effects of the incumbent winning and the margin of victory are different between coalition and majority governments. On the whole, coalition governments grow faster, undertake more social spending, and incur less debt. Results are considered across several different margins of victory to better estimate the relationships under consideration. Keywords: India, political fragmentation, regression discontinuity
1 Disha Verma is a junior at Harvard University. She wrote this paper for Professor Dale Jorgenson’s seminar “The Rise of Asia and the World Economy" in Fall 2013. The author would like to thank Professor Jorgenson for his invaluable support and guidance in formulating this topic and understanding the Indian growth story. She would also like to thank Wentao Xiong and Dana Beuschel for their help in navigating the technical aspects of this paper. Lastly, she is grateful for the input of Professor Torben Iversen and Professor Jeffrey Frieden on the political economy theories discussed in this paper. 2 Herfindahl’s Index is used in this paper as a measure of political competition. See Appendix (online) for an explanation of the calculation.
67
1
Introduction
India is a remarkable example of a democratic nation among the developing countries in the world. It gained its independence from the British in 1947 as a very poor country that had yet to industrialize.3 Despite global skepticism, India proved that a poor, illiterate country is capable of maintaining democratic principles. India stands out even among the BRICS (Brazil, Russia, India, China, and South Africa) for its history of robust democracy. Democracy involves having multiple voices in government, however, and the question arises of how the resulting political fragmentation affects social and economic growth in India. The question that arises is how resulting political fragmentation affects social and economic growth in India. Democratic politics brings individual rights and freedom, but arguably at the cost of efficiency and political sustainability. Political economy models have repeatedly linked fragmented governments to poorer fiscal performance in the form of higher spending and deficits. Volkerink and Haan (2001) find that more fragmented governments, as measured by the number of political parties in a coalition or the number of spending ministers, have higher deficits. They also find some evidence that decreasing the number of seats in parliament is linked to lower deficits. The issue of political fragmentation has become increasingly prevalent in India since the 1990s. The Indian National Congress (referred to in this paper as the Congress or the INC) led Indiaâ&#x20AC;&#x2122;s freedom struggle, and, while other parties did exist and flourish after independence, the Congress dominated for several decades.4 Starting in the 1980s and 1990s, however, regional parties and other national parties such as the Bharatiya Janata Party (BJP) started to increase significantly in popularity. The Congress slowly but surely began to lose its predominance in Indian 3 As
late as 1973, more than 20 years after independence, more than 40% of the urban population and more than 50% of the rural population lived in poverty (Panagariya 2008). 4 India did in fact from the time of independence have a well-developed political spectrum on both the right and the left. The Communist Party of India was well established as a leftist national party. On the opposite end of the spectrum, the Jan Sangh, from which the BJP originated, is just one example of a right-wing national party formed in the early decades following independence.
68
politics, although it remains the largest national party today. The expansion of political parties has affected political outcomes in a number of ways: it has increased options for voters, amplified variance and diversity in who holds seats in parliament, and diluted the concentration of power in the hands of the major parties. Politics has become more uncertain and absolute majorities harder to achieve. Coalition politics has reared its head and the importance of alliances has risen drastically. The nature of politics in India has fundamentally changed. More uncertainty and tougher competition can change party preferences, making parties more willing to incur liabilities if the future burden is likely to rest on someone elseâ&#x20AC;&#x2122;s shoulders. Social expenditure might become more attractive as a way to garner support among voters. The short-run could rise in importance compared to the long-run when the future is more uncertain. As a result of these changes, growth, debt, and spending patterns would probably all be impacted. India experienced almost double-digit growth for the last decade before witnessing a growth slowdown starting in 2011. The slowdown has been attributed to factors such as the global macroeconomic environment, falling investor confidence, and deteriorating fiscal indicators. The International Monetary Fund (IMF) estimates that if India does not return to pre-2008 growth levels, an additional 35 million people will live in poverty (IMF India 2013, Article IV Consultation). But the effect of the changing nature of Indian politics on growth has not been adequately considered. This paper aims to assess the effect of political fragmentation in India on socioeconomic outcomes. The rise of regional politics in India is exogenous, in that it took place without any institutional changes, making it an ideal variable to study. The regression discontinuity approach used in this paper highlights the differences between coalition and majority governments to identify the effects of political fragmentation. It takes advantage of the detailed data collected on state-wide elections and socioeconomic indicators in India from 1994 to 2012. 1994 is chosen as the initial year due to data availability and because it predates by exactly one political term (five years) the first nonCongress government at the Center to last an entire term. It thus allows the establishment of the background against which 69
the Congress started to lose its dominance. The term â&#x20AC;&#x153;political fragmentation" is defined rather comprehensively. Rather than narrowing it to one particular measure, it is used to describe a full range of changes. Three different aspects of fragmentation are explored: re-election of incumbents, margins of victory, and the Herfindahl index. These three factors are considered for both coalition and majority governments. The socioeconomic outcomes considered are growth rates, social sector expenditures, and total outstanding liabilities of the government. The paper is organized as follows. Section 2 establishes the institutional context of Indian politics and theoretically considers the political economy of increased fragmentation theoretically. Section 3 analyzes the data and variables used in the study, Section 4 explains the empirical strategy and methodology, and Section 5 presents the results and their implications. Section 6 concludes the paper and considers avenues for future research.
2 2.1
Background Institutional Context
India consists of 28 states and seven union territories. Every state has a Legislative Assembly that carries out the governmentâ&#x20AC;&#x2122;s administration and is directly elected by the people every five years. Since India is a federal republic, the constitution grants considerable autonomy to the states and union territories. National issues such as defense and foreign policy are reserved for the national government; other issues, such as education, are shared between the national and state governments; and a third set of issues, such as agriculture and land rights, fall entirely under the purview of state governments. These Legislative Assemblies can make laws, amend central government laws, allocate expenditures, and oversee local governments (Panchayats). State government expenditures account for more than half of total government expenditures in India. States oversee 60% of medical and health expenditures, 60% of expenditures on economic services, and 85% of educational expenditures (Rao & Singh 1998). On the whole, state governments have more influence over 70
issues such as education, social security, and transportation (ClotsFigueras 2011). For these reasons, the relation between political fragmentation and socioeconomic outcomes is assessed in this study through State Assemblies rather than the Parliament (i.e. the central government). The size of Legislative Assemblies is based on population. Each elector has one vote based on universal suffrage. Any Indian citizen who is over 18 and is registered as a voter can cast his or her vote. The length of a term served by members of an Assembly is five years. Any Indian citizen over 25 who is registered to vote can run for election. Elections in a constituency are first-past-thepost, so that the candidate with the most votes wins. A party that gains a majority of seats in the Assembly (>50%) comes to power and elects the stateâ&#x20AC;&#x2122;s Chief Minister. If no single party wins a majority, a coalition must be formed. As a consequence of coalitions, although elections are scheduled to take place every five years, in some cases shifting political alignments and hung parliaments can lead to elections before the end of a five-year term.
2.2
The Political Economy of Increased Fragmentation
The change in the nature of politics has increased uncertainty and fostered competition. The rise in the number of political parties means there is greater competition; the potential winners of elections become more uncertain. These changes have led to three main results. First, seats are more divided between parties, leading to the prevalence of coalition governments at both the State and Center. Second, the balance of power has shifted away from national parties towards regional parties, so there are more political actors with power. Third, political frontiers have shortened because the incumbent party can switch repeatedly over subsequent elections and even within one term, as is the case in states such as Tamil Nadu, Rajasthan, and Uttar Pradesh. These changes can be expected to change equilibrium preferences for political actors and affect socioeconomic outcomes. Fragmentation, especially in multi-party coalition governments, creates what can be termed a â&#x20AC;&#x153;common pool problem," in which parties spend on their own constituencies and discount the
71
adverse effects on the economy as a whole. Olson introduces the idea of "distributive coalitions" that increase in number over time and are concerned not with expanding the size of the economic pie but the way it is split (Olson 1965). Since the benefits of these policies are concentrated among a select few groups and the costs are diffused among the population, these coalitions will prosper by the logic of collective action. Economic growth will suffer because, by the exclusive nature of their membership, distributional coalitions tend to favor protectionist and antitechnology policies that benefit only their members (Olson 1965). Another reason to be concerned about political fragmentation is that rapid government turnover and instability make politicians myopic. An inability to commit to the future, or "time inconsistency," is perpetuated (Kydland & Prescott 1977). Governments that expect to stay in power for a short period will discount the future and try to grab what they can in the present, harming investment and long-term growth. This effect will be reinforced if parties become less concerned about their reputation because of rapid turnover. Populist policies may become more tempting as political parties try to gain voter support. Social expenditures may rise at the cost of fiscal responsibility. The current Congress government has been widely criticized for the recent Food Security Act of 2013 for exactly this reason. Critics allege the act is nothing but a ploy to get more votes without concern for the impact of the additional expenditures, assessed at 1.5% of GDP at the bare minimum (“The massive hidden costs of India’s food security act" 2013). Finally, the burgeoning of coalition politics increases the number of "veto" players. There are more parties or groups who must all agree to a policy before it can be implemented. Uncertainty about who will suffer more if no agreement is reached may lead to a “war of attrition" (Alesina & Drazen 1991). This translates to a “battle of the sexes" game and can be understood through the example of fiscal consolidation, where all parties agree that it is necessary to cut the deficit but disagree as to how the burden should be distributed. If the parties do not know who will bear the greater burden when no agreement is reached (i.e. there is high uncertainty), all parties may hold out for a better deal, even though this will leave everyone worse off. 72
3 3.1
Data Electoral Data
The electoral data used were collected from elections in the period from 1994 to 2012 for every state in India and the union territories of Delhi and Puducherry. The data were obtained from the electoral reports published by the Election Commission of India. They include the year of election, the number of seats won by each of the three largest parties, the total number of seats in the Assembly, whether or not the incumbent won, how many parties competed, and how many parties ultimately won seats. This information is used to calculate the margin of victory and Herfindahlâ&#x20AC;&#x2122;s index, which is a measure of political competition. Rather than assessing just the relationship between the outcomes and the incumbent winning through a dummy variable, this paper also calculates the connection between the incumbent winning conditional on the margin of victory. This process is repeated for successively smaller margins, allowing for the formulation of clearer causal effects. A positive margin of victory means that the winning party had a majority. A negative margin of victory means that the party with the most seats did not have an absolute majority and needed to form a coalition. The margin of victory is measured this way because it allows us to easily investigate the relationship between coalitions vs. single-party (majority) governments and development outcomes. In cases where the largest party (the one with the most seats) needed to form a coalition, the margin of victory does not indicate it was successful in doing so. The margin of victory simply indicates that no party managed to get absolute power and hence a coalition government was needed. There are examples of states, such as Nagaland in 2002, where the party with the most seats, in this case Congress, did not have enough seats to form a majority on its own and ended up not coming to power because parties with fewer seats formed a coalition. In this paper, Herfindahlâ&#x20AC;&#x2122;s index applies to political competitiveness (or fragmentation). Its values range between zero and one. The closer to zero a stateâ&#x20AC;&#x2122;s score is, the more politically competitive it is, which means it has more parties with a significant number of seats and is more fragmented. 73
3.2
Outcome Variables
The rise of regional politics in India is a relatively new phenomenon, and its effect on developmental outcomes has not yet been well documented. As already discussed, high turnover, distributional coalitions, and an increase in veto players can increase government deficits, change expenditure patterns, and focus energy on the distribution rather than growth of the economic pie. With this in mind, this paper considers three outcomes at the state level that would be affected by political fragmentation: growth rates, total outstanding liabilities of state governments, and social sector expenditures. Social services are primarily the responsibility of state governments, which incur more than 80% of combined government expenditures in these areas (Reserve Bank of India, â&#x20AC;&#x153;State Finances: A Study of Budgets"). The ratio of social expenditures to total expenditures can thus be considered a good indicator of the extent to which the expansion of the social safety net is a priority compared to economic services such as energy, transportation, and debt servicing. The total outstanding liabilities of the state government, as well as social sector expenditures, together indicate how fiscally responsible a state government is. Rather than assessing the effect of fragmentation on these outcomes in a given year, this paper calculates the average of the outcome under consideration for the duration of a partyâ&#x20AC;&#x2122;s term in office. This allows for better assessment of a governmentâ&#x20AC;&#x2122;s performance, and also avoids the tricky question of which year is most appropriate to use for assessment. Descriptive statistics for the outcome variables as well as the electoral variables are presented in Table 1.
3.3
Controls
A number of controls are included and are indicative of the normal condition of affairs in the state. They help ascertain that the results were due to the factors tested in the experiment and not to a natural or random occurrence. Equivalently, they reduce the correlation of the X variables with the error term. All the controls may be correlated with the measures of fragmentation, but none are perfectly multicollinear. Literacy rate is included to factor out 74
75
Outcome Variables (Y’s) Mean Growth Rate Mean Social Sector Expenditure Mean Total Outstanding Liabilities Political Fragmentation Variables (X’s) Herfindahl’s index Incumbent Win Margin of Victory Incumbent Win given Margin
Independent Variables
7.418 35.830 5.354 .252 .408 -11.585 -4.786
48 48 48 49 49 49 49
.082 .496 8.08 7.634
2.756 5.354 11.149
Coalition Governments Number of Mean Standard Deviation Observations
61 61 61 61
60 61 61
.488 .540 14.724 9.153
7.044 34.757 38.106
.139 .502 11.19 12.48
2.666 6.366 17.096
Majority Governments Number of Mean Standard Deviation Observations
Table 1: Descriptive Statistics on Outcomes and Political Fragmentation
the effect of education on the outcome variables. It is calculated based on the population aged seven or above. Also included is the percentage of people below the poverty line (BPL), determined by the Planning Commission of India. This is a good indicator of how developed a state is; poorer states will have more people below the poverty line. If development is a process, then a stateâ&#x20AC;&#x2122;s location in that process might determine its expenditure priorities or debt liabilities. Hence percent BPL is included to control, at least to some degree, for the effect of poverty and development on the outcomes being investigated.5 The third control is an average of the rural and urban Gini coefficients. States that have lower inequality might be the ones spending more on social services, and might consequently incur more debt to provide these services. The fourth control is sex ratio. There is an extensive sophisticated literature that documents how womenâ&#x20AC;&#x2122;s political preferences are different from menâ&#x20AC;&#x2122;s (Clots-Figueras 2011). Political fragmentation might vary across these preferences; thus, the sex ratio is included to control for the effect of differential preferences of men and women. The last two controls are the number of parties competing and the number of parties that ultimately win seats, to account for different political backdrops. In any given regression, out of the three dependent variables being studied, the other two are included as controls. Debt to GDP, social sector expenditures, and growth rates are likely to be correlated, and adding them to the regression separates their intercausality from that of political fragmentation. When added as controls, their value in the year in which the government got elected is included. The base year is selected in this manner because the election might impact the concerned variables, leading to an endogeneity problem. 5 Percent BPL ultimately had to be dropped as a control because it did not have enough observations and was complicating the test results.
76
4
Causal Effects of Fragmentation
4.1
Graphical Analysis
The data used in this paper are examined in a way that can distinguish between pure selection and the proposed causal effect of type of government and incumbency. Despite the fact that coalition and majority governments are likely systematically different, it is highly plausible that majority governments formed by a very slim margin are ex ante comparable to coalition governments formed by just falling short of the majority. When we focus on the set of states with close elections, it becomes more plausible that idiosyncratic factors, and not systemic state characteristics, affect the outcomes of interest. Thus, under certain conditions, states where the party with the most seats barely formed a majority can serve as a reasonable counterfactual for states where they barely lost out on the majority. Thistlethwaite and Campbell (1960) originally provided the idea of exploiting cases where a treatment variable is a deterministic function of an observed variable. In this case, the nature of an election provides the deterministic function, and the observed variable is the margin of victory. Figure 1 illustrates the regression discontinuity with respect to the type of government. A governmentâ&#x20AC;&#x2122;s mean social expenditures, mean total outstanding liabilities, and mean growth record over its term in office are expressed as functions of its margin of victory. To assess purely causal effects, the observations considered are only those of close elections, in which the margin of victory falls between Âą5% of the threshold needed to form a majority, and hence the outcome can reasonably be said to have been determined by chance.6 Points on the line to the right represent outcomes for majority governments (margin of victory > 0). Points on the line to the left represent outcomes for coalition governments (margin of victory < 0). Comparing the graphed functions of the outcome variables between the right- and left-hand sides of the threshold reveals the difference that arises between majority and coalition governments. 6 These
figures were calculated using triangular kernels. defined was 15.
77
The bandwidth
Figure 1: Effect of Margin of Victory Within a ±5% Interval of the Thresholda
Mean Growth
Mean Social Expenditures Mean Outstanding Liabilities
a In reality, a party needs more than 50% of seats to obtain a majority. In this paper however, the margin of victory is calculated in a way that takes this into account and makes zero the effective threshold for ease of analysis. See data appendix for exact details on calculation.
Although not too large, it indicates that coalition governments are associated with lower outstanding liabilities, higher social expenditures, and higher growth than majority governments. The results stay the same if the interval considered for the margin of victory is extended to ±10% (see Figure 2). Figure 2: Effect of Margin of Victory Within a ±10% Interval of the Threshold
Mean Growth
Mean Social Expenditures Mean Outstanding Liabilities
78
4.2
Validity
Identification requires that all relevant factors besides treatment vary smoothly at the threshold between a coalition and majority government. Formally, letting y1 and y0 denote potential outcomes under a coalition and majority, identification requires that E[y1 |margin] and E[y0 |margin] are continuous at the majority-coalition threshold of 50%. All observable and unobservable pre-determined characteristics that could influence the outcome variables must not be systemically different between majority and coalition governments. For instance, if majority governments are formed with more adept political candidates than are present in coalitions, outcomes may vary not because of the type of government but because of the people who form them. Causality would be incorrectly attributed and the internal validity of the discontinuity jump as a causal effect could be questioned. Thus, the baseline characteristics of the treatment group should not in any observable way be ex ante systemically different from the control group. To test this, the mean and standard deviation for control variables on both sides of the margin of victory threshold are computed, including demographic factors, economic indicators, and electoral factors. Also included are the base year values of the outcome variables. As Table 2 indicates, the results appear fairly equal on both sides of the threshold. The only statistically significant difference was in the average Gini coefficient, lending credibility to the identification strategy employed in this paper.
4.3
Methodology
The regression discontinuity approach used in this paper to estimate the association between type of government and growth, liabilities, and social sector expenditures at the state level exploits the fact that the type of government (coalition vs. majority) changes discontinuously at the threshold of 50% of seats. The formal empirical specification for the right side of the threshold (i.e. majority governments) is: Yit = a0 + a1 Iit + a2 Mit + a3 Iit Mit + a4 Hit + dXit + ai + b t + uit , (1)
79
80
Controls Literacy Rate Sex Ratio Total Parties That Won Seats Total Parties Contesting Social Sector Expenditure Growth Rate Total Outstanding Liabilities Average Gini 60 61 61 61 61 57 61 49
65.062 925.541 5.852 31.049 34.649 7.729 39.478 .284
12.667 44.602 2.999 32.675 7.029 5.833 20.404 .037
Majority Governments Number of Mean Standard Deviation Observations 49 49 49 49 47 48 47 39
66.983 950.122 8.020 32.448 35.114 8.054 36.382 .299
13.457 48.142 3.430 20.957 6.225 5.034 11.778 .048
Coalition Governments Number of Mean Standard Deviation Observations
Table 2: Baseline Characteristics
where Yit is the outcome of interest in state i in year t, ai and b t are state and year fixed effects, I is a dummy variable for whether the incumbent won with a positive margin, M measures the margin of victory, and H is Herfindahlâ&#x20AC;&#x2122;s index. Iit Mit is the interaction term for the margin conditional on the incumbent winning and the margin being positive. Xit represents other control variables and uit is a measure of the error term. The inclusion of time effects and fixed effects improves the precision of the estimates by allowing us to account for state-specific factors and time trends. This specification has a counterpart for the left side of the threshold, in which case the dummy variable indicates whether the incumbent won with a negative margin, and the interaction term is for the margin of victory conditional on the incumbent winning and the margin being negative. As outcome variables, the mean social expenditures (as a percentage of Aggregate State Expenditure), mean total outstanding liabilities (as a percentage of Gross State Domestic Product), and mean growth rates (as a percentage of State Gross Domestic Product) over a governmentâ&#x20AC;&#x2122;s term in office are used, allowing for comprehensive evaluation of a governmentâ&#x20AC;&#x2122;s performance. All the control variables, such as sex ratio and literacy rate, are for the year of the election itself to avoid endogeneity. To check for robustness, results are reported with and without the outcome variables as regressors. When included in this manner as controls, it is not the mean but the base year value of the outcome variables that is considered. Close elections are more likely to have had outcomes determined by chance rather than systemic characteristics and hence be more indicative of causality. Therefore, regressions are carried out first for the whole set of elections and then for elections with a margin of victory above and below 10% and 15% of the threshold. Due to the limited dataset, it was not possible to look at elections within the 5% margin with the expanded empirical specifications. The graphical analysis, by contrast, considered only the margin of victory and thus the 5% margin could be included. To identify the causal effects of incumbent governments, a strong assumption has to hold: an incumbent win and incumbent loss should be similar in all observable and unobservable 81
82
Controls Literacy Rate Sex Ratio Total Parties That Won Seats Total Parties Contesting Social Sector Expenditure Growth Rate Total Outstanding Liabilities Average Gini 53 54 54 54 53 50 53 43
Number of Observations 65.281 927.055 6.259 28.685 34.845 7.824 40.011 .281
Mean 12.843 42.338 3.245 21.119 6.308 5.323 18.58 .044
Standard Deviation
Incumbent Win
59 59 59 59 58 57 58 48
Number of Observations
Table 3: Descriptive Statistics Based on Incumbency
67.235 948.186 7.203 33.644 34.936 7.875 37.079 .299
Mean
13.342 51.290 3.40 32.57 7.072 5.597 15.985 .043
Standard Deviation
Incumbent Loss
characteristics that might determine the outcome variables. Candidate and constituency characteristics should be similar. The test conducted to check this is similar to that in Section 3.2 to compare similarity of controls on both sides of the threshold. In this case, the â&#x20AC;&#x153;threshold" is whether or not the incumbent won. The outcome variables for the base year are also included. The results display no statistically significant difference between cases where the incumbent won and did not win except for the average Gini coefficient. They are displayed in Table 3. However, this test does not consider the traits of the parties and candidates themselves. Close elections would have been an ideal condition under which to study the effect of incumbency, but the dataset is too limited. The results must be interpreted as simply indicating a link between incumbency and the outcome variables rather than displaying a causal effect.
5 5.1
Results and Interpretation Mean Growth Rate
While the results are in general not statistically significant, it is still possible to draw inferences about their meaning. Table 4 includes results for the association between mean growth rate and political fragmentation first without controls, with controls, and finally with the other two outcome variables (social sector expenditures and total outstanding liabilities) as controls as well. When an incumbent wins with enough seats to form a majority government, the results clearly show that growth suffers. The results are also statistically significant in the regression excluding controls, especially when one considers that the standard deviation for the average of the mean growth rate in majority governments across states is 2.6 and the mean is 7.0, while our regression coefficients are all below 4, which is more than three standard deviations away. Lack of accountability could be a possible reason for this negative effect, as could complacency on being re-elected. The incumbent winning when the consequent government is a coalition does not seem to have much of an effect on growth, with the coefficient being less than 0.3, albeit positive, in all cases. 83
84 -
Growth Rate
Social Sector Expenditure
48
.248 (4.072) 50.780 (32.357) .338 (.305) .186 (.257) .047 (.477) 1.175 (.832) .028 (.123) .001 (.266)
59
4.694 (2.632) 3.677 (21.025) .227 (.216) .237 (.132) .050 (.253) .195 (.418) .032 (.093) .030 (.178)
All Observations Coalition Majority
*, **, *** indicates statistical significance at the 10%, 5%, and 1% level, respectively.
60
-
Total Parties Contesting
N
-
Sex Ratio
4.089*** (1.803) 6.944 (17.543) .232 (.184) .227*** (.103) -
Majority
-
48
.144 (3.123) -39.03 (26.821) .267 (.240) .166 (.188)
Coalition
Total Parties that Won Seats
Literacy Rate
Incumbent Win given Margin
Margin of Victory
Herfindahlâ&#x20AC;&#x2122;s Index
Incumbent Win
Independent Variables
Table 4: Dependent Variable: Mean Growth Rate
.284 (2.344) 46.560 (15.142) .276 (.988) -.086 (.151) .035 (.235) .924 (.508) .060 (.127) .255 (.161) .327 (.210) .196 (.257) 46
Coalition 4.734 (3.201) 11.369 (26.905) .059 (.287) .258 (.145) .008 (.312) .224 (.439) .030 (.123) .000 (.190) .192 (.211) .012 (.086) 59
Majority
Herfindahlâ&#x20AC;&#x2122;s index represents the extent of political competition or division. Since it only varies between 0 and 1, the coefficient from the regressions does not have much value. However, the signs of the coefficients clearly show that, when there is a coalition government, greater power division among different parties is bad for growth. This is evidence for the argument that more veto players can harm growth by causing delays and inefficiencies. The coefficient on margin of victory represents the change in growth due to a 1% change in the number of seats held by the largest party. Interestingly, the coefficient is consistently negative for majority governments and positive for coalition governments. It would seem that, when a party has a majority, larger margins of victory are worse for growth. Accountability and complacency could again be the reasons for this. On the other hand, when the winning party must form a coalition because it does not have enough seats to come to power independently, coming closer to the 50% threshold is correlated with slightly higher growth. This could be because while accountability is beneficial to an extent, too much division of power could cause inefficiencies, and hence it is better to have power more concentrated within one party in a coalition. However, when the incumbent wins, our findings about the effect of the margin of victory change somewhat, such that a higher margin generally appears good for growth in both coalition and majority governments. Perhaps re-election, by allowing continued momentum on the same path, impacts growth positively, and the higher the margin, the more freedom the incumbent has to continue its chosen policies. This does not refute our earlier conclusion that incumbents winning with a majority are bad for growth; it simply states that, when an incumbent wins, a higher margin of victory could be better for growth. None of the coefficients on the controls are statistically significant, and the results were mixed, making it hard to draw a clear picture of their connection to growth.
85
86 -
Growth Rate
Total Outstanding Liabilities
48
5.976 (5.418) 10.761 (43.058) .054 (.407) .389 (.379) .076 (.645) -.768 (1.207) .220 (.165) .058 (.359)
60
.911 (3.859) 10.584 (30.557) .219 (.315) .205 (.194) -.332 (.369) .101 (.603) -.013 (.135) .024 (.26)
All Observations Coalition Majority
*, **, *** indicates statistical significance at the 10%, 5%, and 1% level, respectively.
61
-
Total Parties Contesting
N
-
Sex Ratio
1.001 (2.727) 13.978 (26.101) .198 (.274) .129 (.157) -
Majority
-
48
5.963 (3.974) 2.966 (33.998) .068 (.305) .398 (.252)
Coalition
Total Parties that Won Seats
Literacy Rate
Incumbent Win given Margin
Margin of Victory
Herfindahlâ&#x20AC;&#x2122;s Index
Incumbent Win
Independent Variables 1.772 (5.139) 46.560*** (15.142) .345 (.422) .402 (.438) .206 (.540) .496 (.941) .259 (.259) .262 (.274) .807 (.628) .661* (.368) 46
Coalition
Table 5: Dependent Variable: Mean Social Sector Expenditures
5.288 (3.257) 2.512 (19.965) .170 (.226) .011 (.144) .268 (.250) .404 (.402) .006 (.114) .196 (.154) .438*** (.093) .048 (.071) 56
Majority
5.2
Mean Social Sector Expenditures
Table 5 displays the results pertaining to mean social sector expenditures. Social sector expenditures are calculated as a ratio to aggregate expenditure, and the mean is calculated over a governmentâ&#x20AC;&#x2122;s term in office. The findings indicate that social sector expenditures rise by more if the incumbent wins and has to form a coalition than if the incumbent wins a majority. Considering that the standard deviation for mean social expenditures is 5.354 for coalitions, the coefficients on incumbents winning in coalitions seem more significant (5.96, 5.97, and 1.77). Coalition governments by definition represent a broader set of interests, which may lead to higher social spending than a government representing a narrower group. The coefficients on Herfindahlâ&#x20AC;&#x2122;s index do not hold up consistently under the different regressions and are thus inconclusive. The margin of victory has mixed results and does not seem to have a significant impact either way in both government types. However, if an incumbent wins, an increase in the margin seems to be associated with a slight increase in social spending, especially for coalition governments.
5.3
Mean Total Outstanding Liabilities
Table 6 displays the results for the link between fragmentation and average outstanding liabilities of a state government over its term in office. The standard deviation for the average of the mean liabilities across states is 11.149 for coalitions and 17.06 for majorities. None of the coefficients from the results fall outside these margins, except the coefficients on Herfindahâ&#x20AC;&#x2122;s index, which as mentioned before cannot be interpreted in the same way as the other coefficients because the index only ranges between zero and one. The incumbent winning and forming a coalition government is consistently associated with lower outstanding liabilities, or reduced debt. The incumbent winning and forming a majority is associated with increased debt. The issue of accountability would seem to again rise to the fore. In a coalition, the incumbent must justify its debt to its allies. However, an incumbent winning in a coalition is simultaneously related to a bigger increase in 87
88 -
Social Sector Expenditure
48
60
All Observations Coalition Majority 241.940 417.456* (219.502) (246.489) 6.283 4.865 (7.086) (7.336) 2.470 23.187 (56.306) (58.088) .224 .201 (.532) (.599) .5 .093 (.496) (.369) 1.107 .132 (.843) (.701) .768 .952 (1.207) (1.578) .22 .230 (.165) (.215) .058 .027 (.359) (.469)
*, **, *** indicates statistical significance at the 10%, 5%, and 1% level, respectively.
61
-
Growth Rate
48
-
Total Parties Contesting
N
-
Sex Ratio
Majority 41.264*** (19.23) 3.870 (5.66) 15.299 (54.163) .019 (.569) .160 (.326) -
Coalition 27.726 (16.913) .693 (5.159) 14.441 (44.136) .072 (.396) .155 (.328)
Total Parties that Won Seats
Literacy Rate
Incumbent Win given Margin
Margin of Victory
Herfindahlâ&#x20AC;&#x2122;s Index
Incumbent Win
Constant
Independent Variables Coalition 47.050 (321.597) 1.210 (8.042) 66.128 (102.444) .824 (1.000) .213 (.837) .420 (.947) 2.399 (2.208) .110 (.295) .429 (.617) 1.288 (1.435) .390 (.689) 46
Table 6: Dependent Variable: Mean Total Outstanding Liabilities Majority 536.767 (345.897) 5.760 (12.274) 7.521 (90.475) .084 (.988) .196 (.576) .440 (.899) .902 (1.638) .608* (.361) .610 (.62) .271 (.459) .535 (.925) 56
social spending than in a majority. One possible explanation is that the leading party has more pressure to satisfy voters when it is in a coalition, hence the increase in social expenditure. At the same time, it is also held more accountable for the spending it undertakes. The increase in social spending and the decrease in debt in coalitions could perhaps reflect a realignment of priorities towards social spending, and also more fiscally prudent and efficient spending. The higher debt incurred by incumbents gaining majorities goes hand in hand with lower growth, pointing to unaccountability and inefficiency, perhaps related to complacency, as culprits. This conclusion is opposed, however, by the consistently negative coefficient on an incumbent winning a majority as its margin of victory rises. While the coefficients are small, there nonetheless seems to be an inverse relationship between the margin of victory and total debt if an incumbent wins a majority. The answer might lie in the democracy-efficiency tradeoff. As power becomes more concentrated, policies become more streamlined, and improved governance could result in lower debt as well as higher growth, which were indeed indicated by the findings. Moreover, as a majority incumbent party strengthens its hold, it could be said that it in fact becomes more accountable to the people, since it is clearer where the power is concentrated. This too could result in lower debt and higher growth. This does not negate that an incumbent winning a majority takes on more debt than an incumbent winning but needing a coalition. It simply means that, if the margin of victory by which an incumbent wins an absolute majority rises, the debt it takes on reduces. The findings are inconclusive for the effect of Herfindahlâ&#x20AC;&#x2122;s index on debt, as they are for the margin of victory. However, we can infer by the small coefficients in relation to the standard deviation from the descriptive statistics that the margin of victory does not have a significant impact.
6
Conclusion
A question that remains is the external validity of this study. In the case of India in particular, the applicability of this model to the national government rather than the state governments has 89
not been investigated. However, it is reasonable to assume that there is some correlation between a voter’s decisions in state and national elections, and that a government’s performance at the Center affects to at least some degree a voter’s preferences at the state level and vice versa. If this holds true, we can draw some inferences about the performance of India as a whole over the last few years. The last general (national) elections in India were held in 2009. The Indian National Congress remained in power through a coalition by winning 61 more seats, and all the opposition coalitions witnessed declines in the number of seats they held. The result, a huge win for the Congress, meant more power concentration in the Congress at the cost of all opposing parties. As per the findings in this paper, this should have been associated with slightly higher growth, which was indeed the case until about two years into the INC’s second term, when a slowdown started. The results provide no plausible explanation for this slowdown; economic factors outside the realm of institutional arrangements are likely to be the culprit. The findings from this paper further show that the incumbent winning and forming a coalition is associated with an increase in social spending, as is a rise in the incumbent’s margin of victory. This is indeed what India has witnessed through the Food Security Act and the continuance of the National Rural Employment Guarantee Act. The results for mean liabilities are conflicted with respect to the outcome of the 2009 elections, and no strong conclusion can be drawn. On the positive side and from a long-term perspective, the results in this paper do seem to indicate that coalition governments can be better for growth than majority governments, and that re-election of the incumbent through majority governments is bad for growth. It is only in the last 20 years that the Congress has had to depend on coalitions, and regional parties have risen at the state level. India’s faster growth after the crisis of 1991 might then in part be associated with the rise of coalitions and regional politics. However, there is still reason to be apprehensive that the rise of regional politics may negatively impact growth in India. Coalitions in which there is one major party and several smaller 90
parties might be good for growth, but more equal power division between several different parties is conclusively bad for growth, whether at the state or national level. If the next general elections usher in a greater split of power at the Center, between not just national but also regional parties, the outlook for India is not promising. The growth slowdown that India is witnessing could take a turn for the worse, with more veto players, greater chance of deadlock, and a rise in distributional coalitions. The findings in this paper are unclear about the relation between fiscal prudence and political fragmentation. They indicate on one hand that the rise of regional parties and opposition parties such as the BJP should be associated with a rise in social spending, and on the other that they should be related to a reduction in debt, both of which have indeed been the case over the past decade (World Bank, Data Indicators). A rise in social spending could also be harmful in ways other than fiscal imprudence; it could indicate increased financial allocation away from economic services, such as transportation and communication, which foster growth. Greater focus on distributing the pie at the cost of increasing its size can be injurious if it affects long-term growth prospects, especially considering the extent of poverty in India. A reduction in outstanding liabilities associated with coalitions could also be detrimental to Indiaâ&#x20AC;&#x2122;s growth prospects. Although tempting, a decline in debt cannot be taken as a positive development at face value. It is the composition of the debt that matters. If the decrease in liabilities is because of a reduction in â&#x20AC;&#x2122;good debt,â&#x20AC;&#x2122; meaning debt that fosters long-term growth and investment, then that is yet another factor India has to worry about. There is very likely some interplay or reverse causality between liabilities, social spending, and growth that has not been adequately explored. More research is needed to establish the relationship between these factors in the case of India. One reason that the predicted associations with liabilities and social spending can already be seen may be that the latter two are more directly and quickly impacted by government decisions. Growth, by contrast, depends on numerous other factors, which could be why a longer timeframe is needed to observe the results suggested in 91
this paper. A continuation of the ideas presented in this study over a longer timeframe could be beneficial in more clearly establishing relationships. Future research also needs to experiment with different time periods in which a governmentâ&#x20AC;&#x2122;s performance is studied. This study takes data on state governments and then draws inferences for central governments; future work could look at the relationship between votersâ&#x20AC;&#x2122; preferences in state and general elections or undertake similar studies targeted at the central government. A more nuanced understanding of the socioeconomic changes ushered in by political fragmentation will not just aid voters and their political representatives in making more informed decisions, but might also shed light on the variance in national and state development outcomes.
92
Does the Implementation of Affirmative Action in a Competitive Setting Incentivize Underrepresented Public School Applicants’ Performance? Evidence from São Paulo Dounia Saeme, University of California, Berkeley1 Abstract. In 2011, the Federal University of São Carlos (UFSCar) in São Paulo, the most populous state in Brazil, created a 40% quota for black and public school applicants. This study investigates whether the introduction of affirmative action at the university level creates an incentive for the targeted underrepresented applicants to perform better on their qualifying exams in a state where public universities admit one out of 25 students on average. Using data provided by the standard Brazilian entrance exam (ENEM) and its mandatory socioeconomic survey from 2010 and 2011, I employ a difference-in-differences (DID) methodology in order to exploit the characteristics of this quasi-experiment. I use the favored group’s counterparts from comparable states in Brazil that had not introduced any type of affirmative action during those years as a comparison group. I find that, on average, black students from public schools in São Paulo scored 1.54% higher on the ENEM as a result of the introduction of quotas in UFSCar admissions, and the scores of public school students (unconditional on race) in São Paulo were 1.16% higher on average. I find no change among private school test-takers. Keywords: affirmative action, Brazil, difference-in-differences
1 Dounia Saeme is a senior double-majoring in applied mathematics and economics at the University of California, Berkeley. She wrote this honors senior thesis for Professor David Card. The author is very grateful for Professor Card’s helpful comments and useful discussion throughout her research. She would also like to thank Mikkel Sølvsten for his advice on her analysis.
93
1
Introduction
In August 2012, the Brazilian government enacted one of the Western hemisphereâ&#x20AC;&#x2122;s most sweeping affirmative action laws, requiring public universities to reserve half of their admission spots for public school students, who primarily come from lower income groups. This vastly increased the number of students of African descent in universities across the country. This drastic measure, aimed at restoring equal opportunity for all Brazilian children, has provoked heated debates in academic, political, and public spheres. Some claim that these aggressive quotas will generate adverse incentives for the accumulation of human capital by benefiting a lower-performing, poorer segment of the population. Others believe that, with this large reduction in the marginal cost of education, low-income and minority students will finally have the opportunity to succeed in Brazilian society and will perform just as well as their private school counterparts. Since the 1960s, numerous countries have adopted affirmative action policies as a way to improve skill acquisition and human capital accumulation among minority groups (Sowell 2004). The importance of Proposition 209 in the United States, which after 1996 prohibited the University of California from using affirmative action in admissions decisions, demonstrates the pervasive and controversial nature of affirmative action.2 Consequently, there is a vast literature on affirmative action that delivers insightful findings and theories on the important characteristics of the affected minority population (Milgrom & Oster 1987, Card 2001, Lang 1993). Analyzing how targeted and non-targeted groups are both affected by affirmative action is key to understanding the impact of the Brazilian policy. Fryer and Loury (2005) argue that â&#x20AC;&#x153;confident a priori assertions about how affirmative action affects incentives are unfounded. Indeed, economic theory provides little guidance on what is ultimately a subtle and context-dependent empirical question." In light of the Brazilian debate, I will examine the introduction 2 Proposition 209, approved in November 1996, is an amendment to the California state constitution that prohibits state government institutions from discriminating on the basis of race, sex, or ethnicity when making decisions about public employment, public contracting, or public education.
94
of a quota system that benefits black and public school students in the admission procedure of the Federal University of São Carlos (UFSCar) in São Paulo, evaluating its incentive effect on applicants’ performance on the national Brazilian entrance exam. Affirmative action was first introduced in Brazilian federal universities in 2002, but São Carlos was the only university that introduced quotas in 2011. I will investigate whether affirmative action enhances or undercuts incentives to perform well on the entrance exam. I document the impact of this quota system on test performance through Exame Nacional do Ensino Médio (ENEM), or National High School Exam, survey data from 2009 and 2010. I employ a difference-in-differences (DID) methodology, exploiting the characteristics of this quasi-experiment to compare the performance of the students favored by the policy in the state of São Paulo to the performance of similar students in comparable states that did not introduce any type of affirmative action in those years. Because the number of students who leave their home states to attend an undergraduate program in Brazil is very low, this study assumes that students in other Brazilian states are not affected by quotas implemented in São Paulo. I find that, in São Paulo, the test scores of black students and public school students were 1.4% and 1.16% higher, respectively, as a consequence of the introduction of these quotas. With an (unconditional) ENEM test score gap of approximately 15% between public school students and private school students in São Paulo, a 1.16% increase in the performance of public school students indicates approximately an 8% closing of this gap. The group most affected was black applicants, and this pattern is reflected within public school applicants. Conditional on having been schooled in public establishments, the ENEM score of white test-takers increased by 0.83%, the ENEM score of pardo (brown-skinned) test-takers increased by 1.27%, and the ENEM score of black test-takers increased by 1.54%. This pattern reflects the effect desired by the University of São Carlos: incentivizing and providing higher education to social groups that are underrepresented in the state’s federal universities. These results must be interpreted carefully. São Paulo is different from the rest of Brazil on many levels. First, only 2.8% 95
of its more than 10 million blacks have a university diploma (Pnad/Instituto Brasiliero de Geografia e Estatística (IBGE) 2001). In addition, public universities are extremely competitive; they receive 25 applications per available seat. Second, UFSCar is one of the only two universities in the state that uses the ENEM, as São Paulo universities resist adhering to the system, while other states embraced this unified exam in early 2000. I also consider the limitations of attributing these results to UFSCar’s quota implementation. I discuss how I would verify whether my main results are robust to a series of potential problems, including some of the usual concerns in DID models. First, I consider preexistent differential trends prior to the introduction of the quotas. I suggest the limitations of incorporation data from previous and following years in my analysis. Second, I address the size of the treatment considered and potential omitted variables and confounding effects from possible changes in state-level variables that might be correlated with the implementation of quota systems. I show that there were no effects for private school students who should have been equally affected by the state-level variables but less affected by the quotas. Third, I consider another potential pitfall due to my reliance on self-reported racial information. Although Francis and Tannuri-Pianto (2012) suggest that students might change their self-description under the quota system, it is highly unlikely in the context of this study, as students applying to universities could benefit from the quota only when their identity was verified upon admission. Finally, I consider the possibility that my DID standard errors are underestimated given the potential intra-state and serial correlation of the residuals. If this were the case, my statistical inferences would be invalidated. As suggested by Bertrand, Duflo, and Mulainathan (2004), I consider the possibility of relying on robust standard errors clustered at the school level as well as an alternative statistical inference procedure for our main results that would be robust to intra-state correlation in residuals. This paper is organized as follows: Section 2 presents background information on the Brazilian educational system and affirmative action, followed by an introduction to admissions and affirmative action at the Federal University of São Carlos. Section 3 provides a literature review that explores the potential outcomes 96
of such policy and ends with relevant findings about Brazil. Section 4 describes the data and provides summary statistics and explains the choice of the comparison group. Section 5 reports my methodology. Section 6 presents the main empirical results. Section 7 considers further specification tests that I would like to carry out in order to verify the robustness of my findings and broaden the scope of this research project.
2
Background Information
While Brazil is known for its racial diversity, as the country received the greatest number of slaves during the Trans-Atlantic Slave trade (Eltis 2001), it is also notorious for racial inequality. Today, about half of the population is white, 44.2% is pardo, and 6.9% is black (IBGE 2010). In addition, the majority of black Brazilians are impoverished and attend public schools. Although pardos and blacks represent 50% of the Brazilian population, they account for almost 75% of underperforming poor students (Stahlberg 2010). Inequality in education translates into income inequality: Blacks and pardos represent 73% of the poor, and only 12% of the rich.3
2.1
Educational System in Brazil
The Brazilian educational system is split into two levels: basic education and higher education. Basic education has three stages: infantile education, from 0 to 6 years old; fundamental education, which is mandatory, free, and lasts at least 8 years; and middle school, which lasts from 3 to 4 years. The defeat of the Brazilian socialist movement in 1964 marked the beginning of the stagnation of the public higher education system, and, not coincidentally, the growth of private institutions throughout basic and higher levels.4 Brazil was ruled by a military 3A
2007 study by the Brazilian Institute of Geography and Statistics (IBGE) found that white workers received an average monthly income almost twice that of blacks and pardos. Blacks and pardos earned on average 1.8 times the minimum wage, while whites had a yield of 3.4 times the minimum wage. 4 The growth pattern of the private education sector and the recession of the public universities are analyzed by Cunha (1986). On the other hand, Barros,
97
regime for two decades after 1964, and successive administrations continually disregarded the educational system so that by 1990, the federal government provided higher education for a mere 19% of students, whereas in 1984 it had provided for 40% (Brasil 1999).5 Meanwhile the private sector, which already provided services to 59% of students in 1985, continued to expand in order to satisfy the needs of 62% of students in 1998 (Brasil 1999). However, while the expansion of private education sustained the provision of high quality fundamental and middle school education, the same could not be said about private universities. Private universities are often unable to match the quality offered by federal universities because of the high fixed cost of higher education. The growth of the private school sector caused free public schools in Brazil to decrease in quality. The competitiveness of the public university entrance exam and the lack of expansion of the public university system motivated upper and middle class families to demand high-quality schools in order to prepare their students for the exam. Because the college admission process in Brazil considers only test scores and leaves personal information and background unknown, there is little chance that admission officials discriminate based on race. But the process leads to discrimination based on economic status. As poorer students cannot afford the higher quality of education provided at private schools, they tend to not perform as well on the college admission exams as students with access to private education. Even as early as the mid-1970s, some portion of Brazilian society, mainly comprised of middle-class black students, was feeling the effect of these movements. As Santos (1985) writes, in order to obtain a higher education, young black students have to turn to the private institutions that offer diplomas with less value in the job market. The Brazilian education literature blames the high cost of acquiring qualified academic faculty and financing scientific Henriques, and Mendonca (2001) analyze international data and come to the conclusion that â&#x20AC;&#x153;between the 60s and 80s, the Brazilian educational system expanded at a much slower rate than the corresponding international mean." 5 Mainly the administrations led by JosĂŠ Soarney, Fernando Collor de Mello, Itamar Franco, and Fernando Henrique Cardoso.
98
research for the failure of private higher education institutions to produce high-quality education. But this alternative merely accentuates the restrictions placed upon these populations by Brazil’s education system.
2.2
Affirmative Action and Racism in Brazil
Throughout the 1990s, the public university acceptance rate of public school applicants remained stable at approximately 33.8% of the entering class (Peixoto 2000). The Ministry of Education reports that, of the 54.9 million students enrolled in public basic education system at the time, 87.6% attended public universities. While the concept of affirmative action was introduced at the start of the 1980s, it was not until 2000 that Brazilian public universities began to use racial quotas to influence their admission policies. The first law treating affirmative action specifically was approved by the state of Rio de Janeiro, which established that 50% of state university admissions would be reserved for public school applicants starting in 2003. The following year, in the same state, the law changed to guarantee 40% of its seats for pardos and black students. That same year, the state of Bahia matched this guarantee for the two groups in its public universities. Since then, many schools in other states have adopted some form of affirmative action. In 2004, the total number of spots reserved for minorities was only 3.1% in 9 states, while in 2008 this number went up to 11.2% in 21 states.
2.3
Admissions and Affirmative Action at UFSCar
My analysis focuses on the Federal University of São Carlos, a public research university located in São Carlos in the state of São Paulo. UFSCar is located in a rural area, with four campuses spread across the state’s countryside. It has approximately 14,000 students and 1,000 professors and researchers. Its researchers are Brazil’s fourth most productive in terms of the quantity of articles published in indexed international journals of science. In 1994, bucking the national trend, almost half of UFSCar’s admitted students came from public schools. This number, however, decreased over time. In 2005, 80% of admitted students came from private schools. Similarly, while 35% of the population 99
of Brazil’s southeast region (IBGE 2001) is black or pardo, UFSCar’s 2005 entering class had less than 14% of students who are black or pardo. UFSCar took action to adjust this disproportion by maintaining a 20% quota for students from public schools. UFSCar has accepted 2,577 students every year since 2009. There were 40,547 applicants in 2010 (pre-quota) and 71,439 applicants in 2011 (post-quota). In 2011, the university implemented a more drastic measure by reserving 40% of its seats for students who were educated exclusively in public institutions. Of that percentage, 35% were reserved for black students. This last quota will be the main focus of my analysis.
3 3.1
The Effects of the Introduction of Quotas on Student Performance Theoretical Channel
There are many mechanisms through which quotas can affect students’ performance on the public university entrance exam. First, market imperfections can affect access to universities. Specifically, liquidity constraints may prevent access to universities for minorities, who are usually overrepresented in the poorest part of the population. Andrade (2004), for example, builds a theoretical model in order to study how quotas affect the economic efficiency of Brazilian society from the perspective of total expenditure (government and households), considering the coexistence of public and private universities. Starting from the assumption that basic education is available and equally enjoyed by all, he shows that, depending on the difference in quality between public and private institutions and the size of the liquidity constraint faced by beneficiaries of the quotas, there can be an increase in the efficiency of total (public and private) investment. These findings are relevant in the Brazilian context given the lower level of public basic education relative to private education. If public school students were already giving their best effort, meaning that the gap between the scores of public and private schools would be purely due to the difference in quality of 100
education, no increase in performance should be visible. On the other hand, with the implementation of 40% quotas at UFSCar, qualified public school students who previously would not have applied—because in the past they did not have the means to pay or considered public universities too competitive—might now find it worthwhile to apply, increasing the mean score of the pool of public school applicants. A second possible factor that may discourage these otherwise potentially higher-performing students is their anticipation of future discrimination in the job market, in which case minority students might be less motivated to accumulate human capital during their academic career (Lundberg & Startz 1983, Milgrom & Oster 1987, Lundberg & Startz 1998). In this case, quotas can alter minorities’ beliefs and affect their investment decisions. Models of race-based cultural norms (Ogbu & Forham 1986, Ogbu 2003) assert that black children have lower norms of achievement than otherwise similar white children. This discrepancy could be due to a lack of opportunities given and then expected by black students over time. In either case, quotas could increase opportunity, and this opportunity could trigger a shift in realistic norms of achievement for minority students. Finally, since the Brazilian selection process is based solely on a seemingly objective exam grade, perhaps quotas can improve the selection efficiency of the exam. An efficient selection process would select qualified students from diverse backgrounds since test scores provide no information about an applicant’s qualitative characteristics. This could have a mixed effect on entrance exam performance. There might also be a mixed effect on effort. According to Coate and Loury (1993), the effort level may decrease in the presence of quotas and thus diminish the incentives for investment in human capital. Specifically addressing the issue of the cost of the effort, a 1987 study by Bull et al. observes the behavior of individuals in tournaments where the cost of the effort to achieve a certain goal is different. The results show that the behavior of individuals is dependent on the size of the asymmetry of cost and effort. In general, individuals who face higher costs demonstrate less effort than others. Given the high competition at UFSCar, it is interesting to consider whether the quota changed applicants’ beliefs regarding their cost of effort. 101
3.2
Previous Evidence on the Introduction of Quotas in Brazil
Although the total number of spots reserved for minorities was 11.2% in 21 states by 2008, only the most prominent casesâ&#x20AC;&#x201D;those of Rio de Janeiro in 2000, Bahia in 2003, and Brasilia in 2004â&#x20AC;&#x201D;have received serious empirical analysis. These cases seem to show that students who receive special treatment, such as preferred admission, may perform worse than before the policy was implemented (Dâ&#x20AC;&#x2122;Souza 1991, Murray 1994). Francis and TannuriPianto (2009) show that this difference may be small. Studying the University of Brasilia (UNB) applicants accepted under the quota system, the authors estimate that the differential performance of the favored students compared to the unfavored students is only 20% of the standard deviation of their standardized scores. On the other hand, Ferman and Asuncion (2006) say the data provided by the national evaluation exam show that the adoption of racial and socioeconomic quotas in the state universities of Rio de Janeiro and Bahia actually reduced incentives for high school students. However, Francis and Tannuri-Pianto (2009) argue that the conclusions of this study are unreliable since it is not possible to identify those who actually paid for the public university entrance exam. In this paper I will focus on ENEM exam scores used by UFSCar. In 2009, the ENEM was already used by 42 of 55 federal universities in the country. This unique and rich dataset will shed light on the controversial empirical results presented above.
4 4.1
Data The ENEM
To be admitted to a public university in Brazil, a student must pass an admission test called vestibular. Each university offers its own vestibular. Until 2009, some universities also considered the ENEM as part their selection process, but these were isolated cases. My empirical analysis relies on the ENEM micro-dataset. ENEM data provide complete test information for over four million test-takers for the years of 2009-2010, as well as a 102
mandatory socioeconomic survey providing family background characteristics and high school identifiers for students applying to public universities. The 40% quota at UFSCar was introduced in 2011, and in order to account for the one-year lag, it used the 2010 ENEM results in the selection of the entering class, similar to how it used the 2009 exam to select the entering class of 2010. All observations made in 2009 refer to the 2010 application process (pre-quota) and all 2010 observations refer to the 2011 application process (post-quota). In 2009 the ENEM was methodologically reformulated in order to standardize the admissions process for federal universities. In 2009, according to the Ministry of Education, 541 of the 2,252 higher education institutions in Brazil used the ENEM score, either as a unique or partial selection criterion. Of these, 42 were public universities. Universities can use the ENEM in several ways: to allocate a percentage of the vacancies to ENEM test-takers, as a unique selection process, as the first phase of admission, to supplement applicant data, or as part of the entrance exam score. It is important to note that the ENEM is open to anyone who wants to take it. For example, some students use it to apply for a ProUni scholarship to attend a private institute of higher education, while others use it as an evaluation of their capabilities when applying for jobs. The dataset does not specify which students applied to which university. In order to remedy this lack of specification, my analysis relies on the students who reported their reason for taking the ENEM was in order to apply to a university and those who obtained a score greater than 0 (schools will not accept a score of 0 in one of the subjects). In addition, the different rates of growth of ProUni scholarships in different regions presents an omitted variable bias that must be accounted for. I attempt to examine the validity of my findings given this constraint in Section 7. The ENEM evaluates students in natural science, human sciences, Portuguese, mathematics, critical thinking, and essay Writing. The proficiency measure is presumably comparable over time, as it is calibrated using â&#x20AC;&#x153;item response theory" methodology. Unlike simpler alternatives for creating scales, this methodology does not assume that each item is equally difficult and treats the 103
difficulty of each item as information to be incorporated in scaling items. My analysis is based on the cumulative score of these six sections. The 2009 survey contains a wide variety of information on student and school characteristics that are, unfortunately, only partially replicated in 2010. Taking this into consideration, the control variables used are gender, age, household size, an indicator for rural schools, and parent schooling.
4.2
Summary Statistics
In selecting the comparison group for São Paulo, two constraints had to be taken into consideration. First, the states being compared needed to have universities that used the ENEM consistently in both the 2010 and 2011 selection processes. The number of seats offered varies slightly in the compared universities, but as there is no significant change, my results should not be skewed (Brazilian Ministry of Education). Second, the demographics of the comparison group had to be comparable to São Paulo’s. São Paulo is the economic capital of Brazil, and as a consequence it is wealthier and has a larger white population than the rest of the country. It follows that schooling levels are also higher. Two other states in the southeast subdivision of Brazil, Minas Gerais and Rio de Janeiro, are comparable to São Paulo in wealth and education level despite the lower percentage of whites and larger percentage of pardos. I also include Rio Grande du Sul, the southernmost state of Brazil, which is wealthy and has a large white population but has lowerquality education in my comparison group. As can be seen in Table 1, treatment and comparison groups are very comparable at baseline, with the exception that São Paulo is 80% white and 12% pardo while the control group is 70% white and 22% pardo. I consider this constraint in Section 7 but will assume until then that this characteristic does not play a key role. In addition, according to Telles (2004) and Magnoli (2008), self-reporting of pardos is not entirely reliable as it depends on whether or not people consider themselves as such. At first glance, I note from columns A and C of Table 1 that, at the baseline, the score gap between private schools and public schools in our treatment group (São Paulo) is 404.78 points, while
104
105
0.50 0.40 0.32 0.15 2.79 0.99 1.10 0.95 518.96 512.81 526.59 532.44 0.49 0.49 0.45 0.29 7.31 1.23 1.28 1.23 563.60 571.57 533.44 522.91
36373 0.56 0.80 0.12 0.02 18.5 3.58 3.49 3.87 2980.92 2998.32 2868.15 2780.68 308680 0.59 0.58 0.29 0.09 23.42 2.22 2.06 3.89 2576.14 2635.37 2486.21 2461.40
2601.15 2658.82 2558.53 2515.30
416022 0.62 0.50 0.34 0.12 23.43 2.22 1.94 3.81
2950.52 2988.28 2890.37 2782.40
46376 0.58 0.70 0.22 0.05 18.6 3.52 3.29 3.73
526.70 532.49 516.19 501.41
0.49 0.50 0.47 0.33 7.26 1.26 1.25 1.24
494.25 491.70 487.39 472.06
0.49 0.46 0.41 0.23 2.10 1.08 1.21 1.01
2597.76 2643.41 2506.68 2480.54
459357 0.58 0.62 0.25 0.09 22.3 2.25 2.08 –
2981.26 2992.81 2856.55 2815.03
47828 0.55 0.82 0.09 0.02 17.7 3.51 3.40 –
535.47 535.45 514.92 511.53
0.49 0.49 0.44 0.28 7.70 1.20 1.25 –
464.22 457.85 487.57 460.24
0.50 0.38 0.29 0.15 3.18 1.02 1.14 –
2606.40 2647.20 2570.36 2520.39
697697 0.60 0.52 0.30 0.12 23.1 2.18 1.92 –
2976.98 3005.97 2901.02 2815.98
61515 0.56 0.71 0.18 0.05 17.91 3.40 3.19 –
527.86 529.88 515.97 517.28
0.49 0.50 0.46 0.33 8.02 1.21 1.22 –
463.01 454.52 464.73 476.53
0.50 0.45 0.39 0.23 3.05 1.09 1.21 –
Post-Quota São Paulo Comparison Mean S.D. Mean S.D. (E) (F) (G) (H)
Note: Unfortunately the household size measure is different on the 2010 ENEM. Broad adjustments were made so that it could be added as a control in the regression, but it is not as informative.
Private School Applicants N Fraction Female Fraction White Fraction Brown Fraction Black Mean Age Mean Mother Schooling Mean Father Schooling Mean Household Size ENEM Test Score All Students White Mulatto Black Public School Applicants N Fraction Female Fraction White Fraction Brown Fraction Black Mean Age Mean Mother Schooling Mean Father Schooling Mean Household Size ENEM Test Score All Students White Mulatto Black
Pre-Quota São Paulo Comparison Mean S.D. Mean S.D. (A) (B) (C) (D)
Table 1: Summary Statistics
the control group has a substantially smaller gap of 349.37 points. We would expect the implementation of a quota in São Paulo to lead to an improvement in public school performance relative to the control group.6 This result can be seen in the post-quota score gap, found using columns E and G. The score gap after the implementation of the quota becomes 383.45 points for São Paulo and 370.58 points for the control group. Therefore, the score gap in São Paulo narrowed, while the score gap in the control group widened.
5
Methodology
To identify the impact of quota systems on the performance of favored applicants on the ENEM, I use a difference-in-differences (DID) framework to compare the difference in performance between the treatment and comparison groups after the quotas were implemented (in 2011) with the same difference before the quotas were implemented (in 2010). The basic DID estimate of the quota’s effect on the performance of favored students is obtained from the following least squares regression: ln (yi ) = c + a · d2011 + b · dTreat + g · d2011 · dTreat + s 0 Xi + e i i i i i where i indexes the students in the sample, which is pooled for the exam years 2009 and 2010; yi refers to the proficiency variable; d2011 indicates whether student i took the exam in 2009 or 2010; i dTreat indicates whether student i belongs to the treatment group; i Xi is a vector of student characteristics that are broadly divided into demographic characteristics, parental education and school municipality; and ei reflects unobserved variables that affect students’ proficiency. Different pairs of treatment and comparison groups are considered in the next section. The coefficient of interest is related to the interaction between d2011 and dTreat , g, i i 6 According
to the Brazilian annual household survey (PNAD), 15% of the undergraduate students in Brasilia are originally from another state. The corresponding figures are only 5% in São Paulo. Therefore, it is reasonable to assume that favored students in Brasilia faced stronger competition for the reserved spots than black students in Rio de Janeiro.
106
which can be interpreted as the average impact of the treatment on the treated: the percentage variation in the performance of favored students due to the introduction of the quota system.
6
Results
I first present my DID estimates for the most affected groups: black students and students who were exclusively educated in public schools. While UFSCar’s quota was limited to public school applicants who had only attended public schools, there was no restriction on black students; any black student was eligible. Therefore I begin by estimating the quota effect on all black test-takers, followed by the effect on all test-takers who were schooled in public institutions. I then estimate the DID for different races within the pool of public school applicants who attended public school only. Finally, I look at the DID estimate for São Paulo’s private school applicants who might have been negatively impacted by the implementation of this quota system. The models in columns E through F of Table 2 present estimates of the DID equation using black test-takers as the treatment group and their counterparts in comparison states as the control. In column E, no demographic, parental education, or school municipality control variables were included. The estimated effect of being favored by the system of quotas is a 1.11% increase in test score (significant at the 1% level). These results reflect the difference between the mean test score of the treatment and comparison groups after quotas were implemented, compared to the same difference before these quotas were implemented. These results must be analyzed carefully because they may reflect changes in the composition of the groups or changes in factors other than the quota incentive. Column F of Table 2 presents the same regression but includes a vector of student characteristics (age, gender, household size, rural/urban indicator). Controlling for student characteristics does not significantly change the estimated effect of being favored by the quota. The estimated effect is a score improved by 1.13% (significant at the 1% level). In column G, I add parental education level to the control vector, which does not change the size or the significance of the estimate. In column H, I account for the school 107
108 N N N 1874126 0.023 0.00
Y N N 1869524 0.031 0.00
Y Y N 1863125 0.107 0.00
Y Y Y 707205 0.115 0.00
N N N 247763 0.015 0.00
Y N N 246788 0.55 0.00
*, **, *** indicates statistical significance at the 10%, 5%, and 1% level, respectively.
Control Variables Demog. Charact. Parental Education School Municipality N Adjusted R2 p-value, d2011 · dTreat =0 i i
dTreat i
d2011 i
d2011 · dTreat i i
Y Y N 245736 0.102 0.00
Y Y Y 81884 0.116 0.00
Dependent Variable is ln(test score) Treatment Group: Applicants from Treatment Group: São Paulo’s Public Schools Black Applicants Comparison Group: Applicants from Comparison Group: Black RS/RJ/MG’s Public Schools Applicants in RS/RJ/MG (A) (B) (C) (D) (E) (F) (G) (H) 1.09*** 0.96*** 0.93*** 1.16*** 1.11*** 1.13*** 1.13*** 1.4*** (0.08) (0.08) (0.08) (0.12) (0.23) (0.23) (0.23) (0.37) 0.39*** 0.64*** 0.78*** 1.48*** 0.5*** 0.73*** 0.78*** 1.45*** (0.05) (0.05) (0.05) (0.08) (0.13) (0.14) (0.14) (0.22) -1.97*** -1.65*** -2.1*** -3.23*** -2.55*** -2.49*** -2.52*** 3.56*** (0.06) (0.06) (0.08) (0.09) (0.18) (0.18) (0.18) (0.28)
Table 2: Effect of Quota System on Public School and Black Applicants
109 Y Y Y 383101 0.137 0.00
Y Y Y 230345 0.138 0.00
Y Y Y 69761 0.137 0.00
*, **, *** indicates statistical significance at the 10%, 5%, and 1% level, respectively.
Control Variables Demographic Characteristics Parental Education School Municipality N Adjusted R2 p-value, d2011 · dTreat =0 i i
dTreat i
d2011 i
d2011 · dTreat i i
Dependent Variable is ln(test score) Treatment Group: Applicants who were schooled in São Paulo’s Public Schools Comparison Group: Applicants who were schooled in RS/RJ/MG Public Schools White Pardos Black (A) (B) (C) 0.83*** 1.27*** 1.54*** (0.16) (0.22) (0.4) 1.54*** 1.46*** 0.97*** (0.11) (0.14) (0.24) -0.33*** -3.97*** -3.53*** (0.12) (0.16) (0.3)
Table 3: Effect of Quota System on Public School Applicants by Race
110 N N N 257148 0.023 0.11
Y Y Y 242048 0.109 0.00
*, **, *** indicates statistical significance at the 10%, 5%, and 1% level, respectively.
Control Variables Demographic Characteristics Parental Education School Municipality N Adjusted R2 p-value, d2011 · dTreat =0 i i
dTreat i
d2011 i
d2011 · dTreat i i
Dependent Variable is ln(test score) Treatment Group: Applicants who were schooled in São Paulo’s Private Schools Comparison Group: Applicants who were schooled in Rio Grande Sul, Rio de Janeiro, Minais Gerais Private Schools (A) (B) 0.28 0 0.18 0.169 3.31*** 2.79*** 0.12 0.12 -0.085 -0.67*** 0.12 0.12
Table 4: Effect of Quota System on Private School Applicants by Race
municipality, which results in the higher estimated effect of a 1.4% increase in test scores (significant at the 1% level). However, this last estimate reduced the number of observations used from 245,736 to 81,884. Perhaps the subpopulation differs from the aggregate black test-takers. One could interpret reporting of the non-mandatory high school code on the ENEM as an indicator of “overachievement" or wanting to perform well, which could increase the likelihood of students wanting to perform better on the ENEM in order to benefit from UFSCar’s quota system. The models in columns A through D of Table 2 contain estimates of the DID equation using public school test-takers as the treatment group and their counterparts in comparable states. In column A, no demographic control variables (parental education or school municipality) were included. For students favored by the quota system, the estimated effect is a 1.09% increase in test score (significant at the 1% level). When adding controls in columns B and C, the effect estimated is still significant but reduced to slightly less than a 1% increase in test score. Again, we can use our “overachiever" subgroup to note a slightly larger estimate of a 1.16% increase in test score, which is 0.24% lower than the increase estimated for black test-takers in column H of Table 2. It seems that black students were slightly more affected by the quota policy. This hypothesis is further supported by the estimates presented in Table 3. In Table 3, I estimate the difference-in-differences equation for white public school testtakers (column A), pardo public school test-takers (column B), and black public school test-takers (column C). All three estimates were done using full specification, even though these results are significant without specifications. We find in column A that white public school students are least incentivized by the UFSCar quota, with an estimated 0.83% increase in test scores, followed by, in column B, an estimated 1.27% increase in test scores for pardo public school students. Finally for black public school students, the most affected group, the estimated impact is a 1.54% increase in test scores. All of these figures are significant at the 99% level. Table 4 presents results for no control (column A) and complete specifications, including all control variables (column B) for private school test-takers. Neither estimate exhibits an effect 111
from quotas that is significantly different from zero. We note that the number of observations is only 6% less than in the model without controls in the complete specification estimate, whereas the numbers of observations for the black test-takers and public school test-taker models decrease by 67% and 62%, respectively (Table 2). This could support the hypothesis that reporting school municipality is associated with higher scores and could be a sign of overachievement, given that private school students are more successful. Despite the significance of all the results, it is important to note that the effect of the quota estimated by the DID model is small. Black and public school students’ scores increased by approximately 1%, which could be due to factors completely unrelated to the quota implementation. In the next section, I address some of the issues already mentioned and additional potential concerns such as time trends, selection bias, omitted variables, serial correlation, and within-state correlation in the residuals, as well as the comparability of São Paulo with the rest of the country and the size of the treatment.
7
Specification Consideration
In this section, I consider the potential concerns in attributing the estimated 1% increase in black and public school test scores to the implementation of 40% quotas at UFSCar. The first issue we should consider is whether the increase in proficiency among the favored students occurred strictly after the implementation of these quota systems, or if there was already a positive trend occurring in student performance before the implementation of the quota system. Despite the comparability established by the summary statistics in Table 1, the baseline score difference is disconcerting. It could be the case that, relative to the 2010 comparison group, São Paulo’s gap between public and private school average test scores was much larger. Therefore we cannot dismiss the possibility that São Paulo, Brazil’s economic center, has a different time trend than the rest of the country. In order to address this problem, we could estimate a DID model with data from previous ENEM years. There are two complications involved with such a procedure. First, the ENEM 112
became most widely used in 2009. Second, in previous years different states implemented forms of affirmative action. Such estimates could still provide trend evidence if São Paulo’s pattern was strikingly different from the rest of the country. In addition, considering ENEM data from the post-treatment years could further cement a distinctive trend if one were to be found. A second concern is the size of the treatment. UFSCar offers admission to only 2,577 students every year, and there were 40,547 applicants in 2010 (pre-quota) and 71,439 applicants in 2011 (postquota). While the number of applicants per admissions spot nearly doubled from 15.73 to 27.72, it is unlikely that all 507,185 ENEM test-takers in the state of São Paulo intended to apply to UFSCar when they decided to take the test initially. This leads us to consider the potential for omitted variable bias. Also, we should consider the fact that the ENEM became prominent in 2009. Perhaps the growing popularity of the exam combined with ProUni scholarship opportunities is what drove the results, but this remains difficult to analyze because of the lack of ProUni data. Another possible confounding effect is that the system of quotas in UFSCar may have been implemented in conjunction with other statewide changes in educational policy, which would bias the estimators. Table 4 revealed non-significant effects for private school students who should be equally affected by statelevel policies, but there is no evidence suggesting that they are less affected by the quotas. Nonetheless, even if there are no statelevel omitted variables that correlate with the implementation of quotas, serial correlation and within-state correlation in the residuals of DID models could lead to underestimated standard errors and, therefore, incorrect statistical inferences, as suggested by Bertrand, Duflo, and Mullainathan (2004). But we still must evaluate whether this potential downward bias in standard error is leading us incorrectly to reject the null hypothesis: that the quota system had no effect on student performance of black and public school students in São Paulo. The strategy I would like to adopt is one that uses the same data structure of the main regression to estimate placebo regressions for states that had not implemented quota systems during this period as the treatment group. Otherwise, an alternative would be to rely on robust standard errors clustered by municipality, 113
since the school-level data remains incomplete because students do not consistently report this information. My last concern is that the composition of the treatment group may have changed due to the implementation of the quota system. First, a system of quotas that benefit black students would likely change the way in which students describe themselves. However, this is unlikely because upon admission the candidate has to submit a transcript demonstrating that he or she attended public school and documentation proving his or her ethnicity. In an attempt to measure the quantitative relevance of potential selection bias, we could use DID models in which each of the studentsâ&#x20AC;&#x2122; observable characteristics are dependent variables. If the system of quotas truly changed the composition of the treatment group, this would have likely changed the observable characteristics of this group. A more relevant problem of composition is the difference in the percentage of white testtakers in SĂŁo Paulo relative to the comparison group, for which I would have to construct an adequate test to verify if this aspect is driving my findings.
8
Conclusion
I provide empirical evidence justifying the claim that the implementation of affirmative action policies in a competitive setting can have positive effects on the performance of students applying to universities. My estimate shows that, on average, the ENEM test score of black students from public schools in SĂŁo Paulo was 1.54% higher after the introduction of quotas in university admission policies. The estimate, on average, for public school students (unconditional on race) was a 1.16% increase in test scores after the implementation of the UFSCar quota. Private school students were not affected, which implies that the quota system is encouraging public school applicants to perform better as their odds of entering university are increased. The robustness of these results is a project I hope to undertake in the future.
114
A Question of Intent: Explaining the Performance of Governments in Global Development Projects Adin Lykken, Yale University1 Abstract. This paper seeks to explain the performance of governments who complete externally financed development projects. Previous research on the effectiveness of development efforts has analyzed macro-level indicators and overall project outcomes. Less research has explored the dynamics of how well recipient governments implement specific projects. I construct a principal-agent framework that explains government performance in terms of the execution skills of governments and their alignment to the objectives of financing organizations. To isolate the primary drivers of performance, I analyze projectlevel assessment data from the World Bank and country-level indicators of political risk from the Political Risk Services Group. Across both linear and probit specifications, results suggest the overwhelming importance of project supervision in improving government performance. I examine projects indicative of these results and discuss policy options to further improve government performance. Keywords: development projects, government performance, World Bank, principal-agent
1 Adin
Lykken is a senior majoring in economics at Yale University. He wrote this paper for Professor Ioannis Kessidesâ&#x20AC;&#x2122; "Economics of Infrastructure Policy" seminar in Fall 2013. The author would like to thank his adviser, Professor Kessides, for his invaluable guidance on this project. He would also like to thank Maximiliano Appendino and Alex Cohen, graduate students in the Yale Economics Department, for their helpful feedback. This project was supported by a Mellon Undergraduate Research Award from Yale University.
115
1
Introduction
Although global poverty rates have fallen in recent decades, 22% of the developing world still lives on less than $1.25 a day (The World Bank 2012). Mass poverty remains a tremendous challenge for the international community, one that has spurred increased research into improving the outcomes of development assistance. Recent scholarship has produced mixed results on the effectiveness of current development efforts in reducing poverty. Some studies find that development aid has led to unconditional economic growth and poverty reduction (Lensink & Morrissey 1999, Clemens et al. 2004), while others have indicated that efforts over the last few decades have been largely ineffective (Easterly et al. 2003, Doucouliagos & Paldam 2007). Another strain of research on development effectiveness yields results conditional on country-specific characteristics, most notably geography and institutional quality (Dollar & Levin 2005). A smaller subset of research has eschewed measuring macroindicators like growth and poverty rates in favor of projectlevel outcomes. Several studies, most notably Kaufmann and Wang (1995), have attempted to explain the success and failure of specific projects based on a countryâ&#x20AC;&#x2122;s economic productivity and trade policy. Diallo and Thuillier (2005) have analyzed project outcomes based on levels of open communication between project stakeholders, while Chauvet, Collier, and Fuster (2006) have explored the impact of supervision of borrowers by lending institutions. Less research has attempted to explain the behavior of national governments that actually implement development projects. As the performance of aided governments can significantly impact the effectiveness and sustainability of development projects, there is a need for more research into the incentives facing borrowers. There is a demonstrated need to examine such inputs into overall project performance, as 39% of all World Bank projects were rated as unsuccessful in 2010 (Chauvet, Collier, & Duponchel 2010). This paper seeks to explain the behavior of governments in developing nations that complete externally financed development projects. Building upon the work of Kilby (2000), I use a principal-agent model to examine the incentives that 116
underlie the behavior of governments that complete development projects financed by the World Bank. In particular, I analyze the extent to which an adversarial or cooperative theory of behavior better explains the performance of borrowing nations. I specify an empirical approach using project-level data with both a linear and probit model. From the results, I suggest methods for development institutions to improve borrower performance and areas for future research.
2
Literature Review
There exists an emerging consensus on the importance of examining country-specific features in explaining the outcomes of development projects. To explore how differences in developing nations might explain development project outcomes, some studies have turned to measuring indicators of “good governance.” The good governance agenda has been a central concept in the field of international development since the mid-1990s (Earle & Scott 2010). This term is broad and can encompass improvements to “virtually all aspects of the public sector” (Grindle 2004). In the context of international development, measures of good governance can include the rule of law, budgetary and financial management, transparency, accountability, corruption, and public participation (Punyaratabandhu 2004). While development institutions like the World Bank used to define good governance in terms of technocratic competence, there has recently emerged a wider emphasis on the importance of institutional quality for development interventions to be sustainable (Hout 2009). Despite the recent emphasis on governance quality, most previous research has tried to explain its effects in terms of aggregate metrics rather than project-level outcomes. Across developed nations, there is some evidence that weak governance reinforces poverty (Campos & Nugent 1999) and that good governance leads to higher foreign direct investment (Busse & Hefeker 2005) and labor productivity (Hall & Jones 1999). Rodrik, Subramanian, and Trebbi (2004) have linked stronger public institutions to higher per capita income levels and lower poverty rates. Research conducted by the World Bank has also found that 117
good governance in developing nations is linked to higher per capita incomes and literacy rates as well as lower infant mortality (Kaufmann & Wang 1995). One framework to incorporate good governance as a determinant of project-specific outcomes is the concept in agency theory known as the principal-agent problem (PAP). Popularized by George Akerlof’s 1970 paper “The Market for Lemons: Quality Uncertainty and the Market Mechanism,” the PAP involves a principal and agent(s) who engage in cooperative behavior but have differing goals and limited vision into the actions of the other party. In most principal-agent models, a principal pays an agent with specialized knowledge to complete a task but can never perfectly monitor the agent’s behavior (Arrow 1963). The agent may use his information advantage to take an action unobserved by the principal (moral hazard) or conceal the true cost or valuation of his work (adverse selection) (Laffont & Martimort 2002, Aerni 2006). In many cases, principals must consider the tradeoff between increased costs of monitoring agents with the agency costs associated with an agent’s deviant behavior (Bebchuk & Fried 2004). Since its conception, researchers have undertaken experiments to test the applicability of agency theory to fields as varied as marketing, compensation, diversification strategies, board relationships, vertical integration, and innovation (Eisenhardt 1989). International development assistance represents a market in which participants respond to specific incentives, one in which the PAP also arises in numerous ways. Vaubel (2005) and Nielson (2003) have posited the existence of the PAP between internal managers who seek to expand the authority of development organizations and the citizens of the member states who may be rationally ignorant of most of the organization’s activities. Easterly (2006) has pointed out that in the context of relations between donors and development organizations, the latter can become risk-averse and learn to shield donors from being exposed to negative project outcomes. Most salient for this analysis, Chauvet et al. (2006) consider situations in which a donor agency finances development projects that are implemented by recipient governments, finding that donor supervision of projects is more impactful on project performance where interests are 118
more divergent. A final PAP could also exist between government leadership that receives financing for a project and the staff who actually conduct implementation. According to the precepts of Stiglitz (1974), development projects can be seen as principal-agent contracts if: (1) The principal and agent have different objectives; (2) The principal’s information about the agent’s actions is imperfect; and (3) Contracts are imperfect. With respect to (1), Chauvet et al. (2006) have posited an inherent lack of congruency between the interests of borrowing nations and lending institutions. Kilby (2000) has hypothesized that such a lack of congruency likely stems from differences in time horizons. While development institutions have far-sighted perspectives on reducing poverty in developing nations, even benevolent governments are inclined to direct projects with a shorter-term focus. Chauvet et al. (2006) have also noted the existence of (2) in the context of donor-borrower relationships because of the limited observability of the borrowing government’s effort. For instance, Aerni (2006) has described how implementing governments in developing countries may engage in “adverse selection” of what information they report to lenders about the status of projects. To correct for this information asymmetry, donor agencies may try to monitor projects in borrowing nations, an effort Kilby (2000) has shown to be correlated with improved project outcomes. (3) is a natural consequence of the structure of modern development projects, during which lending institutions do not implement projects themselves, instead providing borrowers with decisionmaking capacity outside of formal contracts. In the context of agency theory, Kilby (2000) proposes two potential explanatory models: an “adversarial” model that explains project outcomes as a result of the PAP and a “cooperative” model that assumes congruent interests between borrowing nations and lending institutions. While this dichotomy is a useful framework for applying agency theory to development projects, Kilby does not consider the effects of country-specific factors like governance quality (outside of macroeconomic 119
controls). Furthermore, neither his study nor others examine the impact of the PAP purely on the performance of borrowing nations rather than on overall project outcomes. In light of this underspecification, my analysis seeks to incorporate countryspecific variables into an agency theory model to explain the behavior of governments who borrow from the World Bank. While I adopt Kilbyâ&#x20AC;&#x2122;s dichotomy between adversarial and cooperative models, I reconstruct them with country-specific features, including indicators of good governance. I then define an econometric specification to test which of the two principalagent models better explains variations in borrower performance. Finally, I discuss policy recommendations to improve the outcomes of borrower performance and the implications of the results for future research.
3
Theoretical Model
This section presents a theoretical model for the empirical investigation of the determinants of borrower performance in projects financed by the World Bank. Building off of Kilbyâ&#x20AC;&#x2122;s two agency models, one adversarial and one cooperative, I propose further specification based on country-specific factors, as Mubilaâ&#x20AC;&#x2122;s (2000) results suggest that they are at least as important as projectspecific characteristics for overall project success. The proceeding theoretical model is adapted from Chauvet et al. (2006), which is adapted from Baker, Gibbons, and Murphy (1991). The model focuses on the non-monetary utility function of a riskneutral agent (the government of the borrowing country, C) of a development project financed by a risk-neutral principal (the lending institution, L). Only non-monetary utility is considered because the borrowing country C guarantees loan repayment to L. This reflects how, if a country fails to make a payment on a loan from the World Bank, the Bank will suspend preparations for any new loans and freeze all payments under existing loans (The World Bank 2013a). Consider that L and C agree to a lending contract to finance a project in the borrowing country. Borrower performance p is a measure of the extent to which borrowers succeeded in achieving the objectives of a given project. p is a function of two factors: the 120
alignment A of the borrower to the stated objectives of the project and the technical execution E of the project. Thus, borrower performance is defined as the function p = p(A,E)
(1)
Alignment A reflects the alignment of the borrowing nation’s government with the lending institution’s objectives for a given project. The model assumes a large degree of inherent alignment across projects because borrowing nations choose to partner with lending institutions voluntarily. However, there can still be misalignment of interests. As noted by Hall and Jones (1999), while governments are potentially the most efficient providers of social infrastructure that protect against diversion of resources, they are also a primary agent of behaviors against the general public’s interest like expropriation, confiscatory taxation, and corrupt behavior. Kilby (2000) has hypothesized that the deviant behavior of borrowers likely stems from their shorter time horizons relative to those of lending institutions. An alternate explanation is based on the existence of the principalagent problem within the borrowing government. Even though borrower leadership may be aligned with project objectives, implementing staff may deviate from desired behavior. Assuming that the people of developing countries similarly share a preference for long-term utility maximization, alignment A can also be considered as the alignment of the borrowing country’s government with the objectives of its own people. Any deviations from the public’s objectives by its government can thus be measured in terms of overall governance quality g. While governance quality g, broadly defined as “good governance,” reflects the voluntary alignment of the developing nation to the objectives of a project, lending institutions can supervise the behavior of borrowers to potentially expose or halt abusive practices. Therefore, the quality of supervision s by lending institutions on the governments of developing countries is another critical determinant of project alignment. In sum, measures of project alignment A reflect an adversarial relationship between C and L.
121
Alignment A is thus defined as the function A = A(g,s)
(2)
By contrast, execution E reflects the ability of the borrowing nation to execute its objectives in relation to a development project. E measures the technical execution of the project without reference to the alignment of objectives, and therefore assumes a cooperative relationship between C and L. The primary determinant of E is therefore the technical expertise t of the borrowing nation. To account for variations in difficulty of execution across projects, E also contains a measure of the complexity of the project c, which includes both the technical and managerial challenges posed by projects. Execution E is thus defined as the function E = E(t,c)
(3)
While this model is relatively straightforward, it presents a framework with which to conduct an empirical analysis of the quantitative importance of the drivers of borrower performance. In addition to explaining the behaviors of governments, if borrower performance is a driver of overall project outcomes, then this model would also be highly relevant to policymakers seeking to plan more effective overall projects. Furthermore, both A and E contain components segregated between the control of the borrowing country C and the lending institution L. This dichotomy is critical in determining whether the principal or agents are more influential on agent behavior. Finally, the segregation of project performance p into two mutually exclusive hypotheses between A and E allows for an empirical specification of whether an adversarial or cooperative theory of behavior better characterizes the principal-agent relationship.
4
Data
The World Bank is one of the most prominent development aid institutions, providing loans and grants to developing nations for a variety of development programs. Since 1947, the World Bank has financed almost 12,000 poverty-reduction projects in 122
172 countries (The World Bank 2013c). As part of the Bank’s attempt to improve its effectiveness, a unit called the Independent Evaluation Group (IEG) evaluates the outcomes of Bank projects to provide an objective assessment of the results and cultivate best practices. The Director-General of the IEG reports directly to the World Bank Group’s Board of Executive Directors and not to Bank Group Management (The World Bank 2014a). As part of the evaluation process, the IEG evaluator ranks the Bank and the borrower across a variety of metrics including the quality of the Bank’s supervision efforts (Appendix A) and overall borrower performance (Appendix B) (The World Bank 2014b). I assume the IEG’s reported evaluations to be accurate and unbiased for the purposes of my analysis. In late 2011, the IEG released its entire project evaluation database comprised of some 9,855 projects approved since Bank operations began (Sud & OlmsteadRumsey 2012), allowing for an unprecedented analysis of the determinants of borrower performance. The following analysis uses cross-sectional data on 3,592 development projects across 101 countries that were started and completed between 1984 and 2013 (The World Bank 2013b). Projects spanned 19 sectors and took anywhere from two days to 16 years to complete. This analysis excludes any projects that did not require a monetary loan or had missing IEG reviews for Bank supervision, borrower performance, or overall project outcome. The dependent variable for this analysis is defined as: • BORRPERF (overall borrower performance) measures the performance of the borrowing country’s government as rated by the IEG. To generate project-level data, IEG officers review the assessments of project staff, with a fraction of reviews given a thorough audit. Performance of the Bank, borrowers, and the projects overall are compared against stated objectives and standards. To carry out these reviews, IEG staff have unrestricted access to Bank staff, project sites, and borrower representatives (The World Bank 2014c). Borrower performance is a measure of the extent to which the borrower ensured quality preparation, implementation, and compliance with agreements in the context of each country. Overall project quality is a measure of the extent 123
to which the project achieved its major objectives in an efficient manner. The correlation coefficient between overall borrower performance and overall project quality is 0.768, providing evidence that borrower performance is a probable determinant of overall project success. IEG variables relevant to the adversarial model are defined as: • QVISION (quality of Bank supervision) measures the quality of the Bank’s supervision effort during the course of the project as rated by the IEG as a proxy for s in equation (2) for alignment A. IEG variables relevant to the cooperative model are defined as: • LENGTH (length of the project) takes the natural log of the number of days between the listed start and closing date of a project. I consider length because longer projects are likely to be more complex. • COST (project cost) takes the natural log of the total cost of the loan to the borrowing country in U.S. dollars with the assumption that more complex projects are more expensive. • EXPER (experience with projects) measures the number of previously completed World Bank projects in a borrowing country. I postulate that past project experience leads to greater technical expertise. Bank supervision efforts are rated for proactive identification of opportunities to further project goals as well as resolution of potential threats. LENGTH and COST should both be correlated with the increased managerial and technical complexity captured in c, while EXPER is relevant for technical expertise t in equation (3) for execution E. In order to fully identify the adversarial and cooperative principal-agent models, my analysis also includes data obtained from the Political Risk Services (PRS) Group, a commercial provider of political and country risk forecasts based in East Syracuse, New York. Since 1984, the PRS Group’s International Country Risk Guide has provided annual aggregates of several political risk and institutional quality indicators (PRS Group 2013). PRS Group staff collect political, financial, and economic 124
data and make each political index on the basis of subjective analysis of the available information. Each indicator is assessed on a scale from 0 to 12, with 12 being the highest score.2 For example, a score of 12 on the index for corruption would indicate extremely low corruption in a country. Numerous previous studies have used data from the PRS Group to investigate, among other topics, cross-country technology diffusion (Caselli & Coleman 2001), foreign direct investment (Busse 2005), and agricultural productivity (Cervantes-Godoy & Dewbre 2010). These rich indicators allow for a detailed examination of each recipient government. Any explanatory model of overall project outcomes would require further data on each project as well as further insight into the interaction between Bank and borrower actions, which is why I focus purely on borrower performance. PRS variables that estimate governance quality g in equation (2) are defined as:3 • GOVST (government stability) measures the government’s ability to carry out its policies and stay in office. • CORR (corruption) assesses corruption in the political system, including demands for bribes, patronage, nepotism, job reservations, and close ties between politics and business. • LAW (law and order) measures the strength and impartiality of the legal system and popular observance of the law. • DEMOC (democratic accountability) responsiveness of government.
measures
the
• MILIT (military involvement in politics) measures the influence of the military in the political process. • RELIG (religious tensions) measures the influence of religious sects that seek to replace civil law with religious law and exclude other religions from the political process. 2 In
the original PRS Group data set, some indicators are scaled from 0-6 or 04. For easier interpretation of the results, all PRS indicators have been re-scaled to 0-12. 3 Naming conventions adapted from Busse (2005).
125
Many of the governance quality indicators are highly correlated. This is expected, as they all assess political risk and institutional quality. To specify the model, I calculate the mean of the annual governance indicators for each project as the sum of the indicators over the time period of each project divided by the length of the project in years.4 Weighting all years equally is admittedly an imperfect measure, as an anomalous year (perhaps from a temporary regime change) could artificially inflate or depress the mean. To try and account for these differences in the variance of governance quality throughout the course of projects, I also calculate the annual change of each governance indicator as the difference in start and end year values divided by the length of the project in years (e.g. GSS ). In addition to EXPER, I further specify t in equation (3) with the PRS variable: â&#x20AC;˘ BUR (bureaucratic quality) measures the technical expertise and professionalism of the bureaucracy. To control for unobserved differences across nations at different levels of development, I also include the PRS variable SOCIO (socioeconomic conditions) that measures unemployment, consumer confidence, and poverty levels rated on a scale of 012. I also define a set of six regional dummy variables to control for unobservable differences across regions. Variable names and data sources are available in Appendix C (found online), and descriptive statistics for all variables are displayed in Appendix D (found online). A quick review of the descriptive statistics indicates that most governance indicators are centered around mid-range values on their 12-point scales, although their standard deviations differ. As would be expected, the variables measuring change in governance quality have both positive and negative values depending on the project. The mean project cost was approximately $71 million and took an average of 4.7 years to complete. It should be noted that if lender supervision s also mitigated technical challenges that arose during project implementation, the adversarial and cooperative theories of behavior would not 4 Projects less than a year long are matched with the governance indicator for that single year.
126
be entirely orthogonal. However, although supervision by the World Bank does include technical assistance and management advising, the act of monitoring, both by staff at Bank headquarters and during trips to borrowing countries, is the central activity captured by IEG reviews of Bank supervision (Kilby 2000). This is reflected in the IEG criteria for rating Bank supervision (summarized in Appendix A), which emphasizes the “fiduciary” duty of the borrower (i.e. alignment of the borrower’s implementation to project objectives).
5
Methodology
I first use an ordinary least squares (OLS) model to test if an adversarial model based on alignment A or a cooperative model based on execution E better explains borrower performance. The test of which model is of greater importance will be based purely on a segregation of the explanatory variables into either the adversarial or cooperative model. The OLS model is estimated using the standard multivariable regression relationship Yi = b 1 + b 2 X2i + ... + b k Xki + ui
(4)
In order to specify the model, I first conduct two stepwise regressions. The first is of the means of the governance indicators on BORRPERF and the second is of variables that measure change in governance quality over the course of the project (with S suffixes) on BORRPERF. The stepwise regression procedure uses forward selection to test the addition of each variable only if it improves the model’s explanatory power (tested at the 15% level). The stepwise regression for means finds that GOVST, CORR, LAW, and RELIG are significant, and the stepwise regression for the change variables finds that MS , or the change in military involvement in politics over the course of the project, is significant. The full model includes the quality of bank supervision (QVISION), four variables measuring governance quality (GOVST, CORR, LAW, and RELIG), one variable measuring change in governance quality (MS ), the length of the project (LENGTH), the cost of the project (COST), experience with projects (EXPER), bureaucratic quality (BUR), socioeconomic 127
conditions (SOCIO), and five regional dummy variables (R1, R2, R3, R4, R5). BORRPERFi = b 1 + b 2 QV ISIONi + b 3 GOVSTi + b 4 CORRi + b 5 LAWi + b 6 RELIGi + b 7 MSi + b 8 LENGTHi + b 9 COSTi + b 10 EXPERi + b 11 BURi + b 12 SOCIOi + b 13 R1i + b 14 R2i + b 15 R3i + b 16 R4i + b 17 R5i + ui (5) In addition to the linear model, I use a probit model with the dependent variable, borrower performance, defined as a binary outcome of either 0 or 1. An evaluation of the bimodal distribution of borrower performance scores (Figure 1) supports such a specification. I define BORRPERF greater than 7 as “satisfactory” (BORRPERF = 1) and any less than 7 as “unsatisfactory” (BORRPERF = 0), descriptions that mirror the primary adjectives in the IEG’s methodology in Appendix B. The probit regression model uses the cumulative standardized normal distribution (CDF) to model a sigmoid relationship of a linear variable Z such that Z = b 1 + b 2 X2i + ... + b k Xki
(6)
The probability of the event occurring is defined as pi = CDF(Zi )
(7)
With the standard normal density function defined as 1 CDF(Z) = p e 2p
z2 2
(8)
Thus, the probit specification with each governance indicator is CDF(BORRPERF)i = b 1 + b 2 QV ISIONi + b 3 GOVSTi + b 4 CORRi + b 5 LAWi + b 6 RELIGi + b 7 MSi + b 8 LENGTHi + b 9 COSTi + b 10 EXPERi + b 11 BURi + b 12 SOCIOi + b 13 R1i + b 14 R2i + b 15 R3i + b 16 R4i 128
+ b 17 R5i + ui
6
(9)
Results
Results of the determinants of borrower performance as specified in the linear equation (5) are reported in Table 1. As many of the variables, including BORRPERF, are measured on indexed scales rather than in absolute units, it is difficult to interpret the quantitative importance of some of the coefficients without any transformation. To ease interpretation, I consider the percentage that BORRPERF is increased from its 10th percentile value to its 90th percentile value if an explanatory variable increases from its 10th to its 90th percentile value. This impact on borrower performance ILinear is defined as 90thpctile
ILinear =
(xki
90thpctile (yi
10thpctile
xki
)
10thpctile yi )
bbk
(10)
The overall model presents a reasonable goodness of fit with overall borrower performance, with an adjusted R-squared of 0.3513. Seven of the variables in the model are statistically significant at the 5% level. The quality of the Bankâ&#x20AC;&#x2122;s supervision has the most impactful coefficient, with a 1-point increase leading to an increase in overall borrower performance of 0.5923 points. This impact is shown to be quantitatively important, as an improvement in the quality of Bank supervision from its 10th to 90th percentile value would result in a proportional improvement of 59.2% of the range separating the 10th percentile from the 90th percentile in borrower performance. Of the five governance variables included, four are statistically significant. The variable for corruption has the most quantitative significance, as its coefficient indicates that an improvement in corruption from its 10th to 90th percentile value would result in a proportional increase in borrower performance of 7.5%. Indicators for government stability and law and order returned effects of similar magnitudes, with that of a change in military involvement in politics much smaller. However, as there does remain some multicollinearity between the first three governance indicators, their coefficients 129
Figure 1: Histogram of Borrower Performance (BORRPERF)
130
Table 1: Coefficient Transformation for Linear Regression Variable QVISION GOVST CORR LAW RELIG M_S LENGTH COST EXPER BUR SOCIO Constant
Coefficient (t-stat) 0.5923448*** (37.47) 0.0975966*** (3.17) 0.1206906*** (4.05) 0.0619899*** (2.63) 0.0090508 (0.46) 0.4068133*** (2.76) -0.1371872*** (-4.00) -0.0292852 (-1.02) 0.0002513 (0.21) 0.0207193 (1.02) 0.1052104*** (3.24) 2.110533*** (3.14)
90th percentile 10th percentile 6
I_Linear
Impact %
59.23449
59.2%
4.1111
6.687167
6.7%
3.7182
7.479305
7.5%
5.6388
5.825902
5.8%
-
-
-
0.41666
2.825047
2.8%
1.5578
-3.561937
-3.6%
-
-
-
-
-
-
-
-
-
3.4768
6.09668
6.1%
-
-
-
N = 3296 R2 = 0.3513 F(16,3279) =112.55 ***significant at 1% level; **significant at 5% level; *significant at 10% level. Non-significant regional dummies omitted.
should not be considered perfectly precise. In terms of variables in the cooperative model, only the length of projects was statistically significant, with an increase in project length by 1% leading to a decrease in borrower performance of 0.137 points. Put another way, moving project length from its 10th to 90th percentile value would result in a 3.6% decrease of borrower performance from its 90th to 10th percentile. The final statistically significant variable was the measure of socioeconomic conditions, which found that an improvement from the 10th to 90th percentile resulted in about a 6.1% corresponding increase in borrower performance. To interpret the results from the probit regression, I will specify a similar transformation as that for the linear model. However, unlike in linear regression, an interpretation of the
131
coefficients in probit regression must account for the positions of the other variables in the model. This is because the increase in probability of a satisfactory borrower performance attributable to a one-unit increase in a given explanatory variable is also dependent on the values of the other predictors. To transform the coefficients, I will consider the marginal increase in probability that borrower performance will be satisfactory versus unsatisfactory when an explanatory variable increases from its 10th to its 90th percentile value. The use of the difference in fitted probabilities for each given explanatory variable from its 10th to 90th percentile values within a probit model is credited to Malmendier and Nagel (2011) and Raman, Shivakumar, and Tamayo (2013). I define a new variable IProbit to measure the impact on the probability of a â&#x20AC;&#x153;satisfactoryâ&#x20AC;? borrower performance as 90thpctile b bi
IProbit = CDF( bb1 + xi
10thpctile b bi
CDF( bb1 + xi
+ Skj=1 x j bbj )
+ Skj=1 x j bbj )
(11)
This formula assumes that all other explanatory variables not under consideration are all at their mean values.5 IProbit can thus be interpreted as the contribution in probability toward a satisfactory borrower performance from a change in a given explanatory variable from its 10th to 90th percentile value. The results of regression equation (11) and the transformed coefficients are displayed in Table 2. The relative significance of the explanatory variables are similar to those in the linear model. Again, the quality of the Bankâ&#x20AC;&#x2122;s supervision was the most quantitatively important explanatory variable, with an increase from its 10th to 90th percentile value leading to a 61.2% increase in the probability of satisfactory borrower performance. The same four governance indicators are again statistically significant with similar magnitudes, although now law and order was the most significant, with a 10th to 90th percentile improvement contributing 7.6% toward the probability of satisfactory borrower performance. As in the linear model, 5 This
is not to say that alternate specifications are without interest. One could alter the mean values of variables to match the conditions in a specific country.
132
Table 2: Coefficient Transformation for Probit Regression Variable
QVISION GOVST CORR LAW RELIG M_S LENGTH COST EXPER BUR SOCIO R2 Constant
Coefficient
CDF
CDF
I_Probit
(z-value)
90th percentile 0.854101117
10th percentile 0.241690702
Impact %
0.612410414
61.2%
0.802008121
0.731400805
0.070607316
7.1%
0.796799677
0.737653644
0.059146033
6.0%
0.807897204
0.731073587
0.076823616
7.7%
-
-
-
-
0.78144241
0.754678352
0.02946589
2.9%
0.74803895
0.797922611
-0.040904072
-5.0%
0.748778262
0.789682334
-0.040904072
-4.1%
0.812823786
0.741567884
0.071255902
7.1%
-
-
-
-
0.80435599
0.723970148
0.080385842
8.0%
-
-
-
-
-
-
-
-
0.2926934*** (25.35) 0.0563743** (2.48) 0.0522062** (2.30) 0.0450639** (2.57) 0.0101633 (0.70) 0.2327563** (1.98) -0.106488*** (-3.74) -0.0379666* (-1.76) 0.002760*** (2.90) 0.009905 (0.65) 0.075530*** (3.07) 0.2365606* (1.92) -2.04553*** (-3.95)
N = 3296 Pseudo R2 = 0.2578 LR chi2(16) = 966.67 ***significant at 1% level; **significant at 5% level; *significant at 10% level. Non-significant regional dummies omitted.
project length and socioeconomic conditions are both significant with similar economic impacts. Interestingly, the number of previously completed projects and project cost are now statistically significant. In addition, both of their contributions toward a satisfactory borrower performance are similar in magnitude to the other variables in the cooperative model. The addition of dummy variables for each project sector to account for potential unobserved differences between projects did not materially change the results. These models do not control for potential endogeneity between Bank supervision and borrower performance. As can be seen from Table 3, the quality of Bank supervision is highly 133
Table 3: Correlation between BORRPERF and QVISION
BORRPERF QVISION
BORRPERF 1.0000 0.5695
QVISION 1.0000
correlated with Borrower performance. There is a material concern for reverse causality, as one could conceive that superior borrower performance enables easier and more effective Bank supervision efforts. If endogeneity exists, the most likely explanation is that borrowers with more efficient bureaucracies facilitate Bank supervision while also improving their own performance. However, none of my four specifications found that bureaucratic quality (BUR) influenced borrower performance with any statistical significance. An alternative explanation for endogeneity is that Bank officials may be more easily able to monitor projects in countries with developed transportation and communication infrastructures. To test these hypotheses, I run a regression of QVISION on BORRPERF first with dummy variables for socioeconomic conditions and then with dummy variables for each country. The results, presented in Appendix E (available online), indicate that the relationship between QVISION and BORRPERF remains robust, with coefficients and t-values very similar to those of the linear specifications. In addition to my tests, I also review the findings of Chauvet et al. (2006), who consider the effects of project supervision on development project outcomes. As part of their analysis, Chauvet et al. test for the potential for endogeneity between project supervision and project outcomes. While I consider overall borrower performance, not overall project outcomes, the two are highly correlated. Thus, the results of Chauvet et al.â&#x20AC;&#x2122;s endogeneity tests are directly applicable to the potential reverse causality in my models. In this process, Chauvet et al. use a recursive multivariate probit model that transforms borrower supervision into a binary variable. The direct effect of supervision remains significant at the 1% level, implying a robust relationship. While not conclusive in the context of my analysis, these results suggest a causal relationship between Bank supervision and borrower 134
performance.
7
Discussion
The analysis found that the adversarial model explained borrower performance better than the technical model, although largely because of the impact of project supervision. In the linear model, the variables measuring the technical competence of a borrowing countryâ&#x20AC;&#x2122;s government show the least quantitative importance. However, the resulting statistical significance of experience with past projects in the probit model does suggest that specific experience is important, and certainly more so than general bureaucratic competence. Variables for the cost and length of projects were slightly more robust, indicating that project complexity also plays a role in borrower performance. However, project length may be subject to issues of reverse causality, as poor borrower performance could conceivably delay the completion of projects. Interestingly, the absolute values for the sum of the impact of the statistically significant technical t and complexity c variables in the probit model were about equal, suggesting that the oppositional factors defining execution E are in relative balance. Variables for governance quality were somewhat more quantitatively significant than technical variables, sometimes up to twice the extent as measured in the linear model. However, their impacts were also of limited overall significance. The relative significance of governance indicators inverted across models, with law and order becoming the most impactful variable in the probit model. In sum, the results do not provide clear evidence that there is a single aspect of governance quality that is most important, and therefore there is no clear metric that the Bank should seek to improve over others. The lack of quantitative importance for improvement in military involvement in politics also suggests that short-term changes in governance quality will do little to ensure borrower performance, perhaps reflecting how governance quality can become entrenched in institutions. Rather, the Bank should focus on improving the overall level of governance quality over the long run. Considering again the theoretical model, the results indicate that the impact of technical expertise t, 135
project complexity c, and governance quality g are all of similar (marginal) magnitudes. Socioeconomic conditions were statistically significant and consistently one of the most impactful explanatory variables. That improved conditions led to better borrower performance is not surprising, although the marginal size of the impact might be an encouraging sign that governments in poor countries can still perform admirably. Finally, dummy variables for region had limited significance, although the positive coefficient on the dummy for East Asia and the Pacific (R2) makes sense given the generally more developed nature of this region in contrast to regions such as Sub-Saharan Africa. The most impactful variable in the analysis was the quality of the Bank’s supervision efforts, which maintained considerable quantitative importance across models. That the other adversarial variables would have a small impact appears reasonable, as the ratings given by the PRS group apply to a whole country over an entire year. Even if the ratings are accurate, the indicators are an aggregate from a wide variety of potential agents and projects, which decreases their precision. Traits like corruption and government stability are diffuse and may not be obviously applicable to any given project. For example, some projects may naturally have less scope for corrupt activities or provide limited electoral benefits to an embattled regime. Thus, we may not have expected an enormous impact of governance indicators. By contrast, IEG reports on completed projects indicate that increasing the quality of Bank supervision has been shown to make a material impact on borrower behavior. For example, in a rural finance project for Tunisia, Bank supervision was “somewhat superficial for the first few years of project implementation” (The World Bank 2001). But when Bank efforts to improve the quality of the Tunisian National Agriculture Bank’s loan portfolio appeared to stagnate, supervision become “more active and interventionist,” forcing the borrower to define indicators to measure portfolio quality and establish “new and realistic” return targets. During the course of a series of agricultural credit loans to Morocco’s Caisse Nationale de Crédit Agricole (CNCA), project supervision was “unsatisfactory for a long period,” a result of too little attention paid to the allocation of Bank credit and CNCA’s 136
credit process (The World Bank 1998). After an audit by KPMG revealed gross procedural and management errors in CNCA, the Bank appointed new teams and “supervision dramatically improved.” After supervision improved, the Bank refused to “endorse the vague rescue measures that were proposed by CNCA” and specified which “reforms would be necessary before formalizing further Bank support.” My findings on the outsized impact of Bank supervision also coincide with the results of Kilby (2000) and Ika, Diallo, and Thuillier (2012), who found supervision to be highly influential on overall project outcomes. While many of the technical variables had at least some statistical significance, bureaucratic competence had no meaningful impact on the performance of borrowers. This is surprising, as bureaucratic quality varies significantly across countries. This result implies that bureaucratic efficiency is no match for experience with projects whose complexities create meaningful challenges for borrowers. One possible explanation for the small quantitative importance of technical variables as a whole is that project supervision may help mitigate poor borrower performance that would otherwise be significantly affected by a lack of technical competency or project complexity. If true, this would imply a revision to my original model, which considered Bank supervision as a check against only low governance quality g. More broadly, the regression coefficients of this analysis may have been underestimated because Bank projects were excluded if they had any missing IEG review data. If cancelled or failed projects are less likely to be given comprehensive ratings by the IEG, as one might expect, then my sample would be biased toward projects that may have succeeded in spite of meaningful absences in governance quality. Given the importance of Bank supervision, I now consider methods to improve supervision efforts. Several IEG reports of projects with the lowest reviews of Bank supervision highlighted high project leadership turnover as crippling to supervision efforts because of difficulties in transferring knowledge and maintaining communication (The World Bank 2005). A project for urban revitalization in Mozambique also cited a breakdown in communications as anathema to supervision, one that occurred because of a build up of tensions between the borrower and Bank 137
staff (The World Bank 2002). Thus, the Bank should work to maintain continuity among supervision teams and consistency of communication at all times. Other common challenges to highquality supervision included deficient documentation and a lack of use of standard monitoring and evaluation (M&E) tools (The World Bank 2004). One approach to mitigate this failure is the eOperations platform built by the Asian Development Bank in 2010 (The Asian Development Bank 2011). eOperations is an integrated information technology solution with the ability to monitor projects, streamline administrative procedures, provide uniform project related documentation, and prepare standardized customizable reports. IEG reviews also highlight how Bank officials could become complacent about supervision. During a project to redesign Tanzaniaâ&#x20AC;&#x2122;s roadways, Bank supervisors accepted the reassurances of resident engineers even when their assessments were wildly more optimistic than those of Bank headquarters. Supervisors failed to investigate emerging problems in sufficient detail, which ultimately led to eroding borrower performance and corrupt activities (The World Bank 2000). Similarly, IEG reviews of projects across Jordan, Egypt, and Yemen for higher education reforms found that supervision missions tended to rate fulfillment of all project objectives as satisfactory, thus failing to alert management that parts of the reform agenda were not progressing well (The World Bank 2011). Ultimately then, no matter the project team and the technologies at their disposal, quality supervision depends on the willingness of Bank officials to question underlying assumptions and engage in potentially uncomfortable dialogue. Despite the imperative to improve Bank supervision, there remain potential structural challenges to doing so. As noted by Chauvet et al. (2006), the long lag between the decision to propose a project and the eventual performance of the project mean that incentives for Bank staff to abort projects are weak. The Bankâ&#x20AC;&#x2122;s incentive scheme instead encourages a culture of disbursement rather than ensuring project success, which in part depends on high-quality supervision. Furthermore, supervision alone will likely not be enough to maximize borrower performance. While this analysis explained a reasonable amount of variance in 138
borrower performance, there may remain some other reforms that could substantially affect the behavior of borrowers. Aerni (2006) has proposed changing the underlying principal-agent dynamic by shifting the role of the principal from lending organizations to the middle classes of developing countries. He suggests that developing nations pay the annual interest on their debt to an independent funding pool designed to improve the infrastructure for domestic entrepreneurs. In the long run, this could help develop a politically active middle class that could do a better job of selecting and monitoring development projects than staff from a global lending institution like the World Bank.
8
Conclusion
Despite billions of dollars spent in development aid, poor living conditions still afflict billions of people globally. Recent research has sought to improve the effectiveness of international development efforts by measuring aggregate indicators and project-level outcomes. This paper analyzed the behavior of governments engaged in development contracts with the World Bank within a principal-agent framework. The results indicate that the most significant predictor of borrower performance is how well the Bank supervises projects. By comparison, the technical competence and governance quality of borrowers, as well as the complexity of projects, has only a marginal impact on borrower performance. In some sense it should be encouraging to policymakers that supervision can dramatically improve project performance, as this variable is most fully within the control of the Bank. Whereas supervision may have once been considered a procedural requirement, it is now more than ever clearly a potent tool to direct borrowers toward high performance. Further research is necessary to examine how to maximize project supervision as well as other determinants of borrower behavior unaccounted for in this model. As the behavior of borrowers can impact overall project outcomes, developing a more complete understanding of the decisions made by development projects partners will be crucial in the fight against global poverty.
139
Appendix A: Quality of Bank Supervision Criteria Criteria Bank performance is rated against the following criteria, as applicable to a particular operation. The evaluator should take account of the operational, sector, and country context in weighing the relative importance of each criterion of quality of supervision as it affected outcomes. • Focus on Development Impact • Supervision of Fiduciary and Safeguard Aspects (when applicable) • Adequacy of Supervision Inputs and Processes • Candor and Quality of Performance Reporting • Role in Ensuring Adequate Transition Arrangements (for regular operation of supported activities after Loan/Credit closing) Rating Scale With respect to relevant criteria that would enhance development outcomes and the Bank’s fiduciary role, rate Quality of Supervision using the following scale:
140
Highly Satisfactory
Satisfactory Moderately Satisfactory
Moderately Unsatisfactory
Unsatisfactory Highly Unsatisfactory
There were no shortcomings in the proactive identification of opportunities and resolution of threats. There were minor shortcomings in the proactive identification of opportunities and resolution of threats. There were moderate shortcomings in the proactive identification of opportunities and resolution of threats. There were significant shortcomings in the proactive identification of opportunities and resolution of threats. There were major shortcomings in the proactive identification of opportunities and resolution of threats. There were severe shortcomings in the proactive identification of opportunities and resolution of threats.
Appendix B: Borrower Performance Criteria Definition: Borrower performance is the extent to which the borrower (including the government and implementing agency or agencies) ensured quality of preparation and implementation, and complied with covenants and agreements, towards the achievement of development outcomes. Government Performance Government performance is rated against the following criteria, as applicable to a particular operation. The evaluator should take account of the operational, sector, and country context in weighing the relative importance of each criterion of government performance as it affected outcomes. Criteria Government ownership and commitment to achieving development objectives. Enabling environment including supportive macro, sectoral, and institutional policies (legislation, regulatory and pricing reforms, etc.) 141
• Adequacy of beneficiary/stakeholder consultations and involvement • Readiness for implementation, implementation arrangements and capacity, and appointment of key staff • Timely resolution of implementation issues • Fiduciary (financial management, governance, provision of counterpart funding, procurement, reimbursements, compliance with covenants) • Adequacy of monitoring and evaluation arrangements, including the utilization of M&E data in decision-making and resource allocation • Relationships and partners/stakeholders
coordination
• Adequacy of transition arrangements
142
with
donors/
The Yale Journal of Economics is grateful for the financial support of the Yale Department of Economics, the Yale Undergraduate Organizations Committee, and our generous private donors. The typeface used in the Journal is URW Palladio. The Journal was typeset in LATEX and printed by Yale Printing & Publishing Services in New Haven, CT. Visit our website at http://econjournal.sites.yale.edu/.
143