The Yale Journal of Economics Fall 2013
Volume 2, Issue 1
Staff Editor-in-Chief Antonia Woodford Managing Editor Moss Weinstock Associate Editors Elijah Goldberg Jimin He Mason Kroll Copy Editors Dhruv Aggarwal James Austin Schaefer Production and Design Editor Madeline McMahon Publisher Brian Lei Board of Advisers Joseph G. Altonji Pinelopi K. Goldberg Samuel S. Kortum Anthony A. Smith, Jr.
The Yale Journal of Economics Fall 2013 Volume 2, Issue 1 New Haven, CT Website: http://econjournal.sites.yale.edu/
2
Table of Contents Editors’ Note
4
David Hu (Yale)
The Influence of the U.S. Postal Savings System on Bank Runs
7
Jacob Berman (U. Chicago)
The Maturity Structure of Treasury Debt: How Costly Is Mismanagement?
33
Erica Segall (Yale)
Three Dimensions of Time: An Age-Period-Cohort Analysis of U.S. Spending Patterns
59
Jisoo Han (Princeton)
Language Use and Health of Children in Immigrant Households
87
Aditya Rajagopalan (Princeton)
Speculative Initial Public Offerings: A Disagreement Approach to the IPO Puzzle
109
This journal is published by Yale College students and Yale University is not responsible for its contents. A full list of references for the papers in this issue is available on our website.
3
Editors’ Note One does not need a Ph.D. to conduct original economic research. We founded the Yale Journal of Economics last year with that philosophy in mind, aiming to showcase outstanding research papers in economics written at the undergraduate level. As we worked on our inaugural issue last spring and our second issue this fall, we have constantly been impressed by the number of innovative, thought-provoking papers we have received. Undergraduates from around the world have been excited by the opportunity to have their research published, and we hope to continue to broaden the Journal’s reach in the coming year. Our second issue contains five essays written by undergraduates at the University of Chicago, Princeton University, and Yale. In the field of economic history, David Hu refutes earlier hypotheses in the literature by examining the effect of the U.S. postal savings system on bank runs during the Great Depression. Jacob Berman simulates counterfactual debt management strategies for the U.S. Treasury to see how changing the maturity structure of debt could save the Treasury billions of dollars. Erica Segall analyzes generational patterns in U.S. spending behavior and proposes a new model of consumer demand to incorporate these effects. Jisoo Han studies immigrant families and examines the connection between language spoken at home and children’s access to health care services in the United States. Closing the issue, Aditya Rajagopalan adds his voice to the IPO Puzzle by modeling the effect of aggregate disagreement on short-term IPO underpricing and long-term IPO underperformance. We selected these papers from a pool of more than 80 submissions from undergraduates at Yale and other universities. We were thrilled to receive submissions from across the globe— from university students in Canada, the United Kingdom, Russia, China, Argentina, and Italy, among others—as well as many schools in the United States. We hope that the Journal will continue to attract submissions from a wide range of students, and we look forward to featuring undergraduate research from 4
more institutions in the future. We thank all the students who submitted papers and the professors who nominated the best papers in their classes for our consideration. The Journal’s Board of Advisers also offered us valuable recommendations and guidance. In addition, we thank the generous donors who have made publishing the Journal possible. In addition to the Yale Department of Economics and the Yale Undergraduate Organizations Committee, Stephen Freidheim ’86 and David Swensen GRD ’80 made substantial contributions that allow us to continue to produce the Journal and distribute it free of charge. Thanks to their support, we will be able to publish the Journal two times this academic year—this fall and in the spring of 2014—and make it a more established part of the academic community.
5
6
The Influence of the U.S. Postal Savings System on Bank Runs David Hu, Yale University1 Abstract. As the only nationwide bank to have its deposits fully backed by the government, the U.S. postal savings system exerted a significant influence on the banking system. In addition to providing a safe haven for deposits, the postal bank was designed to redeposit its holdings with commercial banks so as to prevent government competition with the private sector. Previous research suggests that these features caused the postal savings system to have a negative impact during bank runs: savers would seek refuge with the postal bank, but the postal deposits would not be redeposited with local banks. However, these analyses fail to account for potential endogeneity and regional variation. To resolve these issues and establish causality, I construct instruments from the unique institutional features of the postal savings system. There are three main results that refute previous hypotheses in the literature. First, there is a negative correlation between postal deposits and bank failures. Second, I obtain evidence that redepositing by the postal bank helped to prevent bank runs. Third, I find that the contagion effect of bank failures on the demand for postal deposits is highly localized. Altogether, these results help to refine the existing framework used to understand the relationship between the postal savings system and bank runs. Keywords: bank runs, U.S. postal savings system
1 David
Hu graduated from Yale University in 2013. This senior essay won the Ellington Prize for the best senior essay in the field of finance. The paper was supervised by Professor Dan Keniston.
7
1
Introduction
Until the creation of the Federal Deposit Insurance Corporation (FDIC) in 1935, the U.S. postal savings system was the only bank to have its deposits fully insured by the federal government. Given that the U.S. experienced widespread bank runs throughout the 1920s and 1930s, culminating in the Great Depression, it is important to understand how the existence of postal banks may have played a role in altering depositor behavior and preventing bank failures. The existing literature on the postal savings system is limited but has generally pointed to two conclusions. First, in response to a bank failure, individuals move their savings to the postal bank (Kuwayama 2000; Sissman 1936). Second, the postal savings system failed to limit the bank runs of the Great Depression since it did not redeposit its holdings (Friedman and Schwartz 1963; O’Hara and Easley 1979). However, most of the previous research relies on historical accounts and stylized statistics rather than econometric models. In this paper, I attempt to address this gap in the literature by gathering the first county level dataset on postal savings and the first state-level dataset on redeposits from 1911 to 1945. By constructing these datasets, I am able to perform a detailed statistical analysis of the postal savings system. I find that previous research has not adequately addressed endogeneity issues in the relationship between postal deposits and bank failures. Using fixed effects, I reveal a negative correlation between the two variables. Through an instrumental variables approach structured around the rules in the Postal Depository Act of 1940, I also obtain evidence that the postal savings system had a statistically significant effect on bank runs during the 1920s and 1930s. Lastly, I find that the effects of bank failures on the demand for postal deposits are highly localized. All three conclusions contradict previous results in the postal bank literature. The paper proceeds in seven sections. Section 2 provides historical background on the postal savings system’s development and highlights its unique institutional features, which are employed in the instrumental variables strategy later on. Section 3 uses previous research to motivate the questions under 8
consideration. Section 4 describes the panel datasets on postal deposits, redeposits, and bank failures. Section 5 outlines an empirical strategy using instrumental variables to analyze the relationship between bank failures and postal deposits as well as the relationship between redeposits and bank failures. Section 6 presents the regression results, and Section 7 offers some conclusions.
2 2.1
Historical Background The Founding of the Postal Savings System
During the late 19th century, many European countries successfully developed postal savings systems as a way to increase household savings (Schewe 1971). In contrast, the United States was a relatively late adopter. Commercial bankers perceived the postal savings system as a competitive threat, claiming that it could eventually lead to a government takeover of the entire financial sector. It was only after the Panic of 1907 that public support for the postal savings system overcame private sector resistance. The panic left lawmakers to figure out how to restore public confidence in banks and credibility to the financial system. Congress chose to pass the Postal Savings Depository Act of 1910, which authorized the conversion of post offices into governmentbacked banks where all deposits would be fully insured by the government.
2.2
Institutional Features of the Postal Bank
In response to the bank lobby’s concerns, the postal savings system was designed with several unique institutional features to limit its competition with private banks and ensure that the deposits would remain local. The interest rate on postal deposits was fixed at 2%, significantly less than the 3.5% paid out by most commercial banks in 1910 (O’Hara and Easley 1979). A strict deposit limit of $500 was imposed, later raised to $2,500 in 1918, so the government could argue that the postal bank’s deposits were from poor, rural savers and would not have been placed in private banks anyways. 9
The Postal Depository Act ordered the conversion of post offices to banks to proceed from first-class post offices, or those with the highest gross annual revenue, to fourth-class offices. Emphasis was placed on keeping higher classification postal banks open over the years; fourth-class postal banks were most likely to be closed in the event of a downturn.
2.3
Redepositing Mechanism
A second major fear was that the system would redeposit its money in large, urban financial markets, depriving the local areas where the savings were generated of investment funds. To avoid this possibility, the Postal Savings Depository Act stated that 95% of postal deposits were to be redeposited in solvent local banks. Only when no local banks were willing to pay the legally required interest rate of 2.25% could the deposits be offered to other banks within the same state or, eventually, placed in federal government securities. In theory, the postal savings system would have the capacity to stop bank runs through redepositing. As depositors transfer savings from commercial banks to postal banks, the postal savings system could redistribute the new deposits back to the local banks. Even if a commercial bank were hit by a wave of withdrawals, the postal savings system could stymie a potential run by redepositing its funds into the threatened bank, serving as a backstop to the banking system by slowing the vicious cycle between deposit withdrawals and bank failures.
3 3.1
Related Literature and Theory The Causes of Bank Failures during the Great Depression
While little research has been done on the postal savings system, there exists substantial work on the causes of bank failures during the Great Depression. General theories on Depression-era bank failures disagree over whether failures were due to illiquidity or insolvency. The first line of reasoning, formalized as the Diamond-Dybvig 10
model, argues that banking panics occurred because depositors were concerned about other depositors’ actions (Diamond 1983; Carlson 2002). If depositors believe that withdrawal demand will exceed the bank’s supply of liquid reserves, they will rush to the bank and precipitate a bank run. Friedman and Schwartz (1963) argue that the bank failures of the Great Depression resulted from a sudden, unwarranted crisis of confidence among depositors rather than a fundamental deterioration in bank health. They point to the fall of the large, national Bank of United States as a signal for depositors to initiate a bank run, and because banks are inherently susceptible to runs, a banking crisis became a selffulfilling prophecy. The second line argues that bank failures are primarily motivated by economic shocks. In a regression analysis of Great Depression-era bank failures, Calomiris and Mason (2010) find that most national banking panics can be explained by bank fundamentals. Similarly, Wicker (1980; 1996) claims that the perceived banking crisis was the product of regional and bankspecific events rather than the spread of a nationwide panic. In summary, this view argues that real shocks caused banks to become insolvent during the Great Depression. Each theory contains different implications for the postal bank and its depositors, raising two major questions: What is the effect of bank failures on the level of postal deposits, and what is the effect of redepositing on bank failures? In the illiquidity theory, depositors move their savings to the safety of the postal bank, which can prevent bank runs by providing commercial banks temporary liquidity. In the insolvency theory, depositors may draw on their postal savings to cushion their income if bank failure is part of a larger economic downturn. Furthermore, redepositing will have a limited impact on a fundamentally insolvent bank.
3.2
Postal Bank-Related Literature
Friedman and Schwartz claim that widespread fear in the Great Depression led to a great shift of deposits from the banking system to the postal bank. They argue that the postal bank failed to use the redepositing mechanism to provide temporary liquidity
11
and limit further bank runs. Instead of serving as potential reserves for the banking system, postal deposits were mostly diverted into Treasury bonds. As a result, Friedman and Schwartz believe that the postal bank allowed banks that were illiquid but not insolvent to fail. However, they rely mainly on a qualitative description of the events in question to reach this conclusion. Sissman (1936) provides a slightly more quantitative analysis of the postal savings system, noting little correlation between nationwide postal deposits and bank failures from 1911 to 1930, though he attributes this to regional variation. O’Hara and Easley (1979) focused specifically on its effect on the savings and loan (S&L) industry. They find that postal deposits increased exponentially from 1929 to 1934, while S&L and commercial bank deposits declined significantly. Correlation metrics also show that states with high bank failures had low redeposits. They conclude that the postal savings system had a very negative impact on the banking system since postal savings were not redeposited at local banks. Kuwayama (2000) constructs a time-series model to describe the demand for postal savings. Using national-level postal savings data from 1911 to 1967, the regression analysis finds a statistically significant positive correlation between bank failures and postal savings. However, Kuwayama does not address the potential for endogeneity in the bank failures variable. Both the demand for postal savings and the number of suspended banks may be correlated with exogenous shocks, such as crop failures or stock market crashes, and the model ignores these unobserved changes. Kuwayama also does not include the level of redeposits in any of her models even though it is correlated with both postal deposits and bank failures.
4
Dataset
The problem with most of the existing analyses is that they are largely qualitative in nature, built on a broad range of historical judgments rather than formal theory or empirical tests. Even the ones that employ quantitative measures do not fully address potential endogeneity, making it difficult to determine the effect of bank failures on postal deposits and the effect of redeposits on 12
bank failures. Additionally, previous research has only focused on national-level and state-level postal savings data. By constructing the first county-level dataset, I attain a richer picture of the interaction between postal deposits and bank runs. The data used in this study were compiled from the “Annual Report of the Operations of the Postal Savings System” which was published by the Postmaster General for every year of the postal bank’s existence. Since bank runs became less frequent in the years following the introduction of the FDIC in 1935, the reports were only digitized for the years from 1911 to 1945. The data contain both the number of postal depositors and the total amount of postal deposits for every operating postal bank branch in the United States, covering an average of 20,000 towns per year. The postal data were then geocoded so that distances could be calculated between towns and counties. The average county contains approximately three postal banks, holding about $200,000 in deposits; by comparison, the average county contains eight commercial banks, holding about $20,000,000 in deposits. The annual reports also contain state-level data on redeposits (the amount of postal deposits being held by a state’s commercial banks) and bond purchases (the amount of postal deposits used to purchase Treasury bonds). Redeposit data were unavailable at the level of individual postal banks, just at the state level; thus, statistics on postal deposits are calculated again to reflect the longer time span of the bank failure dataset available at the statelevel. The average state held about $13 million in postal deposits, representing about 1% of total bank deposits in that state. Only about $5 million of that was redeposited in local banks, with the remainder used to buy government bonds; the averages suggest that the postal bank did not redeposit nearly as much as it was required to by law (95% of postal deposits), though the standard deviation on redeposits is fairly large. Data on postal bank classification from 1915 to 1940 were available through studies of the postal savings system by the American Bankers Association. In addition to including the postal bank data from the Postmaster General’s annual report, these studies record each postal bank’s classification (first-, second-, third-, or fourth-class). The post office emphasized opening postal banks at locations of a higher classification level, which explains 13
why there are so few fourth-class postal banks on average. It is also clear that a higher class postal bank had significantly more depositors and postal deposits than a lower class postal bank. Statistics on the classifications of all post offices (including nonpostal banks) were unavailable. Bank failure data were gathered from two sources. The FDIC collected data on the total number of banks active, total deposits in all banks, number of suspended banks, and the total deposits of suspended banks in every U.S. county for every year from 1921 to 1936. County-level population data were interpolated from U.S. census data (U.S. Census 1910, 1920, 1930, 1940, 1950).2 The “United States Historical Data on Bank Market Structure” contains data on the number of banks active, total deposits in all banks, number of suspended banks, and the total deposits of suspended banks in every state for the years from 1921 to 1940, a slightly longer period of time than the FDIC dataset. This second dataset also contains estimates of per capita income by state. However, the series is largely incomplete and interpolated, resulting in a smaller number of observations (480, compared to 929 in the rest of the dataset). Also, some data on suspended deposits was also missing, making the number of suspended banks the more complete time series dataset at the state level.
5
Empirical Strategy
The arguments in the existing literature suggest two key issues to examine. First, I will analyze whether the redepositing mechanism helped to prevent bank failures. Second, I will examine the effect of bank failures on the demand for postal savings. Along the way, I will present some of the basic endogeneity issues that were unaddressed by previous research. To resolve these issues, I propose an instrumental variables approach based on the unique institutional features of the postal savings system, which include the use of postal classification to locate banks and the requirement that redepositing occur within 2 For inter-census years, the county population is a simple log-linear interpolation based on the two census data points that bracket the year. For example, the population in a given county, X, in 1929 is pop( X, 1929) = pop( X, 1920) ∗ e(7r ) where r = (ln( pop( X, 1930)) − ln( pop( X, 1920)))/10.
14
state lines.
5.1
The Impact of Redeposits on Bank Failures: Postal Bank Bond Purchases IV and Postal Classification IV
The first question is whether or not the redepositing mechanism served to prevent bank failures. O’Hara and Easley use rank correlation to show that states with high numbers of bank failure had low rates of redeposit. A basic OLS specification to study the same effect follows: Fit = β 1 Rit + β 2 Oit + β 3 Iit + β 4 Tit + γXit + ωt + νi + µit
(1)
where i indicates the state, t denotes the year, Fit is bank failures (either the number of suspended banks or suspended deposits), Rit is redeposits, Oit is population, Iit is per capita income, Tit is total bank deposits, and Xit is a vector that includes measures of bank soundness and general economic conditions. omegat represents the year fixed effects, and nui represents the county fixed effects. The main problem with this specification is the potential for endogeneity from the redeposits variable. Redeposits are likely to be correlated with unobserved changes in bank fundamentals. For example, it is possible that the banks were not sound to begin with and therefore would have failed anyways. This was the conclusion reached by Calomiris and Mason (2000), who used regression analysis to show that bank failures were mostly explained by economic conditions. In this case, it would make sense for the government to avoid redepositing in areas with bank failures. Low redepositing is correlated with bank failures because the postal bank would simply want to avoid throwing good money at bad banks. We cannot resolve this issue by adding controls as there are few proxies for economic conditions at the county level. Instead, there are two ways to instrument for redeposits by the postal bank. First, we can use the number of postal banks per state (fixed at a year before the extensive bank failures of the Great Depression) that interacted with the amount of government bonds purchased by the entire postal savings system. The number of postal banks per state has to be fixed to a year with few bank failures to avoid the possibility that new postal banks were 15
opened in response to the bank failures. This interaction term uses the variation in the number of postal banks per state and the national level of bond purchases by the postal bank to fix the amount of postal savings left for redeposit in each state. Using the interaction term to instrument for redeposits, we can then analyze bank failures in each state. The first and second stage regressions for this IV specification are: � it + β 2 Oit + β 3 Iit + β 4 Tit + γXit + ωt + νi + µit Fit = β 1 R Rit = αi Dit + α2 Oit + α3 Iit + α4 Tit + δXit + ωt + νi + µit
(2) (3)
where Dit denotes our instrument, the number of state postal banks in 1919 interacted with the postal bank bond purchases (i indicates the state, t denotes the year). It is possible that states with lots of postal banks in 1919 are differentially sensitive to nationwide economic shocks and that these shocks are also correlated with both postal bank bond purchases and the number of suspended banks. However, this is unlikely given that postal banks were located by classification level. A second way to measure the effect of redepositing is to instrument for postal deposits using postal bank classification. The key here is that the Postal Depository Act of 1910 introduced postal banks to different places at different times; first- and second-class post offices were converted to postal banks before third- and fourth-class post offices. The classification level also influences whether a postal bank is likely to stay open, as the post office prioritized keeping higher classification postal banks open. Thus, the classification determines how long an area has had a postal bank, which in turn determines the level of postal deposits and the magnitude of the redepositing effect. The first- and second-stage regressions for this IV specification follow below: � it + β 2 Oit + β 3 Iit + γXit + ωt + νi + µit Fit = β 1 P
Pit = α1 Cit + α2 Oit + α3 Tit + δXit + ωt + νi + µit
(4) (5)
where Cit is a vector that denotes our instrument, the number of first-class, second-class, third-class, and fourth-class postal banks, 16
and Pit is the level of postal deposits. Here, i now indicates the county (not state). The rest of the variables remain the same as in the simple OLS specification. Therefore, we can use post office classification as an instrument for postal deposits and then regress bank failures on postal deposits in the second stage. The advantage of this approach is that it utilizes the discontinuity between classifications created by the Postal Depository Act. Presumably, the classification bands were set arbitrarily such that small differences around the cutoffs in annual postal revenue determined whether two otherwise similar towns received a postal bank or not. However, one of the practical difficulties in implementing this approach is that postal bank classification data are not available for the entire time period. In addition, there is a possibility for for concern that areas with larger post offices might be differentially susceptible to national shocks. For instance, larger post offices might be more closely connected with national financial institutions than smaller post offices, in which case the instrument may suffer from endogeneity.
5.2
The Impact of Bank Failures on Postal Deposits: Nearby State Redepositing IV
The second key question under consideration is whether or not bank failures lead to an increase in postal savings. Kuwayama’s regression analysis suggests that the demand for postal savings rises in response to more bank failures; Sissman reaches a similar conclusion, finding that postal deposits are positively correlated with bank failures. A basic OLS specification to study the link between postal deposits and bank failures follows: Pit = β 1 Fit + β 2 Oit + β 3 Tit + γXit + ωt + νi + µit
(6)
where i indicates the county, t denotes the year, Pit is postal deposits, Fit is bank failures (either the number of suspended banks or suspended deposits), Rit is redeposits, Oit is population, Tit is total bank deposits, and Xit is a vector that includes measures of bank soundness and general economic conditions. ωit represents the year fixed effects, and vi represents the county fixed effects. The model is similar to the time-series model of 17
aggregate demand for postal savings proposed by Kuwayama but adds in fixed effects and population controls (Kuwayama 2000). As with the redeposit model, the key concern here is endogeneity. Both postal deposits and bank failures are likely to be correlated with unobserved changes in economic conditions. An unobserved event such as a real economic shock (e.g. a stock market crash) might both create bank failures and cause depositors to withdraw their savings from the postal bank in order to supplement their earnings. The causality could also run in the opposite direction; due to the redepositing mechanism, high levels of postal savings will lead to fewer bank failures. Altogether, it is still difficult to establish causality due to the endogeneity of the bank failures variable. Moreover, any observed effect between postal deposits and bank failures cannot necessarily be attributed to contagion. When a bank fails, there is a direct displacement effect where depositors of the failing bank must move their savings elsewhere, in addition to a contagion effect where depositors move their money to the postal bank in expectation of other bank failures. So even though Friedman and Schwartz argue that fear drove money into the safety of the postal savings system, it is not entirely accurate to attribute the resulting increase in postal deposits to sudden changes in expectations. To accurately evaluate Friedman and Schwartz’s claim that fear of bank failures caused depositor flight to the postal savings system, we must identify a source of variation in depositors’ perceptions of bank risk that is uncorrelated with both local shocks and the direct displacement effect. One potential source of this variation is redepositing in nearby states. In the Postal Depository Act, it was decreed that money was “to be invested in government securities and not deposited in other states” in the event that no banks in the same state could satisfy the interest rate requirements of the postal bank. Because redeposits cannot cross state lines, they can only contribute to bank failures in the state from which they originated. Thus, for any given county, we can identify states that are within a certain geographical distance of the county and then use redeposit levels in the nearby states to instrument for bank failures in that state. The first and second
18
stage regressions for this IV specification are: � it + β 2 Oit + β 3 Tit + γXit + ωt + νi + µit Pit = β 1 FN FNit = α1 Zit + α2 Oit + α3 Tit + δXit + ωt + νi + µit
(7) (8)
where Zit denotes our instrument, redeposits in nearby states, and FNit is bank failures in nearby states. The remaining variables remain the same as in the OLS specification, with i representing county and t representing year. This approach isolates the contagion effect from the displacement effect by assuming that redepositing in nearby states affected the number of bank suspensions in nearby states but is uncorrelated with bank suspensions in the same state as the county being analyzed.
6 6.1
Results OLS Models
What is the effect of bank failures on the demand for postal savings? Table 1 reports the results of an OLS regression of postal deposits on the number of suspended banks. One of the more interesting results from this model comes from the use of fixed effects. Covering more years and more regions, the dataset also allows for county and year fixed effects to be used for the first time to analyze postal savings. The first column shows a naive regression without fixed effects and indicates that for every suspended bank, postal deposits increase by about $110,000. This positive correlation is what Sissman reported qualitatively and O’Hara and Easley calculated. However, once county and year fixed effects are applied in Columns 2 and 3, the correlation becomes negative. It appears that there are unobserved timeinvariant, county-specific factors that are correlated with postal deposits and bank failures (e.g. culture). The use of fixed effects neutralizes these factors and reveals a negative correlation between postal deposits and bank failures, which is a reversal of the effect observed in previous literature. The result remains statistically significant even after controls for population and total bank deposit are added in Column 4. 19
20
1. 2. 3. 4.
173041.40** 254690.70** (21257.90) (19465.96) No Yes No No 0.005 0.28 32578 32578
(1) (2) 109782.80** -125571.20* (37466.26) (56110.60)
54586.82** (17356.34) Yes Yes 0.07 32578
(3) -131640.70* (61964.64)
-1508042 (462076.60) Yes Yes 0.70 32511
-2384271** (733255.50) Yes Yes 0.32 24665
(4) (5) -210857* -258744.30* (103247.50) (115534.40) -29835.84 (33837.14) 170388.50 (75569.84) 36.92 53.90 (11.17) (16.10) -0.0006 -0.004 (0.0003) (0.003)
Robust standard errors in parentheses, clustered at the county level. *Significant at the 5% level. **Significant at the 1% level. Regression equation: Postal depositsit = β 1 + β 2 × (suspended banks)2 + β 3 × (total bank deposits)it + β 4 × populationit + uz
County Fixed Effects Year Fixed Effects R2 N
Constant
Log(Total Bank Dep.)
Log(Population)
Total Bank Deposits
Population
Suspended Banks, t-2
Suspended Banks, t-1
Suspended Banks
Postal Deposits
-0.41 (0.30) -1.15** (0.07) 20.69 (3.03) Yes Yes 0.63 32511
(6) -0.019** (0.007)
-0.13 (0.40) -0.78** (0.07) 15.31 (4.07) Yes Yes 0.64 23801
(7) -0.013 (0.008) -0.009 (0.007) 0.068 (0.029)
Log(Postal Deposits)
Table 1: OLS Estimates of the Effect of Bank Failures on Postal Deposits
0.006** (0.002) Yes Yes 0.33 32511
-3.44x10−8 (2.97x10−8 )
Postal Deposits/Total Deposits (8) -0.006** (0.002)
Column 5 tests for a lag effect where bank failures in the previous time period may still be affecting postal deposits; however, none of the lagged bank failures are statistically significant. Columns 6 and 7 offer alternate ways of understanding the effect of bank failures on postal deposits. One indicates that for every suspended bank, postal deposits decline by 2%. The other says that for every suspended bank, postal deposits as a share of total bank deposits declines by 0.006. For comparison, Kuwayama’s regression model found that each additional suspended bank caused postal deposits as a share of total bank deposits to increase by 0.05. What is the effect of redepositing on bank failures? Table 2 shows the results of an OLS regression of bank failures on the level of redeposits. The naive regression in Columns 1 and 2 shows that the effect of redepositing on the number of suspended banks is statistically insignificant. However, performing a log transformation on the redeposits variable corrects for large outliers in the data. The subsequent regression in Column 3 shows that a 1% increase in redeposits decreases the number of suspended banks by 0.056 (statistically significant at the 5% level). Column 4 tests for the effect of redeposits from previous time periods. In all three time periods (t, t − 1, t − 2), redeposits are inversely correlated with suspended banks but only redeposits from two time periods ago are statistically significant; this is likely due to the decreased number of observations (336, rather than 480 in the previous regressions). Columns 5-8 perform the same regressions as described in Columns 1-4 but with suspended deposits as the dependent variable. There are slightly fewer observations of suspended deposits than suspended banks at the state-level, but the results from Columns 5-8 are similar overall. The regressions imply that a 1% increase in redeposits decreases suspended deposits by about $5,600,000 (statistically significant at the 5% level). Regressing on lagged redeposits proves to be insignificant in Column 8. We cannot take logs of suspended deposits since there are many zeroes in the data. Table 3 performs a Poisson distribution to account for the fact that the number of suspended banks is zero in most cases. The main results are similar to Table 2 but the statistical significance 21
22
Bank
7.05 (2.14) Yes Yes 0.36 480
-68.33 (58.08) Yes Yes 0.31 480
0.00005 (0.00003) -0.014* (0.0058) -1.09x10−8 (8.94x10−9 ) 0.52 (2.63) -1.10 (0.65) -0.23 (0.71) 17.83 (35.96) Yes Yes 0.57 336
28.70 (46.56) -3.74 (19.53) -24.66 (23.04) 181.10 (713.69) Yes Yes 0.25 480
-0.15 (0.17) -0.20 (0.25) -0.48** (0.14)
(4)
594.52 (200.75 Yes Yes 0.20 432
(5) 0.0004 (0.00001)
-94100 (70500) Yes Yes 0.42 432
(22300) -1090000 (890000) Yes Yes 0.44 432
(37100) -1640000 (1970000) Yes Yes 0.23 288
Suspended Deposits (in thousands) (6) (7) (8) -0.002** (0.0007) -5622.85* -15800 (2169.27) (15300) -21400 (20800) -3376.74 (7001.27) 0.062 (0.038) -48.57* (19.11) 0.00003* (0.00001) 61700 66900 (56800) (115000) -1.7900 -72600* (17600) (32800) 20600 65400
1. Robust standard errors are in parentheses, clustered at the state-level, *Significant at the 5% level, **Significant at the 1 level. 4. Regression equation: suspendedbanksit = β 1 + β 2 × (redeposits)it + β 3 × (totalbankdeposits)it + β 4 × populationit + β 5 × incomeit + uz
State Fixed Effects Year Fixed Effects R2 N
Constant
Log(Total Deposits)
Log(Income)
Log(Population)
Total Bank Deposits
Income
Population
Log(Redeposits), t-2
Log(Redeposits), t-1
Log(Redeposits)
Redeposits
Number of Suspended Banks (1) (2) (3) 4.56x10−7 -1.38x10−6 (2.19x10−7 ) (8.55x10−7 ) -5.60* (2.75)
Table 2: OLS Estimates of the Effect of Redeposits on Bank Failures
Table 3: Poisson Estimates of the Effect of Redeposits on Bank Failures (1) Redeposits Log(Redep.)
1.14x10−8 (6.85x10−9 )
Number of Suspended Banks (2) (3) (4)
−1.12x10−7 ** (2.46x10−8 )
Log(Redep.), t-1
-0.46** (0.09)
-0.15 (0.15) -0.42 (0.25) -0.31* (0.14)
3.51 (2.30) -0.52 (0.75) 0.08 (0.76) Yes Yes 480
1.07 (3.18) -1.68 (0.65) 0.54 (0.72) Yes Yes 336
Log(Redep.), t-2 1.91x10−6 (9.82x10−7 ) -0.0008* (0.0002) −2.33x10−10 (2.98x10−10 )
Population Income Total Bank Dep. Log(Population) Log(Income) Log(Total Bank Dep.) State Fixed Effects Year Fixed Effects N
Yes Yes 480
Yes Yes 480
1. Robust standard errors are in parentheses, clustered at the state-level. 2. *Significant at the 5% level. 3. **Significant at the 1% level.
is strengthened to the 1% level. Column 3 regresses suspended banks on a log of redeposits and indicates that a 1% increase in redeposits decreases the number of suspended banks by about 0.005.
6.2
Models
Although the results above are statistically significant, they do not allow us to conclude that the relationships are causal. In the regression of postal deposits on bank failures, the use of county and year fixed effects mitigates omitted variable bias, but could still be vulnerable to endogeneity from variables that vary within 23
counties over time. As discussed earlier, crop failures in a specific county may cause banks to fail and lead depositors to withdraw their savings from the postal bank in an attempt to supplement their earnings. This income effect story is one way to explain the results observed in Table 1. At the same time, reverse causality remains a problem, as areas with more postal deposits available for redepositing will likely experience fewer bank failures. The regression of bank failures on redeposits, though promising, does not prove that the redeposits actually prevented bank failures. Thus, an instrumental variables approach is still needed given the potential for endogeneity. Table 4 presents the second stage results of regressing bank failures on redeposits, instrumenting for redeposits by using the number of state postal banks in 1919 interacted with postal bank bond purchases. Column 1 shows a positive but insignificant correlation between redeposits and the number of suspended banks. As in the OLS regression of suspended banks on postal deposits in Table 2, logs were needed to transform outliers in the redeposit data. Columns 2 and 3 show the results of using log of redeposits instead of outright redeposits as an independent variable. The results indicate that a 1% increase in redeposits decreases the number of suspended banks by about 1.0 (5% significance). Columns 4-6 perform the same regressions as described in Columns 1-3 but with suspended deposits as the dependent variable. The results are fundamentally similar and offer an alternate interpretation of the effect of redepositing: a 1% increase in redeposits decreases suspended deposits by about $10,000,000 (5% significance). Table 5 provides the first stage results, which show that states with more postal banks have redeposit levels that are more sensitive to postal bank bond purchases. The F-statistics were greater than 10 in all cases, suggesting that the instruments were strong. Table 6 contains the results of the postal mode 1. The model regresses bank failures on postal deposits, instrumenting for postal deposits using postal bank classification. The results strongly indicate that postal deposits are inversely correlated with bank failures. Columns 1-3 use the number of suspended banks as the dependent variable and are all statistically significant at the 1% level. Column 1 indicates that a $1 increase in postal 24
Table 4: Second Stage IV Estimates of the Effect of Redeposits on Bank Failures Instrument: State Postal Banks in 1919 x Postal Bank Bond Purchases Number of Suspended Banks (1) (2) (3) Redeposits 0.00005 (0.00004) Log(Redeposits) -20.87 -104.99* (15.63) 42.01) Population -0.0002 (0.0002) Income 0.090 (0.089) Total Bank 7.74x10−8 Dep. (7.92x10−8 ) Log(Population) 125.87 (102.86) Log(Annual -47.40 Personal (39.66) Income) Log(Total -325.61* Bank Dep.) (120.25) Constant 212.11 248.60 5351.59 (256.19) (167.79) (2597.95) State Fixed Yes Yes Yes Effects Year Fixed Yes Yes Yes Effects R2 0.05 0.24 0,42 N 480 480 480
Suspended Deposits (in thousands) (4) (5) (6) 0.069 (0.034) -202,000 -106,000** (169,000) (36,300) 0.094* (0.046) -30.89 (21.88) 0.00007 (0.00005) 119,000 (111,000) -34,000 (40,100)
-57,900 (89,300) Yes
646,000 (697,000) Yes
-331,000 (132,000) 5,150,000 (2,780,000) Yes
Yes
Yes
Yes
0.17 432
0.33 432
0.47 432
1. Robust standard errors are in parentheses, clustered at the state-level. 2. *Significant at the 5% level. 3. **Significant at the 1% level.
25
26
-4933170 (1780658) Yes Yes 0.56 480 12.85
4.46** (0.85) -2031.02** (347.2) -0.002** (0.0003)
(0.001)
1. Robust standard errors are in parentheses, clustered at the state-level. 2. *Significant at the 5% level. 3. **Significant at the 1% level.
State Fixed Effects Year Fixed Effects R2 N F-stat of instrument
Constant
Log(Total Bank Deposits)
Log(Income)
Log(Population)
Total Bank Deposits
Income
Population
Log(State Postal Banks in 1919) x Log(Postal Bank Bond Purchases)
State Postal Banks in 1919 x Postal Bank Bond Purchases
Redeposits (1) 0.002
10.65** (0.63) Yes Yes 0.51 480 10.10
(0.016)
(0.014)
1.37 (0.87) -0.42 (0.32) -2.98** (0.26) 46.74** (12.32) Yes Yes 0.73 480 36.26
-0.043**
Log(Redeposits) (3)
-0.031*
(2)
-1219358 (1508721) Yes Yes 0.28 432 13.16
1.35 (0.75) 49.79 (301.10) -0.001** (0.0003)
(0.001)
Redeposits (4) 0.002
4.06** (0.66) Yes Yes 0.46 432 23.53
(0.018)
-0.021
(5)
1.09 (0.96) -0.069 (0.34) -3.50** (0.31) 56.06** (13.44) Yes Yes 0.68 432 34.06
(0.018)
-0.057**
Log(Redeposits) (6)
Table 5: First Stage IV Estimates of the Effect of Redeposits on Bank Failures Instrument: State Postal Banks in 1919 x Postal Bank
Table 6: Second Stage IV Estimates of the Effect of Postal Deposits on Bank Failures Instrument: Postal Classification (1) Postal Deposits Log(Postal Dep.) Population Total Dep.
Bank
Log(Pop.)
Number of Suspended Banks (2) (3)
-3.19x10−7 ** (2.23x10−8 )
0.00002 (8.62x10−7 ) 9.23x10− 11
-0.0005** -0.45**
-0.36**
(0.05)
(0.06)
(1.40x10−10 )
0.11 (0.11) 0.056
Log(Total Bank Dep.) Constant County Fixed Effects Year Fixed Effects R2 N
Suspended Deposits (in thousands) (4) (5) (6)
(0.0001)
0.050** (0.004) 0.000008**
-323.76
-418.01*
(205.17)
(168.99)
(0.0000006)
1246.37** (336.60) 1247.36*
-0.75 (0.04) Yes
3.33** (0.36) Yes
(0.008) 1.10** (1.77) Yes
-2250.75** (204.94) Yes
-228.85 (1469.27) Yes
(505.64) -25800.00** (7599.12) Yes
Yes
Yes
Yes
Yes
Yes
Yes
0.15 30236
0.12 30236
0.08 30236
0.06 28562
0.18 28562
0.15 28562
1. Robust standard errors are in parentheses, clustered at the state-level. 2. *Significant at the 5% level. 3. **Significant at the 1% level.
27
deposits decreases the number of suspended banks by 0.00000003. Columns 2 and 3 use log of postal deposits as an independent variable, which makes sense given that postal deposits are often in the millions. Columns 2 and 3 show that a 1% increase in postal deposits decreases the number of suspended banks by about 0.004. Columns 4-6 replicate Columns 1-3 but with suspended deposits as a dependent variable. Column 6 is statistically significant at the 5% level and shows that a 1% increase in postal deposits decreases suspended deposits by $418,000. Table 7 provides the first stage results. Overall, the results are consistent with the hypothesis that postal deposits are limiting bank failures through the redepositing effect. This hypothesis may also help explain why postal deposits are negatively correlated with bank failures in the OLS regressions in Table 1. Finally, I test for the contagion effect caused by bank failures. Table 8 regresses postal deposits on bank failures in nearby states, instrumenting for bank failures in a state by using redeposits in nearby states. First stage results are shown in Table 9. In the first stage, we see that redepositing in nearby states is a significant predictor of bank failures in a state. Column 1 analyzes states within a 25-mile radius of any given county. This was the smallest distance tested for and thus had the most statistical significance (significant at 1% level). It states that for every bank failure in a nearby state, postal deposits increased by about $14,000. Column 2 adjusts the distance constraint to a 50-mile radius. Bank failures within a 50-mile radius cause postal deposits to increase by $3,500 (significant at the 5% level). Extending the distance constraint even further to 75-miles and 100-miles in Columns 3 and 4 subsequently caused the sign on the postal deposits coefficient to become negative but the results were not statistically significant. These regressions were also run with the log of postal deposits as the dependent variable, but the results were also insignificant. The results suggest that while there is an increase in the demand for postal savings in response to bank failures, it remains highly isolated within a 25- or 50-mile radius. Within this small radius, we may also be observing the direct displacement effect rather than the contagion effect of bank failures since depositors may cross state lines to move their deposits to the postal bank.
28
29
1. 2. 3. 4.
-2120880** (40372.75) Yes Yes 4.95 30236 4.95
6.68** (0.05) Yes Yes 18.55 30236 18.55
(2) 0.072 (0.068) 0.20** (0.03) 0.30** (0.02) 0.17** (0.05)
-0.75** (0.14) -1.12** (0.03) 23.48** (1.33) Yes Yes 17.39 30236 17.39
Log(Redeposits) (3) 0.16** (0.07) 0.21** (0.03) 0.21** (0.02) 0.05** (0.04)
-238688.30** (32593.21) Yes Yes 7.43 28562 7.43
Redeposits (4) 102590.60** (43807.98) 43807.98** (6923.83) 10694.57 (6764.82) 90441.81** (22007.44) 5.91** (0.74) 0.093** (0.00004)
6.46** (0.06) Yes Yes 14.32 28562 14.32
(5) 0.89** (0.017) 0.13** (0.012) 0.28** (0.012) 0.54** (0.038)
Robust standard errors are in parentheses, clustered at the state-level. *Significant at the 5% level. **Significant at the 1 level. Regression equation: postaldepositsit = β 1 + β 2 × (class1)it + β 3 × (class2)it + β 4 × class3it + β 6 × class4it + β 7 × totalbankdepositsit + β 8 × populationit + uz
Country Fixed Effects Year Fixed Effects R2 N F-stat of instrument
Constant
Log(Total Bank Deposits)
Log(Population)
Total Bank Deposits
Population
No. 4th Class Postal Banks
No. 3rd Class Postal Banks
No. 2nd Class Postal Banks
No. 1st Class Postal Banks
Redeposits (1) 178710.40* (57375.81) 1012199** (25491.35) 104747.50** (17147.21) -28253.83 (37856.32) 28.22** (0.47) -0.0005** (0.00002) 0.61** (0.03) 0.022** (0.02) -1.17** (0.18) Yes Yes 12.11 28562 12.11
Log(Redeposits) (6) 0.35** (0.02) 0.015 (0.25**) 0.25** (0.01) 0.58** (0.04)
Table 7: First Stage IV estimates of the Effect of Postal Deposits on Bank Failures Instrument: Postal Classification
Table 8: Second Stage IV Estimates of the Effect of Bank Failures in Nearby States on Postal Deposits Instrument: Redeposits in Nearby States
in
(1) 13777.82**
Postal Deposits (2) (3) 3458.29* -6923.44
(4) -3294.55
County Fixed Effects Year Fixed Effects R2 N
(2549.84) 3.73** (1.10) 1.30** (0.21) -186919.70 (103243) Yes Yes 0.51 6548
(1399.28) 37.27** (0.71) 0.94** (0.20) -1896968** (63967) Yes Yes 0.43 16515
(2815.38) 40.37** (0.58) -0.05 (0.16) -1803514** (40630.82) Yes Yes 0.71 27847
Bank Failures Nearby States Population Total Bank Dep. Constant
(4167.59) 40.49** (0.67) 0.037 (0.19) -1916709** (48963.29) Yes Yes 0.41 23314
1. Robust standard errors are in parentheses, clustered at the state-level. 2. *Significant at the 5% level. 3. **Significant at the 1 level.
7
Conclusion
Altogether, there are several key findings that contradict the existing theories on the postal savings system. First, the use of fixed effects reverses the positive correlation between postal deposits and bank failures observed by Kuwayama and Sissman. Fixed effects neutralize the time-invariant factors at the county level, revealing a negative correlation. One way to interpret this result is that bank failures were indicative of real economic shocks that decreased postal depositors’ incomes and caused them to draw on their savings. Alternately, due to the redepositing mechanism, it could be a case of reverse causality. Second, bank failures and redeposits are negatively correlated. On the surface, the OLS regression confirms the result from O’Hara and Easley’s analysis, which uses rank correlation to show that states with many bank failures also had low levels of redepositing. However, this result does not prove that low redepositing caused certain states to experience more bank failures. The two instrumental variables approaches help to establish the causal relationship that redepositing did have a 30
Table 9: First Stage IV Estimates of the Effect of Bank Failures in Nearby States on Postal Deposits Instrument: Redeposits in Nearby States (1) Redeposits in Nearby States Population Total Dep.
Bank
Constant County Fixed Effects Year Fixed Effects R2 N F-stat
Bank Failures in Nearby States (2) (3)
(4)
−7.37x10−7 **
-7.15x10−7 **
−7.35x10−7 **
-8.55x10−8 **
(5.04x10−8 ) 0.00003 (0.00003) −9.16x10−7
(3.15x10−8 ) 0.00003 (0.00003) −4.02x10−6
(2.70x10−8 ) 0.00006* (0.00003) −6.52x10−6
(2.56x10−8 ) 0.00006 (0.00003) −6.83x10−6
(5.69x10−6 ) 7.82 (2.73) Yes
(6.34x10−6 ) 6.35** (1.00) Yes
(4.70x10−6 ) 7.58** (1.91) Yes
(8.57x10−6 ) 10.02** (2.08) Yes
Yes
Yes
Yes
Yes
0.53 6548 12.62
0.50 16515 12.94
0.51 23314 14.79
0.53 27847 15.23
1. Robust standard errors are in parentheses, clustered at the state-level. 2. *Significant at the 5% level. 3. **Significant at the 1% level. 4. Regression equation: nearbystatessuspendedbanksit = β 1 + β 2 × nearbystatesredepositsit + β 3 × totalbankdepositsit + β 4 × populationit + uz
31
significant effect in preventing bank failures. Most significantly, the results from the instrumental variables regressions refute Friedman and Schwartz’s as well as O’Hara and Easley’s criticisms of postal bank redepositing during the Great Depression. The results from the postal bank bond purchases IV indicate that every 1% increase in redeposits prevented 1 bank failure; in the OLS regression, a 1% increase in redeposits only prevented of 0.056 bank failures. The large discrepancy between the IV and OLS specifications may be explained by the fact that many bank failures occurred due to fundamental reasons, so the effect of redeposits on bank failures in a naive regression is low. Furthermore, the result suggests that when the postal bank chose to redeposit, it did so strategically and to great effect. Although there may be some endogeneity issues with postal classification IV, results are consistent with the first IV. Given the strength of the redepositing mechanism, it also appears that the OLS regression of postal deposits on bank failures is likely to be a case of reverse causality. Third, the contagion effects of bank failures on the demand for postal savings appear to be limited. The nearby state redepositing IV finds that only bank runs in nearby states within a 25 or 50-mile radius from a given county caused postal deposits to increase. This result contradicts Friedman and Schwartz’s description of a “contagion of fear” spreading across the country during the Great Depression and more closely aligns with Wicker’s description of bank runs spreading within small, regional areas. In summary, there are three main results from the regression analysis. First, the correlation between postal deposits and bank failures is actually negative and likely signals a case of reverse causality due to the redepositing effect. Second, redeposits had a statistically significant effect in preventing bank failures. Third, the contagion effect of bank failures on postal deposits only exists in a localized area. Overall, the analysis refutes many hypotheses from previous research on the postal bank and complements the literature on the causes of bank failures during the Great Depression.
32
The Maturity Structure of Treasury Debt: How Costly Is Mismanagement? Jacob Berman, University of Chicago1 Abstract. This paper explores how the maturity structure of public debt affects the evolution of debt/GDP and net interest/GDP in the United States. I construct a model to simulate counterfactual debt management strategies and then extend the model forward to estimate how changes in the maturity structure can be expected to influence debt management outcomes in the future. In my preferred strategy, I find that the Treasury could save as much as $424 billion in borrowing costs over the next 10 years by rapidly increasing the maturity of new issues. Keywords: debt, GDP, United States
1 Jacob Berman graduated from the University of Chicago in 2013 and this is his senior thesis. The author would like to thank his adviser, John Cochrane, whose guidance and generous support made this project possible. He is also grateful to Victor Lima, Robert Shimer, Grace Tsiang, and Cynthia Wu, who provided thoughtful comments throughout the writing process.
33
1
Introduction
The global financial crisis has led to a dramatic increase in public budget deficits across most advanced economies. In the United States, the federal budget deficit as a percentage of GDP increased from 1.2% in 2007 to 10.1% in 2009. Policymakers have offered a variety of budget plans that seek to reduce these deficits by cutting outlays through changes to federal spending programs, or increasing revenue through changes to the federal tax code. However, there is a third option for reducing budget deficits that has been largely ignored. The mechanism behind explosive debt-GDP ratios is a model of how net interest payments evolve over time. By determining the maturity structure of the United States debt portfolio, the Treasury Department has substantial control over the level and timing of these payments. The maturity structure of public debt is a major policy decision that directly affects long-run budget sustainability. How should we evaluate the Treasury’s previous debt management policy, and what should new policy be going forward? To answer this, I construct a model to simulate counterfactual debt management strategies. The two major debt management decisions faced by a government are the type and proportion of securities to offer. By altering issuance relative to a historical baseline, I approximate how different policy would have affected the evolution of debt and interest payments in the United States between 1948 and 2012. I then extend the model forward using forecasts from the Congressional Budget Office to estimate how changes in the maturity structure can be expected to influence debt management outcomes over the next 10 years.
2
Background
Behind headline debt numbers are a diverse set of securities, each with different properties and obligations. Table 1 shows the Monthly Statement of the Public Debt as reported by the Bureau of Public Debt for December 31, 2012. At that time, gross debt outstanding in the United States was slightly over $16.4 trillion. Gross treasury obligations can be divided into two categories: debt held by the public and intragovernmental 34
holdings. Intragovernmental debt is debt one part of the government owes another and generally represents the assets of social insurance trust funds; these debts are essentially accounting mechanisms used by the Treasury to separate trust funds from general spending and have no net effect on borrowing. Debt held by the public is the sum of the face value of all outstanding securities held outside federal government accounts. The majority of debt held by the public is marketable debt which is frequently traded in the secondary market. This consists of zero-coupon bills, coupon-paying notes and bonds, and as of 1997, Treasury Inflation-Protected Securities (TIPS) which are indexed to the Consumer Price Index (CPI); TIPS account for 7.7% of marketable public debt. About 4% of the public debt is made up of nonmarketable securities which are nontransferable debt instruments. These consist primarily of savings bonds, investments from state and local governments, civil service retirement savings plans, and a variety of other small obligations. I was unable to find data on the maturity structure of nonmarketable debt, so I exclude it from this analysis.2 Debt held on the balance sheet of the Federal Reserve is considered to be debt held by the public. Although some authors prefer to include these obligations with intragovernmental debt, this accounting is not appropriate for my exercise. Since the Fed is required to buy securities on the open market, debt held by the Fed is just as much an obligation of the Treasury as if it were held by a private investor. The Treasury and the Fed face different institutional incentives and there is no reason the Fed would permit the Treasury to default on this debt. Policy decisions from the Fed affect Treasury financing in two ways: through open market operations that determine the interest rate the Treasury must offer and through remittances to the Treasury that affects the primary surplus. The Fed remits net income to the Treasury after expenses and dividends are paid out; the National Income Product Accounts (NIPA) treat these deposits 2 The government makes a variety of other implicit financial commitments that are not reported in most debt statistics. Some examples include unfunded mandates to provide social insurance, civil and military pensions, guarantees from government sponsored enterprises, or (perceived) guarantees of systemically important financial institutions. Since these commitments are less formal and more difficult to quantify, I exclude them from my model.
35
Table 1: Monthly State of the Public Debt, December 31, 2012 Title
Debt Held By the Public
Government Holdings
Totals
Bills
1,626,480
2,491
1,628,971
Notes
7,320,862
6,253
7,327,115
Bonds
1,236,669
3,492
1,240,161
TIPS
849,473
371
849,844
Federal Financing Bank
0
7,112
7,112
11,033,484
19,719
11,053,202
Domestic Series
29,995
0
29,995
Foreign Series
2,986
0
2,986
State and Local Series
162,587
0
162,587
US Savings Securities
162,513
0
162,513
Government Account Series
168,647
4,831,000
4,999,647
Hope Bonds
0
494
494
Other
1,306
0
1,306
Total Nonmarketable
548,034
4,831,494
5,379,528
Total Pubic Debt
11,581,518
4,851,213
16,432,730
Marketable:
Total Marketable Nonmarketable:
as revenue, not negative spending on net interest. During normal times these profits account for only 1% of federal revenues and can be safely ignored in this analysis. The Treasury faces a tradeoff when deciding whether to issue short-term or long-term securities. If the yield curve is upward sloping, favoring shorter maturities may be optimal. Since investors typically demand a term premium, the interest payments on short-term debt will be lower. Additionally, if interest rates fall, the government will be able to easily refinance the debt at a lower cost. On the other hand, favoring longer maturities may be optimal if the government is concerned about insuring against rollover risk. Long-term debt smooths fiscal shocks. By locking-in interest rates today, unanticipated changes to interest rates or primary budget balances will be less likely to affect borrowing. If interest rates rise in the future, the government will not be forced to refinance large quantities of debt at high rates. Given its importance to long-run fiscal sustainability, one might expect the maturity structure to adjust in response to changing interest rates and primary deficits. The Treasury has 36
0
.2
Fraction of Total .4 .6
.8
1
Figure 1: Distribution by Type of Security
1980
1990
2000
Bills
Notes
2010 Bonds
opted not to pursue this strategy and instead issues debt in a “regular and predictable pattern.� Figure 1 shows that the maturity structure over the past three decades has remained stable.
3
Literature Review
The research agenda on optimal debt management has been heavily influenced by the neoclassical literature on optimal fiscal policy. Barro (1979) and Angeletos (2002) focus on how longterm debt allows for tax-smoothing in a neoclassical, stochastic production economy. In a similar exercise with nominal debt, Calvo and Guidotti (1992) show it is possible for a government to pick a maturity structure such that each successive government facing similar preferences is provided the incentive to follow the strategy of the initial government. Other research such as Krishnamurthy and Vissing-Jorgensen (2012) and Greenwood et al. (2010) has focused on the monetary services. Cochrane (2001)
37
shows that within a fiscal model of the price level long-term debt helps to smooth policy since it will generally produce the minimum variance of inflation. Hall and Sargent (2011) is the only major empirical study of optimtal debt management for the U.S. Treasury. They note that net interest payments as reported by the Treasury only measure accrued interest on zero-coupon bonds and cash disbursed for coupon payments. This fails to account for real capital gains (or losses) and so does not measure the true burden of servicing the debt. Their results indicate that properly measured interest payments are significantly more volatile, but lower on average than the Treasury-reported series.
4
Model
Counterfactual simulations only have meaning relative to an appropriate baseline. One approach would be to design a stylized model of debt management and compare the model’s debt time series to the historically observed time series. However, funding the largest government in the world is a complicated business; any model will be a poor approximation of the Treasury’s complex accounting conventions, so comparing counterfactuals to the actual debt time series would be inconsistent. My approach, instead, is to create a model that replicates as closely as possible the major debt management decisions the Treasury makes, feeding these decisions in as an input, and using the output as the historical baseline. If the model time series is tightly correlated with the relevant historically observed time series, then we can be confident that the model correctly captures basic debt management dynamics. Furthermore, since we generate counterfactuals using the same model, we can be sure that any comparisons will be consistent. The two major debt management decisions faced by a government are the type and proportion of securities to offer. The two security types offered are zero-coupon bills with a maturity of one year or less and coupon-paying notes and bonds with maturities greater than a year.3 In 1997, the Treasury began 3 Notes
and bonds currently make coupon payments every 6 months,
38
offering TIPS, which have their principal and coupon payments indexed to the CPI. Although TIPS will likely be an important part of Treasury debt management in future years, they currently remain small relative to nominal obligations. In the interest of tractability, I exclude them from my model. Let prinold and coupold be, respectively, the existing principal t t and existing coupon payments due at time t. Although the Treasury occasionally initiates open-market buyback programs to pay down old debt, existing obligations are almost always paid as promised. The total financing needs at t will be new new duet = oldcasht + coupold + prinold + de f t (1) t + coupt t + print
with de f t denoting the government’s primary budget deficit. The primary deficit is revenues minus all expenditures other than net interest. The Treasury’s need for short-term cash is difficult to predict in any given period, so it will occasionally retain cash balances from previous periods which I denote as oldcasht . That is, if duet < 0, then oldcasht+1 = duet and we replace with duet = 0. Without this provision it would be necessary to assume either the Treasury would initiate very short-term buyback programs, or that it could invest cash in the private market, neither of which are practical.
4.1
A Simple Zero Coupon Model
Suppose the Treasury can only auction a single zero-coupon security that pays $1 in k ∈ K. Using the basic discount formula, the principal from each issue will come due at time t + k such that prinnew t+k =
duet (1 + ykt )k
(2)
with ykt being the relevant yield to maturity. Total debt outstanding is simply the sum of existing and new principal payments due. Note that after the Treasury decides which maturity to offer in the first period, it is constrained to offer only that same security in all other periods. however, some debt issued prior to WWI such as the Panama Canal Loans or the Consols of 1900 made quarterly payments.
39
4.2
Zero-coupon and coupon-paying securities with time dependent issuance
Let pt be a vector of prices at time t with pkt ∈ pt being the price of a security with maturity k at the time of its issue. The Treasury’s current procedure for determining the coupon rates on new offerings is to round the yield bids to the nearest multiple of 1/8 of a percent such that the security is auctioned below par. That is, for a security with maturity k and a par value of one dollar the coupon is �8(1 + ykt )� ckt = (3) 8 with price pkt =
ckt (1 − (1 + ykt )−k ) + (1 + ykt )−k ykt
(4)
Let ikt ∈ it be the fraction of k maturity securities being auctioned at t out of all auctions that period. Since all securities are issued below par, we must define a scalar f kt =
ikt pkt it � pt
(5)
Principal due at t associated with security k will be prink,t+k =
duet f kt pkt
(6)
with prinkt ∈ print . Total principal due from new auctions are prinnew = prin�t 1. Total coupon payments due associated with t any particular maturity k are defined as coupkt ∈ coupt . Within a given maturity, coupon payments from different auctions can still become due at the same date so it is necessary to define τ coupk,t,τ =
prink,τ +k ckτ 2
(7)
for all t ∈ {τ + 6, τ + 12, . . . , τ + k }. Total coupons due from new auctions will be coupkt =
t
∑ coupk,t,τ
τ =1
40
(8)
and coupnew = coup�t 1. The face value of all debt outstanding is t the sum of all future principal obligations or dt =
t+max K
∑
i = t +1
prin�i 1
(9)
Interest payments are currently calculated as period-by-period coupon payments plus the difference between par value and market value at sale for zero-coupon bonds. Thus interest payments at t associated with each maturity k are intk,t+k = (1 − pkt ) prink,t+k + coupk,t+k
(10)
As noted by Hall and Sargent (2010), the net interest series reported by the Treasury does not completely measure the burden of servicing the debt since it does not measure capital gains. In the context of the government budget constraint, their alternative measure is clearly favorable. However, given that the Treasury rarely repurchases debt before maturity, it is not obvious that mark-to-market accounting for liabilities is appropriate from the perspective of a government. A mark-to-market loss on the value of a government’s asset (e.g. the Grand Canyon, or the interstate highway system) would be similarly meaningless if the asset will never be sold. Since the Treasury is continuously auctioning securities at par, cash outlays to investors is an important measure for evaluating the cost of borrowing over time. Servicing a stock of debt requires a flow of payments. Net interest/GDP is the percentage of all national income devoted toward this flow of payments. Since interest payments must be financed through taxes which distort economic incentives, net interest/GDP is my primary measure of the burden of the national debt. Calculating net interest is also useful since it allows the cost of different debt management strategies to be compared with the cost of tax and spending proposals scored using conventional budget accounting.
41
5 5.1
Data Historical Data
My primary dataset is the Monthly U.S. Treasury Database from the Center for Research in Security Prices (CRSP). The CRSP data provides monthly market data such as prices, amount outstanding, yields, and durations on nominal, publicly traded Treasury securities since 1925. Existing principal and coupon obligations are taken from the December 31, 1947 CRSP observation and missing data is replaced using the Treasury’s official Monthly Statement of the Public Debt (MPSD) from the same date. Yield data is taken from the CRSP Fixed Term Index file which is available for maturities of 1, 2, 5, 7, 10, 20, and 30 years. For the remaining securities, I use CRSP yields from the month in which the security was auctioned. I interpolate data from missing months using the cubic splines method. An important assumption I make throughout is that the yield for each security does not depend on the quantity being issued. It is easy to find it using Treasury auction data. Unfortunately, this data is only publicly available going back to 1980, so it is necessary to estimate it using alternate methods. CRSP data assigns a unique CUSIP number to each security outstanding, but because the Treasury frequently “reopens” securities, an individual CUSIP frequently represents several different auctions. Thus, a reopening appears in the data as an increase in the amount outstanding for an individual CUSIPs. I mark a new issue if the amount outstanding increases by more than 15%. This seems to be the level that excludes arbitrary small changes that appear in CRSP data, and also matches well with the post-1980 Treasury data. Since CRSP data is only given for the last day of the month, this procedure requires the assumption that the Treasury conducts all auctions on a single day each month. For this reason, interest rate movements within a given month are not captured. I take data on GDP and primary surpluses from the official NIPA tables. NIPA accounting includes payments made to military and civil service trust funds which are not part of the publicly held debt, so these are netted out using NIPA Table 3.18B, line 24. I assume that primary surpluses are evenly distributed throughout the calendar year. 42
5.2
Forecast Data
All forecast data comes from the February 2013 Budget and Economic Outlook prepared by the Congressional Budget Office (CBO). For interest rates, CBO provides quarterly projections of yields on the 3-month bill and the 10-year note for the next decade. Their model forecasts 10-year yields rising to 5.2% by 2017 and remaining steady thereafter. This is roughly consistent with projections prepared by the Fed and private forecasters, though the CBO model has rates rising the highest (Bernanke 2013).
6 6.1
Results Historical Counterfactual
To summarize, the model baseline starts in January of 1948 with existing principal and coupon obligations as reported by CRSP and the Treasuryâ&#x20AC;&#x2122;s MSPD. The quantity of debt to be auctioned is determined by these initial obligations and the size of primary deficits as reported in NIPA tables. Nominal GDP and primary deficits are held fixed at their historical levels. Auctions are conducted each month across each maturity according to the price and issuance data extracted from CRSP data. Figure 2 plots the modelâ&#x20AC;&#x2122;s estimates of the face value of debt outstanding on the left axis and marketable debt as reported in the Treasury Bulletin on the right. Both are as a percent of GDP and the correlation between the two series is 0.98. The model successfully reproduces the major shifts in the debt/GDP time series: a persistent decline after WWII, an increase following the high interest rates and budget deficits of the 1980s, a decline due to large surpluses starting in the late 1990s, and a sharp spike up after the financial crisis in 2008. But the model does not accurately reproduce debt levels. The model series falls more quickly and reaches its minimum of 10.7% of GDP in 1975, a full 8% of GDP above the historical series. The faster rate of decline in the model is likely caused by the omission of nonmarketable debt. Because the Treasury relied heavily on savings bonds to finance WWII and the Korean War, interest on nonmarketable debt comprised a large fraction of net interest payments during the 1950s and 1960s. As the Treasury shifted 43
.1
.2
.2
.3
Percent of GDP .3 .4
.4 .5 Percent of GDP
.5
.6
.6
Figure 2: Comparing Model Baseline Debt and Historical Debt
1950
1960
1970
1980
Model Baseline
1990
2000
2010
Historical
away from nonmarketable debt, these obligations were rolled over into marketable issues. The model does not account for shifts in nonmarketable issuance even though nonmarketable interest payments are counted in the NIPA tables. Thus, the primary surplus inputs are artificially large and debt levels decline faster than expected. However, for my exercise, discrepancies in levels are not problematic. Since I run counterfactuals within the same model, matching the shape of the debt/GDP path is much more important.4 6.1.1
Historical Counterfactuals: Only Zero-Coupon Securities
Figures 3 and 4 show the evolution of debt/GDP and net interest/GDP, respectively, under a single, zero-coupon security. Because large interest payments come due up to ten years in 4 Since
there is little data on nonmarketable interest payments, an exercise that requires matching levels might be best achieved by computing primary surpluses as a residual. Also, interest payments in NIPA data are reported to only one significant digit, so rounding errors may be nontrivial.
44
0
Par Value of Debt as Percent of GDP .2 .4 .6 .8
1
Figure 3: Counterfactual Debt with Zero-Coupon Securities
1950
1960
1970
1980
1990
Historical Baseline 5-Year Discount
2000
2010
1-Year Discount 10-Year Discount
0
Net Interest Payments as Percent of GDP .01 .02 .03 .04 .05
Figure 4: Counterfactual Interest Payments with Zero-Coupon Securities
1950
1960
1970
1980
1990
Historical Baseline 5-Year Discount
2000
2010
2020
1-Year Discount 10-Year Discount
Table 2: Counterfactual with Zero-Coupon â&#x20AC;&#x201D; Average Net Interest as Percent of GDP 1948-1980 1981-2022 1948-2022
Historical 0.83 1.55 1.23
1-Year 0.81 1.05 0.94
45
5-Year 0.70 1.46 1.13
10-Year 0.54 1.73 1.20
the future, I extend the graph forward to include all promised interest payments as of December 2012. GDP projections come from the CBO. I report results for the sub-periods before and after 1980 since interest rates peak around that year. This allows us to draw some conclusions about what kind of maturity structure is favorable in an environment with interest rates generally falling or rising. The one-year zero coupon is an actual security that the Treasury has offered, while the 5- and 10-year securities are hypothetical. Prior to 1980, the ones-only series and the historical series remain close together, suggesting that shortening issuance maturity would not have affected debt financing by much. Net interest payments are only slightly lower than the historical baseline in this period. The series begin to diverge during Volcker disinflation of the 1980s with shorter-term issuance appearing more favorable. The 5- and 10-year securities are sold at deep discounts, so they consistently result in higher debt outstanding. Interest payments are significantly more volatile since periods of primary surplus allow the Treasury to build up cash balances only to have them quickly depleted when large principal payments come due. In the pre-1980 period, average interest payments are lower than the historical baseline. This strategy becomes very costly when the long-term debt auctioned off during the high interest rates of the 1980s comes due through the 1990s. In the post-1980 period, average payments for the ten-year strategy are 0.18% higher than the baseline, though the five-year strategy beats the baseline by 0.09%. 6.1.2
Historical Counterfactuals: Mixed Coupon and ZeroCoupon Securities
Figures 5 and 6 show the path of interest payments and debt when coupon securities are added to the model. These are the actual securities that the Treasury issues on a regular basis. I include securities ranging from the 3-month bill to the 30-year bond to capture the full range of auction strategies. Coupon payments are significantly less volatile than the zero-coupon model since they are distributed evenly over time. The absence of large balloon payments means it is sufficient to run the model to 2012.
46
0
Par Value of Debt as Percent of GDP .2 .4 .6 .8
Figure 5: Counterfactual Debt with Mixed Securities
1950
1960
1970
1980
Historical Baseline 1-Year Bill 30-Year Bond
1990
2000
2010
3-Month Bill 10-Year Note
Net Interest Payments as Percent of GDP 0 .01 .02 .03 .04
Figure 6: Counterfactual Interest Payments with Mixed Securities
1950
1960
1970
1980
Historical Baseline 1-Year Bill 30-Year Bond
1990
2000
2010
3-Month Bill 10-Year Note
Table 3: Counterfactual with Mixed â&#x20AC;&#x201D; Average Net Interest as Percent of GDP 1948-1980 1981-2012 1948-2012
Historical 0.83 1.86 1.33
3-Month 0.80 1.12 0.96
47
1-Year 0.81 1.37 1.09
10-Year 0.82 2.38 1.59
30-Year 0.71 2.69 1.68
Over the entire period, a strategy of auctioning only a 3-month bill results in both the lowest amount of debt outstanding and the lowest average interest payments. For the year of 1974, the total debt falls to zero and the Treasury holds excess cash until a new round of primary deficits require new auctions. In the pre-1980 period securities between 3 months and 10 year produce virtually the same path for net interest, though shorter maturities are more volatile. This is perhaps because the yield curve was flatter on average. The fact that the maturity structure of existing obligations in 1948 remains unchanged also means that changes in new issuance will take several years to have measurable effects. The 30-year bond does manage to beat the historical series resulting in payments that are 0.12% of GDP lower on average. This suggests that the Treasury could have reduced borrowing costs moderately by relying on more longterm issuance. As in the zero-coupon model, the series begins to diverge around 1980 with shorter securities beating the historical series by a significant amount. The 3-month and 1-year securities drop to virtually zero by 2009. For the post-1980 period, 3month only issuance would have resulted in interest payments that were, on average, 0.37% of GDP lower than the historical baseline, translating into a total of $2.3 trillion in savings over 31 years. The excess cost of exclusively 30-year bond issuance is an average of 0.35% of GDP, or a total of $3.5 trillion. These results strongly suggest that favoring short-term issuance during a period of falling interest rates results in large savings over time.
6.2
Forecasts
Current projections suggest that high debt levels and rising interest rates will push net interest/GDP back to the peak levels of the 1980s. This model provides a rough measure of how changes in maturity issuance can be expected to affect the path of net interest payments over the next decade. The model baseline uses economic and budget forecasts from the CBO and assumes that the maturity structure of new issues is fixed at its 2012 average. Figure 7 compares the modelâ&#x20AC;&#x2122;s baseline estimate for debt with the CBO model. Levels differ due to the omission of nonmarketable
48
debt and TIPS, but the shape remains the same. Debt/GDP is projected to decline over the next 5 years as growing employment leads to more growth, less spending on automatic stabilizers, and increased tax revenue. Around 2018, debt/GDP begins to rise again as mandatory healthcare spending and higher interest payments push expenditures higher.
.78 .74 .75 .76 .77 Debt as Percent of GDP (CBO) .73
Marketable Nominal Debt as Percent of GDP (Model) .6 .61 .62 .63 .64
Figure 7: Comparing Model Forecast Debt and CBO Forecast Debt
2013
2016
2018
Model Baseline
6.2.1
2021
2023
CBO Forecast
Scenario 1: Revert to Current Issuance in 2015
Since the Treasury perceives there are advantages to the current structure of maturity issuance, I first explore scenarios in which there are immediate changes in issuance, and then a reversion to current policy at some later date. Specifically I model issuing only 5-, 10- and 30-year securities until 2015, and then returning to the current structure â&#x20AC;&#x201D; a broad combination of securities with a weighted average maturity of slightly over two years. Figures 8 and 9 show the paths of debt and net interest, respectively. In all cases, debt peaks around mid-2014, declines until 2018, and then increases slowly into the future. For the 5- and 10-year 49
strategies, there is a steep drop in debt outstanding as a large quantity of those securities come due and get refinanced into shorter maturities. The path of net interest is much smoother. Favoring long-term issuance in the present leads to higher interest payments now, but lower payments in the future since current rates are locked in. Conversely, favoring short-term issuance leads to steady or declining payments for several years, and then a sharp rise once interest rates return to historically normal levels.
.58
Par Value of Debt as Percent of GDP .6 .62 .64 .66 .68
Figure 8: Forecasts of Debt: Revert to Current Issuance in 2015
2012
2014
2016
2018 Forecast Baseline 10-Year
2020
2022
2024
30-Year 5-Year
Net Interest Payments as Percent of GDP .01 .015 .02 .025 .03
Figure 9: Forecasts of Interest Payments: Revert to Current Issuance in 2015
2012
2014
2016
2018 Forecast Baseline 10-Year
50
2020 30-Year 5-Year
2022
2024
6.2.2
Scenario 2: Diversification
Minimizing
Interest
Payments
and
An alternate exercise is to depart completely from the current structure and attempt to push total interest payments as low as possible. I find this optimal structure using a program that cycles through a variety of different combinations and picks the one which yields the minimum net interest payments over the 10-year budget window. The program finds that the optimal strategy is to issue 10-years until January 2014, 7-years until July 2014, 5-years until October 2016, 2-years until March 2017, and then 3-month bills thereafter. Over the period from 2013 to 2023, this leads to a total savings of $769 billion relative to the historical baseline. Clearly, this strategy is not realistic for actual policy. Because it requires that deficits at a given time be financed through a single type of security, it offers no diversification. Nevertheless, this estimate is useful since it provides an upper-bound to the savings that might be achieved through changes in the maturity structure.
Par Value of Debt as Percent of GDP .5 .55 .6 .65
.7
Figure 10: Forecasts of Debt: Minimizing Interest Payments
2012
2014
2016
2018
Forecast Baseline Diversified Strategy
2020
2022
2024
Optimal Strategy
A more practical policy would be to allow large shifts in issuance for a particular maturity while still preserving some minimum level of issuance for all other maturities. In 2012, the security accounting for the smallest proportion of all new issues was the 30-year bond at 2.3%. I round up to 3% and use this level as a lower bound for issuance. Thus, under my diversified strategy, each security makes up 3% of total issuance with the 51
Table 4: Forecasts with Reversion: Average Interest Payments as Percent of GDP 2013-2023
Baseline 2.01
5-Year 2.00
10-Year 1.92
30-Year 2.22
Net Interest Payments as Percent of GDP .01 .015 .02 .025 .03
Figure 11: Forecasts of Interest Payments: Minimizing Interest Payments
2012
2014
2016
2018
Forecast Baseline Diversified Strategy
2020
2022
2024
Optimal Strategy
Table 5: Minimum Interest Forecasts : Average Interest Payments as Percent of GDP 2013-2023
Baseline 2.01
Best Case 1.74
52
Diversification 1.86
exception of the primary security which makes up the remaining 73%. The maturity of the primary security is determined by the optimal strategy outlined above. That is, the diversified strategy issues 73% 10-years and 3% of each other security until January 2014, then 73% 7-years and 3% of each other security until July 2014, etc. Figures 10 and 11 compare the evolution of debt and interest under the optimal strategy, the diversified strategy, and the forecast baseline. The diversified strategy produces less savings than the optimal strategy, but still beats the baseline interest payments by an average of 0.15% of GDP, or $424 billion. Even when we set a lower bound for the percentage of each maturity at a level above current Treasury policy, it is still possible to realize large savings. 6.2.3
Scenario 3: Higher Interest Rates
The challenge of debt management is to choose a maturity structure that minimizes cost relative to the current forecast while still allowing flexibility to shift policy when the forecast changes. Predicting the path of interest rates is notoriously difficult, so it is important that the chosen maturity structure remains cost-effective under different scenarios for future rates. From this risk management perspective, a longer average maturity would be optimal since it provides valuable insurance against unanticipated spikes in rates. In this higher rate simulation, I assume that all interest rates stabilize at a level 50% higher than the CBO forecasts. This means that the rate on the 10year note reaches 7.8% in 2017 instead of 5.2%. I run the current issuance reversion strategy of Scenario 1 as well as the minimum interest and diversification strategy of Scenario 2. The percentages of each issue are exactly the same as described above, so the optimal title refers to optimal under the previous forecast assumptions. Figures 12 and 13 report the results for debt and interest, respectively.
53
.55
Par Value of Debt as Percent of GDP .6 .65 .7 .75
Figure 12: Forecasts of Debt: Higher Interest Rates
2012
2014
2016
2018
Forecast Baseline 10-Year Optimal Strategy
2022
2024
Forecasts of Interest Payments:
Higher Interest
Net Interest Payments as Percent of GDP .01 .02 .03 .04 .05
Figure 13: Payments
2020 30-Year 5-Year Diversified Strategy
2012
2014
2016
2018
Forecast Baseline 10-Year Optimal Strategy
2020
2022
2024
30-Year 5-Year Diversified Strategy
Table 6: Higher Interest Rates: Average Interest Payments as Percent of GDP Baseline 5-Year 2013-2023
3.24
3.10
10Year 2.84
54
30Year 3.14
Min. DiversiInterest fication 2.66 2.84
As expected, the simulations indicate that the savings from favoring long-term issuance are magnified in the higher interest rate environment. All strategies beat the forecast baseline. Of the three strategies that revert back to current issuance in 2015, the 10year is the least costly, just as in the normal rate environment. The gains from the minimum interest and diversification strategies of Scenario 2 are especially large. The more realistic diversification strategy beats the baseline interest payments by an average of 0.4% of GDP, or just over $1 trillion. One qualification is that the macroeconomic factors that would lead to rates stabilizing at these levels would likely affect a variety of other inputs to the model. I do not attempt to estimate these general equilibrium effects but instead hold nominal GDP and primary surpluses constant. 6.2.4
Scenario 4: Low Interest Rates
Prudent risk management also requires considering a scenario in which rates rise less than forecast. In this lower rate simulation, I assume that all interest rates stabilize at a level 50% lower than the CBO forecasts. Here, the 10-year note reaches a maximum of only 2.6% in 2017. In an economy with lower rates, lockingin current rates is less advantageous and so long-term issuance is more costly. Figures 14 and 15 show the outcome of the simulation. Here, the forecast baseline preforms moderately well. The 5-year strategy results in savings, while the 10- and 30year strategies do not. Notably, even in this extreme low-rate scenario the performance of the diversification strategy still results in savings, though they are trivial. Table 7: Lower Interest Rates: Average Interest Payments as Percent of GDP Baseline 5-Year 2013-2023
1.41
1.34
10Year 1.41
55
30Year 1.68
Min. DiversiInterest fication 1.28 1.39
.45
Par Value of Debt as Percent of GDP .5 .55 .6 .65
Figure 14: Forecasts of Debt: Lower Interest Rates
2012
2014
2016
2018
Forecast Baseline 10-Year Optimal Strategy
2020
2022
2024
30-Year 5-Year Diversified Strategy
Net Interest Payments as Percent of GDP .01 .012 .014 .016 .018
Figure 15: Forecasts of Interest Payments: Lower Interest Rates
2012
2014
2016
2018
Forecast Baseline 10-Year Optimal Strategy
56
2020 30-Year 5-Year Diversified Strategy
2022
2024
7
Conclusion
My results show that the maturity structure of debt can be a powerful tool for managing the U.S. government’s long-term fiscal outlook. In my preferred strategy with diversification, lengthening the maturity structure could save the Treasury $424 billion in interest payments over the next ten years. More importantly, this strategy would also provide valuable insurance against rolling over large quantities of debt at high interest rates. If interest rates rise more quickly than expected, savings relative to the current policy baseline would be much larger. The changes needed to realize these savings do not require an act of Congress and could be implemented unilaterally by the Treasury department. The path of future interest payments is a policy decision, and there is compelling evidence that it should not be treated as fixed. The Treasury’s current policy is heavily influenced by the Treasury Borrowing Authorizing Committee, which meets quarterly to issue new guidance. The membership of this committee is comprised of senior executives from the nation’s largest investment funds and banks. Given that many of these firms are the Treasury’s largest customers, mismatched incentives of committee members may explain some of the reluctance to change policy. Some caveats are necessary. First, this model does not estimate how the prices of different securities would adjust to major changes in issuance. If price elasticities are high, savings would be lower. Estimating these price elasticities and incorporating them into this model would be a valuable extension. Secondly, this model does not account for the many services Treasury securities provide beyond financing government deficits. For example, Greenwood et al. (2012) note that favoring short-term debt has benefits since these securities offer many money-like properties. Lengthening the maturity structure could disrupt financial markets; however, the Fed has a variety of tools to minimize such risks. The Fed’s new authority to pay interest on excess reserves is one example. Many of the other financial services provided by short-term debt could be easily replaced by new products from the private sector. 57
Finally, I address the question of risk management only through simple changes to the levels of interest rate forecasts. The model omits interaction effects between interest rates and other inputs, as well as how the timing of rate changes might affect debt outcomes. Incorporating these two effects would give a much clearer picture of the risk involved in pursuing particular strategies.
58
Three Dimensions of Time: An Age-PeriodCohort Analysis of U.S. Spending Patterns Erica Segall, Yale University1 Abstract. To test for the existence of generational or “cohort” effects in U.S. spending behavior, I incorporate Age-Period-Cohort (APC) modeling, which disentangles three distinct time-related effects on changes in behavior, into Deaton and Muellbauer’s (1980) Almost Ideal Demand System (AIDS). The result is an empirical model that accounts for these nuanced time effects but is also consistent with the principles of consumer theory. Estimation of the model using data from the Consumer Expenditure Survey suggests that household budget allocations exhibit significant cohort effects, and that including cohort effects significantly improves demand models that account only for age. The existence of these effects suggests that cohort membership can influence an individual’s preferences and spending patterns across the life course. Keywords: age-period-cohort, spending behavior, Consumer Expenditure Survey
1 Erica Segall graduated from Yale University in 2013. This senior essay won the Ronald Meltzer/Cornelia Awdziewicz Economic Award for the runner-up senior essay in economics. The author would like to thank the entire Yale Economics Department and in particular her adviser, Professor Doug McKee, without whose guidance, wisdom, and encouragement this paper would not have been possible.
59
1
Introduction
It is no stretch to suggest that the elderly tend to consume a different mix of products than younger generations. From casual observation, this would appear to hold true both within product categories and for the weight each of those categories is given in the overall consumption basket. Yet there are multiple potential explanations for why purchasing patterns might change over time. A critical question, therefore, is how much of the temporal change is actually due to the aging process, and how much is due to something else—for instance, the particular moment in history being analyzed, or the particular time and circumstances in which an individual aged. Disentangling these separate effects can help us answer important questions related to aging, consumption, and preference formation. By analyzing long-term consumption patterns along these dimensions, we can identify trends that explain broader purchasing patterns and illuminate our understanding of well-known American generations. To test for the existence of generational or “cohort” effects in U.S. spending behavior, I incorporate Age-Period-Cohort (APC) modeling, which disentangles three distinct time-related effects on changes in behavior, into Deaton and Muellbauer’s Almost Ideal Demand System (AIDS). The result is an empirical model that accounts for these nuanced time effects but is also consistent with the principles of consumer theory. Estimation of the model using data from the Consumer Expenditure Survey suggests that household budget allocations exhibit significant cohort effects, and that including cohort effects significantly improves on demand models that account only for age. The existence of these effects suggests that cohort membership can influence spending patterns across the life course. The analysis proceeds as follows: In the following section, I draw insights from literature in sociology, history, and economics to demonstrate the importance of accounting for cohort or generational effects while analyzing time- or age-related changes in behavior or consumption patterns, and I survey prior work that has addressed these issues. In Section 3, I propose explanations for why cohort membership could have an effect on macro spending behavior above and beyond changes related to aging 60
or the particular period being examined. In Section 4, I offer an economic model of utility-maximizing consumers, which forms the groundwork for the construction of a demand framework that is consistent with microeconomic theory. In Section 5, I describe the particular sample of Consumer Expenditure Survey data that I use in this paper’s analysis. In Section 6, I combine two existing models from the literature—APC models and AIDS—into a single empirical model that I use to estimate the effect of cohort membership on consumer behavior. Additionally, I confront the fundamental identifiability problem inherent in building APC models, and I propose a solution to combat that problem in my model. In Section 7, I discuss some limitations of this analysis, and I propose several directions for future research in this field. In Section 8, I conclude by reiterating my main findings and emphasizing the importance of cohort effects in demand analysis.
2
Background
Sociologists and historians have repeatedly tackled the concept of birth cohort and what distinguishes one cohort from another. Cohorts, or generations, are shaped and defined by their particular age-related involvement in prominent historical events during their lifetime (Strauss and Howe 1992). The same historical event can have extremely different effects on two generations that experience that event at different ages and in different life stages (Strauss and Howe 1992). For example, it is easy to imagine that the Great Depression would influence toddlers, the middle-aged, and the elderly quite differently. Similarly, it is easy to fall into what Strauss and Howe (1992) label the “life-course fallacy,” in which we attempt to explain a life-cycle by looking at the different age groups that are alive at a particular moment in time. The fact that today’s elderly demonstrate certain preferences doesn’t mean that tomorrow’s elderly will express those same preferences. Rather, groups follow a diagonal trajectory, experiencing each age at one unique time. As a result, no two groups experience an event or period in the same way, nor do two groups experience being a particular age in the same way. Tracking individuals by age location, rather than age or time period alone, allows us to see how historical events and trends 61
shape different age groups differently (Strauss and Howe 1992). Such analysis becomes particularly meaningful when we define groups not just as uniformly sized cohorts, but as â&#x20AC;&#x153;historical generationsâ&#x20AC;? whose boundaries are demarcated by prominent historical events (Carlson 2008). Moreover, we can hypothesize that not only do historical events differentially affect different age groups, but that those effects may continue to influence those groups throughout the life course. This sociological premise can be tested and examined in the realm of economics. Treating consumption patterns as one dimension along which this generational logic can operate allows us to test whether different generations really do act (and spend) differently, and whether those differences coincide with our prior understandings of American history and culture. Moreover, disentangling how age, period, and cohort differentially affect consumption patterns will help us understand the consequences of an ever-changing population age composition on aggregate demand (Howden and Meyer 2011). Age-period-cohort (APC) analysis formally captures these sociological concepts in estimable models. Its primary aim is to disentangle age, period, and cohort effects to determine which is dominant in driving behavioral variations over time (Chen et al. 2001). Age effects are variations resulting from the biological and social processes of aging specific to individuals, such as physiological changes and the buildup of social experience (Reither, Hauser, and Yang 2009). Period effects are defined as external variations across time periods that simultaneously influence all age groups, and encompass a wide range of historical, social, and environmental factors, such as wars, technological innovation, and economic crises (Reither et al. 2009), as well as changes in income and relative prices. Cohort effects, the focus of this paperâ&#x20AC;&#x2122;s analysis, capture a number of concepts and interpretations. At the most fundamental level, these effects convey the idea that different age groups were born at different times, that they experience a unique collection of environmental forces as they age, and that they could therefore develop distinctive patterns of behavior above and beyond changes in their age or in prices and income in a given period (Chen et al. 2001). In other words, a birth 62
cohort experiences the same historical, social, and environmental events at the same age, potentially giving rise to unique, cohortspecific values, attitudes, and preferences. Of course, there is also potential interaction among age, period, and cohort effects. For instance, one’s cohort membership might affect the process of aging. The focus of this paper, however, will be to isolate distinct effects, and, as such, interactions will not be estimated. Cohort analysis has been more widely applied in the field of sociology than in economics (Chen et al. 2001), though a number of studies have used it to analyze changing consumption patterns for a variety of goods. Chen et al. (2001) employed cohort analysis using a constrained multiple regression model to examine the U.S. life insurance purchase pattern between 1940 and 1996, and found that baby boomers’ tendency to purchase less life insurance than their earlier counterparts—a cohort effect particular to the baby boom generation—was driving the recent decline in insurance purchases. The cohort analysis method has also been used to examine not just single-product consumption, but variations in product mixes. For instance, Kerr et al. (2003) analyzed age, period, and cohort effects not only on total alcohol consumption, but on beverage-specific trends in the composition of individuals’ alcohol intake. Their findings suggest that cohort effects do affect drinking patterns, not just in quantity, but in compositional makeup, and that the aging and mortality of high-consumption cohorts influence economy-wide, beverage-specific demand, as well as trends in medical conditions typically associated with particular types of alcohol consumption. It is possible that this finding, which points to the existence of cohort effects on the compositional makeup of certain forms of consumption, can be generalized to the compositional makeup of all nondurable expenditures. This paper will examine broad U.S. consumption patterns and see whether underlying cohort effects influence not just specific product preferences, but consumer behavior on a macro scale. Indeed, understanding what drives the changing tastes of individuals as they age not only provides a richer knowledge of different groups of consumers, but it also provides us with a more thorough comprehension of aggregate demand and how it will change with shifting demographics. Age, period, and 63
cohort effects on consumption patterns could have economy-wide implications for understanding aggregate spending behavior and key drivers of demand. In sum, by looking at how individuals across age groups and across time choose to allocate their money among these different product groups, we can come away with a better understanding of consumer preferences and how productspecific demand will change over time as different cohorts reach different life milestones and differentially experience history.
3
Hypothesis
Given the evidence in prior literature that age, period, and cohort effects can and do separately influence demand for specific products, it seems reasonable to think that age, period, and/or cohort could each affect the broad compositional makeup of household budget allocations over time. Determining the relative size of these different effects will be the aim of this paperâ&#x20AC;&#x2122;s empirical analysis. It seems both possible and likely that the magnitude of each effect would vary with the time period and cohort being analyzed. Flexible interpretations of each timerelated variable convey the different mechanisms by which these effects might occur. Physiological changes coinciding with the aging process would certainly seem to influence product demand and, by extension, budget allocation. Age effects need not be physiological; age-related household roles or stages in life could similarly affect marketplace behavior. Research suggests that bachelors tend to allocate their money to used furniture and automobiles, restaurant meals, entertainment, and recreation; newlyweds prepare more meals at home and purchase new furniture and autos; and the birth of a first child generally leads to more home-oriented leisure activities (Shaninger and Danko 1993). As households age, we would expect their consumption decisions to align with these physiological and socio-cultural changes. One way to interpret period effects at a given moment in history is as the combination of price and income effects. In the simplest microeconomic terms, we might imagine that, if people today are richer than they were 100 years ago, they 64
might demand different goods and different quantities of those goods; in particular, they might come to cultivate tastes for higher-quality or luxury goods. Price effects similarly align with intuition: if apples today are cheaper than they were 100 years ago, we can imagine that people today would demand more apples. As average incomes have risen over the past century (Chao and Utgoff 2006), we can anticipate that households’ optimal allocation of expenditures will shift toward more income-elastic, “luxury” goods—for instance, clothing and personal care, and food away from home. Similarly, we expect that own-price elasticities will be negative, and that elasticities for necessity items will have lower magnitudes. We can also think of period effects as trends or fads, in that the proliferation and popularity of a certain product or behavior is the result of a particular time, and it affects all individuals living in that time. Of course, the idea of fads can easily become confounded with age and cohort. For instance, it is unlikely that 80-year-olds and 20-year-olds would be equally affected by a bellbottoms trend; rather, fads might differentially affect different age-cohort groups. Like Rentz, Reynolds, and Stout (1983), we can look for any fad-related effects that persist over time and label those as cohort effects. We can also generalize the idea of a “fad” beyond specific products to see whether fads emerge in broader spending patterns. We can interpret cohort effects—and justify their existence—in a number of other ways as well. First and foremost, we can think of them as the direct consequence of an individual being a certain age at a certain point in history. Any number of stories can be constructed along these lines. We might hypothesize, for example, that individuals who passed through young adulthood during Prohibition might have different lifetime consumption patterns of alcohol than other cohorts (Levenson, Aldwin, and Spiro 1998). Similarly, children or young adults of the Great Depression might demonstrate distinct lifetime purchasing patterns of luxury goods, even in expansionary periods. It is unclear, however, whether such effects would be positive or negative for any given cohort and product. The “learning and habit formation” interpretation of cohort effects could be relevant in the food away from home (FAFH) 65
category, in which the U.S. has seen a dramatic proliferation of convenience foods, fast-food establishments, and food marketing. Given the vast evidence suggesting that children are highly susceptible to external influences on eating behavior, the changing face of the FAFH market could have a life-long effect on the cohorts of children it influenced. Similarly, we might look for trends in food consumption both at and away from the home as cultural understandings of gender and family roles evolved over time. In sum, strong cases can be made for why age, period, and/or cohort could each affect the particular broader compositional makeup of household budget allocations over time.
4
Economic Model
In my model, I assume that individuals solve a standard utility maximization problem over a range of product categories xi , .., xk and all other forms of consumption xz given individuals’ age A, birth cohort C, and the time period of the decision P. At any given time, individuals choose a bundle of goods so as to maximize their utility subject to their overall budget constraint. Individual utility functions take into account the changes that occur in the marginal utility of each good when age, period, and cohort vary. Individuals therefore face the following problem: max
{ xi ,...,xk },xz
U ( xi , ..., xk , xz | A, P, C )
s.t. y = p1 x1 + ... + pk xk + pz xz
(1) (2)
Solving the utility maximization problem yields the individual’s optimal demand functions xi∗ as a function of all the available goods’ prices pi , as well as the individual’s age, birth cohort, and the time period. xi∗ = xi∗ ( p1 , ..., pz , A, P, C )
(3)
All consumption good prices must enter the demand function to account for the possibility of substitution and complementarity between goods. We can imagine past prices indirectly affecting 66
the preferences of particular cohorts—for instance, if the price for good X was low during an individual’s childhood, that person was more likely to cultivate a taste for X, and that effect stays with that individual through the life-course. With these optimal demand functions, total nondurable expenditures yn can be calculated in absolute terms as: yn =
k
∑ pi xi∗
(4)
i =1
Since the set { xi , ..., xk } is taken to represent different product categories (e.g., food consumed at home, clothing and personal care, etc.), and their xi∗ to represent optimal consumption of that category, optimal budget composition can be calculated by product group. As such, ωi ∗ =
pi xi ∗ ∑ik=1 pi xi ∗
(5)
represents the utility-maximizing share ωi of an individual’s budget allocated to product i. While the economic model is fairly straightforward, the more interesting contribution of this paper lies in the empirical analysis, whereby the individual effects are disentangled and estimated. The empirical method is a direct extension of Deaton and Muellbauer’s (1980) Almost Ideal Demand System (AIDS), which is based on a classical consumption model like the one described in this section, though without accounting for the effects of age, period, and cohort. The AIDS framework allows us to take standard, individual utility functions and derive demand functions for utility-maximizing individuals. I outline and construct the empirical method more thoroughly in Section 6.
5
Data
Analysis of age, period, and cohort requires the collection of data from multiple time periods due to the fact that each of these factors fundamentally relates to changes over time. Mason and Wolfinger (2001) explain that both longitudinal and repeated cross-sectional data can be used for cohort analysis. Clearly, 67
single-period (i.e., cross-sectional) datasets must be ruled out, since any difference in the dependent variable with age could be interpreted as either an age or a cohort effect, with no clear basis for choosing (Mason and Wolfinger 2001). The data selected for analysis come from the Consumer Expenditure Survey (CEX), a Bureau of Labor Statistics (BLS) survey that collects information on the expenditure habits of U.S. consumers at the household—or “consumer unit”—level, as well as income data and household characteristics. In particular, this analysis uses data from the Diary survey component of the CEX, in which respondents fill out a detailed diary of expenses for two consecutive one-week periods. The Diary survey is intended to gather data on small, frequently-purchased items like food or clothing that respondents are unlikely to recall over time, as opposed to much larger purchases like property or vehicles. Each week-long diary is divided into seven days, and each day is divided into four parts based on expenditure type: food and drinks away from home; food and drinks for home consumption; clothing, shoes, jewelry, and accessories; and all other products, services, and expenses. Each category is broken down into more detailed product subcategories that are explained to survey respondents. The CEX gathers data at the level of the consumer unit, which it defines as all members of a household who are related by blood, marriage, adoption, or other legal arrangements; a person living alone or sharing a household with others who is financially independent (determined by spending behavior on housing, food, and other living expenses); or two or more persons living together who use their incomes to make joint expenditure decisions. For a given consumer unit, the reference person is the first member mentioned by the respondent when asked to “Start with the name of the person or one of the persons who owns or rents the home.” All other consumer unit members’ relationships are determined relative to this person. Throughout this analysis, any mention of individual characteristics (age, birth year, etc.) will refer to this reference person unless otherwise specified. I use a cleaned version of the CEX data compiled by Aguiar and Hurst (2012), who use NBER CEX extracts (compiled and harmonized across years by Harris and Sabelhaus) including 68
all survey waves from 1980 through 2003. Aguiar and Hurst restrict the sample to include only those households that report expenditures in all four quarters of the survey (which they sum to calculate annual expenditures), record non-zero expenditures on six key sub-components of the consumption basket (food, entertainment, transportation, clothing and personal care, utilities, and housing/rent), and have a head between the ages of 25 and 75 (inclusive). They take additional measures to account and correct for the possibility of zero expenditures on smaller consumption categories, including food away from home, alcohol and tobacco, etc. The authors also limit their analysis of expenditures and budget shares to nondurables, excluding health and education expenditures; they define a measure of nondurables consisting of expenditures on food (at and away from home), alcohol, tobacco, clothing and personal care, utilities, domestic services, nondurable transportation, airfare, nondurable entertainment, net gambling receipts, business services, and charitable giving. Cumulatively, these categories comprise roughly 75% of household annual expenditures. This nondurables measure will be employed in this paper’s analysis. Unless otherwise specified, expenditures are nondurable, and budget shares are expressed as a fraction of nondurable expenditure. The Aguiar and Hurst data consolidate expenditures into broader product categories, four of which will be the focus of this analysis. They include: food consumed at home, food consumed away from home, clothing and personal care, and alcohol and tobacco. The fifth category of this analysis includes all other nondurable (“Other ND”) expenditures not included in the first four categories. That is: Other ND = Total ND - FAH - FAFH - CPC - AT
(6)
Aguiar and Hurst calculate composite price indexes for the former four categories using weighted CPI data, which they use to convert nominal expenditures into real ones; I used these calculations along with overall price level data to calculate a price index for all “other” nondurables. In addition to the sample restrictions imposed by Aguiar and Hurst, I have reduced the sample to households of size 1 or 2 and without children, so as to reduce the need for household equivalence scaling 69
(explained in Section 6). The resulting data sample contains 23,987 observations. Table 1: Descriptive Statistics Variable
Observations
Mean
Age Household size Nondurable expenditure (nominal) Nondurable expenditure (real) FAH budget share FAFH budget share CPC budget share AT budget share Other ND budget share
23,987 23,987 23,987 23,987 23,987 23,987 23,987 23,987 23,987
52.719 1.581 13114.43 15777.41 0.227 0.094 0.104 0.0503 0.523
6
Standard Deviation 14.281 0.493 8686.088 9535.52 0.103 0.073 0.071 0.061 0.121
Empirical Method
The main contribution of this research lies in the empirical method, the aim of which is to disentangle, identify, and estimate sources of change in demand that result from timerelated preferences distinct from changes in price and income (Poray et al. 2000). These effects can then be incorporated into designing and estimating a demand system that more thoroughly and satisfactorily explains consumer behavior along the dimensions of age, period, and cohort, while also staying true to fundamental tenets of consumer theory (Deaton and Muellbauer 1980). Understanding and modeling the true origins of demand poses significant difficulties, as changes in observed equilibrium prices and quantities of goods over time can be the result of shifts in both supply and demand (Shafrin 2009). I aim to build a model that adequately reflects changes in observed demand over time and identifies the sources of that change. Such a model can be used to improve standard demand analysis and derive greater insights into individual and aggregate demand. To construct this model, I combine specifications from two existing, but largely separate, models from the literature: the Almost Ideal Demand System (AIDS) proposed by Deaton and Muellbauer (1980), and APC modeling. The following subsections describe each of the models in further detail. I combine 70
the models to incorporate APC variables into a single demand system consistent with consumer preferences.
6.1
Almost Ideal Demand System (AIDS)
Deaton and Muellbauer’s AIDS model is derived from Muellbauer’s “price independent generalized linearity,” or PIGL/PIGLOG model, which allows aggregation over consumers as if they were the outcome of decisions by a single, utilitymaximizing consumer (Muellbauer 1975). Deaton and Muellbauer begin with a classical theoretical model of consumer behavior very much like the economic model presented in section 4 of this paper, but not including age, period, or cohort. From this model they derive AIDS demand functions for utility-maximizing consumers in terms of budget shares. Within this framework, household demand is approximated as (Deaton and Muellbauer 1980): � � xh wi h = ai + ∑ γij log p j + β i log (7) kh p j where wi h is household h’s budget share of good i, x is total expenditure, p is a price index, and k h is a sophisticated measure of household size that can account for other household characteristics, such as economies of household size, and which is used to deflate budget. For the sake of simplicity, I assume k = 1. Because this analysis is limited to small households, this should be a reasonable assumption. Deaton and Muellbauer’s demand system also accounts for price and expenditure changes, with changes in relative prices working through γij terms and changes in real expenditure operating through β i coefficients. A number of restrictions are imposed on the model to make it consistent with the theory of demand (Deaton and Muellbauer 1980). The first of these restrictions is additive, whereby all of a household’s budget shares sum to 1 (i.e., ∑ ω ≡ 1); this restriction is imposed so that all the categories, as we’ve defined them, sum to total expenditures. Homogeneity of the system says that ∑ γij = 0. That is, if all prices rise or fall by the same amount, none of the demand functions should change, and therefore budget shares should not change. This homogeneity restriction is imposed on a per-regression basis. A third restriction, the 71
Slutsky symmetry restriction, ensures that cross-price elasticities are symmetric between goods, such that the change in demand for good i in response to a change in price for good j is equal to the change in demand for j in response to a price change for i (i.e., γij = γ ji ) (Deaton and Muellbauer 1980). Additionally, AIDS models must account for household size and composition in order to approximate behavior accurately and consistently, as emphasized by Deaton (1997). He argues that household consumption must be adjusted to account for economies of scale and variations in individual needs. To remove the need for equivalence scaling, I modify the data to include only those households with 1 or 2 members and no children, and I control for household size while estimating the AIDS model. While this limitation on the sample might prevent our results from generalizing to larger households with children, it could be generalized to the elderly.
6.2
Age-Period-Cohort Models
While original applications of APC analysis tended to rely on innovative graphical presentations of data, recent work in sociology and statistics has explored more formal quantitative modeling of individual effects (Holford 2005). However, any attempt to quantify the three effects must work around a fundamental “identifiability problem,” in that age, period, and cohort are linearly dependent according to the following relationship: Cohort = Period - Age
(8)
and therefore cannot be uniquely and simultaneously estimated (Holford 2005). It is consequently logically impossible to hold two effects constant and then vary the third in the way we typically understand and interpret regression coefficients (Mason et al. 1973). Indeed, Mason et al. (1973) conclude that if we assume that all age groups, time periods, and birth cohorts have unique effects on the dependent variable, it is impossible to estimate a difference between the effects of any two categories. Finding techniques to combat this fundamental problem is the primary aim of subsequent research in cohort analysis (Mason 72
and Wolfinger 2001). The simplest and most straightforward solution to the identifiability problem is to drop one of the effects altogether and fit a two-factor model (Holford 2005). In situations for which there is reason to expect minimal or no contribution from one of the effects, or in which a two-factor model fits the data well, this approach seems reasonable (Holford 2005). However, the appropriateness of two-way cohort analysis depends entirely on whether we consider age, period, and birth cohort to have causally distinct effects on the dependent variable (Mason et al. 1973). Unfortunately, the possibility of each factor having a distinct causal relationship to the dependent variable is very strong in most archival data. Mason et al. (1973) explain how age, period, and cohort could all intuitively and independently affect a number of observable trends, including menâ&#x20AC;&#x2122;s earnings over time and party identification. Similarly, in the budget composition story, there is no such reason to expect a priori that any of the three effects would be negligible, and in fact, logical cases can be made for why each factor might have a significant effect. In cases where all three effects are deemed essential to the analysis, the linear dependency must be eliminated by some other restriction on effect coefficients (Mason and Wolfinger 2001). Rather than equate all effects for one of the model factors to zero, a second approach to nonidentifiability is to equate just two of the effects for one of the three factors. However, the validity of this approach relies on making a reasonable assumption for the equality constraint. Typically, there is no more solid basis for equating two effects beyond reasonable intuition (Holford 2005). Nevertheless, the resulting parameter estimates can vary significantly for two equally logical assumptions and still fit the data equally well, and therefore any analysis of the results depends critically on the assumptions that are made (Mason and Wolfinger 2001). Mason and Wolfinger (2001) propose yet another solution to the identifiability problem. The idea behind this alternate method is that, while APC analysis studies time trends (since age, period, and cohort are all time measures), the effect of time is not itself causal, but rather is instead related to some other factor or factors that will affect the outcome. Therefore, one way to conduct 73
the analysis and avoid the identifiability problem altogether is to include a more direct measure of the underlying factor for which time is a surrogate measure (Mason and Wolfinger 2001). For instance, if we had reason to believe that a cohort’s size was the only way it meaningfully contributed to the dependent variable, we could substitute cohort size for cohort membership to eliminate collinearity in the regression (Mason et al. 1973). The aim of the analysis presented here is to estimate each of the three factors’ effects on expenditures, according to y apc ,i = β 0 + β 1 a a + β 2 p p + β 3 cc + ε
(9)
where the dependent variable y apc ,i is the expenditure in period p on product category i by a household whose respondent is of age a and was born in cohort c, and where ε controls for demographic factors. Given its relative flexibility and the fact that no equality restriction on two age, period, or cohort effects can reasonably be defended a priori, the Mason and Wolfinger underlying-factor method will be used in this analysis. Period effects will be taken to be the sum of price and income effects: the effect of a specific time period on budget allocation is reflected and generated by prevailing price levels and trends in income for that period. Clearly, changes in price and income over time will change the way people spend their money. The AIDS framework controls both for changes in relative prices (via the γi terms) and for changes in income (through the β i terms), and their cumulative effects are taken to be the predominant influence of a given time period on an individual’s purchasing decisions. For the sake of simplicity, I will avoid introducing more complexity via additional macroeconomic measures.
6.3
Building a Demand System Including APC Variables
Combining Deaton and Muellbauer’s flexible demand system with APC analysis of expenditure data allows us to determine how temporal preferences come to affect demand within the framework of a theory-consistent model. Though the two models are largely separate in existing literature, Gustaven and Rickertsen (2009) included APC variables in a demand system that they applied to Norwegian purchases of non-alcoholic 74
beverages. Inserting age, period, and cohort dummies into an AIDS framework gives them a model that can be simplified as
wih = ai0 +
+ β i log
K
L
M
k =1
l =2
m =2
∑ πik Ak + ∑ δil Pl + ∑
�x �
ηim Cm + ∑ γij log p j j
h
(10)
P
where one age, period, and cohort dummy variable is dropped due to singularity, and K − 1 age dummies, L − 1 period dummies, and M − 1 cohort dummies remain. (To ease interpretation of variables, I will drop the first of each: i.e., youngest age group, earliest birth cohort). Taking period effects to be the result of price effects (∑ j γij ) and income effects (β i ) leaves us with the following:
wih = ai0 +
K
M
k =2
m =2
∑ πik Ak + ∑
+ β i log
�x � h
ηim Cm + ∑ γij log p j j
(11)
P
Dummies for age and cohort are broken into 5-year intervals,2 resulting in 10 age groups (25-29, 30-34, etc.) and 11 birth cohorts (beginning with 1916-20). As mentioned before, the age 25-29 and cohort 1916-20 dummies are dropped from the regression due to singularity and the remaining dummy coefficients can be interpreted as digression from those reference groups.
6.4
Estimation
In my main analysis, I estimate a set of linear simultaneous equations for the five different product categories (food at home, food away from home, clothing and personal care, alcohol and tobacco, and other nondurables) using the method of seemingly unrelated regressions. This method allows for the simultaneous estimation of separate linear regressions whose error terms are allowed to be correlated across equations. Estimating the model as 2 Given
the range of the data, some groups are one year larger or smaller.
75
a set of simultaneous equations is also required for the imposition of the symmetry constraint, which requires equality of cross-price elasticities across regressions. In addition, because the dependent variable in question is a budget share (rather than traditional) quantity demanded and price, therefore appears on both sides of the demand equation, extra work must be done to interpret price coefficients in the model; that is, they cannot be interpreted as elasticities directly as they appear in the model. I follow the steps taken by Fujii, Khaled, and Mak (1985) to calculate elasticities from the model parameters as follows: • Own-price elasticity (uncompensated): −1 − β i + γij /ωi • Own-price elasticity (compensated): −1 − ωi + γij /ωi • Income: 1 + β i /ωi
7
Results and Discussion
Estimation of my AIDS-APC model (Table 2), a constrained multiple regression model in which all AIDS restrictions are imposed and period effects are approximated by price and income effects, suggests the existence of distinct age, period, and cohort effects among households. As mentioned in the previous section, because the dependent variable in question is a budget share (rather than traditional quantity demanded) and price therefore appears on both sides of the demand equation, price coefficients in the model need to be translated into elasticities before we can make sense of them. Table 3 contains own-price and income elasticities for the five expenditure categories as calculated from the fully AIDSconstrained model. Four of the five elasticities are significantly negative and fit within ranges suggested by prior research on expenditure data, though elasticity estimates vary widely in the literature. Cramer (1973) calculates a price elasticity for beer (the largest single component of alcohol expenditures in recent) CEX data) of -2.00, extremely close to the alcohol elasticity of -2.06 calculated in the model. Food at home elasticity falls
76
Table 2: Budget Allocation in a Restricted AIDS-APC Model Food at Home
Model Variables Prices Food at Home Food Away Clothing and Personal Care Alcohol and Tobacco Other Nondurables Nondurable Expenditures (real) Household Size Age 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-75 Cohort 1921-25 1926-30 1931-35 1936-40 1941-45 1946-50 1951-55 1956-60 1961-65 1966-70 cons
Product Category Clothing and Personal Care (3)
(1)
Food Away from Home (2)
0.0870** (0.0335) -0.0863** (0.0295) 0.0610***
-0.0863** (0.0295) 0.0786* (0.0329) 0.0734***
0.0610*** (0.0137) 0.0734*** (0.0124) 0.114***
-0.0544*** (0.00407) -0.000558 (0.00337) -0.0359***
-0.00732 (0.00942) -0.0632*** (0.00750) -0.213***
(0.0137) -0.0544*** (0.00407) -0.00732 (0.00942) -0.105***
(0.0124) -0.000558 (0.00337) -0.0632*** (0.00750) 0.0448***
(0.0114) -0.0359*** (0.00286) -0.213*** (0.00655) 0.0421***
(0.00655) 0.145*** (0.00367) 0.138*** (0.0106) 0.0248***
(0.00178) 0.0392*** (0.0116)
(0.00140) -0.0382*** (0.00915)
(0.00133) -0.0207* (0.00869)
(0.00286) -0.0540*** (0.00215) 0.145*** (0.00367) 0.00665*** (0.00107) 0.00984 (0.00702)
0.0183*** (0.00439) 0.0355*** (0.00491) 0.0523*** (0.00568) 0.0617*** (0.00625) 0.0733*** (0.00705) 0.0815*** (0.00791) 0.0902*** (0.00880) 0.0950*** (0.00957) 0.0975*** (0.0105)
0.00345 (0.00348) 0.000881 (0.00394) -0.00358 (0.00464) -0.000989 (0.00518) 0.000305 (0.00591) 0.00122 (0.00670) 0.00501 (0.00749) 0.00644 (0.00820) 0.0115 (0.00900)
-0.00227 (0.00330) -0.0114** (0.00375) -0.0138** (0.00442) -0.0135** (0.00495) -0.0144* (0.00566) -0.0104 (0.00641) -0.0119 (0.00718) -0.00811 (0.00785) -0.00492 (0.00862)
0.00540* (0.00262) 0.000648 (0.00283) 0.00478 (0.00313) -0.00247 (0.00327) -0.00593 (0.00349) -0.00836* (0.00376) -0.00985* (0.00405) -0.0149*** (0.00430) -0.0207*** (0.00460)
-0.0249*** (0.00590) -0.0256*** (0.00637) -0.0397*** (0.00704) -0.0449*** (0.00732) -0.0533*** (0.00777) -0.0640*** (0.00831) -0.0736*** (0.00887) -0.0786*** (0.00936) -0.0836*** (0.00995)
0.00440 (0.00347) 0.0105** (0.00376) 0.00903* (0.00448) 0.0173** (0.00543) 0.0236*** (0.00638) 0.0296*** (0.00746) 0.0317*** (0.00861) 0.0379*** (0.00956) 0.0508*** (0.0105) 0.0606*** (0.0121) 1.072*** (0.0311)
0.00577* (0.00276) 0.00374 (0.00306) 0.0164*** (0.00372) 0.0146** (0.00456) 0.0183*** (0.00540) 0.0196** (0.00635) 0.0270*** (0.00734) 0.0278*** (0.00819) 0.0325*** (0.00899) 0.0354*** (0.0103) -0.305*** (0.0256)
0.00319 (0.00263) 0.00937** (0.00292) 0.0134*** (0.00355) 0.0154*** (0.00436) 0.0164** (0.00518) 0.0204*** (0.00609) 0.0197** (0.00704) 0.0251** (0.00783) 0.0253** (0.00860) 0.0367*** (0.00989( -0.288*** (0.0242)
-0.00146 (0.00205) -0.00317 (0.00207) -0.00706** (0.00230) -0.00386 (0.00266) -0.00177 (0.00298) -0.00175 (0.00336) -0.000817 (0.00387) -0.00184 (0.00430) -0.00721 (0.00470) -0.0106 (0.00553) 0.105*** (0.0180)
-0.0119** (0.00461) -0.0205*** (0.00461) -0.0318*** (0.00511) -0.0435*** (0.00583) -0.0566*** (0.00649) -0.0680*** (0.00727) -0.0777*** (0.00834) -0.0890*** (0.00925) -0.102*** (0.0101) -0.122*** (0.0120) 0.416*** (0.0403)
77
Alcohol and Tobacco
Other Nondurables
(4)
(5)
(0.00243) 0.00985 (0.0158)
Table 3: AIDS Own-Price and Income Elasticities Product Food at Home Food Away from Home Clothes and Personal Care Alcohol and Tobacco Other Nondurables
Uncompensated Price Elasticity -0.5122 -0.2104
Compensated Price Elasticity -0.3898 -0.0714
Income Elasticity 0.5378 1.4752
0.0489
0.1958
1.4016
-2.0669 -0.7906
-2.0232 -0.1775
0.8678 1.0422
generally within the range suggested by Nayga and Capps (0.428), Craven and Haidacher (-0.455), and Lamm (-0.630). That FAFH appears less elastic than FAH is surprising and somewhat counterintuitive, though the number calculated here still fits in the lower range calculated by Andreyeva, Long, and Brownell (2010). The positive own-price elasticity for clothing and personal care is also somewhat unexpected, though it is just barely positive, and much smaller in magnitude than all other elasticities.3 Income elasticities largely fit with our expectation. Across categories, income elasticities are positiveâ&#x20AC;&#x201D;i.e., total expenditures rise with income. Income elasticities between 0 and 1 denote necessities, for which total expenditures rise with income but budget share falls. In this analysis, food at home falls within that range, as we might expect. Interestingly, alcohol and tobacco also falls within the necessity range, perhaps contrary to expectations. Those goods with income elasticities greater than 1 (food away from home, clothing and personal care, and other nondurables) are considered luxuries, for which budget share increases with income. This finding is consistent with the findings of Leser (1941). Both FAH and FAFH are correlated with household size in 3 Of the five expenditure categories, however, CPC seems the most likely to approach a positive value, as it encompasses a number of commodities (clothing, jewelry) that convey status, for which price can serve as an indicator of quality, and for which preferences may thereby increase with price. Prior research suggests that some consumers demonstrate status motives when purchasing clothing (Leibenstein 1950) and womenâ&#x20AC;&#x2122;s cosmetics (Chao and Schor 1998). Cramer (1973) calculated positive price elasticities for both shoes (1.83) and toilet articles (0.16).
78
agreement with economies of scale in both categories. In terms of FAH, for instance, we might intuit that once an individual goes through the trouble of cooking at home, it becomes less costly to cook extra for a second person, and that larger households will therefore choose to eat at home more often. This outcome is consistent with the findings of Nelson (1988), who finds significant economies of scale in household food consumption and points to discounts from bulk purchasing as a possible source of such economies. Food at home demonstrates highly significant and increasing age effects, meaning that increasingly older age groups allocate increasingly greater portions of their budgets to food at home. Regression results indicate that the 70-75 age group allocates roughly 9.8 additional percentage points of its nondurable expenditures to FAH, compared to 25-29-year-olds. This does not suggest that older individuals have greater absolute FAH expenditures, which would be in disagreement with Aguiar and Hurst (2005), who suggest that absolute food expenditures fall at retirement. Rather, it merely suggests that as people age, a greater share of their nondurable expenditures is devoted to FAH, and they reallocate money from otherâ&#x20AC;&#x201D;perhaps more conspicuous or luxury-relatedâ&#x20AC;&#x201D;product categories into food at home. Simply put, it appears that as individuals age, their taste for food at home rises relative to, say, their taste for clothing and personal care items. Cohort effects for food at home are also highly significant and also increasing in later cohorts. Interestingly, this makes net effects in any given period ambiguous, as increasing age necessarily implies belonging to earlier cohorts, and demonstrates how conventional cross-sectional analysis cannot sufficiently distinguish between effects. The results suggest that the 196670 cohort allocates an additional 6.1 percentage points of its budget to FAH compared to the 1916-20 reference cohort. Given that research suggests that childhood is a crucial period in the formation of life-long food preferences (Birch and Fisher 1998), it is not surprising that food expenditures demonstrate significant cohort effects. The increasing nature of these cohort effects could be interpreted in a number of ways: as an increase in absolute quantity of food (i.e., households are purchasing more food), an 79
increase in food quality (i.e., households are purchasing better food), or a substitution away from FAFH expenditures. Our results eliminate the latter explanation, since FAFH budget shares also demonstrate increasing cohort effects, but distinguishing between the first two is impossible given our data, and both seem plausible. Nevertheless, it is not difficult to speculate as to why more recent cohorts might prefer to spend more of their budgets on food than the cohorts before them, regardless of whether those changes are a reflection of shifting quantities or qualities. For instance, the development and proliferation of supermarkets and specialty food stores over time has made food more widely available, including gourmet and specialty items (Chao and Utgoff 2006). As availability of items and overall household incomes increased, households may have formed tastes for higher quality (i.e., more expensive) grocery items, coinciding also with a rising gourmet movement throughout the twentieth century (Strauss 2011). On the other hand, the American diet has also risen in calories in recent decades, driven in part by technological innovations that have facilitated mass production of food items and lowered time costs of production (Culter, Glaeser, and Shapiro 2003), potentially accounting for some of the additional FAH budgeting. Additionally, we might imagine that as overall prosperity rises, householdsâ&#x20AC;&#x2122; propensity to waste perishable food might also be increasing, and increased expenditures on food might not necessarily imply increased consumption. The increasing cohort effect appears highly linear across successive cohorts. Interestingly, this is also true of many of the cohort effects (as well as age effects) examined in this analysis. While the explanation behind this phenomenon is unclear, it suggests that there might be a cumulative aspect to cohort effects. In other words, my cohort prefers to allocate more to a given good than our parents did, but our parents preferred to allocate more to that good than their parents did, leading to semi-linear, largely monotonic cohort effects. In contrast to food at home, FAFH consumption has no significant age effects. Cohort effects, however, are highly significant, with younger cohorts allocating an increasing portion of their budgets to food away from the home. This isnâ&#x20AC;&#x2122;t surprising, 80
given the proliferation of sit-down and fast-food restaurants, the increasing availability of take-out, ready-made, or convenience foods (Guthrie, Lin, and Frazao 2002), and the rising popularity of gourmet food establishments (Strauss 2011) throughout the twentieth century, all of which could be influencing the lifelong eating habits of young cohorts experiencing those changes. A number of other explanations can account for these highly significant cohort effects. Individuals of later cohorts entered the labor market at a time in which women were increasingly participating in the workforce, attending school, and establishing careers (Guthrie, Lin, and Frazao 2002). We might imagine that as women entered the workplace and shed their roles as fulltime housekeepers, American households became increasingly dependent on ready-made or convenience foods. Fast foodâ&#x20AC;&#x201D;and different cohortsâ&#x20AC;&#x2122; differential exposure to itâ&#x20AC;&#x201D; could also be contributing to the rising cohort effects observed in the data. The earliest cohorts in this analysis were already approaching middle age during the advent of the fast food industry and its period of phenomenal growth. More recent cohorts, on the other hand, were highly exposed to fast food and its marketing campaigns. These cohorts demonstrate a markedly increased propensity to spend money on food away from home compared to prior cohorts, suggesting that different cohorts are differentially influenced by dramatic changes in the marketplace depending on their age at that moment in time. Children who grow up with exposure to fast food, for instance, might be disproportionately susceptible to its external influence on dietary habit formation. It is also notable that the 1926-30 cohort is the only cohort without a significant cohort effect in FAFH. We could perhaps fit this finding into a larger narrative of the Great Depression, since research and intuition both suggest that families will scale back on eating out during recessionary times (Kumcu and Kaufman 2011). As such, we might expect that members of the 192630 cohort, who were children and young adolescents during the Great Depression, grew up eating out less frequently than children raised in non-recessionary periods and never developed a life long taste for eating out (Birch and Fisher 1998). This finding, however, is far from conclusive, and is not significantly different 81
than the other observed effects. A great deal of research has pointed to the increasing importance of food away from home in the American diet (U.S. Department of Agriculture 2013). Overall, though, the main lesson from the FAFH analysis is that higher FAFH budget shares, which we might associate with young people who like the social aspect of eating out, are not so much a factor of youth as much as the cohort membership of the particular youths weâ&#x20AC;&#x2122;re studying. Their preferences for eating out, therefore, should stay with them even as they age. As a generation born and raised on convenience foods, fast food, and dining out, they have cultivated a taste for those forms of consumption that will persist beyond their youth and across the life course. This has tremendously different implications for demand forecasting than a cross-sectional analysis looking only at age profiles at a single moment in time. Clothing and personal care consumption demonstrates yet another distinct pattern of time effects. Age effects are significant from the ages of 35-54, with successive age groups allocating smaller and smaller budget shares relative to the 25-29-year-old reference group. This finding suggests that older individuals might feel less of a need to purchase personal grooming products or other status-conferring goods as they enter married life and/or middle age. After the age of 55, however, age effects do not appear significant, suggesting an increase in CPC allocation back to pre-35 levels. Cohort effects are highly significant, with later cohorts spending increasingly large portions of their budgets on clothing and personal care items. This outcome aligns with the claim that those who experienced the hardship of the Great Depression cultivated a sense of â&#x20AC;&#x153;things that matter,â&#x20AC;? valuing things like hard work and the centrality of family over material possessions (Elder 1999). Cohort effects grow increasingly positive the further a cohort lies from the Depression and its lingering memory, possibly reflecting a loss of the values it imparted and a subsequent shift in preferences toward material goods. This suggests that cultural attitudes of status seeking, conspicuous consumption, and consumerism in America can be interpreted from a generational viewpoint. The existence of a cohort effect suggests that, holding income and all other variables constant, 82
later generations exhibit a relative preference for more outwardly noticeable consumption goods. Notably, by far the greatest jump in cohort effect occurs in the 1966-70 cohort; its difference is indeed statistically significant from the prior cohort. Interestingly, members of the 1966-70 cohort are considered the early portion of “Generation X.” Gen X has earned a reputation in sociology literature for being the first generation to put quality of personal life ahead of work life (Meredith and Schewe 1994). As such, the large cohort effect can be interpreted as an increased propensity to allocate money to personal care, luxury, or conspicuous-consumption items among members of that cohort. The relative increase for the 1966 cohort could also be interpreted as a relatively less robust increase for the cohorts immediately preceding it—those born between 1956 and 1965. The results point to no statistically significant difference in cohort effects between the 1956-60 and 1961-65 cohorts, a deviation from the upward trend. Prior literature suggests that this “Boomers II” cohort, whose members came of age during the Arab oil embargo of 1973 and the subsequent economic slump, showed less optimism about their financial future than the cohorts on either side of them (Schewe and Noble 2010). As such, they might have been more cautious with discretionary spending and been less preoccupied with status-seeking consumerism throughout their lives, even in more stable economic times. Age effects in alcohol and tobacco consumption are almost a mirror image of clothing consumption. Age effects are barely or not significant until the age of 55, at which point older individuals spend increasingly less on this consumption category, as we might expect for both health/physiological reasons as well as sociocultural ones. Cohort effects, perhaps surprisingly, are largely insignificant, with one prominent exception: the 1931-35 cohort, whose members tend to allocate less of their budgets to alcohol and tobacco than other cohorts. These individuals were born immediately after the 1933 repeal of Prohibition, though it is difficult to justify why a changing alcohol environment would influence such young children, or even if it did, why the effect would be necessarily be negative. Instead, it might make sense to look later in history to account for this outcome. Individuals 83
from this cohort approached legal drinking age in the very early 1950s. Interestingly, the federal government imposed a sharp tax increase on most forms of alcohol to raise revenues during the Korean War, bringing alcohol taxes to some of their highest real levels in American history (Mosher and Beauchamp 1983). While our regression analysis holds prices constant, it is possible that high taxes during the Korean War distorted preferences, and that those individuals who reached drinking age exactly at that time didn’t go on to cultivate a life long taste for alcohol. If this is true, it would imply that there is a critical age range at which one’s lifelong consumption of alcohol is tremendously influenced. Similar to the idea that individuals’ life-long eating habits are cultivated in early childhood, it is possible that their life-long drinking habits are cultivated in young adulthood. Other nondurables also demonstrate highly significant decreasing age and cohort effects, with older individuals and more recent cohorts allocating smaller and smaller budget shares to this category. Changes in other nondurables expenditures are difficult to interpret because the category is comprised of several very different types of goods and services, including utilities, domestic services, nondurable transportation, airfare, nondurable entertainment, net gambling receipts, business services and charitable giving. Our data does not distinguish among these categories, and therefore it is difficult to attribute a shift in other nondurables expenditures to any particular sub-category of goods. Indeed, any of its components could be driving the observed trends. However, the highly significant age and cohort effects, combined with the fact that this category comprises roughly 52% of all nondurable expenditures, means that these patterns are an important part of the story. Further research could be done into the individual components of this category to more thoroughly explain the roots of the observed effects. This paper’s results are also undoubtedly affected by the exclusion of households with children. Childless households by definition do not experience many of the life phases experienced in households with children—early and late parenthood, empty nesting, and so on—and therefore exhibit different consumption behavior. Further work in household equivalence scaling could enable the inclusion of households with children into 84
the sample. Additionally, the choice to use total nondurable expenditures as a basis for budget share analysis necessarily meant forgoing the study of expenditures on housing, education, and other durable goods. While examining nondurables still proved fruitful, searching for potential cohort effects in expenditures on, say, education, could also have valuable results. Moreover, budget share results in this paperâ&#x20AC;&#x2122;s analysis could be somewhat different when housing, durables, and education are included in expenditure totals. Future research could also address heterogeneity within cohorts, such as how race, gender, geography, or other differentiating features might differentially sensitize members of a common birth cohort to prominent events.
8
Conclusions
This paper makes two important contributions: one methodological and one substantive. On the methodological side, this paper constructs and estimates a model that accounts for distinct age, period, and cohort effects within the framework of a formalized demand system and confronts the fundamental identifiability problem in APC modeling. The inclusion of cohort effects significantly improves demand models that account only for age and allows for a far more nuanced understanding of consumer behavior, whether at the level of individual products or at the level of broader macro patterns in consumer spending. The results suggest that cohort effects in spending patterns are both real and significant: different birth cohorts form generational preferences that affect their purchasing patterns over the life course, though these effects can easily be obscured when confounded with age and period effects. Relating these effects to notable historical occurrences suggests that individuals are not uniformly affected by events, but are instead distinctly shaped by their unique experience of history given their particular age at a particular time. Many of the results suggest that children and young adults can be particularly vulnerable to these effects, and therefore point to the importance of historical and cultural events during childhood and young adulthood in shaping future consumption patterns. This is particularly salient in the cases of food away from home expenditures, clothing and personal care 85
expenditures, and alcohol and tobacco expenditures, all of which demonstrate highly significant cohort effects that increase over time and show large increases in particular cohorts occupying notable places in the evolution of the marketplace. In sum, this paper confirms the body of work that suggests that individuals are inextricably linked to and shaped by their particular experience of history, and that the occurrence of certain events at certain developmental periods can have effects throughout the life course. The results here suggest that this idea applies not only to a generationâ&#x20AC;&#x2122;s values and attitudes, but also to its concrete, quantifiable changes in spending patterns. With their far-reaching implications in a vast array of fields, cohort effects ought to be more thoroughly studied and considered as an integral driver of demand.
86
Language Use and Health of Children in Immigrant Households Jisoo Han, Princeton University1 Abstract. Immigrants and their children are a growing segment of the American population. Despite the importance of the immigrant population in policy considerations, immigrants have worse access to health care than natural-born citizens and are more likely to be uninsured. I analyze the relationship between language use at home and access to health care for the children of immigrants. I use a dataset containing demographic and health characteristics of children in the state of California. Using survey data from 2009, I estimate how language spoken at home affects measures of access to care and how these effects vary with parental citizenship status. I find that two factors, coming from households speaking an Asian language and having non-citizen parents, are most significantly associated with decreases in access to health care. Keywords: immigrants, insurance, healthcare
1 Jisoo Han is a senior at Princeton University. He wrote this paper as a junior independent project in the 2012-â&#x20AC;&#x2122;13 academic year. The author would like to thank his junior paper adviser, Professor Swati Bhatt, for her guidance throughout the process. He would also like to thank Michael Sockin, a graduate student in the Princeton Economics Department, for his helpful feedback.
87
1
Introduction
The children of immigrants comprise the fastest-growing segment of the U.S. child population. The foreign-born are driving the growth of the American population, and nearly one in four children have at least one immigrant parent. Despite the increasing importance of the foreign-born population in policy decisions, children of immigrants remain vulnerable with regard to health insurance coverage and access to health care. Immigrants tend to be healthier than the native-born at the point of immigration due to healthful behaviors such as better nutrition and a lower tendency to smoke or drink. However, this advantage disappears as they spend time in the U.S. (Derose, Escarce, and Lurie 2007). Immigrants are more likely than their U.S.-born counterparts to be uninsured, have poor access to health care, and have lower medical expenses (Huang, Yu, and Ledsky 2006; Ku 2009). These gaps persist even at higher levels of income (Bass 2006). Moreover, the foreign-born population is becoming more heterogeneous, and immigrant subgroups vary significantly in health characteristics (Singh and Hiatt 2006). Previous research has focused on explaining health disparities between the foreign-born and the native-born by examining barriers that immigrants face and the consequences of changes in government health program policy. Less research has attempted to explain differences between various immigrant groups by examining the link between the children of immigrants and the additional costs that their parents face when consuming medical services. There is a need for more research on the descendants of immigrants, as they will have a greater long-term influence on the nation than the immigrants themselves. Two government-run health programs, Medicaid and the State Childrenâ&#x20AC;&#x2122;s Health Insurance Program (SCHIP), serve as a crucial source of health insurance coverage for children in lowincome households. Despite the benefits of having insurance, immigrants have lower take-up rates for these programs than the native-born due to greater costs of obtaining coverage. For the foreign-born population, language barriers and immigration concerns increase the costs of finding information and completing applications. Starting in 2014, the Affordable Care Act (ACA) will 88
expand access to Medicaid and SCHIP by simplifying eligibility rules, enrollment, and renewal processes. It will also establish a uniform set of benefits that all health plans except employmentbased plans must offer. The ACA is an effort to eliminate health inequality and achieve universal coverage by improving accessibility and quality of health care. In light of the enactment of the ACA, which will be fully implemented by 2019, I examine the effect of barriers associated with parental immigration status on insurance coverage and health care utilization of children. In particular, I analyze differences between immigrant subgroups by focusing on language spoken at home. From the results, I hope to suggest ways to improve the implementation of the most significant policy change in the U.S. healthcare system in decades.
2
Literature Review
To explain poor health outcomes of immigrants, past studies have compared the native-born and the foreign-born across various demographic variables. Researchers demonstrate that socioeconomic background, immigration status, and cultural barriers are factors that affect health insurance coverage in adult immigrants (Huang, Yu, and Ledsky 2006; Jasso et al. 2004). For immigrants, language barriers and immigration concerns increase the costs of finding information about government programs, completing applications, preparing necessary documents, and maintaining enrollment (Aizer 2007). These results suggest that for the foreign born population, being eligible for insurance is not sufficient to ensure coverage (Currie 2009). Consequently, research on immigrant health has focused on identifying the causes of disparities in health by examining the effects of changes in government health programs. However, research in this line of thought has been limited to analyzing differences between immigrants and the native-born. The U.S. immigrant population has become more heterogeneous, and subgroups vary greatly in their socioeconomic, health, and cultural characteristics (Singh and Hiatt 2006). Potocky-Tripodi (2006) show that for the children of immigrants, disparities in health conditions persist for different ethnic groups even after controlling for family income and health insurance status. Given 89
such evidence and the continuing growth in both the size and diversity of the foreign-born population, analyzing immigrants as one group is inadequate. In this analysis, I attempt to distinguish the difference in costs that immigrant groups face across two criteria: immigration status and language barriers. Immigration status has a significant influence on decisionmaking because non-citizen immigrants, both legal and undocumented, may have additional concerns associated with enrolling in government programs. Legal non-citizen immigrants may mistakenly believe that taking advantage of governmentfunded programs might jeopardize their chances of becoming legal permanent residents or citizens. Undocumented immigrants may fear that they would be deported if they utilize government health programs. In addition to these concerns, non-citizens are disadvantaged in other aspects. Studies that compared immigrants by their legal status show that naturalized citizens and permanent residents are more likely than non-citizens to earn higher wages, have insurance coverage, and receive benefits such as employment-based health insurance (Carrasquillo, Carrasquillo, and Shea 2000; Kandilov and Kandilov 2010). However, relatively less research has been devoted to analyzing the effect of parental legal status on childrenâ&#x20AC;&#x2122;s health outcomes. Lurie (2008) addresses this gap in the literature by comparing insurance coverage of the children of permanent residents and non-permanent residents following the enactment of the Personal Responsibility and Work Opportunity Reconciliation Act of 1996 (PRWORA). The PRWORA made legal immigrants ineligible for government health programs for the first five years of residence in the U.S. This reform did not have a direct effect on the children of immigrants because the children were citizens, or because their state of residence provided insurance to all children from low-income households regardless of citizenship status. Lurieâ&#x20AC;&#x2122;s results, however, show that the children of nonpermanent residents were more likely to be uninsured than the children of permanent residents despite being equally eligible. This finding suggests that parentsâ&#x20AC;&#x2122; citizenship status influenced the likelihood of their children being insured. Moreover, the analysis of Bass (2006) demonstrates that long-term immigrants who are likely to be citizens are still less likely than their 90
native counterparts to be insured, which suggests that naturalized citizens and native citizens behave differently. I will further contribute to this body of literature by comparing children across their parents’ citizenship status. I will examine the interaction of additional costs associated with legal status and language concerns. Studies focusing on language barriers show that limited English proficiency contributes to lack of insurance, poor access to health care, increased risk of medical errors, and patient dissatisfaction (Ku 2009; Cordasco et al. 2011). Non-professional interpreters such as family members are commonly used in medical settings, but these ad hoc interpreters are more likely to make errors that could lead to tragic consequences (Hernandez, Denton, and Blanchard 2011; Flores 2006). In addition to affecting the quality of health care, language difficulties distort the decision-making process involving insurance coverage. Immigrants with limited English proficiency are likely to form false beliefs because they do not have adequate access to information. Previous research including the studies discussed above has used immigrants’ self-ratings as a measure of their English proficiency, which may not be an accurate reflection of their actual language use or mastery of English. Lee, Nguyen, and Tsui (2011) address this bias by testing interview language as a measure of acculturation among Asian Americans. However, the analysis compared English and non-English survey takers without further analysis of language groups. I suggest a more comprehensive way to compare immigrants by examining language spoken at home. This is a better measure of immigrants’ integration into the U.S. and allow for comparison between multiple immigrant groups. Analysis involving language spoken at home is needed, especially when looking at the children of immigrants, because the actual language used at home is a better reflection of parents’ English skills and acculturation than perceived proficiency. Regardless of the children’s English proficiency, what largely determines their access to and use of health care is their parents’ ability to secure such resources. In predicting the differences in the effect of speaking a certain language as opposed to another, linguistic distance can be a useful concept. Linguistic distance is defined as the difficulty 91
that Americans have in learning the language in question, and Chiswick and Miller (2005) provide a quantitative measure of this distance. They show that controlling for other variables, linguistic distance had a significant effect on English proficiency of adult immigrants. According to Chiswick and Millerâ&#x20AC;&#x2122;s values, European languages such as French and Spanish are linguistically closer to English, while Asian languages such as Japanese and Chinese are further. Immigrants speaking languages that have larger linguistic distances will have a more difficult time learning English as they assimilate to life in the U.S. Since English proficiency has been shown to influence immigrantsâ&#x20AC;&#x2122; health, those who have to expend relatively more effort to achieve a certain level of proficiency would face greater costs in securing insurance and a usual source of care for their children, and experience lower quality and satisfaction from utilizing medical services. In order to account for the growing heterogeneity of the foreign-born population and to explain health disparities between ethnic groups, research with more comprehensive ways of categorizing immigrants is necessary. To my knowledge, researchers have not yet performed a multilingual analysis on health care and health outcomes of the children of immigrants. The aim of this analysis is twofold: to examine whether the language spoken at home leads to differences in health measures for children in immigrant households, and whether the effect of language spoken is related to parental immigration status. To answer these questions, I examine the way in which parental citizenship status affects language use, which in turn influences childrenâ&#x20AC;&#x2122;s health outcomes. Drawing from existing literature and the theory of linguistic distance, I hypothesize that children from households speaking languages that are linguistically closer to English will have greater access to and use of medical services. I also hypothesize that the effect of language use will be smaller for the children of naturalized immigrants as compared to noncitizens, since the former will have fewer concerns related to legal status. Through this analysis, I plan to provide an update to the literature and reveal areas of vulnerability for immigrant subgroups.
92
3
Data
This analysis uses data from the 2009 California Health Interview Survey (CHIS). The CHIS is a telephone survey of Californiaâ&#x20AC;&#x2122;s overall population, which includes more than a quarter of the U.S. immigrant population according to the 2010 Census. The sample includes people from most large and small racial and ethnic groups, and the interview is administered in English, Spanish, Mandarin, Cantonese, Korean, and Vietnamese. From each household, a random adult, adolescent, and child are interviewed separately. Adolescents (ages 12 to 17) are interviewed directly. For children (under age 12), the randomly selected adult in the household acts as the proxy. The CHIS collects healthrelated information including health status, behavior, insurance, access to care, and use of medical services. The CHIS was chosen because it covers groups underrepresented in most other health surveys, and conducts interviews in six languages (including English). Compared to commonly used surveys such as the Current Population Survey (CPS) and the National Health Interview Survey (NHIS), the CHIS provides more detailed data on language use and immigration status. Moreover, multiple interview languages will ensure that the immigrant population is represented, and improve the accuracy of their responses. The CHIS also provides more detailed information on parental citizenship status and language spoken at home compared to other health surveys, which makes the dataset appropriate for this analysis. Information on whether each parent is a US-born citizen, naturalized citizen, or a non-citizen is provided. To place children into the category that best characterizes the set of advantages and disadvantages that their parents face in securing health care, and to avoid double counting, I sort children according to the â&#x20AC;&#x153;highestâ&#x20AC;? parental citizenship status in the household. Children in the first category have at least one US-born citizen parent, and those in the second category have at least one naturalized parent. Both parents of children in the last category are non-citizens. As for language spoken at home, I consolidate the various categories that the dataset provides into three (European, Asian, and other), since certain subsets of language spoken at home only represent a small fraction of the sample. For the 2009 survey, data are 93
available for 12,324 children ages 0 to 17. Among these, 5,135 are children of immigrants, including naturalized citizens, noncitizens with legal visas, and undocumented immigrants.
4
Methodology
I use OLS linear probability and probit models of CHIS data to test if language spoken at home has an effect on health care access and utilization for children in immigrant households. To distinguish the effect of language use by legal status, regressions are performed for each group of children categorized by parental citizenship status: “US-Born Citizen,” “Naturalized Citizen,” and “Non-Citizen.” Since naturalized citizens tend to be better assimilated to the United States and have fewer concerns with regard to language difficulties and citizenship status compared to non-citizen immigrants, I hypothesize that the potential negative effect of language use will be stronger for non-citizens. With the objectives of examining the relationship between language spoken at home, parental immigration status, and health care access and utilization, the following linear probability model will be used:
( Healthcare)i = β 0 + β 1 Europeani + β 2 Asiani + β 3 Otheri + β 4 FPLi + β 5 Agei + ui
(1)
Healthcare is a dummy variable for one of the following measures of health care access and utilization: insurance coverage, having a usual source of care other than an emergency room, and having visited a doctor in the past year. In this model, it indicates the probability that a child with a set of characteristics given by the independent variables has access to the measure of health care in question. European, Asian, and Other are indicator variables denoting the language spoken in the child’s household. Although the dataset provided more detailed categories on language use at home, certain categories included too few children to be useful for data analysis. Therefore, children in households that use English were grouped with foreign language-only households if a foreign language was also among the languages used at home. 94
In these multilingual families, it could be the case that while the children use English to communicate, the parents mainly use their native foreign language. In such cases, categorizing these children as coming from an English-speaking household would not be an accurate reflection of the barriers that their parents face in consuming health care. Moreover, it is plausible that parents in households that use English along with a foreign language are less fluent in English than parents in English-only households. For children in families that only use English, all of the three language dummy variables take a value of 0. FPL is the household income as a percentage of federal poverty level (FPL), which varies by household size. In the dataset, FPL equals 1 if the household income is 100% of federal poverty level, 2 if it is 200% of FPL, and so forth. Age is the child’s age in 2009 when the interviews were conducted, and ranges from 0 to 17. The continuous variables for age and FPL were included because they are the two main determinants of children’s eligibility for government health programs. California’s Medicaid program is called Medi-Cal, and its SCHIP program is called Healthy Families. Infants (age 0) in households with incomes of up to 200% FPL, children under age 6 with household incomes of up to 133% FPL, and children ages 6 to 18 with household incomes of up to 100% FPL are eligible for Medi-Cal. Children are eligible for Healthy Families if they are ineligible for Medi-Cal and have household incomes of up to 250% FPL.2 Controlling for age and household income as a percentage of FPL will indicate if language spoken at home has an effect on different measures of health care access and utilization, holding constant the determinants of eligibility and ability to afford insurance. Since federal poverty guidelines take into account the size of the household, FPL will be a more accurate measure of the resources available per person in the household than absolute income. In addition to the basic linear probability model, I use a probit model with an additional set of explanatory variables to estimate the effect of language use and parental immigration status. The probit regression resolves the disadvantages of the linear probability regression by modeling the probability of the 2 States have varying eligibility rules, which are available in the National Governors Association’s issue brief (National Governors Association 2008).
95
independent variable equaling one using the cumulative standard normal distribution function. The probit model is estimated as follows: φ ( Healthcare)i = β 0 + β 1 Europeani + β 2 Asiani + β 3 Otheri
+ β 5 FPLi + β 6 Agei + β 7 Femalei + β 8 Healthi (2) + β 9 Marriedi + β 1 0Collegei + ui The variables European, Asian, Other, FPL, and Age have the same definitions as in equation (1). Additional demographic variables for gender, perceived health status, family structure, and parental education are included. Previous studies have shown that these factors are correlated with children’s health outcomes (Guendelman, Schauffler, and Pearl 2001). Female is a dummy variable that equals 1 if the child is female, and it equals 0 if the child is male. Perceived health status, Health, takes one of five values ranging from 1 to 5 depending on the self-rating provided by the respondent (5=excellent, 1=poor).3 Married is a dummy variable that equals 1 if the child’s household has married parents, and it equals 0 if the household has a single parent. College is a dummy variable that equals 1 if the interviewed adult of the family has attended college, and it equals 0 otherwise.
5
Results
5.1
Demographic Characteristics
Summary statistics for demographic characteristics of children in immigrant households suggest that “higher” parental legal status is strongly associated with the ability to provide better resources for their children. An overwhelming majority of native households use English at home. For naturalized citizens, the percentage of households using both English and a European language is the greatest, followed by households using both English and an Asian language at home. Half of non-citizen households use English and a European language, while a third 3 Answers
for perceived health status was provided by the respondent for the adolescent population, and by the adult proxy for children under the age of 12.
96
only use Spanish. Data on health characteristics of children vary with parental immigration status. Children of US-born citizens have better ratings than the immigrant population in general, and children of naturalized citizens had slightly better health ratings than children of non-citizens. As for insurance coverage, children of non-citizens are much more likely to be uninsured, and much less likely to have employment-based insurance. Immigrant children are also less likely to have a usual source of care or to have visited a doctor or a dentist in the past twelve months. As expected, the data suggest that children’s access to and use of medical care differ significantly depending on parental immigration status.
5.2
Health Insurance Coverage
For the full sample of children, I apply the linear probability model specified in Equation (1), and include the results in Table 1. I also present regression results with additional variables. For both the basic and the expanded models, all of the coefficients on language variables are significant. Speaking a language other than a European or Asian language at home most negatively affects probability of insurance. To analyze the effect of parents’ citizenship status on children’s probability of being insured, I apply the linear probability and probit models to three groups of children categorized by parental immigration status (US-born citizen, naturalized citizen, and non-citizen). The regression results are included in Table 3. As expected, the language coefficients tend to be more negative and significant for children who have parents with a “lower” citizenship status. In the linear model for children of non-citizens, the three language coefficients are all significant. The coefficient on Asian language is the largest and is significant at the 0.01 level (5.67 percentage point decrease in probability of insurance). Given that only 8.73% of the children of non-citizens are uninsured, this value represents a substantial change in probability of being uninsured. I use the same variables from the expanded linear model and apply them in the probit model. These results are included in the third column for each citizenship group in Table 3.
97
98
Health Insurance (1) (2) -0.0283*** -0.0204*** (0.00471) (0.00478) -0.0141** -0.0149** (0.00635) (0.00644) -0.0239** -0.0268*** (0.00990) (0.00990) 0.00523*** 0.00383*** (0.000373) (0.000398) -0.00232*** -0.00205*** (0.000383) (0.000396) 0.000844 (0.00375) 0.000670 (0.00226) 0.0200*** (0.00655) 0.0280*** (0.00514) 0.963*** 0.925*** (0.00450) (0.0124) 0.020 0.024
1. Robust standard errors in parentheses 2. ***p<0.01, **p<0.05, *p<0.1
R2
Constant
Parentâ&#x20AC;&#x2122;s Education
Married Parents
Health Status
Female
Age
FPL
Other Lang.
Asian Lang.
European Lang.
Dep. Variable
Usual Source of Care (1) (2) -0.0224*** -0.0149*** (0.00536) (0.00570) -0.0242*** -0.0188** (0.00781) (0.00791) -0.0251** -0.0249** (0.0108) (0.0108) 0.00360*** 0.00232*** (0.000538) (0.000572) -0.0106*** -0.00964*** (0.000503) (0.000508) 0.0107** (0.00438) 0.0151*** (0.00270) 0.0189** (0.00745) 0.00846 (0.00584) 1.016*** 0.922*** (0.00500) (0.0143) 0.049 0.053
Doctor Visit in Past Year (1) (2) -0.0103* -0.00367 (0.00616) (0.00658) -0.0156* -0.0197** (0.00897) (0.00913) -0.00638 -0.00996 (0.0120) (0.0120) 0.00508*** 0.00409*** (0.000625) (0.000668) -0.00868*** -0.00904*** (0.000485) (0.000501) 0.00920* (0.00515) -0.0103*** (0.00304) 0.00615 (0.00812) 0.0359*** (0.00674) 0.963*** 0.977*** (0.00565) (0.0161) 0.027 0.030
Table 1: Linear Probability Regression of Healthcare, Full Sample
99
Health Insurance (1) (2) 0.000233 0.00914 (0.0197) (0.0128) 0.0297** -0.0592* (0.0123) (0.0325) -0.00469 -0.0183 (0.0149) (0.0264) 0.00309*** 0.00485*** (0.00101) (0.00140) -2.71e-05 -0.00591*** (0.000893) (0.00114) -0.00603 0.00637 (0.00847) (0.0107) -0.00206 0.000883 (0.00476) (0.00534) -0.00338 0.0148 (0.0158) (0.0160) 0.0323*** 0.0234* (0.0111) (0.0134) 0.936*** 0.923*** (0.0257) (0.0283) 0.013 0.019
Usual Source of Care (1) (2) -0.0175 -0.00987 (0.0229) (0.0122) 0.00481 -0.0927*** (0.0191) (0.0355) -0.0126 -0.0440 (0.0182) (0.0302) 0.00221* 0.00320 (0.00132) (0.00195) 0.00986*** -0.0143*** (0.00127) (0.00122) 0.0126 0.00805 (0.0104) (0.0108) 0.0116* 0.0129** (0.00605) (0.00526) -0.0167 0.0108 (0.0200) (0.0162) 0.0163 -0.0128 (0.0132) (0.0141) 0.958*** 0.964*** (0.0330) (0.0257) 0.044 0.072
Doctor Visit in Past Year (1) (2) -0.0192 -0.0447*** (0.0261) (0.0144) -0.00691 -0.0361 (0.0231) (0.0307) 0.0119 -0.0387 (0.0197) (0.0301) 0.00409*** 0.00291 (0.00142) (0.00221) -0.00801*** -0.0130*** (0.00120) (0.00122) 0.0196 0.00889 (0.0123) (0.0118) -0.00789 -0.0145** (0.00672) (0.00588) -0.00763 0.00506 (0.0236) (0.0170) 0.0145 0.0271* (0.0149) (0.0141) 0.971*** 1.037*** (0.0378) (0.0297) 0.021 0.046
1. Robust standard errors in parentheses 2. ***p<0.01, **p<0.05, *p<0.1 3. A language indicator variable in these regressions takes a value of 1 if that language is the only language spoken in the household. Regressions in column (1) and (2) correspond to children of naturalized citizens and non-citizens, respectively.
R2
Constant
Parentâ&#x20AC;&#x2122;s Education
Married Parents
Health Status
Female
Age
FPL
Other Language
Asian Language
Spanish
Dependent Variable
Table 2: Linear Probability Regression of Healthcare, Foreign Language-Only Dummy Variables
100
R
0.956*** (0.00473) 0.014
-0.00163 (0.00608) 0.0162* (0.00933) -0.0268 (0.0155) 0.00484*** (0.000445) -0.00136*** (0.000414)
1. Robust standard errors in parentheses 2. ***p<0.01, **p<0.05, *p<0.1
2
Constant
Parentâ&#x20AC;&#x2122;s Ed.
Marr. Parents
Health Status
Female
Age
FPL
Other Lang.
Asian Lang.
European Lang.
Linear
-0.000441 (0.00602) 0.0155* (0.00916) -0.0288 (0.0155) 0.00361*** (0.000443) -0.000964** (0.000431) 0.00103 (0.00408) 0.00205 (0.00264) 0.0256*** (0.00757) 0.0239*** (0.00673) 0.909*** (0.0151) 0.021
U.S.-Born Citizen Linear 0.0321 (0.0849) 0.410 (0.399) -0.367** (0.151) 0.118*** (0.0263) -0.0169** (0.00670) 0.0274 (0.0609) 0.00871 (0.0353) 0.201*** (0.0766) 0.209*** (0.0732) 1.222*** (0.182) 0.965*** (0.0132) 0.011
-0.0353*** (0.0110) -0.0108 (0.00972) -0.0154 (0.0150) 0.00327*** (0.000926) -0.000349 (0.000873)
-0.0276** (0.0111) -0.00947 (0.00994) -0.0165 (0.0149) 0.00241** (0.00100) -0.000306 (0.000879) -0.00524 (0.00849) -0.00317 (0.00481) -0.00187 (0.0158) 0.0256** (0.0109) 0.965*** (0.0273) 0.014
-0.337** (0.164) -0.195 (0.169) -0.248 (0.214) 0.0487* (0.0286) -0.00320 (0.00947) -0.0624 (0.0925) -0.0368 (0.0501) -0.0251 (0.168) 0.220* (0.114) 1.894*** (0.320)
Dependent Variable: Health Insurance Coverage Naturalized Citizen Probit Linear Linear Probit
0.990*** (0.0205) 0.018
-0.0400** (0.0176) -0.0517** (0.0207) -0.0482 (0.0294) 0.00525*** (0.00132) -0.000624*** (0.00110)
Linear
-0.0357** (0.0177) -0.0567*** (0.0207) -0.0528 (0.0293) 0.00396*** (0.00149) -0.00609*** (0.00111) 0.00564 (0.0107) 0.000177 (0.00531) 0.0169 (0.0161) 0.0211 (0.0136) 0.965*** (0.0327) 0.019
Non-Citizen Linear
Table 3: Linear and Probit Regressions of Insurance Coverage by Parental Immigration Status
-0.313 (0.203) -0.480** (0.213) -0.451* (0.267) 0.0597*** (0.0213) -0.0393*** (0.00694) 0.0314 (0.0692) -0.00209 (0.0340) 0.113 (0.0919) 0.114 (0.0945) 1.784*** (0.268)
Probit
5.3
Usual Source of Care
In the linear regression of having a usual source of care for the full sample of children (Table 1), all of the language coefficients are small but significant and negative. Speaking a language other than a European or Asian language at home most negatively affects the probability of having a usual source of care. In regressions applied to groups of children categorized by parental immigration status, only the non-citizen group has significant language coefficients (Table 4). For the non-citizen group, the coefficient on Asian language is significant and indicates a 5.8 percentage point decrease in probability of having a usual source of care.
5.4
Doctor Visits in the Past Year
In the linear probability regressions of having visited a doctor in the past twelve months (Table 5), the differences between language groups are insignificant or small for the full sample of children. In the linear and probit regressions of children grouped by parental immigration status, none of the language coefficients are significant. While the values are not significant, the predicted probabilities based on probit regressions are such that speaking English makes children in the naturalized citizen group less likely to have visited a doctor, while it makes children in the non-citizen group more likely to have visited a doctor. This general pattern is true for both reference cases. However, since the differences in probability are quite small and statistically insignificant, drawing a conclusion from this pattern would be inappropriate.
6 6.1
Discussion Health Insurance Coverage
Among the three measures of health care access and utilization explored in this analysis, the linear probability regressions for health insurance coverage yielded the largest number of significant language coefficients. The values of these significant coefficients confirm the main hypothesis that children from households speaking languages that are linguistically further 101
102
1.005*** (0.00539) 0.041
-0.00858 (0.00773) -0.00304 (0.0181) -0.0161 (0.0152) 0.00316*** (0.000638) -0.00902*** (0.000608)
1. Robust standard errors in parentheses 2. ***p<0.01, **p<0.05, *p<0.1
R2
Constant
Parentâ&#x20AC;&#x2122;s Ed.
Marr. Parents
Health Status
Female
Age
FPL
Other Lang.
Asian Lang.
European Lang.
Linear
-0.00598 (0.00769) -0.000811 (0.0184) -0.0162 (0.0153) 0.00174*** (0.000660) -0.00765*** (0.000604) 0.0107** (0.00522) 0.0187*** (0.00376) 0.0281*** (0.00919) 0.00936 (0.00756) 0.884*** (0.0194) 0.049
-0.0549 (0.0753) -0.0483 (0.259) -0.158 (0.151) 0.0212** (0.00872) -0.0814*** (0.00644) 0.110** (0.0524) 0.135*** (0.0288) 0.208*** (0.0639) 0.122* (0.0641) 1.462*** (0.160) 1.016*** (0.0162) 0.040
-0.0211 (0.0151) -0.00719 (0.0140) -0.0174 (0.0201) 0.00303** (0.00124) -0.0103*** (0.00123)
-0.0140 (0.0157) -0.000109 (0.0144) -0.0170 (0.0203) 0.00200 (0.00132) -0.00983*** (0.00125) 0.0132 (0.0104) 0.0120** (0.00598) -0.0168 (0.0201) 0.0140 (0.0135) 0.964*** (0.0348) 0.044
-0.0766 (0.124) -0.0192 (0.124) -0.154 (0.173) 0.0239 (0.0154) -0.0773*** (0.0100) 0.0986 (0.0811) 0.0901** (0.0432) -0.104 (0.148) 0.117 (0.0977) 1.825*** (0.270)
Dependent Variable: Usual Source of Care U.S.-Born Citizen Naturalized Citizen Linear Probit Linear Linear Probit
1.035*** (0.0214) 0.067
-0.0172 (0.0201) -0.0605** (0.0246) -0.0486 (0.0328) 0.00398** (0.00185) -0.0146*** (0.00120)
Linear
-0.0166 (0.0206) -0.0580** (0.0249) -0.0473 (0.0331) 0.00363* (0.00204) -0.0141*** (0.00121) 0.00808 (0.0108) 0.0130** (0.00528) 0.0128 (0.0163) -0.00994 (0.0146) 0.971*** (0.0329) 0.070
Non-Citizen Linear
Table 4: Linear and Probit Regressions of Usual Source of Care by Parental Immigration Status
-0.119 (0.167) -0.394** (0.181) -0.403 (0.254) 0.0367 (0.0229) -0.0931*** (0.00777) 0.0382 (0.0716) 0.0877** (0.0354) 0.0893 (0.0946) -0.0659 (0.0942) 1.902*** (0.244)
Probit
103
0.957*** (0.00632) 0.023
-0.00277 (0.00925) -0.00780 (0.0242) -0.00336 (0.0177) 0.00478*** (0.000773) -0.00766*** (0.000590)
1. Robust standard errors in parentheses 2. ***p<0.01, **p<0.05, *p<0.1
R2
Constant
Parentâ&#x20AC;&#x2122;s Ed.
Marr. Parents
Health Status
Female
Age
FPL
Other Lang.
Asian Lang.
European Lang.
Linear
-0.00217 (0.00929) -0.0114 (0.0242) -0.00740 (0.0178) 0.00382*** (0.000808) -0.00809*** (0.000621) 0.00510 (0.00642) -0.00928** (0.00428) 0.00842 (0.0102) 0.0406*** (0.00945) 0.964*** (0.0223) 0.027
-0.0145 (0.0641) -0.0608 (0.186) -0.0482 (0.133) 0.0308*** (0.00768) -0.0599*** (0.00433) 0.0305 (0.0440) -0.0785*** (0.0282) 0.0508 (0.0584) 0.243*** (0.0530) 1.894*** (0.147) 0.937*** (0.0188) 0.019
0.0128 (0.0180) 0.00274 (0.0179) 0.0194 (0.0234) 0.00501*** (0.00127) -0.00779*** (0.00116)
0.0167 (0.0187) 0.000905 (0.0185) 0.0193 (0.0235) 0.00463*** (0.00140) -0.00791*** (0.00117) 0.0195 (0.0122) -0.00804 (0.00676) -0.00751 (0.0237) 0.0210 (0.0154) 0.955*** (0.0407) 0.021
0.116 (0.104) 0.0170 (0.104) 0.145 (0.153) 0.0352*** (0.0122) -0.0490*** (0.00692) 0.109 (0.0710) -0.0496 (0.0384) -0.0448 (0.128) 0.112 (0.0843) 1.650*** (0.229)
Dependent Variable: Usual Source of Care U.S.-Born Citizen Naturalized Citizen Linear Probit Linear Linear Probit
0.984*** (0.0230) 0.039
-0.0128 (0.0225) -0.0135 (0.0257) -0.0270 (0.0342) 0.00498** (0.00217) -0.0119*** (0.00117)
Linear
-0.00796 (0.0227) -0.0211 (0.0258) -0.0324 (0.0343) 0.00398* (0.00231) -0.0122*** (0.00119) 0.0102 (0.0118) -0.0132** (0.00591) 0.00608 (0.0170) 0.0341** (0.0144) 1.014*** (0.0369) 0.043
Non-Citizen Linear
Table 5: Linear and Probit Regressions of Doctor Visit in Past Year by Parental Immigration Status
-0.0380 (0.142) -0.114 (0.163) -0.186 (0.217) 0.0318 (0.0221) -0.0685*** (0.00643) 0.0495 (0.0657) -0.0743** (0.0334) 0.0365 (0.0874) 0.174* (0.0887) 1.989*** (0.223)
Probit
from English will perform worse in measures of health care access and utilization, as those from Asian-language households suffered a greater decrease in the probability of being insured than those from European-language households. The predicted values from probit analysis also confirm the secondary hypothesis that the effect of language use will be smaller for the children of naturalized immigrants compared to children of non-citizen immigrants. In both the linear and probit models, more language coefficients are significant for the non-citizen group than for the naturalized citizen group, and the decrease in predicted values is greater for children in the former. There may be other factors contributing to non-citizensâ&#x20AC;&#x2122; sensitivity to change in circumstances. Non-citizens could be staying in the country only temporarily, and therefore decide that they do not need insurance during their short stay. Language difficulties exacerbate the tendency for immigrants to stay uninsured, complicating the process of obtaining insurance, becoming familiarized with terms, and filing claims. As policy makers aim to achieve universal insurance coverage, they should consider the fact that non-citizens and foreign language households are much more likely to respond to changes in eligibility or the process of obtaining insurance than their native and English-speaking counterparts. As noted previously, regressions in this analysis have rather low R2 values. These values for linear probability regressions are influenced by the fact that the dependent variable takes a value of zero for a rather small group of children. Moreover, it could be the case that immigrants are affected by factors other than those included in this analysis. For instance, non-citizens could have most of their wealth in their native countryâ&#x20AC;&#x2122;s currency. In this case, consumption choices such as obtaining health care could be influenced by change in exchange rates between the native currency and the U.S. dollar. Such fluctuations would be the result of more general trends in the world economy, which would be difficult to incorporate into an analysis of immigrants from many different countries. To account for factors like these, future studies could focus on a select group of immigrants from a specific country.
104
6.2
Usual Source of Care
Whereas there are significant language coefficients for regressions of insurance coverage for all three groups of children, only the non-citizen group has significant language coefficients for regressions of having a usual source of care. The dummy variable for speaking an Asian language has the only significant coefficient, and the magnitude is quite substantial. These results confirm both of my hypotheses; children in Asian language households are less likely to have a usual source of care, and the effect of language use is greater for the children of non-citizens than naturalized citizens. The probit analysis yields similarly significant results. The analysis regarding having a usual source of care is limited, which could partly explain why the regressions have less significant coefficients compared to regressions of insurance coverage. Unlike insurance, having a usual source of care depends on oneâ&#x20AC;&#x2122;s definition of the term. It is possible that some hold stricter definitions while others hold looser definitions of having a usual place to go when their children are sick. To produce more precise results, future surveys and studies could clarify this definition or create more accurate ways to measure access to medical care.
6.3
Doctor Visits in the Past Year
The linear probability regressions of having visited a doctor in the past year yield the least number of significant coefficients on language indicator variables. These results may be influenced by the fact that there is relatively little variation in the proportion of children who have visited a doctor in the past year between different language groups.4 This suggests that the language spoken at home does not directly influence the probability of a child having a physician visit in the last twelve months. 4 8.6%,
10.5%, 9.8%, and 7.9% of children from English, European-language, Asian-language, and other-language households have not visited a doctor in the past year, respectively.
105
6.4
Additional Discussion of Regression Results
This analysis has additional limitations that should be considered when interpreting the results. It is possible that there are cultural factors within each language group that influence decisions about health care utilization in a systematic way. If this were the case, speakers of a certain language would make health care choices as a result of being a native speaker of that language, and not directly because of speaking the language itself. However, even if these external factors have a significant impact on immigrantsâ&#x20AC;&#x2122; decisionmaking process, this analysis still confirms the hypothesis that there are significant differences between children in immigrant subgroups, both in terms of parental legal status and language spoken at home. As discussed previously, this analysis also suffers from the limits of binary dependent variables. The models in this analysis have a binary dependent variable that reflects whether the child has utilized a certain measure of health care. As for insurance coverage, a simple answer to a yes-no question cannot indicate whether the coverage is permanent or temporary. Non-citizens, who tend to be poorer than citizens and therefore more sensitive to marginal changes in income, could be insured at one point in time, but uninsured at another.5 Another potential source of bias may involve parental immigration status. Since the non-citizen population includes permanent residents, who are better integrated into the U.S. in general and are less concerned about legal status, the coefficients for non-citizens may be underestimating the effect of barriers that non-permanent residents experience. While the dataset for this analysis places undocumented immigrants, permanent residents, and non-permanent residents such as those with employment visa into the broad category of â&#x20AC;&#x153;non-citizens,â&#x20AC;? future studies could look at these segments of the immigration population separately. 5 Almost half of non-citizens have incomes less than 100% of federal poverty level, which is an annual income of $23,550 for a family of four.
106
7
Conclusion
The immigrant population is an increasingly important segment of the nation to consider in policy decisions. In particular, the children of immigrants are driving the growth of the U.S. child population. In comparison to U.S. natives, additional factors influence immigrants when they make decisions about utilization of health care. This research analyzed the effect of language spoken at home on children’s access to and utilization of health care, and how the strength of this effect varies with parental citizenship status. As expected, the children of non-citizens are affected most strongly by the type of language used in the child’s household. For this group of children, speaking an Asian language seems to have the most significant and strongest effect on having access to medical care. These results confirm both of my hypotheses: that speaking a language that is linguistically further from English has a stronger effect on health care consumption, and that this effect is stronger for children of parents with “lower” legal status. These findings suggest several areas of improvement for medical service providers. To address the heterogeneity among immigrant households, hospitals should focus on recruiting interpreters or nursing staff that are fluent in foreign languages. Policy initiatives could encourage this by making it mandatory for hospitals to have a minimum amount of interpretive services available to patients with limited English proficiency. Under Medicaid and SCHIP, the two main government health programs for children in the U.S., each state decides if, and how, it will reimburse health care providers for language services (Youdelman 2007). National requirements for providing interpretive services could strengthen the quality of medical care that immigrants receive, and encourage them to seek preventive care, which will decrease the cost in the long run. Moreover, language services that are available to patients should reflect the specific needs of that region. Since language is a significant barrier, especially for low-income immigrants, programs geared toward this population will have limited effectiveness if policymakers ignore the heterogeneity between immigrant subgroups. Communities with a higher proportion of low-income, Asian immigrants should are 107
in greatest need of interpretive services, as my analysis suggests that this population is most vulnerable to financial and language difficulties. Further research is necessary to obtain a full understanding of the factors that influence immigrantsâ&#x20AC;&#x2122; decision-making process with regard to health care and to create effective policy. These analyses could use data from national surveys or regional surveys that cover areas with a substantial immigrant population. More detailed information on childrenâ&#x20AC;&#x2122;s consumption of health care goods and services would provide a more comprehensive analysis of this population than the binary measures of health care used in this investigation. Another limitation of this study was data insufficiency due to few observations belonging to certain language groups. A larger dataset with more specific information on language use at home would supplement the shortcomings of this analysis. Numerous past studies have confirmed that immigrants and natives make different decisions with regard to health care. This investigation further suggests that immigrant subgroups behave differently when making decisions about utilization of medical care, and supports the need for research to determine the sources of differences between immigrants. At a time when the foreignborn and their children are driving the growth of the American population, studying immigrants and the challenges that they face in obtaining health care will remain crucial to the effort to alleviate health inequality.
108
Speculative Initial Public Offerings: A Disagreement Approach to the IPO Puzzle Aditya Rajagopalan, Princeton University1 Abstract. I model the effect of disagreement on shortterm IPO underpricing and long-term IPO underperformance. Given the existence of short-sales constraints and momentum traders, sufficiently high disagreement amongst investors triggers short-term overperformance and long-term underperformance as momentum traders magnify the effect of short-term disagreement. I verify these predictions empirically using data for aggregate disagreement and various measures of IPO abnormal returns, turnover, and frequency. I find that aggregate disagreement has an increasingly positive effect on initial post-IPO abnormal returns and turnover, and a negative effect on long-term postIPO abnormal returns. My results also indicate that the effect of aggregate disagreement is amplified by gradual information diffusion in both the short- and long-term. This paper provides a new model combining the disagreement and information cascades hypotheses of the IPO Puzzle, and new empirical evidence explaining the IPO Puzzle through aggregate disagreement. Keywords: initial public offerings, investing
1 Adi
Rajagopalan graduated from Princeton University in 2013. The author would like to thank his thesis adviser, Professor JosĂŠ Scheinkman, for his tremendous guidance and support in the writing of this thesis. He would also like to thank Professor Harrison Hong, for inspiring the authorâ&#x20AC;&#x2122;s passion for and interest in behavioral finance, and the countless people who were instrumental to proof synthesis, dataset construction, editing, proofreading, and general motivation for this thesis.
109
1
Introduction
The pricing behavior of initial public offerings (IPOs) has been one of the great mysteries of modern corporate finance. IPOs feature two seemingly antithetical pricing phenomena: First, in the short-run, IPO stocks appear to be underpriced; namely, first-day post-IPO returns have averaged 18.6% (Ritter 1998), indicating that firms and their underwriters are on average deliberately pricing IPO firms below their fundamental value. Second, empirical evidence suggests that in the medium-tolong-run, IPOs underperform; specifically, IPOs achieve negative abnormal results relative to their peers, indicating that firms and their underwriters may be deliberately overpricing IPOs. Although many attempts have been made to rationalize these two patterns individually, the coexistence of these two contradictory phenomena—which I hereafter refer to as the “IPO Puzzle”—has long confounded researchers. Two major behavioral theories explaining the IPO Puzzle have recently been gaining acceptance: The first, disagreement theory, uses the existence of disagreement and short-sales constraints in the days and months immediately following IPOs to argue that during IPOs, investors with low valuations are priced out of the market, causing short-term post-IPO prices to reflect only the valuations of optimistic investors (Miller 1977). The second, information cascades theory, suggests that gradual information dispersion and momentum are commonplace among IPOs and are the cause of the short-run bubbles that form immediately after IPOs (Welch 1992).2 This paper therefore has two goals: First, I seek to combine the intuitions of Miller’s disagreement theory and Welch’s information cascades theory in a theoretical framework to illustrate the pricing implications of temporary disagreement and short-sales constraints, given the existence of gradual information diffusion—here, through momentum traders. Second, I seek to study for the first time the effect of aggregate, nonidiosyncratic disagreement on post-IPO price-volume dynamics 2 Note:
although Welch (1992) is classically used to explain only short-term underpricing, my usage of this theory in this paper suggests that information cascades theory could potentially explain long-term underperformance of IPOs.
110
and see Millerâ&#x20AC;&#x2122;s intuitions stand empirically. To achieve these ends, this paper is organized as follows: In Section 2, I review literature pertaining to the IPO Puzzle, herding in markets, and disagreement in order to motivate this research. In Section 3, I develop a simple model in which I incorporate momentum traders into the classic A-B disagreement framework. I use the intuitions from my model to motivate Sections 4 and 5, in which I describe my dataset and empirical methodology, respectively. In Sections 6 and 7, I present and discuss my empirical results. Finally, in Section 8, I summarize the major conclusions of my study and introduce a number of avenues for future work.
2
Background
2.1
The IPO Puzzle
Historically, two main empirical idiosyncrasies of IPOs have drawn the attention of academics: (1) exceptionally high returns immediately following IPOs, and (2) long-run underperformance of IPOs (Ritter 1998). The key unanswered question is therefore how both short-term underpricing and long-term underperformance can coexist for IPOs. I consider attempts by researchers to explain each phenomenon in turn. 2.1.1
Short-Run Underpricing
A wide body of empirical work has shown that the distribution of first-day returns to IPO stocks is highly right-skewed (Ibbotson 1975; Ritter 1984, 1998; Loughran and Ritter 2004): between 1980 and 2012, mean proceeds-weighted first-day returns have been approximately 18.6% (Ritter 2013). This phenomenon has been replicated in every country with a stock market, with historical average first-day returns ranging from 4.2% in Russia to 264.5% in Saudi Arabia (Loughran et al. 1994). Asymmetric Information Models. Asymmetric information models presume that issuing firms, underwriters, or investors know more about the firmâ&#x20AC;&#x2122;s true value than the other two. The 111
most famous of these is Rockâ&#x20AC;&#x2122;s (1986) winnerâ&#x20AC;&#x2122;s curse model, which argues that the best-informed, privileged investors crowd out unprivileged investors also during underpriced issues and abstain from investing during overpriced issues. When they abstain, they trigger negative conditional expected returns on the IPO, causing unprivileged investors to also abstain from investing. As such, firms underprice IPOs to ensure that privileged and unprivileged investors participate in the market.3 Institutional Incentives. Institutional incentives models suggest that companies underprice themselves to reduce the likelihood of post-IPO lawsuits, prevent price drops following an IPO, generate additional revenue from greater brand awareness, or reduce managersâ&#x20AC;&#x2122; personal tax burdens (Tinic 1988; Hughes and Thakor 1992; Ruud 1993; Demers and Lewellen 2003; Rydqvist 1997).4 Control Theories. The two major control theories actually conflict directly with each other: Brennan and Franks (1997) argue that underpricing helps to increase managerial control and agency costs by reducing the risk of a single shareholder taking control of a company. By contrast, Stoughton and Zechner (1998) argue that underpricing helps to increase monitoring and thus reduce agency costs and misaligned incentives. Behavioral Models. The three rational-choice-based groups of theories suffer from one major flaw: incentive-driven underpricing is an exceptionally costly means for firms to convey information to investors. It seems especially unlikely that a highquality firm would incur an opportunity cost of over 10% of its overall value to help communicate its true value to investors (Ritter 2011).5 Behavioral models, by contrast, allow for the existence of irrationality in firms and markets. As a note, 3 Associated with this theory is the idea that underwriters have reputational incentives to underprice. c.f. Beatty and Ritter (1986) and Hoberg (2003). 4 See Ritter and Welch (2002) and Ljungqvist (2007) for a survey of institutional incentives models. 5 See Ritter and Welch (2002), Ritter (2003), Ljungqvist (2007), and Yong (2007) for an extensive survey of asymmetric information models.
112
behavioral theories are the newest, least-developed, and fastestgrowing theories of IPO underpricing (Ljungqvist 2007). Prospect theory models build on Kahneman and Tversky (1979) and Thaler (1985), and argue that managers of issuing firms both underweight the opportunity cost of underpricing and overweight the direct cost of underwriter spreads (Loughran and Ritter 2002). Welch’s famous information cascades model (1992) claims that later investors use other investors’ purchasing decisions rather than their own private signals to guide their purchasing decisions. This triggers the formation of “cascades,” many of which are untethered from the firm’s fundamental value. Finally, proponents of disagreement models argue that as issuing firms seek to extract as much surplus from high disagreement investors as possible, they limit asset float immediately following IPOs, causing IPO offer prices to exceed firms’ fundamental value. Miller (1977) and Ljungqvist et al. (2003) therefore predict shortrun overperformance and long-run underperformance of IPOs, as in the long run, investors learn the true value of the firm. The appeal of a number of the behavioral IPO theories over rationality-based theories is that they can explain both first-day IPO underpricing and long-term post-IPO underperformance. 2.1.2
Long-Run Underperformance
An equally interesting but far more empirically contentious phenomenon concerns the long-run returns of IPOs. As a result of the exceptional methodological complexity of the empirical debate pertaining to long-run post-IPO returns, I draw the following two conclusions from existing literature and point the reader to reviews by Ritter (1998), Jenkinson and Ljungqvist (2001), Ritter and Welch (2002), Yong (2007), and Ritter (2011) for further reading. First, evidence from U.S. and Chinese financial markets indicates that IPOs underperform in the long run (Yong 2007). Second, the extent of IPO underperformance will be up for debate as long as there are multiple accepted measures of risk-adjusted performance (Ritter and Welch 2002). I turn to the two major explanations given for long-run IPO underperformance. Originally proposed by Ritter (1991) and Loughran and Ritter (1995), the window of opportunity
113
hypothesis holds that firms attempt to time IPOs to take advantage of positive swings in investor sentiment when IPOs themselves are overvalued (Ritter 1998).6 Of greater interest to this paper is the concept that short-sales constrained investors with heterogenous expectations cause the formation of price bubbles. This theory has its roots in Miller (1977) and is predicated on the intuition that IPOs are environments with high disagreement and short-sales constraints as a result of the limited float available post-IPO. The review above has demonstrated that behavioral explanations to the IPO Puzzle may be superior to their rationalmarkets counterparts. Accordingly, I address the precepts of both herding and disagreement finance and discuss recent work tying each to IPOs.
2.2 2.2.1
Herding and Markets Information Diffusion
The most famous study of the effect of information diffusion is by Huberman and Regev (2002), who study EntreMed. After the New York Times carried a front-page article about the company in 1998, EntreMed’s share price rose from a $12 Friday close to an $85 Monday open, consistently closing above $30 in the subsequent three weeks. Even though the Sunday New York Times article provided no new information, it induced a permanent price increase for EntreMed. The EntreMed example illustrates the power of gradual information diffusion, a phenomenon described famously in Bikhchandani et al. (1992). Investors receive noisy signals of the technology’s true value and revise their beliefs using Bayesian updating, given previous investors’ investment decisions. The result of the model is fascinating: early movers exert exceptional amounts of influence on late movers. Moreover, when there are sufficiently many prior confirmatory actions, Bayesian updating rationally requires unconditionally throwing away individuals’ 6 Although
Kang et al. (1999) provide empirical evidence of the window of opportunity hypothesis, I approach their study with caution, since since Kang et al. use market-to-book ratio—a ratio interpreted very differently in behavioral and conventional finance—as a proxy for over-valuation.
114
signals, initiating an information cascade. 2.2.2
Momentum
Closely related to the concept of information cascades is the concept of medium-term momentum in markets, a phenomenon described famously by Jegadeesh and Titman (1993). On the empirical side, Lee and Swaminathan (2000) show that stocks with high past turnover generally had higher and more persistent momentum. Hong et al. (2012) demonstrate that arbitrageurs can amplify the effects of good news if investors have a disproportionate number of short positions on the stock ex ante.7 Theoretical attempts to rationalize momentum include Hong and Stein (1999) and Hong et al. (2011), who use epidemiological models to explain momentum and information diffusion, and the aforementioned studies on information cascades.8 2.2.3
Turnover
Also related to the above literature is the widely documented relationship between turnover and price bubbles. Piqueira (2004) demonstrates that between 1993 and 2002, turnover was negatively related to long-run returns, when controlling for liquidity. Similar results have also been found in Baker and Stein (2004), Datar et al. (1998), Chen et al. (2002), and Brennan et al. (1998).9 7 c.f.
Shleifer and Vishny (1995). Also related to momentum is post-earnings announcement drift. Most famously, Bernard and Thomas (1989) demonstrated that stock prices tended to drift after earnings announcements. 8 A number of papers also analyzed related herding behaviors and the effects of career concerns; however, I leave these out of this review for the sake of concision. For further reading, see Scharfstein and Stein (1990), Trueman (1994), Zwiebel (1995), Morris and Northwestern University (1998), Avery and Chevalier (1999), Prendergast and Stole (1999) and Hong et al. (2000) for studies of herding in markets and amongst analysts; and Coughlan and Schmidt (1985), Warner et al. (1988), Weisbach (1988), Grinblatt and Titman (1989), Jensen and Murphy (1990), Gibbons and Murphy (1992), Prendergast and Stole (1996), Khorana (1996), Holmstram (1999) and Chevalier and Ellison (1999) for studies of the effects of career concerns amongst managers and analysts. 9 On a related note, researchers have also found positive correlations between volume and price variability c.f. Tauchen and Pitts (1983) and Epps and Epps (1976).
115
Approaches to this apparent violation of the Milgrom and Stokey (1982) No-Trade Axiom include the disagreement approach, which focuses on speculative motives that can create volume regardless of whether price moves exist. Given the growing acceptance of disagreement approaches to the IPO Puzzle, I turn my attention to the effect of disagreement on speculation in markets.
2.3 2.3.1
Disagreement and Speculation Sources of Disagreement
Gradual Information Diffusion. Although the EntreMed example from Huberman and Regev (2002) is often cited by behavioral economists, the existence of gradual information diffusion can be fully consistent with models of rational decision-making; namely, specialists may face lower information acquisition costs than generalists. In the case of EntreMed, generalists failed to deduce information from the trading patterns of specialists following stories before the front-page New York Times article; instead, they only traded when spoon-fed information (Hong and Stein 2007). This theory is especially relevant in the case of IPOs, which feature extremely limited publicly available information, and therefore highly asymmetrical information. As information about companies typically spreads gradually, it is likely that gradual information diffusion may be driving disagreement during and immediately following IPOs. Limited Attention. Limited attention theory suggests that individuals pay attention to only a small fraction of information available to them. Empirical studies of limited attention include Klibanoff et al. (1998), who find that closed-end country fund prices react far more to changes in fundamentals when news pertaining to that country appears in the New York Times, and underreact otherwise. Conversely, DellaVigna and Pollet (2005) find that stocks with Friday earnings announcements have substantially slower stock prices responses, and 10% lower
116
volume, as investors are distracted temporarily over weekends.10 Heterogenous Priors. The heterogenous priors school of thought suggests that investors can still disagree about asset valuations if they utilize different economic models for interpreting information (Harris and Raviv 1993; Kandel and Pearson 1995).11 Regardless of the channel by which disagreement exists, it can only create speculative bubbles if it exists in the context of shortsales constraints (Miller 1977). 2.3.2
Short-Sales Constraints and Speculation
Short-sales constraints are a regular occurrence in markets: many investors lack the ability to short. Short-sales constraints can additionally come in the form of unavailable proxies for shorting, government bans on short-sales (Bris et al. 2007), and even limited asset float, as is often the case with IPOs. Miller (1977) argues that if short-sales constrained investors disagree sufficiently about the intrinsic value of an asset, pessimistic investors can effectively be shut out of the market, causing the price of an asset to reflect only the views of optimists. Miller argues that disagreement can explain low returns on value stocks, IPOs, and high-beta stocks. One major implication of Miller’s canonical model is that as the magnitude of disagreement rises, so does overpricing. This conclusion is supported by a simple static A-B model from Chen et al. (2002), as well as empirical evidence in Diether et al. (2002) and Chen et al. (2002).12 Two pieces of Hong and Sraer’s (2011b) empirical work are of great interest to this paper. First is the inversion of the securities market line predicted in high aggregate disagreement periods, 10 Friday earnings announcements triggered 60% delayed responses as a percentage of total response, compared with 40% for other weekdays. Interestingly, firms with Friday announcements also are 45% more likely to have negative earnings responses than their counterparts. 11 Both of these papers also use these models to explain the strong positive correlation between overpricing and turnover. 12 A-B models focus on two types of investors, one optimistic and one pessimistic. In the face of short-sales constraints, optimistic “A” investors can sometimes force pessimistic “B” investors out of the market, causing overpricing.
117
a result that inspires me to study for the first time the effect of aggregate disagreement on IPOs. Second, and more importantly, is the monthly time series of aggregate disagreement used by Hong and Sraer. As a proxy for aggregate disagreement, the authors use the beta-weighted average of the dispersion of analyst forecasts of the long-term growth rate (c.f. Yu (2011)). I use this same measure as my primary explanatory variable during this paper. 2.3.3
Disagreement and IPOs
Although the disagreement approach to the IPO Puzzle has existed since Miller (1977), empirical research studying the effect of disagreement on IPO returns is surprisingly scant. Houge et al. (2001) study IPOs between 1993 and 1996, and use three proxies for disagreement—the flipping ratio, time of first trade, and opening bid-ask spread. Houge et al. find that all three measures of disagreement have the ability to predict low long-run IPO returns when controlling for issue quality. Similarly, Loughran and Marietta-Westberg (2005) use turnover as a proxy for disagreement, and unsurprisingly find a negative relationship between long-run return and turnover. No paper has attempted to directly study the effect of disagreement on IPOs. Each of the aforementioned papers uses a noisy measure of disagreement that can reflect a number of market factors that may not be directly related to actual market disagreement. Moreover, idiosyncratic measures of disagreement often are influenced by a firm’s risk profile, introducing further omitted variables bias and error-in-variables bias into these studies. Accordingly, I opt for more direct proxies of disagreement used in Hong and Sraer (2011b) and Yu (2011).
3 3.1
Model Setup
I seek to show that temporary disagreement in a setting with temporary short-sales constraints and momentum traders—both of which often exist immediately following IPOs—can lead to 118
a short-term bubble and long-term underperformance below fundamental value. To illustrate this, I use a simple adaptation of the static A-B model used in Chen et al. (2002), Hong and Stein (2003), Hong et al. (2006), Hong and Stein (2007), Hong and Sraer (2011a), and Hong and Sraer (2011b). Consider a single asset in supply Q > 0, and assume for simplicity that the risk-free rate is zero. There are four dates: t = 0, 1, 2, and 3. The asset pays a terminal dividend D = V + �v at t = 3, where �v ∼ N (0, 1). In this model, there are three types of investors: A, B, and M. Investors of type A and B trade the asset at t = 0, 1, and 2. Each period, investors of type A and B maximize the following CARA function: E [W ] −
1 [var(W)] 2η
(1)
where W is wealth and η is the risk-bearing capacity of each (i ) group. I denote Dt as type i ∈ A, B investor’s belief of the (i ) terminal dividend at time t, and xt to be the time t demand of type i investors. At t = 0, type A and B investors have homogenous priors about D: ( A)
D0
( B)
= D0
= V + �v
(2)
where �v ∼ N (0, 1). At t = 1, investors of type A and B receive a disruptive signal λ and −λ about the fundamental value of the asset, respectively, where 0 < λ < 3Q η . I interpret λ and − λ as the positive bias of investor type A and the negative bias of investor type B—namely, their disagreement. No investors know that this disruptive signal will exist at t = 0. At t = 1, the disruption λ is a private signal to both investors of type A and B; moreover, each investor naively believes (i ) that the other investor has the old belief D1 = V, and does not dynamically infer the other investor’s belief based on prices until after markets have cleared at t = 1. This disruption transforms the beliefs of investors of type A and B into the following:
119
( A)
D1
( B)
= V + λ + �v + �λ
D1
= V − λ + �v − �λ
(3)
where �λ ∼ N (0, σλ2 ) represents the additional perceived variance by type A and B investors as a result of λ. For the purposes of this model, assume 0 < σλ < σv = 1. At t = 2, disagreement between type A and B investors disappears, restoring their beliefs about the dividend to the following: ( A)
D2
( B)
= D2
= V + �v
(4)
Note that at t = 2, type A and B investors believe that other non-type M investors also hold the same belief about the asset’s terminal dividend.13 (i ) I let Σt be type i ∈ A, B investor’s beliefs about the variance of the returns between time t and t + 1. For simplicity, I assume (i ) (i ) Σ0 = 1, and Σ1 = Σ ∈ (1, 2) (because of the symmetric but (i )
distorted beliefs of the investors), and Σ2 = 1. Investors of type M represents momentum traders. Momentum traders only trade the asset at t = 1 and 2. Like type A and B investors, type M investors receive the same terminal dividend D for holding the asset at t = 3. Their demands are the following: ( M)
x1
( M)
= η ( P1 − P0 )
x2
= η ( P2 − P1 )
(5)
It is important to understand why momentum traders exist in my model. Momentum traders can consist of a number of individuals: First, non-institutional traders often react to IPOs based on initial price movements. Second, momentum traders can reflect growing numbers of investors learning about the IPO, in a manner similar to the information diffusion documented in the EntreMed case (Huberman and Regev 2002). Finally, and perhaps most importantly, it is conceivable that traders dynamically and 13 Note: This represents nature eventually revealing the fundamental value of the asset; I assume here that this revaluation is public knowledge.
120
profitably trade based on momentum.14 Hereafter, I assume that type A and B investors are fully shortsale constrained at t = 0 and 1, as is often initially the case for IPOs, and that no investor is short-sale constrained at t = 2. Investors also know they will not be short-sale constrained at t = 2. Moreover, I assume momentum traders are never shortsale constrained.15
3.2
Solution
Lemma 1. The stock holdings and price at t = 0 are given by the following: ( M)
x0 where Q˜ =
( A)
= 0, x0
( B)
= x0
=
Q , 2
P0 = V −
9Q˜ 8
(6)
Q η.
Lemma 2. The stock holdings and price at t = 1 are given by the following two cases: Q˜ Case 1: λ > 4(3Σ 4− Σ ) ηλ Q(5 − 4Σ) −ηλ 3Q ( A) + , x1 = + , 2−Σ 4(2 − Σ ) 2 − Σ 4(2 − Σ ) λ Q˜ (8 − Σ) P1 = V + − 2−Σ 8(2 − Σ ) ( M)
x1
=
Case 2: λ ≤
3Σ Q˜ 4(4− Σ )
( M)
x1
(7)
=
Q(5 − 2Σ) , 2(4 − Σ )
( A)
x1
=
ηλ 3Q + , Σ 4(4 − Σ )
14 Although momentum strategies are not profitable in this model, in a more dynamic, stochastic, continuous-time setting, the profitability of a momentum strategy is more likely to be nonnegative. 15 To make my model more realistic, I ideally would have M short-sale constrained at t = 1 as well; however, this drastically complicates the analysis in this paper, with almost no effect on the equilibrium solutions. As such, I allow M to short at t = 1. Note also that since prices unequivocally rise between t = 0 and 1, short-sales constraints have no effect on momentum traders at t = 1.
121
( B)
x1
=
−ηλ 3Q Q˜ (16 − Σ) + , P1 = V − Σ 4(4 − Σ ) 8(4 − Σ )
(8)
Lemma 3. The stock holdings and price at t = 2 are given by the following two cases: Q˜ Case 1: λ > 4(3Σ 4− Σ )
−2ηλ Q(8 − 6Σ) ηλ Q(8 − 7Σ) ( A) ( B) − , x2 = x2 = + , 2−Σ 8(2 − Σ ) 2−Σ 8(2 − Σ ) λ Q˜ (8 − 7Σ) P2 = V − − (9) 2−Σ 8(2 − Σ ) ( M)
x2
=
Case 2: λ ≤
3Σ Q˜ 4(4− Σ )
6QΣ Q(16 − 7Σ) ( A) ( B) , x2 = x2 = , 8(4 − Σ ) 8(4 − Σ ) Q˜ (16 − 7Σ) P2 = V − 8(4 − Σ ) ( M)
x2
=
(10)
Proof. I begin by solving for demand and price at t = 2. Given CARA preferences, investor demands are given by the following: ( A)
x2
( A)
= max[η ( D2
( B)
( B)
− P2 ), 0] = max[η ( D2 − P2 ), 0] = x2 ( A)
I impose the market-clearing condition, x2 Q, and have the following:
( B)
(11)
( M)
+ x2 + x2
=
( A)
Q = 2η ( E[ D2 ] − P2 ) + η [ P2 − P1 ]
( A) P2 = 2E[ D2 ] − P1 − Q˜ = 2V − P1 − Q˜
(12)
I pause here to solve for each investor type’s t = 1 expectation ( A) of the t = 2 price. For type A and B investors, I have: E1 [ P2 ] = ( B) ˜ 2V + λ − P1 − Q˜ and E1 [ P2 ] = 2V − λ − P1 − Q. I now turn to the t = 1 equilibrium. Type A and B investors seek to maximize their one-period return. As such, given their mean-variance preferences, t = 0 demands are given by the
122
following: ( A)
x1
( B)
x1
� η (2V + λ − 2P1 − Q˜ ) ,0 Σ � � η (2V − λ − 2P1 − Q˜ ) = max ,0 Σ
= max
�
(13)
I once again impose the market-clearing � condition,˜ Q� = ( B) ( M) η (2V +λ−2P1 − Q) + x + x1 i.e. Q = max ,0 + Σ � 1 � ˜ η (2V −λ−2P1 − Q) max , 0 + η [ P1 − P0 ], and am left with three cases: Σ ( A) x1
(1) P1 ≥ 2V −λ− Q˜ . 2
2V +λ− Q˜ , 2
(2)
2V +λ− Q˜ 2
> P1 ≥
2V −λ− Q˜ , 2
and (3) P1 <
In case (1), as only momentum traders participate, I trivially have: ( A) ( B) ( M) P1 = Q˜ + P0 , x1 = x1 = 0, x1 = Q (14) In case (2), type B investors are short-sale constrained, while type A investors are long: η (2V + λ − 2P1 − Q˜ ) + η [ P1 − P0 ] Σ 2V + λ − (1 + Σ) Q˜ − ΣP0 P1 = 2−Σ Q=
(15)
And in case (3), I have:
η (2V + λ − 2P1 − Q˜ ) η (2V − λ − 2P1 − Q˜ ) + + η [ P1 − P0 ] Σ Σ 4V − (2 + Σ) Q˜ − ΣP0 P1 = (16) 4−Σ Q=
I now consider the t = 0 equilibrium. Here, I have a symmetric � � ( A) ( A) equilibrium, with demand x0 = max η ( E0 [ P1 ] − P0 ), 0 = ( B)
x0
=
Q 2.
Therefore, I have the following: ( A)
P0 = E0 [ P1 ] −
Q˜ 2
(17)
I pause again to solve for the investors’ t = 0 expectation of the 123
t = 1 price. At t = 0, type A and B investors believe that short˜ sales constraints will bind if P1 > 2V2−Q . In the case of binding short-sales constraints, the demand curve is the same as that of Equation 14. At t = 0, type A and B investors can solve Equations 14 and 17 to show that short-sales constraints will not bind at t = 1 given their t = 0 priors. Additionally, at t = 0 the expected market clearing condition at t = 1 is: Q = η (2V − 2P1 − Q˜ ) + η (2V − 2P1 − Q˜ ) + η ( P1 − P0 ) 4V − 3Q˜ − P0 P1 = (18) 3 Solving Equations 17 and 18 gives me Lemma 1. I then plug Equation 6 into Equations 14-16, and observe that type A investors cannot be priced out by momentum traders, to obtain Lemma 3. I then use Lemma 2 and Equation 12 to obtain Lemma 3. I focus on Case 1 in Lemmas 2 and 3. In each expression, the first term V represents the expected value of the terminal dividend, the second term ( 2−λΣ in Lemma 2 and − 2−λΣ in Lemma 3) represents the effect of t = 1 disagreement, and the third term represents the combined effect of risk aversion and the existence of momentum traders. Of interest is the disagreement term: disagreement at t = 1 contributes not only to short-term overperformance but also to underperformance at t = 2. The implication of Lemmas 2 and 3 is therefore the following: Proposition 1. In a universe with sufficiently high short-term disagreement λ, short-term short-sales constraints, and momentum traders, the extent of the short-term overperformance and long-term underperformance increases with λ.
3.3
Empirical Implications
I pause briefly to consider the results I would expect in my empirical work given Proposition 1. First, Proposition 1 suggests that the effect of disagreement should be positive initially following an IPO, especially given the existence of momentum traders. 124
Second, extrapolating this model to a continuous-time environment, I would expect an increasingly positive effect of disagreement on returns in the first months following an IPO, as gradual information diffusion and momentum effects magnify the effect of initial disagreement. Third, Proposition 1 suggests that medium-to-long-term prices should be negatively impacted by disagreement, as momentum traders exacerbate the effect of prices returning to their fundamental value after disagreement disappears, causing prices to swing below the IPO firmâ&#x20AC;&#x2122;s fundamental value. Fourth, combining Proposition 1 with the work of Harrison and Kreps (1978), Scheinkman and Xiong (2003), and Hong et al. (2006), I would expect disagreement to have a positive effect on turnover in the first months following an IPO. I use these predictions to motivate the empirical analysis that follows, and return to all four of these predictions in Section 7.
4
Data
I used two datasets for this study: the aggregate dataset, which consists of IPO time-series data aggregated by month, and the firm-level dataset, which consists of data for individual firms performing IPOs.
4.1
The Aggregate Dataset
The aggregate dataset consists of monthly data from December 1981 through January 2010. Monthly data for the number of IPOs, the average first-day return for IPOs, and percentage of IPOs priced above the midpoint of the original file-price range are from Ibbotson et al. (2013), through Professor Jay Ritterâ&#x20AC;&#x2122;s online IPO database. All three measures exclude closed-end funds, real estate investment trusts, acquisition companies, IPOs with offer prices below $5, American depositary receipts, limited partnerships, savings and loan associations, and any IPOs excluded from the Center for Research in Security Prices database (CRSP). The percentage of IPOs priced above the midpoint of the original file price range also excludes IPOs with starting file-price range midpoints below $8. 125
Similarly, monthly data for the number of SEOs are from Ritter (2004), also through Professor Jay Ritter’s online IPO database. In this dataset, SEOs are defined as non-IPO share issuances including at least some shares offered by the company performing the SEO. As such, pure secondaries—as well as SEOs for utilities, non-CRSP-listed companies, and NASDAQ-listed ADRs—are excluded from this dataset. Aggregate disagreement is from Hong and Sraer (2013), thanks to the generosity of Professors Harrison Hong and David Sraer. The measure of aggregate disagreement used by Hong and Sraer is the monthly beta-weighted average of the dispersion in analysts’ forecasts of the long-term growth rate. Note that beta-weighting of analysts’ forecasts creates a proxy for aggregate disagreement by underweighting high-beta assets. For more information about this measure, see Hong and Sraer (2011b) and Yu (2011). Smoothed aggregate disagreement is computed using STATA’s exponential smoothing algorithm on the aggregate disagreement data series. I compute smoothed aggregate disagreement to reflect the multi-month horizons during which firms decide whether to launch an IPO, as well as the expanded period during which underwriters must prepare firms before they are ready to issue shares publicly. There are two important notes to consider when analyzing aggregate disagreement. First, aggregate disagreement exhibits random walk-like behavior: Dickey-Fuller tests fail to reject the null of a unit root, and simple autoregressions of aggregate disagreement on lagged aggregate disagreement have significant (p < 0.001) coefficients of 0.98, 0.95, 0.89, 0.83, 0.76, 0.70, 0.63, 0.57, and 0.50 when regressed against the first, third, sixth, ninth, twelfth, fifteenth, eighteenth, twenty-first, and twenty-fourth aggregate disagreement lags, respectively, and r2 values of 0.9626, 0.8963, 0.7971, 0.6813, 0.5604, 0.4656, 0.3770, 0.3018, and 0.2317, respectively. As such, present disagreement is a very strong predictor of future aggregate disagreement. Second, aggregate disagreement exhibits exceptionally bizarre behavior of aggregate disagreement during the 2000-2002 post-dot-com/September 11th era. As such, I create dummies for years 2000-2002, and cross them with aggregate disagreement, to control for potential structural 126
breaks in my model.
3
0
Aggregate disagreement 4 5 6
.5 1 1.5 Average day-one return
7
2
Figure 1: Time series for aggregate dispersion of analyst forecasts and average first-day IPO return
1980m1
1990m1
2000m1
2010m1
Date Aggregate disagreement
Average day-one return
Monthly data for the Fama-French and broad market factors— SMB, HML, market risk premium, and risk-free rate—are from French (2013), through Professor Ken French’s website. Monthly data for monthly broad-market price/earnings and dividend/price ratio are from Shiller (2006), through Professor Robert Shiller’s website. Cross-correlations of all of the right-hand side variables in the aggregate dataset are summarized in Table 1. Note here the relatively high correlation between aggregate disagreement and the price/earnings ratio, as well as the high negative correlation between aggregate disagreement and the dividend/price ratio. This underscores the fact that multicollinearity may be disrupting my regressions. I use these intuitions extensively as I motivate my empirical models.
127
Table 1: Correlation for aggregate dataset Variables Agg. dis. Sm. dis. Rm-Rf Rf SMB HML P/E D/P
4.2
Agg. dis. 1.000 0.981 -0.070 -0.323 0.115 0.111 0.403 -0.472
Sm. dis.
Rm-Rf
Rf
SMB
HML
P/E
D/P
1.000 -0.090 -0.333 0.110 0.125 0.389 -0.473
1.000 -0.019 0.210 -0.333 0.043 0.010
1.000 -0.124 0.039 -0.528 0.658
1.000 -0.342 0.088 -0.038
1.000 0.016 0.015
1.000 -0.315
1.000
The Firm-Level Dataset
The firm-level dataset consists of forward abnormal returns and turnover data for IPOs between 1996 and 2006. The firm-level dataset consists of three main pieces. The foundational piece is a list of IPOs, IPO prices, and IPO dates from January 1996 to December 2006 (Kenney and Patton 2010), obtained through the generosity of Professors Don Patton and Martin Kenney. This list excludes mutual funds, real estate investment trusts, blank-check companies, asset acquisitions, foreign F-1 filers, small business IPOs (with the exception of internet firms), spin-offs, and all non de-novo issues. The second piece includes price and SIC industry group data from Compustat (2013), as well as turnover data from CRSP (2013).16 I compute forward abnormal returns for the end of the calendar month during which the IPO took place, as well as the end of calendar months 1-24 after the IPO took place (forward returns N months after IPO), by subtracting the CRSP valueweighted index from forward N-month return. Note here that computational and dataset limitations force me to use end-ofmonth price data instead of daily price data. Although this does create a small amount of left-hand-side measurement error, it does not detract substantially from my analysis. Forward turnover is computed by dividing volume by the number of publicly traded shares (both from CRSP), and like forward returns is computed for all IPOs in the Patton dataset. As many firms have only partially available shares outstanding and volume data, all firms without CRSP data are excluded from 16 Compustat
price data are fully adjusted for dividends, splits, and other major distribution events.
128
my dataset. I also exclude from my turnover dataset any firm that does not have volume or shares outstanding data by the sixth complete month following the IPO, to help reduce sample bias. I compute SIC division using SIC industry group data from Compustat, with the classification system from the Occupational Safety and Health Administration (2013). This enables me to perform fixed-effects regressions with reasonable numbers of groups, while reducing the effect of the error inherent in the SIC group classification process. The third piece to the firm-level dataset is the set of right-handside variables from the aggregate dataset; data from the aggregate dataset are paired with firm-level data by matching the month and year of data from the aggregate dataset with the month and year of the IPO. I conclude with one final note: when possible, studies should attempt to utilize aggregated time-series data, to avoid the frequency bias addressed in Schultz (2003). The aggregate dataset uses a large enough time period to allow for aggregation; however, the firm-level dataset only uses 1996-2006, a time period far too small to allow for aggregation. I address this limitation to my study further in the Sections 7 and 8.
5
Methodology
For the purposes of this section, I distinguish between public-side models, in which the primary left-hand-side variables are first-day returns, forward abnormal returns, and forward turnover; and private-side models, in which the primary left-hand-side variables are the number of IPOs, number of SEOs, and percent of IPOs priced above the file-price range midpoint in a given month.17 Note that all of the data in the firm-level dataset are entirely 17 I
choose this terminology as a means for distinguishing who controls the magnitude of each measure: for public-side measures, markets primarily drive the size of the measure in question (i.e. for price and volume), whereas for private-side measures, firms and their underwriters determine the size of the measure in question (i.e. whether to undertake an IPO, how to price an IPO, etc). Although markets can certainly affect private-side decisions, and vice versa, I use this distinction extensively for my understanding of the results in this paper.
129
public-side data.18 Given my analytical motivations, I focus first and primarily on public-side models, and turn later to private-side models.
5.1 5.1.1
Public-Side Models Average First-Day Returns
One of the core questions of this thesis is the effect of aggregate disagreement on initial IPO returns. I begin therefore begin my analysis using the aggregate dataset, and start with Equation 19: AVG_FIRST_DAY_RETt = β 0 + β 1 AGG_DISPt + ut
(19)
where AVG_FIRST_DAY_RETt is the average first-day return for IPOs at time t and AGG_DISPt is the aggregate disagreement at time t.19 As the Durbin’s alternative test allows me to reject soundly the null of no serial correlation in Equation 19, I turn to the Prais-Winsten GLS estimation method in Prais and Winsten (1954) for the following regressions on AVG_FIRST_DAY_RETt .20 To further refine the above model, I introduce a number of controls, as in Hong and Sraer (2011b) and Loughran and Ritter (1995): AVG_FIRST_DAY_RETt = β 0 + β 1 AGG_DISPt
+ β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t + β 4 Rm_R f t + ut
(20)
where It∈[2000,2002] ∗ AGG_DISPt is 0 if t ∈ / [2000, 2002] and is AGG_DISP if t ∈ [2000, 2002], R f t is the risk-free rate at time t, and Rm_R f t is the market risk premium at time t. I also add 18 Note further that this terminology is styled after distinctions made in many investment banks between public-side information (i.e. information available to both investment banking divisions and trading/research desks) and private-side information (i.e. information available to only investment bankers). 19 Note here that t increments on a monthly basis. 20 Note that, since some of the private-side measures had a few missing data points, I could not use OLS with Newey-West HAC standard errors; as such, for consistency’s sake, I opted to use Prais-Winsten GLS estimation.
130
further controls, as in Loughran and Ritter (1995): AVG_FIRST_DAY_RETt = β 0 + β 1 AGG_DISPt
+ β 2 It∈[2000,2002] ∗ AGG_DISPt
+ β 3 R f t + β 4 Rm_R f t + β 5 SMBt + β 6 HMLt + ut (21) where SMBt and HMLt are the small (market capitalization) minus big and high (book-to-market ratio) minus low FamaFrench factors at time t. Furthermore, given the relationship between aggregate disagreement and both the price/equity and dividend/price ratios, I add controls for each, as in Hong and Sraer (2011b) and Yu (2011): AVG_FIRST_DAY_RETt = β 0 + β 1 AGG_DISPt
+ β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t
+ β 4 Rm_R f t + β 5 SMBt + β 6 HMLt + β 7 PEt + β 8 DPt + ut (22) where PEt and DPt are the market average price/equity and dividend/price ratios at time t, respectively. Note: given the potential for multicollinearity in Equation 22, I compute the uncentered variance inflation factor (VIF) after each regression to measure the degree of multicollinearity introduced in each regression. I also repeat the same regressions for the smoothed aggregate disagreement measure, replacing AGG_DISPt and It∈[2000,2002] ∗ AGG_DISPt with SM_AGG_DISPt and It∈[2000,2002] ∗ SM_AGG_DISPt , respectively, where SM_AGG_DISPt is the exponentially-smoothed aggregate disagreement measure at time t. Additionally, I study the possibility of using the square of aggregate disagreement, both standard and smoothed, in all of my models. As I find no interesting results while performing this analysis, I omit these results for the sake of parsimony.
131
5.1.2
Forward Abnormal Returns and Turnover
I investigate two main relationships when working with the firmlevel dataset: the effect of aggregate disagreement on forward abnormal returns for IPO stocks, and the effect of aggregate disagreement on forward turnover for IPO stocks. I address each in turn, noting that I use very similar techniques for both analyses. My baseline model (results not presented) for the firm-level regression of forward n-month abnormal returns on aggregate disagreement is the following: (t+n)
FWD_RETi
= β 0 + β 1 AGG_DISPt + uit
(23)
(t+n)
where FWD_RETi is the n-month forward post-IPO abnormal return for firm i, and t is the time of IPO. Note that n ranges from 0 (the end of the calendar month of the IPO) to 24. Adding in controls as before gives me the following model (results not presented): (t+n)
FWD_RETi
= β 0 + β 1 AGG_DISPt + β 2 It∈[2000,2002] ∗ AGG_DISPt
+ β 3 R f t + β 4 Rm_R f t + β 5 SMBt + β 6 HMLt + β 7 PEt + β 8 DPt + ut (24) However, given that industry sector is potentially correlated with the error term in Equation 24, I turn to fixed-effects regression techniques to help bring this error into my models. As such, my primary model is the following: (t+n)
FWD_RETi
= β 0 + β 1 AGG_DISPt + β 2 It∈[2000,2002] ∗ AGG_DISPt
+ β 3 R f t + β 4 Rm_R f t + β 5 SMBt + β 6 HMLt + β 7 PEt + β 8 DPt +
N
∑ λk I NDCLk + ut
(25)
k =0
(k)
where I NDCLi is an indicator variable for the kth SIC industry group, equaling 1 if firm i has been classified as part of sector k. I also consider the effect of removing PEt and DPt , given their high 132
multicollinearity, using the following model: (t+n)
FWD_RETi
= β 0 + β 1 AGG_DISPt + β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t + β 4 Rm_R f t + β 5 SMBt + β 6 HMLt N
+
(k)
∑ λk I NDCLi
(26)
+ ut
k =0
In both of the above regressions, I perform Hausman tests to determine whether there is omitted-variables bias as a result of industry classification. In all cases, I reject the null that the random-effects and fixed-effects models are both equally acceptable, and thus focus on my fixed-effects regression results. My analysis for forward turnover follows the same approach. I begin with the following baseline model: (t+n)
FWD_TURNi
(27)
= β 0 + β 1 AGG_DISPt + uit
(t+n)
where FWD_TURNi is the n-month forward post-IPO turnover for firm i, and t is the time of the IPO. Adding in controls and using the same fixed-effects model as before gives me the following model: (t+n)
FWD_TURNi
= β 0 + β 1 AGG_DISPt + β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t + β 4 Rm_R f t + β 5 SMBt + β 6 HMLt + β 7 PEt + β 8 DPt +
N
(k)
∑ λk I NDCLi
+ ut (28)
k =0
Once again, I reject the null of acceptability of random-effects results, and therefore in all cases use fixed-effects regressions. Once again, given the potential for multicollinearity, I remove the price/equity and dividend/price ratios in the following model: (t+n)
FWD_TURNi
= β 0 + β 1 AGG_DISPt + β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t + β 4 Rm_R f t + β 5 SMBt + β 6 HMLt 133
N
+
(k)
∑ λk I NDCLi
+ ut
(29)
k =0
Of note is one major issue I face when performing regressions on forward turnover: CRSP lacks data on publicly traded shares for a large number of firms in the immediate months following the IPO. As a result, as I note in Section 4, I limit my dataset to firms that have volume and publicly traded shares data by the sixth month, doing so with the caveat that my analysis of turnover may be incomplete.
5.2
Private-Side Models
After analyzing the effects of aggregate disagreement on firstday returns, forward abnormal returns, and forward turnover, I turn my focus to whether the behavior observed in the publicside models is driven by private decision-making by firms— specifically, whether firms consider aggregate disagreement when deciding to undertake and price an IPO—as a potential alternative to Proposition 1. I begin by studying the number of IPOs in a given month, beginning with the following base-line model: NU M_IPOSt = β 0 + β 1 AGG_DISPt + ut
(30)
where NU M_IPOSt is the number of IPOs launched in month t. I then expand the model, as before, with the following three models: NUM_IPOSt = β 0 + β 1 AGG_DISPt + β 2 It∈[2000,2002] ∗ AGG_DISPt
+ β 3 R f t + β 4 Rm_R f t + ut (31) NUM_IPOSt = β 0 + β 1 AGG_DISPt + β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t + β 4 Rm_R f t + β 5 SMBt + β 6 HMLt + ut NUM_IPOSt = β 0 + β 1 AGG_DISPt + β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t
+ β 4 Rm_R f t + β 5 SMBt + β 6 HMLt + β 7 PEt + β 8 DPt + ut 134
(32)
(33)
I repeat the same analysis with NUM_SEOSt , the number of SEOs in month t: NU M_SEOSt = β 0 + β 1 AGG_DISPt + ut
(34)
NU M_SEOSt = β 0 + β 1 AGG_DISPt
+ β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t
+ β 4 Rm_R f t + ut NU M_SEOSt = β 0 + β 1 AGG_DISPt + β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t
+ β 4 Rm_R f t + β 5 SMBt + β 6 HMLt + ut
(35)
(36)
NU M_SEOSt = β 0 + β 1 AGG_DISPt
+ β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t
+ β 4 Rm_R f t + β 5 SMBt + β 6 HMLt + β 7 PEt + β 8 DPt + ut (37) I also use the same modeling methodology with PCT_ABV_MED_t, the percent of IPOs priced above the median of the file-price range in month t: PCT_ABV_MEDt = β 0 + β 1 AGG_DISPt + ut
(38)
PCT_ABV_MEDt = β 0 + β 1 AGG_DISPt
+ β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t
+ β 4 Rm_R f t + ut PCT_ABV_MEDt = β 0 + β 1 AGG_DISPt + β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t
(39)
+ β 4 Rm_R f t + β 5 SMBt + β 6 HMLt + ut (40) PCT_ABV_MEDt = β 0 + β 1 AGG_DISPt + β 2 It∈[2000,2002] ∗ AGG_DISPt + β 3 R f t + β 4 Rm_R f t + β 5 SMBt + β 6 HMLt + β 7 PEt + β 8 DPt + ut (41)
In Equations 30-41, I perform Durbin’s alternative test, and soundly reject the null of no serial correlation. As such, in all
135
of the above models, as before, I use the Prais-Winsten GLS estimation method in lieu of OLS.
5.3
Empirical Predictions
Given Proposition 1, I expect the following results for the above empirical models: (1) a significant, positive relationship for the models in Equations 19-22; (2) an initially significant, positive (for low i) and subsequently negative (for high i) relationship for the models in Equations 25 and 26; (3) an initially significant, positive relationship for the models in Equations 28 and 29; and (4) no material relationship for the public-side models. I return to these predictions repeatedly in the following sections.
6
Results
6.1 6.1.1
Public-Side Models Average First-Day Returns
The results of my Prais-Winsten regressions for average first-day returns are summarized in Table 2. Here, I turn my focus to the coefficient on aggregate disagreement. As expected, I obtain positive, significant coefficients on aggregate disagreement for each of the four regressions, indicating that high-disagreement periods likely have bigger first-day IPO returns, a result consistent with both disagreement theory and Proposition 1.21 Moreover, I observe that a one standard deviation increase in aggregate disagreement (i.e. an increase of 0.897) leads to a 0.192, 0.392, 0.383, and 0.279 standard deviation increase in day-one returns in each of the four models, respectively. This, along with r2 values of 0.003, 0.047, 0.081, and 0.145, respectively, indicate that although the model unsurprisingly does not explain most of the variance in first-day IPO returns, aggregate disagreement is moderately economically significant as an explanatory variable for average first-day returns. As a note, similar but marginally less statistically and economically significant results were obtained for regressions 21 Note, however, that I only have significance of 10% for the first and fourth regressions. For the second and third regressions, I have 5% significance.
136
using smoothed aggregate disagreement as an explanatory variable. Of note in Table 2 is the decrease in the scale and significance of the coefficient on aggregate disagreement after controlling for market average price/earnings and dividend/price ratios. This is unsurprising: aggregate disagreement has a correlation of 0.403 with the price/earnings ratio, and -0.472 with the dividend/price ratio. This, however, suggests that multicollinearity (and thus overly large standard errors) might be introduced into the model by price/equity and dividend/priceâ&#x20AC;&#x201D;in fact, the VIF value for aggregate dispersion grows from 4.29 to 13.44 between models 3 and 4 in Table 2. As such, it is somewhat unclear whether the drop in statistical and economic significance between models 3 and 4 in Table 2 is the result of omitted variables bias or multicollinearity. 6.1.2
Forward Abnormal Returns
The results of my regressions on forward abnormal returns are summarized in Tables 3-7. Interestingly, I observe highly statistically significant, positive coefficients on aggregate disagreement (at the 0.1% level except month 13) from months 0-13 post-IPO. Between month 14 and 18, I fail to reject the null that the coefficients on aggregate disagreement equal zero. From month 19 through at least month 24, I observe highly statistically significant (at the 0.1% level except month 19) negative coefficients on aggregate disagreement. The coefficients on aggregate disagreement from Tables 37 are summarized in Figure 2. Interestingly, an increase in aggregate disagreement has an increasingly positive effect on forward abnormal returns in the first three months following an IPO, a decreasingly positive effect on forward abnormal returns after month three, and an increasingly negative effect on forward abnormal returns after month 18. I discuss this further in Section 7. The regressions in Tables 3-7 explain only a moderate fraction of the variance in forward abnormal returns, with r2 values ranging from 0.050 to 0.159. Still, the economic significance of the coefficient on aggregate disagreement is substantial: a one standard deviation change in aggregate disagreement causes a
137
Table 2: Effect of aggregate disagreement on first-day returns
Aggregate disagreement
(1)
(2)
(3)
(4)
0.0394+ (0.0226)
0.0725∗ (0.0312)
0.0709∗ (0.0309)
0.0516+ (0.0279)
-0.0127 (0.0139)
-0.0125 (0.0138)
-0.0239+ (0.0123)
2000-2002xAggregate dis. Price/earnings
0.00226∗ (0.00115)
Dividend/price
-8.983∗∗∗ (2.043)
HML
-0.151 (0.285)
-0.192 (0.284)
SMB
0.674∗∗ (0.230)
0.683∗∗ (0.232)
Risk-free rate
16.49+ (8.531)
18.16∗ (8.477)
45.78∗∗∗ (9.615)
Market risk premium
0.526∗∗ (0.162)
0.443∗ (0.176)
0.424∗ (0.177)
-0.0206 (0.103)
-0.224 (0.147)
-0.224 (0.146)
-0.0499 (0.137)
336 0.003 0.000 1.145
336 0.047 0.036 4.083
336 0.081 0.065 4.853
336 0.145 0.124 6.931
Constant Observations R2 Adjusted R2 F
Standard errors in parentheses + p < 0.10, ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001
138
139
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
Observations R2 Adjusted R2 F
Constant
Mkt. risk prem.
Risk-free rate
SMB
HML
Dividend/price
Price/earnings
2000-2002xAggregate dis.
Aggregate disagreement
(1) Ret. EOM 0.223*** (0.0528) -0.0749*** (0.0163) 0.00716 (0.00444) -35.98*** (10.43) -3.559*** (0.667) 0.303 (0.436) 65.13*** (15.16) -0.274 (0.466) -0.474 (0.449) 1958 0.159 0.152 45.89
(2) Ret. 1 mo. 0.513*** (0.0630) -0.169*** (0.0195) 0.0147** (0.00530) -6.558 (12.43) -4.957*** (0.796) -1.431** (0.521) 72.31*** (18.03) -0.312 (0.557) -2.397*** (0.535) 1973 0.150 0.142 43.01
(3) Ret. 2 mos. 0.813*** (0.0799) -0.294*** (0.0248) 0.0275*** (0.00674) 27.18+ (15.74) -4.737*** (1.011) -2.139** (0.661) 87.47*** (22.87) -0.677 (0.707) -4.570*** (0.678) 1984 0.133 0.125 37.62
(4) Ret. 3 mos. 0.905*** (0.0961) -0.344*** (0.0298) 0.0352*** (0.00810) 40.16* (18.90) -7.010*** (1.213) -3.769*** (0.794) 77.47** (27.47) -2.716** (0.849) -5.262*** (0.815) 1988 0.125 0.118 35.23
(5) Ret. 4 mos. 0.766*** (0.105) -0.326*** (0.0325) 0.0392*** (0.00883) 27.41 (20.62) -8.679*** (1.324) -4.909*** (0.867) 74.11* (29.98) -3.448*** (0.926) -4.464*** (0.889) 1989 0.120 0.112 33.49
Table 3: Effect of aggregate disagreement on post-IPO abnormal returns months 0-4
140
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
Observations R2 Adjusted R2 F
Constant
Market risk premium
Risk-free rate
SMB
HML
Dividend/price
Price/earnings
2000-2002xAggregate dis.
Aggregate disagreement
(1) Ret. 5 mos. 0.616*** (0.0991) -0.303*** (0.0307) 0.0433*** (0.00834) 31.92 (19.46) -6.187*** (1.250) -2.790*** (0.820) 68.77* (28.31) -4.611*** (0.874) -3.964*** (0.840) 1990 0.110 0.102 30.33
(2) Ret. 6 mos. 0.446*** (0.0888) -0.257*** (0.0275) 0.0479*** (0.00747) 32.42+ (17.43) -2.816* (1.120) -0.849 (0.734) 56.25* (25.35) -3.525*** (0.784) -3.379*** (0.752) 1990 0.101 0.094 27.77
(3) Ret. 7 mos. 0.436*** (0.0891) -0.274*** (0.0276) 0.0548*** (0.00751) 41.94* (17.50) -0.420 (1.126) 0.753 (0.738) 57.82* (25.47) -2.630*** (0.788) -3.669*** (0.755) 1989 0.106 0.098 29.27
(4) Ret. 8 mos. 0.359*** (0.0877) -0.244*** (0.0273) 0.0548*** (0.00737) 44.01* (17.21) 0.0245 (1.105) 0.0208 (0.726) 38.28 (24.98) -0.954 (0.774) -3.336*** (0.744) 1984 0.094 0.086 25.43
(5) Ret. 9 mos. 0.391*** (0.0896) -0.247*** (0.0279) 0.0511*** (0.00752) 56.93** (17.55) -0.0498 (1.128) 0.108 (0.741) 17.50 (25.49) -0.691 (0.790) -3.559*** (0.759) 1979 0.085 0.077 22.76
Table 4: Effect of aggregate disagreement on post-IPO abnormal returns months 5-9
141
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
Observations R2 Adjusted R2 F
Constant
Market risk premium
Risk-free rate
SMB
HML
Dividend/price
Price/earnings
2002-2002xAggregate dis.
Aggregate disagreement
(1) Ret. 10 mos. 0.410*** (0.0825) -0.230*** (0.0257) 0.0453*** (0.00692) 73.38*** (16.14) -0.179 (1.038) -0.402 (0.682) 3.361 (23.41) -0.248 (0.726) -3.768*** (0.698) 1972 0.083 0.075 22.05
(2) Ret. 11 mos. 0.343*** (0.0836) -0.192*** (0.0260) 0.0401*** (0.00701) 81.00*** (16.35) -2.807** (1.053) -1.773* (0.692) -21.14 (23.71) -1.640* (0.739) -3.387*** (0.708) 1969 0.079 0.071 20.89
(3) Ret. 12 mos. 0.337*** (0.0911) -0.178*** (0.0284) 0.0394*** (0.00764) 90.40*** (17.82) -3.412** (1.147) -2.664*** (0.754) -38.31 (25.86) -1.912* (0.802) -3.453*** (0.771) 1961 0.070 0.062 18.34
(4) Ret. 13 mos. 0.234** (0.0902) -0.144*** (0.0281) 0.0347*** (0.00757) 82.45*** (17.61) â&#x2C6;&#x2019;2.194+ (1.138) -1.638* (0.747) -57.26* (25.63) -1.739* (0.794) -2.719*** (0.763) 1950 0.062 0.054 16.01
(5) Ret. 14 mos. 0.147 (0.0965) -0.119*** (0.0301) 0.0354*** (0.00808) 78.59*** (18.83) -1.173 (1.217) -1.290 (0.799) -62.24* (27.40) -0.267 (0.849) -2.307** (0.817) 1941 0.053 0.045 13.54
Table 5: Effect of aggregate disagreement on post-IPO abnormal returns months 10-14
142
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
Observation R2 Adjusted R2 F
Constant
Market risk premium
Risk-free rate
SMB
HML
Dividend/price
Price/earnings
2000-2002xAggregate dis.
Aggregate disagreement
(1) Ret. 15 mos. 0.0616 (0.0919) -0.0794** (0.0287) 0.0269*** (0.00767) 69.34*** (17.90) -1.270 (1.156) -1.491* (0.760) -88.30*** (26.00) -0.388 (0.808) -1.528* (0.777) 1934 0.055 0.046 13.82
(2) Ret. 16 mos. -0.0286 (0.0922) -0.0453 (0.0287) 0.0213** (0.00769) 62.00*** (17.97) -1.369 (1.160) â&#x2C6;&#x2019;1.413+ (0.763) -111.0*** (26.08) -0.829 (0.811) -0.808 (0.779) 1918 0.056 0.048 14.09
(3) Ret. 17 mos. -0.122 (0.0993) -0.00668 (0.0309) 0.0182* (0.00826) 59.68** (19.33) â&#x2C6;&#x2019;2.342+ (1.247) -2.306** (0.820) -126.6*** (28.00) -1.223 (0.871) -0.247 (0.838) 1906 0.054 0.046 13.58
(4) Ret. 18 mos. -0.113 (0.104) -0.00633 (0.0325) 0.0175* (0.00870) 61.67** (20.31) -3.391** (1.310) -2.318** (0.861) -122.3*** (29.59) -2.717** (0.916) -0.303 (0.880) 1892 0.050 0.041 12.30
(5) Ret. 19 mos. -0.389* (0.122) 0.0755* (0.0380) 0.00823 (0.0102) 29.85 (23.75) -5.383*** (1.532) -4.265*** (1.007) -152.2*** (34.64) -3.448** (1.071) 1.793+ (1.030) 1881 0.050 0.041 12.30
Table 6: Effect of aggregate disagreement on post-IPO abnormal returns months 15-19
143
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
Observations R2 Adjusted R2 F
Constant
Market risk premium
Risk-free rate
SMB
HML
Dividend/price
Price/earnings
2000-2002xAggregate dis
Aggregate disagreement
-0.448*** (0.103) 0.0992** (0.0320) 0.00230 (0.00853) 19.69 (19.93) -3.200* (1.286) -3.295*** (0.845) -175.4*** (29.05) -2.068* (0.899) 2.388** (0.865) 1875 0.068 0.059 16.90
(1) Ret. 20 mos.
(2) Ret. 21 mos. -0.512*** (0.102) 0.123*** (0.0318) -0.00213 (0.00855) 15.47 (19.95) -1.940 (1.280) -2.352** (0.842) -180.7*** (28.90) -0.836 (0.893) 2.818** (0.864) 1862 0.070 0.061 17.36
(3) Ret. 22 mos. -0.550*** (0.102) 0.135*** (0.0316) -0.00452 (0.00849) 0.776 (19.82) -2.245+ (1.273) -2.199** (0.837) -204.6*** (28.67) -0.664 (0.887) 3.336*** (0.859) 1855 0.069 0.060 17.05
(4) Ret. 23 mos. -0.580*** (0.111) 0.144*** (0.0345) -0.00347 (0.00926) -2.798 (21.64) -1.869 (1.382) -2.382** (0.911) -217.4*** (31.14) -0.323 (0.964) 3.553*** (0.936) 1834 0.063 0.055 15.37
(5) Ret. 24 mos. -0.495*** (0.101) 0.136*** (0.0316) -0.00221 (0.00846) 11.67 (19.77) -2.466+ (1.263) -2.901*** (0.832) -228.6*** (28.44) 0.198 (0.882) 2.900*** (0.857) 1819 0.075 0.066 18.31
Table 7: Effect of aggregate disagreement on post-IPO abnormal returns months 20-24
Effect of a 1.1 stdev increase in agg. dis. on returns -1 -.5 0 .5 1
Figure 2: Coefficients for regressions of monthly forward abnormal returns on aggregate dispersion of analyst forecasts
0
5
10 15 Months after IPO
20
25
0.62 standard deviation increase in two-month forward abnormal returns, and causes similarly large changes in forward abnormal returns in other months. One concern in Tables 3-7 is multicollinearity: with price/equity and dividend/price in the regressions, VIF for aggregate disagreement is over 80 in all 25 regressions. Therefore, I perform the same regressions excluding both ratios. Interestingly, although all VIF values fall below 15, the economic and statistical significance of the coefficients on aggregate disagreement change only marginally. As such, I omit the results of these regressions. The results using smoothed aggregate disagreement as the explanatory variable mirrored the results using simple aggregate disagreement closely. Therefore, I omit summary tables from this paper.
144
6.1.3
Forward Turnover
In my study of the effect of aggregate disagreement on forward turnover, I begin my analysis with regressions excluding price/equity and dividend/price controls. The results of my fixed-effects firm-level regression models of forward turnover are summarized in Tables 8-12. As before, the coefficients on aggregate disagreement from these tables are summarized in Figure 3. Interestingly, the coefficients on aggregate disagreement for months 0-4 are significant, positive, and increasing, indicating that higher disagreement has an increasingly positive effect on forward turnover. This result is consistent with the behavior of speculative bubbles in Hong et al. (2006), and suggests that there may be a momentum effect magnifying the effect of disagreement immediately post-IPO. Moreover, I cannot reject the null that the coefficients on aggregate disagreement from months 6-24 equal zero. This is consistent with the disappearance of the speculative bubble after the first months following the IPO. Of interest are a few features of the above tables. First, the regressions capture very little of the variance in forward turnover, with a maximum r2 of 0.048. Still, however, at the four-month point, a one standard deviation increase in aggregate disagreement can increase four-month forward post-IPO turnover by 0.37 standard deviations. Second, the standard error in all of my regressions is fairly high relative to my returns regressions, and is likely driven by the relatively smaller sample size used in these regressions. I also consider models controlling for the market-wide price/equity and dividend/price ratio. However, as these regressions have exceptionally high VIF values (in all cases greater than 80), and different results from those of Tables 8-12, I omit them in this paper. Although the predictive quality of this model is effectively unchanged when controlling for the price/equity and dividend/price ratios, I do note that the coefficients on aggregate disagreement for the months immediately following the IPO unsurprisingly fall slightly. However, the overall patterns (initially positive, and then subsequently insignificant coefficients) remain unchanged. I discuss this pattern in greater detail in Section 7. 145
146
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
Observations R2 Adjusted R2 F
Constant
Market risk premium
Risk-free rate
SMB
HML
2000-20002xAggregate dis.
Aggregate disagreement
(1) Turn. EOM 0.676* (0.276) -0.204* (0.101) -6.374 (4.531) -1.751 (2.988) 86.66 (98.04) 0.146 (3.174) 0.146 (1.328) 545 0.023 -0.004 2.116
(2) Turn. 1 mo. 0.484+ (0.252) -0.110 (0.0932) -11.37** (4.131) -5.486* (2.720) -65.80 (90.59) -1.092 (2.911) -1.092 (1.220) 587 0.030 0.004 2.915
(3) Turn. 2 mos. 0.723*** (0.191) -0.201** (0.0709) -7.454* (3.141) -3.343 (2.083) 11.79 (69.21) -1.971 (2.237) -1.971 (0.926) 642 0.039 0.016 4.274
(4) Turn. 3 mos. 0.978*** (0.224) -0.301*** (0.0839) -6.392+ (3.762) -4.434+ (2.507) 153.5+ (82.39) 1.890 (2.649) 1.890 (1.093) 689 0.048 0.027 5.639
(5) Turn. 4 mos. 1.080*** (0.260) -0.388*** (0.0970) -9.9424* (4.402) -5.299+ (2.917) -8.452 (97.09) -3.657 (3.084) -3.657 (1.270) 736 0.039 0.019 4.850
Table 8: Effect of aggregate disagreement on post-IPO turnover months 0-4
147
1. Standard errors in parentheses 2. p<0.10, *p<0.05, **p<0.01, ***p<0.001
Observations R2 Adjusted R2 F
Constant
Market risk premium
Risk-free rate
SMB
HML
2000-2002xAggregate dis.
Aggregate disagreement
(1) Turn. 5 mos. 0.865** (0.270) -0.305** (0.101) -7.639+ (4.572) -7.605* (3.016) -82.79 (100.7) -1.722 (3.198) -1.530 (1.322) 784 0.027 0.008 3.533
(2) Turn. 6 mos. 0.319 (0.331) -0.145 (0.123) -8.135 (5.644) -4.115 (3.692) -80.35 (125.3) -2.010 (3.971) 1.046 (1.628) 839 0.008 -0.011 1.039
(3) Turn. 7 mos. 0.397* (0.200) -0.161* (0.0746) 1.299 (3.406) -0.352 (2.228) -105.9 (75.70) -0.151 (2.397) 0.401 (0.985) 833 0.011 -0.007 1.501
(4) Turn. 8 mos. 0.237 (0.175) -0.0585 (0.0651) -3.712 (2.973) -5.134** (1.944) -112.4+ (66.25) -1.056 (2.095) 1.046 (0.859) 831 0.016 -0.002 2.186
(5) Turn. 9 mos. 0.113 (0.200) -0.0185 (0.0746) -1.451 (3.392) -3.853+ (2.220) -114.1 (75.53) 2.477 (2.389) 1.636+ (0.982) 829 0.011 -0.008 1.450
Table 9: Effect of aggregate disagreement on post-IPO turnover months 5-9
148
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
Observations R2 Adjusted R2 F
Constant
Market risk premium
Risk-free rate
SMB
HML
2000-2002xAggregate dis.
Aggregate disagreement
0.140 (0.178) -0.0332 (0.0666) -2.370 (3.033) -4.827* (1.986) -112.5+ (67.28) 0.771 (2.134) 1.446+ (0.876) 822 0.014 -0.004 1.964
(1) Turn. 10 mos. 0.201 (0.186) -0.0885 (0.0697) -7.144* (3.187) -5.421** (2.086) -125.2+ (70.22) -2.563 (2.255) 1.287 (0.916) 817 0.018 -0.000 2.481
(2) Turn. 11 mos. -0.0377 (0.197) 0.0210 (0.0736) -8.233* (3.370) -6.445** (2.204) -271.0*** (74.70) -4.238 (2.388) 2.930** (0.969) 814 0.026 -0.007 3.483
(3) Turn. 12 mos. 0.00197 (0.221) -0.0151 (0.0824) -8.390* (3.784) -8.330*** (2.473) -256.9** (83.61) -2.960 (2.715) 2.723* (1.087) 808 0.026 -0.008 3.562
(4) Turn. 13 mos.
(5) Turn. mos. -0.264 (0.248) 0.0857 (0.0929) -8.841* (4.253) -2.833 (2.784) -110.9 (93.87) -1.966 (3.061) 3.374** (1.221) 800 0.009 -0.009 1.251
Table 10: Effect of aggregate disagreement on post-IPO turnover months 10-14 14
149
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
Observations R2 Adjusted R2 F
Constant
Market risk premium
Risk-free rate
SMB
HML
2000-2002xAggregate dis.
Aggregate disagreement
(1) Turn. mos. -0.0350 (0.218) -0.0246 (0.0818) -9.109* (3.737) -7.513** (2.447) -185.99* (82.32) -7.763** (2.686) 2.780** (1.073) 797 0.026 0.007 3.485 15
(2) Turn. mos. -0.195 (0.210) 0.00975 (0.0788) -3.569 (3.594) -3.835 (2.362) -219.4** (79.19) -4.198 (2.603) 3.622** (1.034) 788 0.021 0.002 2.790 16
(3) Turn. mos. -0.360 (0.326) 0.0810 (0.122) -8.880 (5.560) -7.326* (3.650) -171.8 (122.8) -5.740 (4.020) 4.392** (1.607) 779 0.010 -0.009 1.317 17
(4) Turn. mos. -0.0833 (0.269) -0.05377 (0.100) -5.962 (4.572) -4.811 (2.992) -179.7+ (100.3) 8.523** (3.299) 3.157* (1.323) 769 0.021 0.002 2.693 18
(5) Turn. mos. -0.266 (0.268) 0.0421 (0.100) -9.957* (4.532) -8.355** (2.978) -187.4+ (99.01) -8.059* (3.269) 3.956** (1.317) 757 0.022 0.002 2.719
Table 11: Effect of aggregate disagreement on post-IPO turnover months 15-19 19
150
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
Observations R2 Adjusted R2 F
Constant
Market risk premium
Risk-free rate
SMB
HML
2000-2002xAggregate dis.
Aggregate disagreement
-0.372 (0.230) 0.0950 (0.0859) -6.350 (3.879) -8.338** (2.549) -239.3** (84.40) -4.132 (2.803) 4.480*** (1.129) 744 0.031 0.011 3.872
(1) Turn. 20 mos. -0.260 (0.267) 0.0710 (0.0998) -7.104 (4.501) -8.090** (2.956) -382.9*** (97.71) -1.314 (3.247) 4.568*** (1.312) 740 0.031 0.011 3.823
(2) Turn. 21 mos. -0.448+ (0.268) 0.102 (0.100) -0.996 (4.530) -3.280 (2.972) -424.0*** (98.63) -1.439 (3.266) 5.509*** (1.318) 732 0.033 0.013 4.109
(3) Turn. 22 mos. -0.131 (0.354) -0.0829 (0.132) 4.050 (5.984) 0.818 (3.923) -353.1** (129.8) -1.289 (4.306) 3.931* (1.737) 726 0.022 0.001 2.640
(4) Turn. 23 mos.
(5) Turn. mos. -0.244 (0.323) -0.0276 (0.121) -3.042 (5.458) -4.804 (3.586) -274.7* (117.6) -1.641 (3.907) 4.147** (1.583) 717 0.020 -0.001 2.407
Table 12: Effect of aggregate disagreement on post-IPO turnover months 20-24 24
Effect of a 1.1 stdev increase in agg. dis. on turnover -1 -.5 0 .5 1 1.5
Figure 3: Coefficients for regressions of monthly forward turnover on aggregate dispersion of analyst forecasts excluding price/equity and dividend/price controls
0
5
10 15 Months after IPO
20
25
As a note, as the regressions using smoothed aggregate disagreement as an explanatory variable (both with and without price/equity and dividend/price controls) produce similar results to the regressions featuring simple disagreement, I exclude tabular summaries of these regressions.
6.2
Private-Side Models
Although the results above suggest that short-run IPO underpricing and long-run IPO underperformance may be partially explained by a speculative motive, I explore the possibility that firms and their underwriters consider aggregate disagreement when deciding (1) whether to initiate an IPO, and (2) the appropriate IPO price. Accordingly, I consider three monthly measuresâ&#x20AC;&#x201D;the number of IPOs, number of SEOs, and the percent of IPOs priced above the median of the file price range.
151
Table 13: Effect of aggregate disagreement on number of IPOs Aggregate dis. 2000-2002xAgg. dis.
(1) -3.250 (2.316)
Price/earnings
(2) -1.959 (2.783) -1.168 (1.230)
(3) -2.058 (2.798) -1.089 (1.241)
-102.5 (841.7) 4.660 (10.47) 30.50* (13.31) 349 0.009 -0.003 0.763
-11.97 (18.61) -4.382 (15.37) -157.5 (850.4) 1.962 (11.34) 31.18* (13.40) 349 0.010 -0.007 0.574
Dividend/price HML SMB Risk-free rate Mkt. risk prem. Constant Observations R2 Adjusted R2 F
35.11** (10.60) 349 0.006 0.003 1.967
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
152
(4) -3.653 (2.786) -1.414 (1.213) -0.0205 (0.131) -648.4** (242.8) -10.40 (18.74) -4.651 (15.47) 943.6 (923.3) 2.203 (11.43) 51.79*** (14.98) 349 0.031 0.008 1.350
Effect of a 1.1 stdev increase in agg. dis. on turnover -2 -1 0 1 2
Figure 4: Coefficients for regressions of monthly forward turnover on aggregate dispersion of analyst forecasts including price/equity and dividend/price controls
0
6.2.1
5
10 15 Months after IPO
20
25
Number of IPOs and SEOs
I begin my analysis of private-side decision-making with Table 13, which presents the results of my Prais-Winsten GLS regressions of the number of IPOs on aggregate disagreement. In all cases, I fail to reject the null that the coefficient on aggregate disagreement is equal to zero (and in fact fail to do so for most of the terms in the regression). Given the low predictive value of the regressions (r2 â&#x2030;¤ 0.031), and the across-theboard economically and statistically insignificant coefficients on aggregate disagreement, it is likely that there exists no meaningful relationship between disagreement and the number of IPOs in a time period, also indicating that aggregate disagreement likely does not affect a firmâ&#x20AC;&#x2122;s decision to undertake an IPO. I obtain fairly similar results when performing similar regressions on the number of SEOs in a given month, the results of
153
Table 14: Effect of aggregate disagreement on number of SEOs Aggregate dis. 2000-2002xAgg. dis.
(1) -1.511 (2.731)
Price/earnings
(2) -2.262 (3.514) -0.127 (1.516)
(3) -2.230 (3.535) -0.104 (1.526)
-2642.2** (993.7) 0.278 (15.67) 52.86*** (15.56) 277 0.026 0.011 1.796
-13.01 (31.60) -12.38 (24.02) -2723.3** (1012.0) -4.278 (19.32) 53.16*** (15.67) 277 0.027 0.005 1.229
Dividend/price HML SMB Risk-free rate Mkt. risk prem. Constant Observations R2 Adjusted R2 F
37.70** (12.19) 277 0.000 -0.003 0.122
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
154
(4) -5.039 (3.587) -1.139 (1.541) 0.697 (0.565) -487.4 (354.9) -3.270 (31.51) -8.107 (23.91) -412.5 (1251.3) -1.145 (19.20) 55.34* (23.46) 277 0.060 0.032 2.138
which are summarized in Table 14.22 Here again I universally fail to reject the null that the coefficient on aggregate disagreement is equal to zero, and once again have low predictive quality to my regression models (r2 < 0.06). Given the low economic and statistical significance of the coefficients, the likely implication of these regressions is that a firm and its underwriters’ decision to issue shares—either initially or for follow-on offerings—is not influenced by aggregate disagreement. 6.2.2
Percent of IPOs Priced Above File-Price-Range Median
In the previous section, I consider the possibility that a firm’s choice to go forward with an IPO is influenced by aggregate disagreement. In this section, I consider whether a firm’s IPO price is influenced by aggregate disagreement given the decision to undertake an IPO. As a proxy for firms’ pricing decision-making, I use the percent of IPO prices priced above the median of the file-price range. Although the measure is undoubtedly an imperfect proxy for pricing decision-making, it does reflect the firm’s decision after surveying potential buyers, as well as the broad market. Moreover, this pricing decision occurs after the firm and its underwriters have objectively set the valuation range for the company issuing shares. The results of my analysis of the effect of aggregate disagreement on the percent of IPO prices above the median of the file-price range are summarized in Table 15. Once again, I fail to reject the null that the coefficients on aggregate disagreement are equal to zero. Although these models capture slightly more variance than the models for the number of IPOs and SEOs (r2 ∈ [0.042, 0.140]), the coefficients on aggregate disagreement are still highly economically insignificant, indicating that the relationship between aggregate disagreement and pricing is negligible. As such, I cannot conclude that firms and their underwriters are, given knowledge of aggregate disagreement, changing their share issuance decisions. Furthermore, I cannot conclude that 22 Note:
although SEOs are not directly related to IPOs, I expect in most casts that firm IPO and SEO behavior would be fairly similar to each other for various levels of disagreement.
155
Table 15: Effect of aggregate disagreement on percent of starting values above median Aggregate disagreement 2000-2002xAggregate dis. Price/earnings
(1) 1.821 (2.806)
(2) 4.505 (4.107) -2.081 (1.823)
(3) 4.639 (4.107) -2.299 (1.826)
-727.9 (1105.0) 55.80* (25.19) 23.63 (19.38) 335 0.063 0.051 5.528
68.56 (44.12) 63.12+ (36.16) -566.7 (1107.7) 70.01* (27.70) 22.16 (19.39) 335 0.074 0.057 4.354
Dividend/price HML SMB Risk-free rate Mkt. risk prem. Constant Observations R2 Adjusted R2 F
31.59* (12.75) 335 0.042 0.039 14.50
(4) -0.854 (3.785) â&#x2C6;&#x2019;2.768+ (1.673) 0.709*** (0.159) -769.6** (273.4) 69.37 (43.52) 70.66+ (36.02) 3357.7* (1333.0) 72.91** (27.25) 35.62+ (18.47) 335 0.140 0.119 6.628
1. Standard errors in parentheses 2. +p<0.10, *p<0.05, **p<0.01, ***p<0.001
firms are altering share offer prices as a result of varying levels of disagreement. As I will further explore in the following section, these results all suggest that a speculative motive more likely drives the strong effect disagreement has on IPO pricing and turnover.
7
Discussion
I begin by recalling the conclusions of Section 3, and specifically Proposition 1, which suggest that short-term overperformance and long-term underperformance both increase with increasing disagreement, given temporary short-sales constraints, sufficiently high temporary disagreement, and 156
momentum traders. My model suggests that (1) increasing aggregate disagreement during the day of the IPO should have a positive effect on first-day returns; (2) forward abnormal returns in the months immediately following the IPO should rise with day-of-IPO aggregate disagreement; and (3) day-of-IPO aggregate disagreement should affect long-run post-IPO negatively. Moreover, the combination of the results of my model and the work of Scheinkman and Xiong (2003), Hong et al. (2006), and Hong and Sraer (2011b) suggest that (4) turnover immediately following IPOs should rise with increasing disagreement. My empirical results appear to support the above hypotheses. First, my work with first-day IPO returns suggests that a one standard-deviation change in time-of-IPO aggregate disagreement increases average monthly first-day returns by 7.8%, indicating that increasing time-of-IPO aggregate disagreement leads to increased overpricing immediately following the IPO. Second, my results also offer preliminary evidence that time-of-IPO aggregate disagreement affects long-term post-IPO returns. Specifically, I observed that aggregate disagreement has an increasingly positive effect on forward abnormal returns in the first three months following an IPO. Although the high serial correlation in the aggregate disagreement time series makes it challenging to parse whether the positive effect of disagreement on short-term post-IPO returns is the result of the combination of time-of-IPO disagreement and momentum or the persistence of aggregate disagreement, the increasing effect of aggregate disagreement suggests that there is a momentum effect interacting with the disagreement effect. That said, the fact that aggregate disagreement has a positive effect on forward abnormal returns for the first twelve months post-IPO is harder to interpretâ&#x20AC;&#x201D;this effect can be credibly attributed to both momentum and the persistence of the aggregate disagreement time series. Third, my results indicate that disagreement has a decidedly negative effect on medium-to-long-term post-IPO returns. Specifically, disagreement appears to have a significant, negative relationship on medium-to-long-term (1-2 year) post-IPO returns. It follows, then, that momentum traders exacerbate the effect of mean-reversion as simple investors receive signals from nature 157
about the true, fundamental value of IPO stocks. Especially important to the analysis here is the fact that I utilized aggregate disagreement as my proxy for disagreement, instead of idiosyncratic disagreement. Had I used idiosyncratic disagreement as my proxy, it would have been difficult to distill whether the negative long-term effect of disagreement on returns is the result of disagreement itself or the perceived riskiness of the IPO stock. Since I use aggregate disagreement, this objection does not hold. Fourth, my empirical evidence on forward post-IPO turnover seems to be consistent with the theory that momentum traders exacerbate the effect of disagreement, and thereby cause increased turnover in the months immediately following the IPO. Specifically, in the first four months following an IPO, disagreement appears to have a significant, increasingly positive effect on turnover, a result consistent with the price-bubble behavior observed in this paper, as well as the results of Hong et al. (2006). Furthermore, disagreement appears to have no significant, positive effect on long-term post-IPO turnover, a result consistent with the disappearance of post-IPO price bubble behavior after the first months following an IPO. My empirical work has given me evidence that the interaction between aggregate disagreement and momentum traders, through market-driven channels, has a strong effect on post-IPO price-volume dynamics. However, one equally interesting finding is the implication that aggregate disagreement dynamics do not have a place at the table in private decisionmakersâ&#x20AC;&#x2122; IPO decision-making processes. Specifically, I failed to find any significant relationship between aggregate disagreement and the number of IPOs and SEOs launched in a given month. The absence of this relationship is surprising, especially given the dramatic effect aggregate disagreement has on post-IPO pricing. This offers preliminary evidence that there may be flaws in overall firm and underwriter-level IPO decision-making. I also consider the possibility that firmsâ&#x20AC;&#x2122; pricing decisions have a significant relationship with aggregate disagreement, by studying the percent of IPOs priced above the file-price range median. Here again, I fail to find a significant relationship, an especially interesting result given the enormous opportunity 158
cost firms face when IPOs are underpriced. I believe that this paradoxical behavior should undoubtedly be the explored further by researchers. I conclude with the following observation: this paper has been the first to show that aggregate, non-idiosyncratic disagreement affects post-IPO price-volume dynamics. Given that measures of idiosyncratic disagreement naturally are confounded by a firmâ&#x20AC;&#x2122;s risk profile, and that all prior studies of the effect of disagreement on IPOs have focused on noisy measures of idiosyncratic disagreement, this paper offers the first conclusive evidence that disagreement can explain both short-term IPO underpricing and long-term IPO underperformance.
8
Conclusions and Future Work
In this paper, I demonstrate that the existence of short-term disagreement and short-sales constraints in an environment with momentum traders can have significant short- and long-term implications for post-IPO price-volume dynamics. Consistent with literature and my theoretical model, I provide new conclusive econometric evidence that aggregate disagreement has an increasingly positive effect on early post-IPO returns, and an increasingly negative effect on medium-to-long-term postIPO returns. Moreover, I find that post-IPO volume dynamics act consistently with my empirical post-IPO pricing results, and surprisingly fail to find any strong relationship between disagreement and firm- and underwriter-level decision-making. This study suggests a number of interesting avenues for future work. I divide these into two major categories: refinements to the methods used in this study, and possibilities for new work expanding on this paper. Although I view many of the results of this study to be empirically robust, a number of improvements could be made to the methods of this paper. First, as technological limitations prevent me from utilizing daily price data when calculating forward abnormal returns, future work should aim to replicate the work of this paper using daily stock-price data.23 Second, as 23 However,
as my methods introduced only left-hand-side error, I donâ&#x20AC;&#x2122;t expect
159
CRSP lacks volume or shares outstanding data for a large number of IPO stocks (at least for the first few months post-IPO), future work should aim to complete this dataset, to ensure that there is no sample selection bias occurring in this study as a result of missing data in CRSP. Third, future studies should aim to explore the effects of dropping tech-bubble results from the dataset, as surveys of IPOs tend to be highly sensitive to the time period studied (Ljungqvist 2007). Fourth, future studies should attempt to study the effect of momentum directly, as well as momentumâ&#x20AC;&#x2122;s interaction with disagreement. Finally, future studies should attempt to also aggregate forward abnormal returns and forward volume data by month, as was done in my aggregate dataset, to ensure that there is no frequency or sample selection bias occurring in this paper.24 More interestingly, this paper also opens five possible paths for future study. First, given the recent work by Hong and Sraer (2011b), and the fact that aggregate disagreement appears to have an effect on post-IPO price-volume dynamics, future work should consider the relationship between disagreement, beta, post-IPO returns, and post-IPO turnover. Second, future work should explore further the relationship between disagreement and turnover, both from the theoretical and empirical end, to expand on the preliminary analysis performed in this paper. Third, future work should explore the effect of time-of-IPO disagreement on forward abnormal returns in the 3-5 year horizon, given the empirical evidence of long-run reversals (De Bondt and Thaler 1985) and IPO underperformance (Ritter 1991, Loughran and Ritter 1995) for this time-period. Fourth, given the interrelationship amongst price, turnover, and volatility found in Hong et al. (2006), future work should also explore the effect of disagreement on post-IPO volatility. Finally, and perhaps most interestingly, given my failure to find relationships between aggregate disagreement and firm-underwriter IPO decisionmaking, future work should explore more proxies for firm-level decision-making to see if firms are truly ignoring the effect of disagreement on IPOs. Although this paper provides a simple explanation to the IPO this to yield substantially different results from my current results. 24 c.f. Ritter (2011).
160
Puzzle, it may represent only the tip of the iceberg with respect to behavioral factors influencing IPOs. Refining the work of this paper and answering the questions above could therefore help to further explain the anomaly that is post-IPO pricing, and resolve one of corporate financeâ&#x20AC;&#x2122;s most vexing puzzles.
161
The Yale Journal of Economics is grateful for the financial support of the Yale Department of Economics, the Yale Undergraduate Organizations Committee, and our generous private donors. The typeface used in the Journal is URW Palladio. The Journal was typeset in LATEX and printed by Yale Printing & Publishing Services in New Haven, CT. Visit our website at http://econjournal.sites.yale.edu/.
162