CIGI PAPERS
NO. 21 — DECEMBER 2013
A MARKOV SWITCHING APPROACH TO HERDING MARTIN T. BOHL, ARNE C. KLEIN AND PIERRE L. SIKLOS
A MARKOV SWITCHING APPROACH TO HERDING Martin T. Bohl, Arne C. Klein and Pierre L. Siklos
Copyright © 2013 by The Centre for International Governance Innovation
The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of The Centre for International Governance Innovation or its Operating Board of Directors or International Board of Governors.
This work is licensed under a Creative Commons Attribution — Non-commercial — No Derivatives License. To view this license, visit (www.creativecommons.org/ licenses/by-nc-nd/3.0/). For re-use or distribution, please include this copyright notice. Earlier versions of the paper were presented the 92nd Southwestern Economics Association Annual Meeting and at the Northern Finance Association Conference, FMA Asian Conference, and King’s College, London. Cathy Ning provided helpful comments. Pierre L. Siklos thanks The Centre for International Governance Innovation (CIGI) for financial support.
57 Erb Street West Waterloo, Ontario N2L 6C2 Canada tel +1 519 885 2444 fax +1 519 885 5450 www.cigionline.org
TABLE OF CONTENTS 4
About the Authors
4
About the Project
4 Acronyms 4
Executive Summary
4 Introduction 5
Literature Survey
6
Existing Herding Models
7 Methodology 7
Markov Switching Herding Measures
9
Markov Switching ADF Test
9 Data
9
Empirical Results
13 Conclusions 13 Works Cited 16 About CIGI 16 CIGI Masthead
CIGI Papers no. 21 — December 2013
ACRONYMS ABOUT THE AUTHORS
ADF
augmented Dickey-Fuller
Martin T. Bohl is professor of economics, Centre for Quantitative Economics, Westphalian Wilhelminian University of Münster. From 1999 to 2006, he was a professor of finance and capital markets at the European University Viadrina Frankfurt (Oder). His research focusses on monetary theory and policy as well as financial market research.
AR autoregressive
Arne C. Klein is an assistant lecturer in the Department of Economics at the Westphalian Wilhelminian University of Münster. From July to October 2011, he was a visiting scholar at Wilfrid Laurier University, Waterloo, Canada.
GARCH generalized autoregressive conditional heteroskedasticity
Pierre L. Siklos, a CIGI senior fellow, is the director of the Viessmann European Research Centre at Wilfrid Laurier University, and a research associate at Australian National University’s Centre for Macroeconomic Analysis. His research interests are in applied time series analysis and monetary policy, with a focus on inflation and financial markets.
ABOUT THE PROJECT This publication emerges from a project called Essays in Financial Governance: Promoting Cooperation in Financial Regulation and Policies. The project is supported by a 2011-2012 CIGI Collaborative Research Award held by Martin T. Bohl, Badye Essid, Arne Christian Klein, Pierre L. Siklos and Patrick Stephan. In this project, researchers investigate empirically policy makers’ reactions to an unfolding financial crisis and the negative externalities that emerge in the form of poorly functioning financial markets. At the macro level, the project investigates whether the bond and equity markets in the throes of a financial crisis can be linked to overall economic performance. Ultimately, the aim is to propose policy responses leading to improved financial governance.
BFGS Broyden-Fletcher-Goldfarb-Shanno CAPM
capital asset pricing model
EM
expectation maximization
ETF
exchange trade fund
GED
generalized error distribution
MSADF
Markov switching ADF
MSGARCH
Markov switching GARCH
OLS
ordinary least square
VIX
volatility index
EXECUTIVE SUMMARY Existing models of market herding suffer from several drawbacks. Measures that assume herd behaviour is constant over time or independent of the economy are not only economically unreasonable, but describe the data poorly. First, if returns are stationary, then a two-regime model is required to describe the data. Second, existing models of time-varying herding cannot be estimated from daily or weekly data, and are unable to accommodate factors that explain changes in this behaviour. To overcome these deficiencies, this paper proposes a Markov switching herding model. By means of time-varying transition probabilities, the model is able to link variations in herding behaviour to proxies for sentiment or the macroeconomic environment. The evidence for the US stock market reveals that during periods of high volatility, investors disproportionately rely on fundamentals rather than on market consensus.
INTRODUCTION Herding behaviour has been the subject of considerable interest over the years. The issue of whether investors imitate each other when making investment decisions has been extensively investigated. The extent to which investors discriminate between stocks should be reflected in how returns deviate from overall market performance. If investors follow the market, then dispersion in returns should disappear entirely. It is widely observed, however, that following the market may be conditional on, for example, whether the overall market is rising or falling. More importantly, one would expect market sentiment
4 • THE CENTRE FOR INTERNATIONAL GOVERNANCE INNOVATION
A Markov switching Approach to Herding and macroeconomic and financial conditions to have a significant influence on the extent to which investors follow the market. Existing strategies to empirically investigate herding behaviour suffer from several deficiencies, including the inability to recognize that herding can change over time as market conditions change. For example, some herding measures are assumed to be constant. This is not only economically unreasonable, but such a view implies a mis-specification. For example, this paper shows that a unit root in the time series for dispersion can be rejected in favour of an alternative, wherein there are two regimes with stationary returns, namely, one when market volatility is high and another when volatility is low. Moreover, existing models of herding behaviour cannot be estimated from daily or weekly data, or are incapable of accommodating factors that determine investors’ propensity to display herd-like behaviour. This paper makes a start towards overcoming these drawbacks by proposing a Markov switching model of the model proposed by Chang, Cheng and Khorana (2000), and applying this model to data from the US stock market. Reliance on a Markov switching model is supported by the finding that two distinct states exist; they are found to be closely related to observed market phases that alternate between high- and lowvolatility market conditions. The aftermath of the dot.com bubble represents a highly volatile regime, as does the period of the 2008–2010 financial crisis. Time-varying transition probabilities as derived by Diebold, Lee and Weinbach (1994) enable us to consider economic and financial variables that drive changes in herding behavior over time. In particular, proxies for market sentiment were applied, such as implied volatility and trading volume, as well as term structure variables, which the literature considers to be closely linked to macroeconomic fundamentals. In addition, non-normal distributions and generalized autoregressive conditional heteroskedasticity (GARCH) effects were controlled for. The remainder of the paper is organized as follows: in the next section, the literature on herding is reviewed; the classical approaches to measure herd formation are discussed in the third section; in the fourth section, the Markov switching models of herd behaviour are outlined and the two-regime augmented Dickey-Fuller (ADF) test is briefly sketched out; the fifth section discusses the empirical results; and the paper ends with a brief conclusion.
LITERATURE SURVEY The literature, in general, defines intentional herding as a situation where investors imitate each other’s buy and sell decisions, even though this kind of trading
strategy might be at odds with their own information and beliefs. By contrast, spurious herding refers to a “clustering” of investment decisions due to similar underlying information sets. Herding behaviour can be either rational or irrational (Devenow and Welch 1996; Bikhchandani and Sharma 2001). Pure irrational herd behaviour is closely related to the theory of noise trading (De Long et al. 1990; De Long et al. 1991; Jeanne and Rose 2002), which assumes that a group of investors act irrationally or at least base investment decisions on some exogenous liquidity concerns combined with some limits to arbitrage. In contrast, information-based herding rests on the presumption that investors face uncertainty about the quality of the information they are able to access. Although information cascades attempt to address this kind of behaviour (Bikhchandani, Hirshleifer and Welch 1992; Welch 1992; Banerjee 1992; Avery and Zemsky 1998) are the first to propose a model that is applicable to the case of financial markets. However, even for an investor who has access to superior private information, it might be rational to ignore this information and to rely on herding, for example, in the case of portfolio managers facing incentives to stick with a benchmark (Scharfstein and Stein 1990; Froot, Scharfstein and Stein 1992; Graham 1999). The empirical literature on herding can be subdivided into two branches. The first one deals with herding among institutional investors like fund managers. Research of this kind resorts to data on their trading behaviour. Work on this topic is mainly based upon the measure proposed by Lakonishok, Schleifer and Vishny (1992), who compare the actual share of managers’ buy and sell decisions against the expected values under the assumption of independent trading. A second strand of research deals with herding towards the market, which is given by investors who base their investment decisions entirely on the market consensus, thereby ignoring their own beliefs about the risk-return profile of particular stocks. Christie and Huang (1995) are the first to address this issue empirically. They test the conjecture that such a trading pattern is more likely to arise during times of market stress as evidenced by unusually high volatility. However, their evidence for the US market cannot corroborate a significant clustering of returns during strong market movements. Unlike Christie and Huang (1995), the approach put forward in Chang, Cheng and Khorana (2000) does not neglect investors’ behaviour during periods of low or average volatility. Their test specification aims to compare the actual dispersion of single stock returns around the market with the value implied by rational asset pricing. In particular, they exploit the fact that those pricing models imply a linear relationship between the absolute value of
Martin T. Bohl, Arne C. Klein and Pierre L. Siklos • 5
3
Existing Herding Models
CIGI Papers no. 21 — December 2013 Research on herding rests on the seminal work of Christie and Huang (1995). Their approach considers the dispersion of single stock returns around the market. They
the market return and its dispersion. Their findings support of single stock returns around the market. They propose an increased tendency to herdpropose in emerging markets, the following measure: the following measure: but reveal only little evidence for such a behaviour in developed countries. N (t)
St =
1 |ri,t − rm,t | , N (t) i=1
(1) (1) Tan et al. (2008) investigate herding in Chinese A and B stocks using the approach of Chang, Cheng and Khorana (2000). They find evidence for herding in both the A stocks where N (t) and T are the numbersat of time stockst available at and T that are the numbers of stocks available and observations in available for domestic investors where and in N the(t) B shares time t and observations in the sample, ri,t, stands for the are dominated by foreign investors. Polish forreturn the return of istock i and for thereturn market return in period the Analyzing sample, rthe of stock and rm,t for rthe in period i,t stands m,t market 11 stock market, Bohl, Gebka and Goodfellow (2009) highlight 1 t, respectively. market,asinaturn, is defined as aaverage value- of single respectively. in turn, isThe defined value-weighted differences in trading patterns t, between individualThe andmarket, weighted average of single stock returns. Equation (1) institutional investors. While stock the former engage in returns. Equation (1) is designed to measure the average absolute deviation of is designed to measure the average absolute deviation herding, particularly during market downturns, the latter of single stock returns marketinsights return into and, the extent single stock returns from the market return and, from thus,the provides are unlikely to be driven by herd behaviour. An application thus, provides insights into the extent to which market to the exchange trade market (ETF) marketmarket can be found to which participants discriminate between individual stocks. all investors participants discriminate between individual stocks. If all in Gleason, Mathur and Peterson (2004). They estimate the investorsSactmust alike be andequal followto the0.market, St must be equal act alike and follow the market, t models of Christie and Huang (1995) and of Chang, Cheng to 0. and Khorana (2000) from New York Stock herding Exchangeconditional on strong market movements, Christie and Huang To detect To detect herding conditional on strong market intraday data and find strong evidence for adverse herding constant and two dummy variables that control for both (1995) regress St upon amovements, Christie and Huang (1995) regress St upon a in this market. Adverse herding refers to a situation where, andreturns two dummy variables that control forquantiles both unlike the case of herding, investors disproportionately extreme positive as well asconstant negative measured by certain outer of the extreme positive as well as negative returns, measured by discriminate strongly between individual stocks. return distribution. Although very clear-cut, this approach obviously depends heavily certain outer quantiles of the return distribution. Although The papers cited above deal with herd behaviour within very clear-cut, approach obviously depends heavily on the definition of the thresholds for this extreme returns. In addition, differing investor a given market, but do not take potential international on the definition of the thresholds for extreme returns. In during times of addition, low and differing average investor volatility is completely neglected. linkages in account. Chiang andbehavior Zheng (2010), however, behaviour during times of low investigate the impact of the US market on herding and average volatility is completely neglected. The extension formation in several stock markets around the world.put Theyforward by Chang et al. (2000) aims to overcome these drawbacks. extension putthe forward by Chang, and asset Khorana provide favourable evidence that They both the volatility as highlight thewell notionThe that, under assumption ofCheng rational pricing (i.e., (2000) aims to overcome these drawbacks. They highlight as the cross-sectional dispersion of single stock returns in 5 the notion under assumption of rationalincreasing asset the United States influence herding activities in pricing), the rest ofequation CAPM-type (1), isthat, linear andthe strictly monotonically in the pricing (i.e., capital asset pricing model [CAPM]-type the world. In contrast, Tan et al. (2008) are unable to find 1 pricing), equation (1),only is linear strictly check monotonically interactions between the herding behaviour in the Chinese Actually, Christie and Huang (1995) use (1) as a and robustness and base their main of the absolute market E (|rThe |). By contrast, herding behavior(1) m,t increasing inreturn, the expected value of the absolute marketdeviation stock markets in Shanghai and expected Shenzhen. inference value upon the cross-sectional standard deviation. advantage of the absolute E (|r |). By contrast, herding behaviour is better over the standard deviation isreturn, that the former is less sensitive to outliers. m,t is better captured by a function that either non-linear reaches ora reaches maximum at a Hwang and Salmon (2004) are the first to derive a measure captured by aisfunction that is eitheror non-linear of herding that allows for time variation in herding a maximum at a certain threshold value of E (|rm,t|), certain threshold value (|rm,t |), declining thereafter . The following regression is behaviour. Their approach is based on the assumption of of E declining thereafter. The following regression is designed time-varying monthly betas. Results for the United States to capture these effects: designed to capture these effects: and South Korea show a tendency of herding to mitigate, or even become adverse, in the run-up to and during 2 St = γ + δ |rm,t | + ζrm,t + t , (2) (2) periods of turmoil, for example, in the Asian and Russian financial crises as well as the tech bubble of the early 2000s. In order to establish a theoretical rationale for these market facts, where the realized return is used to proxy expected value. asset where the realized markettheir return is used to proxyRational their Hwang and Salmon (2009) put forward a testable model expected positive value. Rational pricing, implies aa value of pricing, then, implies a significantly δ and aasset ζ equal to 0. then, By contrast, that incorporates the effect of investor sentiment. In this significantly positive δ and a ζ equal to 0. By contrast, a framework, herding occurs in ζa that situation when investors significantly differs from 0 indicates violation ofdiffers the linearity value of ζ that asignificantly from 0 implied indicates by a rational broadly agree about the future direction of the market, 2 violation of themeans linearity implied by m,t rational asset pricing. asset pricing. Using daily returns, this that V ar(r ) = E(r E(rm,t )2 ≈ m,t ) − 2 whereas adverse herding is likely to arise when there is a 2 2 Using daily returns, this means that Var(rm,t) = E(rm,t) − high probability of divergencesE(r of opinion among 2 so market that rm,t can )2be≈ regarded as so thethat market variance. m,t ) holds, E(r E(rm,t ) holds, r2m,t canreturn be regarded as theIf, during m,t participants. market return variance. If, during periods of high volatility, periods of high volatility, investors herd towards the market, this implies that St , the investors herd towards the market, this implies that St, dispersion of returns around the market, becomes disproportionately low compared to EXISTING HERDING MODELS the rational pricing model. 1 This should show as (1995) a negative coefficient for ζ. Actually, Christie and up Huang use (1) only as a robustness Research on herding rests on the seminal work of Christie check and base their main inference upon the cross-sectional standard the models outlined above, herding behavior is constant and Huang (1995). Their approachWithin considersthe the framework dispersion ofdeviation. The advantage of the absolute deviation (1) over the standard deviation is that the former is less sensitive to outliers. over time in spite of different market phases or business cycles. Since the literature relates herding to investors’ sentiment (Shiller et al. (1984), Lee et al. (1991), Devenow 6 • THE CENTRE FOR INTERNATIONAL GOVERNANCE INNOVATION and Welch (1996), Hwang and Salmon (2009)), which by definition is time-varying, this
in herding to proxies for investor sentiment or macroeconomic fundamentals. T A Markov switching todefinition, Herding im assuming a 0 mean for the latent herding variable, the Approach model, by
swings between herding and so-called adverse herding. Thus, this measure is un
the dispersion of returns around the market, becomes the sought after phenomenon. Second, the model is unable to todescribe a market investors areherding switching between herding, no herdin disproportionately low compared the rational pricing where to link changes in to proxies for investor sentiment model. This should show up as aadverse negative herding coefficientforms for ζ. of or macroeconomic fundamentals. Third, assuming a behavior. In contrast, the Markov switching versions o 0 mean for the latent herding variable, the model, by measureabove, (2) proposed in this paper aimsbetween to remedy these Within the framework of the herding models outlined definition, implies swings herding andproblems. so-called We herding behaviour is constant over time in spite of turn to its description. adverse herding. Thus, this measure is unable to describe a different market phases or business cycles. Since the market where investors are switching between herding, no literature relates herding to investors’ sentiment (Shiller, herding or adverse herding forms of behaviour. In contrast, Fisher and Friedman 1984; Lee, Shleifer and Thaler 1991; the Markov switching version of the herding measure (2) Devenow and Welch 1996; Hwang and Salmon 2009), proposed in this paper aims to remedy these problems. which by definition is time-varying, this assumption does Methodology not seem reasonable. Furthermore, it is conceivable that, METHODOLOGY due to the crisis-laden environment prevailing during the last decade, including the tech bubble, and theSwitching most 4.1 9/11 Markov HerdingSWITCHING Measures HERDING MARKOV recent financial crisis, the time series of dispersion, St, may MEASURES not be stationary in a single regime setting, but might be Our principal aim is to model time-varying herd behavior based on daily data better characterized by a two-state model allowing for The authors’ principal aim is to model time-varying herd different dynamics in tranquil and volatile periods. additionally, to allow variations herding to be by exogenous variable behaviourinbased on daily datadriven and, additionally, to allow
4
variations in herding to be driven by exogenous variables. To account for time-varying effects, Hwang and Salmon straightforward way of introducing time-varying is to assume that it is su A straightforward way behavior of introducing time-varying (2004) propose the following state space model, which, behaviour is to assume that it is subject to regime switches. while similar in spirit, does not to directly make use of the regime switches. Hence, we modify (2) to toallow Hence, equationequation (2) is modified allowfor forswitching switching betwe dispersion measure (1). First of all, the model assumes between two regimes j ϵ {1, 2}: regimes ∈ {1, 2}about : that market betas are changing over time. jInference herding can then be obtained from the cross-sectional 2 St = γj + δj |rm,t | + ζj rm,t + j,t , (3) standard deviation of the betas. For instance, a situation where the betas of all stocks in the market are approaching σj2 ) and the other variables were previously defined. It is well kn j,t ∼N (0, the value 1 implies that this where cross-sectional standard 2 where ϵj,t~N (0, σj ) and the other variables were previously deviation gets close to 0. In contrast, when all investors that financial time seriesdefined. often display Therefore, we often reestimate It is wellleptokurtosis. known that financial time series disproportionately strongly differentiate between stocks, display leptokurtosis. Therefore, modelto given in model given equation (3) allowing one or even both the regimes be governed such that the betas more strongly diverge fromin1 than is equation (3) is re-estimated allowing one or even both implied by the CAPM equilibrium condition, referred to To this end, we rely on the t as well as on the Generalized E fat-tailed distribution. regimes to be governed by a fat-tailed distribution. To as adverse herding, this would result in a higher standard this end, the t is relied on as well as on the generalized deviation. error distribution (GED).2 We assume the latent state To account for the foregoing considerations, Hwang and Salmon (2004), as a first step, estimate standard ordinary least square (OLS) betas on a monthly basis. In a second step, the cross-sectional standard deviation of these betas is calculated for all periods. The deviation is then modelled within a state space framework where the changes in the dispersion of the betas are governed by a latent herding variable. Assuming an autoregressive [AR](1) process describes its movements, the latter can be extracted by using the Kalman filter. Although the above approach produces a continuously evolving herding variable, it suffers from several drawbacks. First, the model cannot be estimated from daily or weekly data, but relies on monthly beta estimates. Monthly betas, however, are strongly driven by “noise,” for example during periods of substantial financial turmoil, such as in the case of the recent financial crisis. Reducing noise requires expanding the estimation period for the market betas, which in turn, reduces the number of observations for the state space model. Furthermore, if herding dynamics actually take place in the very short run, say on a daily or weekly basis, the model cannot capture
variable to be driven by a first-order Markov process, with transition probabilities, pij,t = Pr (St = j|St−1 = i), i, j ϵ {1, 2}, which can either be constant or time-varying. For the sake of inferring the regime the process is in at time t, based on all information available up to the end of the sample period, ΓT, smoothed probabilities pi,t|T = Pr(St = i|ΓT) we recalculated as given in Kim (1994). As stated previously, time-varying transition probabilities can provide insights into the factors driving changes in herding behaviour over time. This means making p11,t and p22,t dependent on a set of exogenous variables Xt−1 including a constant.3 Variables suitable in explaining the switches in investors’ herding behaviour include investor sentiment and macroeconomic conditions relying on data available at the daily frequency. Implied volatility, here 2 The GED may provide further insights into the distributional properties of the dispersion of single stock returns since, unlike the t distribution, it also allows for thinner tails than in the case of the normal distribution. 3 These variables are lagged because the transition probabilities governing switches from t − 1 to t must be determined in t − 1.
Martin T. Bohl, Arne C. Klein and Pierre L. Siklos • 7
]2, ∞[. Nevertheless, it is well known that the t distribution is empiric
the kurtosis. In principle, this parameter can take on any value in the region t and macroeconomicgoverning conditions relying on data available at guishable from theisnormal one for degrees of freedom greater than 30 (H ]2, ∞[. Nevertheless, it is well known that the t distribution empirically indistinCIGI Papers no. 21here — December 2013 se implied volatility, measured using the Chicago Board Jondeau and Rockinger (2003)). Thus, if for state j, the estimate for guishable from the normal one for degrees of freedom greater than 30 (Hansen (1994),
we againforfitυthetakes model, t Volatility Index (VIX). Motivated by the branchThus, of literaon this a time with one regime being g Jondeau and Rockinger (2003)). if forvalue stateabove j, the30,estimate j measured using the Chicago Board Options Exchange
followed since this distribution reduces to the normal for a
t andwith the one other one by a normal distribution. When applying the GED d (2002), BakerVolatility and Stein (2004), Baker and value above 30, again fitWurgler the model, this time regime being tail thickness parameter, κgoverned Market Index (VIX), iswe used. Motivated by(2006)), the , equal toby 1. a 2j the errors, j,t ∼GED(0, σj distribution , κj ), we cantoproceed with a one step estimat branch oftoliterature on sentiment (Jones and t and the other one by a 2002; normalBaker distribution. When applying the GED nover relative market capitalization.
To account for autocorrelation, the authors the since thisa distribution reduces to the normalmake for ause tailofthickness parame Stein 2004; Bakerthe anderrors, Wurgler 2006), theσ 2share turnover j,t ∼GED(0, with one step estimation procedure j , κj ), we can proceed covariance matrix proposed by Newey and West (1987) omic relative conditions can capitalization be derived is from term structure data to market also used. to 1. equal since this distribution reduces to the normal where for a tail thickness a lag lengthparameter, equal to 8 κisj , set as suggested by the To account for autocorrelation, makethe useconstruction of the covariance matrix Proxies for macroeconomic conditions can beand derived and West (1994) criterion.weSince (1991), Estrella andto Mishkin (1997) Estrella MishkinNewey 1. Newey and West (1987) we setby the equal to 8 as sug from term structureTo data (Estrella and Hardouveliswe1991; of of this error matrix andwhere theproposed selection of lag the length appropriate account autocorrelation, make use the covariance matrix cheinkman and Knez et Estrella al.for(1994) that1998). the variEstrella(1988) and Mishkin 1997; andshow Mishkin lag length rests on several assumptions that might be Newey and Westto(1994) criterion. by Since of this error m Newey and West where Litterman we set the lag length equal 8 as asuggested thetheisconstruction Litterman and Scheinkman (1988)(1987) and Knez, crucial for the results, robustness check conducted by selection of the appropriate lag length several assumptions wh capital canNewey be very by inmodels that andwell West (1994) criterion. Since the construction error matrix therests onnumbers andmarkets Scheinkman (1994) show thatdescribed the variation money performing of thethis analysis based and on different of crucial for the results, we perform a robustness check performing our a as well as capitalselection markets of can be very well described by lags. Since the autocorrelations in S are in general found appropriate rests on several assumptions which mightt be ommon factors. Based on zero the bond returns,lag welength use princimodels that contain from one to four common factors. to different be relatively largeof(Chang, Cheng Khorana 2000), on numbers lags. Since the and autocorrelations in St are in gen crucial for the results, we perform a robustness check performing our analysis based Based on zero bond returns, principal components all models are re-estimated for 6, 10, 12 and 14 lags. those to extract common factors. Wenumbers only include in X the autocorrelations be relativelyinlarge (Chang et al. (2000)), of lags. Only Since t−1 St are in general found towe reestimate all models for 6, 1 analysis is used on to different extract common factors. those lags. For someall markets, bethan relatively large ensures (Changgreater etthat al. (2000)), models forstudies 6, 10, 12,report and 14 different herding principal greater components with eigenvalues than factor 1we reestimate h eigenvalues 1. This each dynamics during falling and markets are included in lags. Xt−1. This ensures that each factor has For some markets, studies reportrising different herding(Chang, dynamics during fall Cheng and Khorana 2000; Bohl, Gebka and Goodfellow wer than any return series. we assemble coefficients more explanatory powerIfthan any returnthe series. If the in markets (Chang et al. (2000), Bohlfund et al. (2009)). trading In addition, eviden 2009). In addition, evidence managers’ For someinmarkets, studies report different herding dynamics during fallingfrom and rising coefficients are assembled a vector θ j , the transition probability associated with state j, p can be modeled as: reveals differences in their herding behaviour between probability associated with(Chang state j, et pjj,tjj,t can(2000), be modelled managersIn trading reveals differences in their herding behavior between markets al. Bohl etas:al. (2009)). addition, evidence from fund buying and selling decisions (Keim and Madhavan 1995; decisions (Keim and Madhavan (1995), et al. (1995)). W θj Grinblatt, Titman and Wermers 1995). TheseGrinblatt phenomena managers trading reveals differences in theirselling herding behavior between buying and eXt−1 (4) are also accounted for by estimating an asymmetric version pjj,t = . (4) for these phenomena by estimating an asymmetric version of our baselin Grinblatt et al. model: (1995)). We also account t−1 θj decisions (Keim and Madhavan (1995), of 1 + eXselling the baseline for these phenomena by estimating asymmetric version of our baseline model: Turning tothe themodels estimation procedures, modelsan that 2 n procedures, which assume the a normal distribuSt = γj +γjasy ∗I Rm,t<0 +δj |rm,t |+δjasy ∗I Rm,t<0 |rm,t |+ζj rm,t +ζjasy ∗I Rm,t<0 assume a normal distribution can be estimated using the (5) asy (EM) asy asy Rm,t<0 Rm,t<0 2 Rm,t<0 2 ng the expectation Expectation Maximization algorithm maximization algorithm +δj |rm,t(Dempster, |+δ(Dempster |rm,t |+ζ rm,t + j,t , (5) St = γj +γ(EM) j rm,t +ζj ∗I Rm,t<0 j ∗I j ∗I where I is a dummy variable which is equal to 1 if the market retu Laird and Rubin 1977). A closed form solution for all form solutions for all parameters put forward by Hamiltonwhere IRm,t<0 is a dummy variable that is equal to 1 if the parameters was put forward (1990), which while is equal to 1 if the market return is negative is aHamilton dummy variable where I Rm,t<0 by is negative and equal to 0 otherwise and , the parameters for (4) are derived in Diebold s for θthe j solutions for θj , the parameters for (4), are derived inet al.market return 2 ϵ ~N (0, σ ). j,t j Diebold, Lee and Weinbach (1994). The specifications using with further insights into the distributional properties of the dispersion 9 t and GED-distributed errors are also estimated using the Finally, the authors want to control for ARCH effects, unlike the t distribution, it also allows for thiner tails than in the case EM algorithm. Unlike the case of the normal distribution, volatility clustering and skewness, which are often present to 0 otherwise and j,t ∼N (0, σj2 ). no analytic solutions for the regression parametersand areequal in financial time series. In order to do so, (3) is estimated because the transition probabilities governing switches from t − 1 to t available. Nevertheless, since the conditions for the closedFinally, in we awant to control for ARCH effects, volatility clustering, and 1]) skewness Markov switching GARCH(1, 1) (MSGARCH[1, form solution for the transition probabilities, pij,t = Pr (St = GARCH as(3) in which are framework, often present inthereby, financialmodelling time series. the In order to do component so, we estimate j|St−1 = i), given in Hamilton (1990) still hold, these can be proposed by Gray (1996b). The first lag of St is included, a Markov Switching GARCH(1, 1) (MSGARCH(1, 1)) framework, thereby, modeling calculated as a by-product of the smoothed probabilities, to take into account autocorrelation since the Newey and pi,t|T. Thus, obtaining estimates for the remaining the GARCH component proposed by Gray WeGARCH include themodels. first lag of St , West (1987) aserrors cannot be (1996b). used for regression and distributional parameters requires a whole Toaccount modelautocorrelation skewness, asince skewed t distribution, is applied to take into the Newey and West (1987) errors cannot numeric optimization in each iteration of the EM algorithm as proposed by Fernandez and Steel (1998). The density be used for GARCH models. To model skewness, we apply a skewed t distribution as relying on the Broyden-Fletcher-Goldfarb-Shanno (BFGS) function is given as follows: algorithm. proposed by Fernandez and Steel (1998). The density function is given as follows:
2βj j,t In the case of the t distribution, first (3) is estimated, t(0, βj j,t , υj )I j,t <0 + t(0, , υj )(1 − I j,t <0 ) , (6) (6) ft ( j,t ) = 2 1 + βj2 βj assuming ϵj,t~t(0, σj, ʋj), where ʋj is the (regime-dependent) degrees of freedom parameter governing the kurtosis. where I j,t <0 is an indicator function that is equal to 1 if j,t is negative and equal to where Iϵj,t<0 is an indicator function that is equal to 1 if In principle, this parameter can take on any value in 0 otherwise. βj > 1 indicates a distribution which is skewed to the right while βj is ϵj,t is negative and equal to 0 otherwise. βj > 1 indicates a the region ]2, ∞[. Nevertheless, it is well known smaller that than 1 in case of a left skewed density. For βj = 1, (6) reduces to a standard t distribution that is skewed to the right while βj is smaller the t distribution is empirically indistinguishable from distribution. The MSGARCH(1, 1)skewed model is density. estimatedFor using numerical optimization than 1 in case of a left βj = 1, (6) reduces the normal one for degrees of freedom greater than 30 according to the BFGS algorithm. As we make use of the forward looking algorithm to a standard t distribution. The MSGARCH(1, 1) model (Hansen 1994; Jondeau and Rockinger 2003). Thus, if = P r (S provided inisGray (1996a) to calculate smoothed probabilities, p T ), this estimated using numerical optimization i,T accordingt |Γto for state j, the estimate for ʋj takes on a value above 30, approach can also be considered as a robustness check for Kim’s (1994) smoother. the BFGS algorithm. As the forward-looking algorithm 4 it again fits the model, this time with one regime being provided in Gray (1996a) is used to calculate smoothed governed by a t and the other one by a normal distribution. probabilities, = Pr 4.2 Markov SwitchingpADF Test(St|ΓT ), this approach can also i,T When applying the GED distribution to the errors, 2 ϵj,t~GED(0, σj, κj ), a one-step estimation procedure can be Under rational asset pricing (see equation (2)) St should be stationary. To investigate
the stationarity properties of a time series, it is common practice to rely on a unit root
8 • THE CENTRE FOR INTERNATIONAL GOVERNANCE INNOVATION
test such as Augmented-Dickey-Fuller (ADF) tests. Typically, these tests ignore possible regime switching effects often present in financial time series. To take these effects
ng to the BFGS algorithm. As we make use of the forward looking algorithm d in Gray (1996a) to calculate smoothed probabilities, pi,T = P r (St |ΓT ), this A Markov switching Approach to Herding ch can also be considered as a robustness check for Kim’s (1994) smoother.4 be considered as a robustness check for Kim’s (1994) smoother.4
comprehensive sample may be informative about the contribution of classical blue chips to herding compared Markov Switching ADF Test with more opaque, illiquid and smaller stocks. The MARKOV SWITCHING ADF TEST principal components of the term structure are extracted ational asset pricing (see equation (2)) St should be stationary. To using investigate the Datastream Zero Curve with maturities of 0, 3, Under rational asset pricing (see equation [2]) St should 6 and 9 months as well as 1–10, 12, 15, 20, 25 and 30 years. ionarity properties of a time series, it is the common practice to rely on a unit root be stationary. To investigate stationarity properties Aggregated trading volume and market capitalization of a time series, it is common practice to rely on a unit for the United States-Datastream Market are employed, h as Augmented-Dickey-Fuller (ADF) tests. Typically, these tests ignore possiroot test such as ADF tests. Typically, these tests ignore and the VIX is also obtained from Thomson Reuters possible oftentime present in financial me switching effectsregime often switching present ineffects financial series. To take these effects Datastream. time series. To take these effects into account, Hall and ount, Hall and andPsadarakis Hall et al.and (1999) a MarkovEMPIRICAL switching Sola Sola (1994)(1994) and Hall, Sola propose (1999) propose RESULTS a Markov switching ADF test. Allowing for deterministic st. Allowing for deterministic trending and a variance, regime-depending the trending and a regime-depending the test variance, First, the stationarity properties of the time series of equation is given as follows: the cross-sectional absolute deviations given in (1) are uation is given as follows: considered. To this end, the ADF and the MSADF tests D H described above are applied to the series of dispersion, St. αd,j td + ρd,j ∆St−h + ηj,t , (7) (7) ∆St = ϕj St−1 + The Dickey-Fuller test statistics are given in Table 1. d=0
h=1
2 j
When the standard (single regime) ADF test is considered,
ηj,t~N(0, π ). jdenotes ϵ {1, 2} the again denotes the state theat time {1, 2} again state the process is in t, root while a unit can only be rejected by the version of the test that process is in at time t, while D = 0, 1, 2, 3 indicates the does not account for deterministic trending. By contrast, 1, 2, 3 indicates theof degree of the polynomial the trend. deterministic trend. degree the polynomial defining the defining deterministic
2 where j,t ∼N (0, πj ). j ∈
the two-state MSADF test rejects the null in both states and all specifications for the deterministic trend.6 These ontrol for potential overparameterization, we also estimate a simple MSGARCH(1, 1) with Obviously, if D = 0, (7) this reduces to a Markov switching findings corroborate the use of a time-varying herding distributed errors and without lagged dependent variables. ADF (MSADF) test with a constant. When D = 1, a regimemeasure, since the assumption of constant herding is not depending linear trend is added while, in case of D = 2, the only economically unreasonable but in the present context trend can be changing and for D = 3, this trend may have also ignores structural shifts in the time series dynamics a turning point. H indicates the number of lags included. of St such as when markets move from a low- to a highDue to strong autocorrelations in St, the maximal lag volatility state. length is set at a relatively high value of 25, and then the number of lags is successively reduced until the coefficient Table 1: Dickey-Fuller Test Statistics of the last lag H is found to be statistically significant at the 10 percent level in at least one state.5 The MSADF test is Single State 2 States estimated using the EM algorithm. S =1 t
DATA The analysis covers the entire US stock market for the period 2001–2010. Total returns were obtained for all listed stocks and a capitalization weighted market index from the Center for Research in Security Prices at the University of Chicago. To ensure that the results are not sensitive to the selection of the sample period, the baseline model was also run for the periods 1999–2010 and 2003–2010. The second sample omits the period of the 2001 tech bubble period. The analysis is also carried out based on a sample that is free of cross-listings, listings in foreign currencies, shares from minor exchanges, ETFs and preferred stocks as well as stocks that are not marked as major securities. Data are taken from Thomson Reuters Datastream. The extent to which the results for these data differ relative to the more
D=0
-2.990**
-10.145***
D=1
-3.051
-15.116***
D=2
-3.083
-18.047***
D=3
-3.790
-17.751***
St = 2 D=0
-3.559***
D=1
-3.851**
D=2
-3.988**
D=3
-4.330**
Notes: Single State refers to a standard ADF test where 2 States stands for the two regimes version of the test outlined in the section on Markov switching herding measures. The test statistic provided is the pseudo t-statistic.***, ** and * denote statistical significance at the one percent, five percent and 10 percent level, respectively. Significance is based on asymptotic critical values obtained by Monte Carlo simulation.
4 To control for potential overparameterization, a simple MSGARCH(1, 1) is also estimated with normally distributed errors and without lagged dependent variables. 5 As a robustness check, the procedure is also performed for a maximal lag length of 15.
6 A maximal number of lags, H, equal to 25 are used, but these results also hold for H = 15.
Martin T. Bohl, Arne C. Klein and Pierre L. Siklos • 9
24
7
Figure 1: Smoothed Probabilities and Implied Volatility Figure 1: Smoothed probabilities and implied volatility 1
90
0.9
80
0.8
70
0.7
60
0.6
50
0.5
40
0.4
30
0.3 0.2
20
0.1
10 0 Jul‐10
Jul‐09
Jan‐10
Jul‐08
Jan‐09
Jul‐07
Jan‐08
Jul‐06
Jan‐07
Jul‐05
Jan‐06
Jul‐04
Jan‐05
Jul‐03
Jan‐04
Jul‐02
Jan‐03
Jul‐01
0 Jan‐01
Turning to the baseline model, equation (3) with normal errors, the smoothed probabilities, pi,t|T, which are plotted in Figure 1 (left-hand side scale), reveal two clearly distinct herding states. The smoothed probabilities are plotted against the VIX (right-hand side scale). It is immediately clear that the high volatility state typically coincides with deteriorating investors’ sentiment, that is, a relatively high implied volatility. The high-volatility regime can be related to periods of large market movements and is characterized ˆ by a herding parameter, ζ1, that is significantly positive, indicating adverse herding. Unlike the case of herding, this indicates that investors differentiate more strongly between particular stocks than implied by rational asset pricing behavior. By contrast, the second regime seems to ˆ prevail during more tranquil times. Here, ζ2 is found to be positive, but statistically insignificant, which is in line with CAPM-type models. Parameter estimates are reported in Table 2.
Appendix
Jan‐02
CIGI Papers no. 21 — December 2013
Notes: VIX line) (dotted line)probabilities and smoothed probabilities calculated Notes: VIX (dotted and smoothed calculated as proposed by Kim (1994). The
as on theThe left-hand axisNBER while thedates. right-hand side displays the VIX. The VIX. shaded areasside represent recession shaded areas represent NBER recession dates. proposed by Kim (1994). The smoothed are plotted smoothed probabilities are plotted on the left-hand side axis while probabilities the right-hand side displays the
Table 2: Estimation Results for Four Herding Models OLS Coeff.
Markov_norm Std. Error
Coeff.
Std. Error
Markov_t_norm Coeff.
Std. Error
Markov_GED Coeff.
Std. Error
St = 1 γˆ 1
0.017***
(0.000)
0.025***
(0.000)
0.024***
(0.000)
0.023***
(0.000)
ˆ
δ1
0.549***
(0.043)
0.356***
(0.042)
0.375***
(0.015)
0.378***
(0.014)
ζˆ1
0.622
(0.746)
1.676**
(0.735)
1.494***
(0.205)
1.510***
ˆ2
4.064 10
3.548 10 ***
(0.000)
3.347 10 ***
(0.000)
4.750***
(0.972)
1.615***
(0.107)
σ1 υˆ
1
−5
3.536 10
−5
−5
/κˆ
1
pˆ
0.991
11
0.991
(0.197) −5
0.991
St = 2 γˆ 2
0.015***
(0.000)
0.015***
(0.000)
0.015***
(0.000)
ˆ
δ2
0.253***
(0.021)
0.253***
(0.012)
0.247***
(0.014)
ζ2
ˆ
0.873
(0.769)
0.987**
(0.440)
1.051**
ˆ2
4.042 10
3.884 10
(0.038)
3.869 10
0.995
0.994
σ2 υˆ
2
−6
−6
/κˆ
pˆ 22
0.842***
2
(0.453) −6
(0.000) (0.090)
0.993
Notes: Autocorrelation and heteroskedasticity consistent standard errors as proposed by Newey and West (1987) are provided in brackets. OLS refers to the basic ordinary least squares estimates while Markov norm, Markov t norm and Markov GED stand for the Markov switching models with normal, t-/normal and generalized error distribution, respectively. ***, **, and * denote statistical significance at the one percent, five percent and 10 percent level, respectively.
10 • THE CENTRE FOR INTERNATIONAL GOVERNANCE INNOVATION
A Markov switching Approach to Herding The high-volatility state initially prevails from the beginning of our sample in 2001, that is the bursting of the tech bubble, the time around 9/11 as well as the start of wars in Afghanistan and Iraq. Shortly thereafter, in mid-August 2003, a switch into the calmer regime takes place. This switch coincides with a strong US economy. The second regime then ends in mid-2007 when, around August 6, a switch into the high volatility state takes place, a couple of days before central banks around the world started to intervene in order to stabilize the money market at the onset of the financial crisis. Subsequently, the process switches several times between both regimes, consistent with uncertainty about the existence of a grave crisis prevailing among market participants during this period. Again, we also see this reflected in the behaviour of the VIX. On September 2, 2008, a switch into the highvolatility state is indicated, a couple of days before Lehman Brothers released news about severe losses for the first time. In mid-August 2009, the process then moves back into the calmer regime and remains there until the end of the sample at the end of 2010.7 Applying the procedures described in the section “Markov Switching Herding Measures” to allow for fat-tailed distributions produces virtually unchanged inferences about regimes, the smoothed probabilities and very similar parameter estimates compared to the assumption of normality. The only substantial difference found is ˆ that ζj is significantly different from zero in both states. Thus, adverse herding is significantly stronger during volatile periods, but remains significant in more tranquil market phases. The estimates for the model allowing for both t and normally distributed regimes indicate that, during the high volatility state, the errors follow a t distribution while they are well characterized by a normal distribution during calm periods.8 Taken together, these findings highlight that in the first state the dispersion of returns around the market and, thus,
7 Robustness checks using the sample periods 1999–2010 and 2003– 2010 broadly confirm the findings. The 12-year period actually reveals that already in 1999 and 2000, the process is in the high volatility state. This supports the interpretation of the first regime as being closely related to periods of strong market movements rather than only strong downward movements or crises periods. Moreover, the results using different lag lengths in calculating the Newey and West (1987) covariance matrix broadly confirm the results. 8 The use of the GED reveals that the calm period regime displays tails being even thinner than those of a normal distribution. Filtered and smoothed probabilities for the non-normal models are available upon request.
investor sentiment, is not only relatively volatile but also subject to large shocks. The results for the asymmetric specification given in equation (5) suggest that herding in US stocks does not differ as much between market upturns and downswings ˆ asy since the t values for ζj are far from being statistically significant. This is shown in Table 3. By contrast, the use of the MSGARCH model is corroborated by the data since strong volatility clustering and ARCH effects are found. In ˆ addition, the values of υˆ j and βj are found to be significantly different from the values implied by a normal distribution. The smoothed probabilities for the MSGARCH(1, 1) specification with skewed t distribution, given in Figure 2, are even more clear-cut than those for the above models with constant variances. Again, we find adverse herding to be much stronger when the high volatility state prevails.9 This paper now turns to the approach with time-varying transition probabilities, pjj,t, which are made conditional on the VIX, the relative turnover and the principal components of the yield curve. The latter is extracted from the sample of US zero rates. In what follows, only the first two components are of interest for the model since they are associated with eigenvalues greater than zero. Further interpretations, of course, depend on the loadings of these components.10 They were found to be perfectly in line with the patterns known in the literature as shift and twist (Litterman and Scheinkman 1988; Knez, Litterman and Scheinkman 1994). This means that the first component has very evenly distributed loadings and stands for a shift in the overall interest rate level. By contrast, the second one is characterized by loadings, which monotonically decrease with a change in the slope of the term structure, in particular a rise in short-term interest rates, which is not accompanied by a proportional rise in long-term rates or vice versa. So, starting from a normal yield curve, a rise in this factor comes along with a flattening of the term structure. 9 The smoothed probabilities for the MSGARCH(1, 1) with normal errors closely resemble those for the homoskedastic models and are available upon request, as are the parameter estimates for both MSGARCH models and the asymmetric specification. The results for the sample that is adjusted for minor stocks, ETFs, etc. are similar in spirit. The main difference is that the coefficients ζj are larger in absolute values. This suggests that adverse herding is stronger in the stocks of large and transparent firms. 10 The components are linear combinations of the underlying zero bond rates and the respective parameters are referred to as loadings. There are always as many components as there are different zero rates.
Martin T. Bohl, Arne C. Klein and Pierre L. Siklos • 11
CIGI Papers no. 21 — December 2013 Table 3: Estimation Results for the Asymmetric Herding Model
Table 4: Parameters of the Time-varying Transition Probabilities
Markov_norm_asy
St = 1
Coeff.
Std. Error
St = 1 γˆ1
0.024***
(0.001)
γˆ 1
8.643 10−4
(5.953 10−4)
ˆ
0.454***
(0.056)
δ1
-0.176***
(0.060)
ζ1
ˆ
1.248
(0.918)
ˆ asy
ζ1
0.553
ˆ2
σ1
3.411 10
pˆ
0.991
asy
δ1 ˆ asy
11
(0.798) −5
(1.626 10−4)
γˆ 2
3.523 10−4**
(1.750 10−4)
ˆ
0.275***
(0.023)
δ2
-0.057*
(0.034)
ζ2
1.316*
(0.772)
ζˆ2asy
-0.308
(1.175)
σˆ 22
3.991 10−6
pˆ 22
0.995
δ2 asy
ˆ
Figure 2: Smoothed Probabilities for the Model Figure 2: Smoothed probabilities1) forwith the model MSGARCH(1, 1) model withand skewed MSGARCH(1, Skewed t Distribution
1
90
0.9
80
0.8
70
0.7
60
0.6
50
0.5
40
0.4
30
0.3 0.2
20
0.1
10 0 Jul‐10
Jul‐09
Jan‐10
Jul‐08
Jan‐09
Jul‐07
Jan‐08
Jul‐06
Jan‐07
Jul‐05
Jan‐06
Jul‐04
Jan‐05
Jul‐03
Jan‐04
Jul‐02
Jan‐03
0 Jul‐01
-0.384
PC2
-2.603
0.361
VIX
0.225
-0.179
-124.680
-176.966
Using the principal components and the other two covariates for (4), the logit specification for pjj,t, the estimates for the parameters in (3) and the variances are found to be very close to those for the model with constant probabilities and a normal distribution. For this reason, ˆ Table 4 only reports θj, the parameters estimates for (4), the specification governing the transition probabilities.12 Those parameters estimates, which differ with respect to the signs between the states, are of particular interest since a change in a given exogenous variable always increases the probability for one regime while decreasing the one for the other regime, irrespective of the level of the covariates (see specification [4]). Put differently, high values of such a variable could be linked to one state while
t distribution and implied volatility Implied Volatility
Jan‐02
10.414
-1.651
This paper now turns to the approach with time-varying transition probabilities, pjj,t, which are made conditional on the VIX, the relative turnover and the principal components of the yield curve. The latter is extracted from the sample of US zero rates. In what follows, only the first two components are of interest for the model since they are associated with eigenvalues greater than zero. Further interpretations, of course, depend on the loadings of these components.11 They were found to be perfectly in line with the patterns known in the literature as shift and twist (Litterman and Scheinkman 1988; Knez, Litterman and Scheinkman 1994). This means that the first component has very evenly distributed loadings and stands for a shift in the overall interest rate level. By contrast, the second one is characterized by loadings, which monotonically decrease with a change in the slope of the term structure, in particular, a rise in short-term interest rates, which is not accompanied by a proportional rise in long-term rates or vice versa. So, starting from a normal yield curve, a rise in this factor comes along with a flattening of the term structure.
25 Notes: Autocorrelation and heteroskedasticity consistent standard errors as proposed by Newey and West (1987) are provided in brackets. Markov_t_norm_asy refers to the Markov switching models with normally distributed errors given in equation (5). ***, **, and * denote statistical significance at the one percent, five percent and 10 percent level, respectively.
Jan‐01
-0.003
PC1
Notes: PC1 and PC2 refer to the first and second principle component of the yield curve while VIX and TURN stand for the Chicago Board Options Exchange Market Volatility Index and the relative volume measure defined by turnover in US$ divided by market capitalization times 1000.
0.015***
asy
Constant
TURN
St = 2 γˆ 2
St = 2
Notes: VIX line) (dotted line)probabilities and smoothed probabilities calculated Notes: VIX (dotted and smoothed calculated as proposed by Gray (1996a). The
as on theThe left-hand axisNBER while thedates. right-hand side displays the VIX. The VIX. shaded areasside represent recession shaded areas represent NBER recession dates. proposed by Gray (1996a). The smoothed are plotted smoothed probabilities are plotted on the left-hand side axis while probabilities the right-hand side displays the
12 • THE CENTRE FOR INTERNATIONAL GOVERNANCE INNOVATION
11 The components are linear combinations of the underlying zero bond rates and the respective parameters are referred to as loadings. There are always as many components as there are different zero rates. 12 It must be borne in mind that there are no standard errors for the parameters of the transition probabilities within the EM framework.
A Markov switching Approach to Herding below-average values would always be consistent with the other regime, independent of the behaviour of other exogenous variables. While the parameters for the first principal component representing the level as well as the turnover measure have a negative sign for both states, this is not the case for the second component and the VIX. The interpretation of the latter is straightforward, namely that state 1 is associated with a positive coefficient. Hence, an increase in the implied volatility makes it more likely to switch into or remain in regime 1 while the reverse holds for regime 2. More interestingly, the coefficient for the second principal component, which stands for a flattening of the term structure, is negative for the first state and positive for the second one. At first glance, this is counterintuitive since a flat (or even inverse) term structure is, in general, associated with a contracting economy. This should presage a switch into the high volatility. However, when the model is re-estimated employing two quarter lagged principal components, the sign turns negative for both states and, in particular, more negative for the second, the tranquil state.
CONCLUSIONS This paper models the time-varying herd behaviour reflected in movements in stock market returns. The authors argue that extant empirical treatments of herding behaviour suffer from several drawbacks. Models that assume constant herding dynamics are economically implausible since the literature links herding to investor sentiment, which by definition, is time varying. Unit root tests corroborate this view since one is only able to reject the unit root null unambiguously when the process is allowed to switch between two distinct regimes. Moreover, existing models of time-varying herding require data at monthly or lower sampling frequencies and, thus, cannot be used to generate evidence about investors’ short-term behaviour. The procedure proposed by Christie and Huang (1995) and Chang, Cheng and Khorana (2000) is relied on to examine investors’ herd behaviour. However, their approach is adapted by fitting a Markov switching model to allow for different dynamics between high and low volatility regimes. In addition, the time-varying transition probabilities proposed by Diebold, Lee and Weinbach (1994) are used to reveal factors explaining changes in herd behaviour over time, driven by proxies for macroeconomic conditions and investor sentiment. The authors’ models are estimated for US stock market data, thereby controlling for non-normalities, autocorrelation and GARCH effects. The findings suggest that during times of high volatility in the market, investors discriminate more strongly between single stocks than during tranquil times and more strongly than implied by rational asset pricing models.
WORKS CITED Avery, C. and P. Zemsky. 1998. “Multidimensional Uncertainty and Herd Behavior in Financial Markets.” American Economic Review 88 (4): 724–48. Baker, M. and J. Stein. 2004. “Market Liquidity as a Sentiment Indicator.” Journal of Financial Markets 7 (3): 271–99. Baker, M. and J. Wurgler. 2006. “Investor Sentiment and the Cross-section of Stock Returns.” The Journal of Finance 61 (4): 1645–80. Banerjee, A. 1992. “A Simple Model of Herd Behavior.” The Quarterly Journal of Economics 107 (3): 797–817. Bikhchandani, S., D. Hirshleifer and I. Welch. 1992. “A Theory of Fads, Fashion, Custom and Cultural Change as Informational Cascades.” Journal of Political Economy 100 (51): 992–1026. Bikhchandani, S. and S. Sharma. 2001. “Herd Behavior in Financial Markets.” IMF Staff Papers 47 (3): 279– 310. Bohl, M. T., B. Gebka, and C. Goodfellow. 2009. “Together We Invest? Individual and Institutional Investors’ Trading Behaviour in Poland.” International Review of Financial Analysis 18 (4): 212–21. Chang, E. C., J. W. Cheng and A. Khorana. 2000. “An Examination of Herd Behavior in Equity Markets: An International Perspective.” Journal of Banking & Finance 24 (10): 1651–79. Chiang, T. and D. Zheng 2010. “ An Empirical Analysis of Herd Behavior in Global Stock Markets.” Journal of Banking & Finance 34 (8): 1911–21. Christie, W. G. and R. D. Huang. 1995. “ Following the Pied Piper: Do Individual Returns Herd around the Market?” Financial Analysts Journal 51 (4): 31–37. De Long, B., A. Shleifer, L. Summers and R. Waldmann. 1990. “ Positive Feedback Investment Strategies and Destabilizing Rational Speculation.” The Journal of Finance 45 (2): 379–95. ———. 1991. “The Survival of Noise Traders in Financial Markets.” The Journal of Business 64 (1): 1–19. Dempster, A. P., N. M. Laird and D. B. Rubin. 1977. “ Maximum Likelihood from Incomplete Data via the EM Algorithm.” Journal of the Royal Statistical Society. Series B (Methodological) 39 (1): 1–38. Devenow, A. and I. Welch. 1996. “Rational Herding in Financial Economics.” European Economic Review 40: 603–15.
Martin T. Bohl, Arne C. Klein and Pierre L. Siklos • 13
CIGI Papers no. 21 — December 2013 Diebold, F. X., J. H. Lee, and G. C. Weinbach. 1994. “ Regime Switching with Time-varying Transition Probabilities.” In Nonstationary Time Series Analysis and Cointegration, edited by Colin P. Hargreaves, 283–302. Oxford: Oxford University Press. Estrella, A. and G. A. Hardouvelis. 1991. “ The Term Structure as a Predictor of Real Economic Activity.” The Journal of Finance 46 (2): 555–76. Estrella, A. and F. S. Mishkin. 1997. “ The Predictive Power of the Term Structure of Interest Rates in Europe and the United States: Implications for the European Central Bank.” European Economic Review 41 (7): 1375–1401. ——— 1998. “Predicting U.S. Recessions: Financial Variables as Leading Indicators.” The Review of Economics and Statistics 80 (1): 45–61. Fernandez, C. and M. F. J. Steel 1998. “ On Bayesian Modeling of Fat Tails and Skewness.” Journal of the American Statistical Association 93 (441): 359–71. Froot, K. A., D. S. Scharfstein and J. C. Stein. 1992. “ Herd on the Street: Informational Inefficiencies in a Market with Short-term Speculation.” T h e Journal of Finance 47 (4): 1461–84. Gleason, K. C., I. Mathur and M. A. Peterson. 2004. “Analysis of Intraday Herding Behavior among the Sector ETFs.” Journal of Empirical Finance 11 (5): 681–94. Graham, J. R. 1999. “Herding among Investment Newsletters: Theory and Evidence.” The Journal of Finance 54 (1): 237–68. Gray, S. F. 1996a. “An Analysis of Conditional Regimeswitching Models.” Working paper. ——— 1996b. “Modeling the Conditional Distribution of Interest Rates as a Regime- switching Process. Journal of Financial Economics 42: 27–62. Grinblatt, M., S. Titman and R. Wermers. 1995. “Momentum Investment Strategies, Portfolio Performance, and Herding: A Study of Mutual Fund Behavior.” The American Economic Review 85 (5): 1088–1105. Hall, S. G., Z. Psadarakis and M. Sola. 1999. “Detecting Periodically Collapsing Bubbles: A Markov-wwitching Unit Root Test.” Journal of Applied Econometrics 14 (2): 143–54. Hall, S. G. and M. Sola. 1994. “Rational Bubbles during Poland’s Hyperinflation: Implications and Empirical Evidence.” European Economic Review 38 (6): 1257–1276.
14 • THE CENTRE FOR INTERNATIONAL GOVERNANCE INNOVATION
Hamilton, J. D. 1990. “Analysis of Time Series Subject to Changes in Regime.” Journal of Econometrics, 45 (1-2): 39–70. Hansen, B. E. 1994. “Autoregressive Conditional Density Estimation.” International Economic Review 35 (3): 705– 30. Hwang, S. and M. Salmon. 2004. “Market Stress and Herding.” Journal of Empirical Finance 11(4): 585–616. ——— (2009). “Sentiment and Beta Herding.” Working paper. Jeanne, O. and A. K. Rose. 2002. “Noise Trading and Exchange Rate Regimes.” Quarterly Journal of Economics 117 (2): 537–69. Jondeau, E. and M. Rockinger. 2003. “Conditional Volatility, Skewness, and Kurtosis: Existence, Persistence, and Comovements.” Journal of Economic Dynamics & Control, 27 (10) :1699–1737. Jones, C. 2002. “A Century of Stock Market Liquidity and Trading Costs.” Working paper. Keim, D. B. and A. Madhavan. 1995. “Anatomy of the Trading Process: Empirical Evidence on the Behavior of Institutional Traders.” Journal of Financial Economics 37 (3): 371–398. Kim, C. J. 1994. “Dynamic Linear Models with MarkovSwitching.” Journal of Econometrics 60 (1-2): 1–22. Knez, P. J., R. Litterman and J. Scheinkman. 1994. “Explorations Into Factors Explaining Money Market Returns.” The Journal of Finance 49 (5): 1861–1882. Lakonishok, J., A. Shleifer and R. W. Vishny. 1992. “The Impact of Institutional Trading on Stock Prices.” Journal of Financial Economics 32 (1): 23–43. Lee, C. M., A. Shleifer and R. H. Thaler. 1991. “Investor Sentiment and the Closed- end Fund Puzzle.” The Journal of Finance 46 (1): 75–109. Litterman, R. and J. Scheinkman. 1991. “Common Factors Affecting Bond Returns.”Journal of Fixed Income 1: 54– 61. Newey, W. K. and K. D. West. 1987. “A Simple, Positive Semi-definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix.” Econometrica 55 (3): 703–08. ——— 1994. “Automatic Lag Selection in Covariance Matrix Estimation.” The Review of Economic Studies 61 (4): 631–53.
A Markov switching Approach to Herding Scharfstein, D. S. and J. C. Stein. 1990. “Herd Behavior and Investment.” The American Economic Review 80 (3): 465–79. Shiller, R. J., S. Fisher and B. M. Friedman. 1984. “Stock Prices and Social Dynamics.” Brookings Papers on Economic Activity 2: 457–510. Tan, L., T. Chiang, J. R. Mason and E. Nelling. 2008. “Herding Behavior in Chinese Stock Markets: An Examination of A and B Shares.” Pacific-Basin Finance Journal 16 (1-2): 61–77. Welch, I. 1992. “Sequential Sales, Learning, and Cascades.” The Journal of Finance 47 (2): 695–732.
Martin T. Bohl, Arne C. Klein and Pierre L. Siklos • 15
ABOUT CIGI The Centre for International Governance Innovation is an independent, non-partisan think tank on international governance. Led by experienced practitioners and distinguished academics, CIGI supports research, forms networks, advances policy debate and generates ideas for multilateral governance improvements. Conducting an active agenda of research, events and publications, CIGI’s interdisciplinary work includes collaboration with policy, business and academic communities around the world. CIGI’s current research programs focus on four themes: the global economy; global security; the environment and energy; and global development. CIGI was founded in 2001 by Jim Balsillie, then co-CEO of Research In Motion (BlackBerry), and collaborates with and gratefully acknowledges support from a number of strategic partners, in particular the Government of Canada and the Government of Ontario. Le CIGI a été fondé en 2001 par Jim Balsillie, qui était alors co-chef de la direction de Research In Motion (BlackBerry). Il collabore avec de nombreux partenaires stratégiques et exprime sa reconnaissance du soutien reçu de ceux-ci, notamment de l’appui reçu du gouvernement du Canada et de celui du gouvernement de l’Ontario. For more information, please visit www.cigionline.org.
CIGI MASTHEAD Managing Editor, Publications
Carol Bonnett
Publications Editor
Jennifer Goyder
Publications Editor
Sonya Zikic
Assistant Publications Editor
Vivian Moser
Media Designer
Steve Cross
EXECUTIVE President
Rohinton Medhora
Vice President of Programs
David Dewitt
Vice President of Public Affairs
Fred Kuntz
Vice President of Finance
Mark Menard
COMMUNICATIONS Communications Specialist
Declan Kelly
Public Affairs Coordinator
Erin Baxter
dkelly@cigionline.org (1 519 885 2444 x 7356) ebaxter@cigionline.org (1 519 885 2444 x 7265)
CIGI PUBLICATIONS
ADVANCING POLICY IDEAS AND DEBATE
Essays on International Finance Volume 1: October 2013
International Cooperation and Central Banks Harold James
Off Balance: The Travails of Institutions That Govern the Global Financial System Paul Blustein
CIGI Essays on International Finance — Volume 1: International Cooperation and Central Banks Harold James
Paperback: $28.00; eBook: $14.00
The CIGI Essays on International Finance aim to promote and disseminate new scholarly and policy views about international monetary and financial issues from internationally recognized academics and experts. The essays are intended to foster multidisciplinary approaches by focussing on the interactions between international finance, global economic governance and public policy. The inaugural volume in the series, written by Harold James, discusses the purposes and functions of central banks, how they have changed dramatically over the years and the importance of central bank cooperation in dealing with international crises.
The latest book from award-winning journalist and author, Paul Blustein, is a detailed account of the failings of international institutions in the global financial crisis. Based on interviews with scores of policy makers and on thousands of pages of confidential documents that have never been previously disclosed, the book focusses mainly on the International Monetary Fund and the Financial Stability Forum in the run-up to and early months of the crisis. Blustein exposes serious weaknesses in these and other institutions, which lead to sobering conclusions about the governability of the global economy.
ˆ
CIGI PaPers
no. 18 — May 2013
Short-Selling banS and inStitutional inveStorS’ herding behaviour: evidence from the global financial criSiS
Short-selling Bans and Institutional Investors’ Herding Behaviour: Evidence from the Global Financial Crisis CIGI Paper No. 18
CIGI PaPers
Martin T. Bohl, Arne C. Klein and Pierre L. Siklos
MartIn t. Bohl, arne C. KleIn and PIerre l. sIKlos
no. 15 — aPrIl 2013
Are Short SellerS PoSitive FeedbAck trAderS? evidence From the GlobAl FinAnciAl criSiS
Are Short Sellers Positive Feedback Traders? Evidence from the Global Financial Crisis CIGI Paper No. 15 Martin T. Bohl, Arne C. Klein and Pierre L. Siklos
MartIn t. Bohl, arne C. KleIn and PIerre l. sIKlos
During the recent global financial crisis, regulatory authorities in a number of countries imposed short-sale constraints aimed at preventing excessive stock market declines. This paper focusses, in particular, on short-sale constraints’ effect on institutional investors’ trading behaviour and the possibility of generating herding behaviour. The authors conclude that the empirical evidence shows that short-selling restrictions exhibit either no influence on herding formation or induce adverse herding.
This paper examines bans on selected financial stocks in six countries during the 2008-2009 global financial crisis. These provided a setting to analyze the impact of short-sale restrictions on feedback trading. The findings suggest that, in the majority of markets examined, restrictions of this kind amplify positive feedback trading during periods of high volatility and, hence, contribute to stock market downturns. On balance, therefore, shortselling bans do not contribute to enhancing financial stability.