Decision Ryan-Air/Aer Lingus
 Quantitative Analysis
Lorenzo Ciari, consultant
Plan of the talk
Introduction: The use of Panel data
Cross-section regression vs panel data
Fixed effect model vs Random effect model Cross section regressions in the Ryan Air-Aer Lingus Case Objections against cross-section approach Panel Data analysis in the Ryan-Air Air-Lingus Case
Introduction: the use of panel data Suppose we want to estimate the following linear approximation to a production function
mi is managerial ability, which crucially is unobservable Suppose we have information only on a cross section of firms for a given t, so that we can only estimate where
The use of panel data
Given the equations above we have
Suppose that we have:
Then,
If we estimate using OLS y on l, we obtain
But b2 is a biased estimate of the causal effect of l on y, given that
The use of panel data Panel data can solve the problem, as long as the unobservable, managerial ability, can be assumed to be constant over time General framework Consider the following model
αi is a time invariant individual effect. It measures the effect of all the factors that are specific to individual i and constant over time. Basically, the idea is that we panel data you can control for all unobserved heterogeneity which is constant through time
The use of panel data Fixed effect estimators (LSDV)
A first way to proceed is to estimate with OLS a model in which we include a dummy variabel for each individual in the sample.
OLS estimator applied to this model would give unbiased estimates of the parameters of interest , β.
The use of panel data
D i s a d v a n t a g e s o f t h e p r o c e d u r e : i t c a n b e computationally unfeasible. If N is too large the LSDV estimator is not feasible and we need a trick. The intuition is the following: given the model
Using partition regression results, we know that an unbiased estimate of B can be obtained by Regressing Y on D and get residuals Y* Regress X on D and get residuals X* Regress Y* on X* to obtain an estimate of B
The use of panel data
It can be shown that in the panel set up, given the way in which matrix D is constructed (D is a matrix of individual specific dummies) The elements of Y* are the deviations of each element of Y with respect to the correspondent individual specific mean The elements of X* are the deviations of each element of X with respect to the correspondent individual specific mean
The estimators presented (the LSDV and the within estimator) are numerically equivalent Notice, that this model exploits only the within variation to get the estimates
The use of panel data At the opposite extreme of the fixed effect (within) analysis, the basic model can be transformed to fully exploit the variability between individuals, ignoring the variability within. If
is the correct model, then also the following must be true
The individual effects are now included in the error term, and therefore we have to assume that the individual specifc effects are uncorrelated with the explanatory factors.
The use of panel data
Under this assumption, OLS applied to equation above gives an unbiased and consistent estimate of . The “between” estimator will not be efficient, as it ignores the information given by the within variability. In order to have an estimator that exploits efficiently both the within and the between variation (and that allows to estimate the effect of time invariant factors) we need to make strong assumptions Start from the basic model
The use of panel data
We have to do three different things
Abandon the assumption that the individual effects are fixed and estimable Assume that they measure our individual specific ignorance which should be treated similarly to our general ignorance Assume that the composite error term is uncorrelated with the regressors Explicit carefully the covariance structure of the two types of ignorance
The use of panel data
Assumption we make for the estimation of the Random Effect model (RE)
In terms of the composite error term These assumptions imply
The use of panel data
It can be shown that the random effect estimator, which is the GLS estimator of the previously presented equation, is a weighted average of the within and between estimator So, the random effect estimator seems preferable because It uses efficiently the between and the within information, allowing the estimation of the effects of time invariant factors However, the random effect estimator is consistent only when the individual specific effects are not correlated with the explanatory factors. Assumption that is hard to find convincing, and that anyhow should be tested
The RyanAir/AerLingus merger
Hypothesis I
I-A Ryanair’s presence is associated with a statistically and economically significant reduction in Aer Lingus fares in the various short-haul routes where they overlap. I-B Conversely, Aer Lingus presence is associated with a statistically and economically significant reduction in Ryanair's fares.
Hypothesis II
II-A Ryanair exerts a stronger competitive constraint on Aer Lingus’ fares than any other actual or potential competitor does. II-B Aer Lingus exerts a stronger competitive constraint on Ryanair’s fares than any other actual or potential competitor does.
The RyanAir/AirLingus merger
Hypothesis III
The existence of an actual or potential competitor operating from a base at the destination airport on a route originating in Dublin has a limited impact on the merging parties' prices.
Hypothesis IV
IV-A The stronger the presence of Ryanair in the route the more pronounced the effect on Aer Lingus fares. IV-B The stronger the presence of Aer Lingus in the route the more pronounced the effect on Ryanair’s fares
RyanAir/AirLingus
The parties provided data on:
Their own fares and costs for each of the routes they operate out of Dublin (fares data). The competitive framework in the relevant routes (carriers data). Route specific characteristics (route data).
Further, the Commission requested the DAA to provide information regarding the merging parties’ competitors on all relevant routes out of Ireland Each dataset contains information for each of these airports covering the period as of January 2000 until December 2006 All the data available at the airport level has been aggregated for the relevant catchments area at the destination city.
RyanAir/AirLingus
The final dataset contains information for 81 different airports for a total number of 5427 observations between January 1996 and December 2006. The data, however, is complete for most relevant variables only as of January 2002 for passengers flying to each airport for each carrier. A carrier was assumed to have a strong presence at the destination airport where it operated more than 200 flights a week during the month in question. The Commission has chosen January 2002 as the starting point of analysis.
RyanAir/AirLingus
Following the methodology pursued by CRA and RBB a similar dataset has been constructed on an airport-pair basis. Where relevant the econometric analysis performed on the market definition dataset has been replicated in the airportpairs dataset to assess the sensitivity of the results to the market definition. In order to test the four hypotheses set out above, the merging parties and the Commission have all considered a reduced-form specification. The basic idea is to regress some measure of airline fares on a vector of firm and route characteristics
RyanAir/AirLingus The Commission’s (merged) panel data tracks the prices set by both Ryanair and Aer Lingus in individual routes over time. Average monthly fares net of airport charges. Total net revenue on a passenger basis Two different empirical strategies to assess the extent to which the merging firms exert a competitive constraint on each other (holding constant other factors such as competition from other airlines): Cross-section regression analysis: examines differences in prices across a number of affected routes at a point in time. Fixed-effects regression analysis with panel data, which exploits the variation in market structure at individual routes over time.
RyanAir/AirLingus The disadvantage of using a cross-section approach is that it may not be possible to control for important but unobserved or unmeasured influences on price that vary from route to route. For example, prices may be higher in monopoly routes not because there is no competition but because in this particular route, demand is relatively low or costs are relatively high (e.g. when high entry barriers are correlated with high operation costs).
Fixed effect regressions
The results from cross-section analysis are not robust for two reasons First, the number of independent observations is rather small. Second, there is no reason to think that possible omitted variable bias can be ignored. Possible solution to the omitted variable problem in cross-section regressions: control for sources of route heterogeneity that likely affect prices (the type of destination, the popularity of the route according to purpose of travel destination airport characteristics etc.) The problem is that these variables may be unobserved or difficult to measure accurately.
Fixed effects regressions An alternative way to control for differences across routes is to view the unobserved factors affecting fares as consisting of two types: those that are constant and those that vary over time. Fixed effect approach is a suitable approach when data contains many examples of entry and exit unobservable influences on prices are time given route there is little reason to expect inaccuracies in explanatory variables (measurement error amplified in a panel setup).
over time invariant in a measuring key bias can be
These conditions are all met in the Commission’s view The Commission’s empirical strategy focuses on the impact of Ryanair’s “presence” and “strength of presence” on Aer Lingus’ average net monthly fares and vice versa.
Fixed effects regressions
The baseline fixed-effects regression is as follows:
Two datasets: city-pairs, airport-pairs / Two specifications: presence, frequency Look only on the impact of Ryan Air on Aer Lingus, not viceversa, since no significant episodes of entry of Aer Lingus on Ryan Air routes Presence Specification The presence specification includes dummy variables for the presence of (i) Ryanair, (ii) one or more flag carriers and (iii) one or more non-flag carriers. The baseline regression also includes Aer Lingus’ log of capacity (seats) as a scale factor and a dummy to capture the impact of Aer Lingus’ final stage in its restructuring, namely the move towards an internet based sales strategy that was implemented in September 2004.
Fixed effect regressions
The coefficient of Ryanair’s presence is 0.077 and significant at the 1% level. No other rival has a negative and statistically significant effect on Aer Lingus’ fares. The Sept’04 dummy is highly significant Column 4 in table 9 reports the results of adding two dummy variables to indicate the presence of at least one flag and one non-flag carrier with a base at the destination airport. The presence of a non-flag carrier with a base at destination has a negative and statistically significant effect on Aer Lingus’ prices. However the impact is economically small (-2.75%), particularly in relation to the impact of Ryanair (-7.6%)
Fixed effect regressions
Finally Aer Lingus' capacity has a significant and positive relationship with average fares (i.e. on average, more capacity is planned on routes with higher expected demand, and therefore higher average fares). Table 10 reports the results from adding a number of additional controls. First a direct measure of total costs in the route (ln_EI_costs). Second the log of total frequencies offered by all carriers at the destination airport (ln_dest_freq_total). This variable controls for the traffic at the destination airport and is therefore a good proxy for variations in demand. Both variables turn out to be highly significant in all specifications.
Fixed effects regressions
As shown in the first column there is little variation in the presence variables. Column (2) introduces the “base at destination” variables. The Commission also explored whether the presence of Ryanair primarily affects Aer Lingus prices in the routes where they fly to the same airport. For this purpose the Commission introduced two dummies for Ryanair. The first takes the value of 1 when it serves the same airport as Aer Lingus in the given route, and zero otherwise. The other takes the value of 1 when it serves a different airport, and zero otherwise. The fourth column includes the “base at destination” dummies whereas the regression in column 3 omits these variables. In both cases the presence of Ryanair remains significant and around 7% on average.
Fixed effects regressions In fact, Aer Lingus appears to be slightly more sensitive to Ryanair’s presence on the route when the latter serves a different airport. The difference does not appear to be important but in any event these results rebut the argument that when Ryanair serves a different airport it does not compete with Aer Lingus in the relevant city-pair.
Airport-pairs database As already mentioned the Commission tested the presence specification also in the airport-pairs database. In this case, the maintained assumption is that if Ryanair serves a different airport to a given city it does not compete at all with Aer Lingus. This assertion has been made by Ryanair. The results of these regressions are presented in table 12 below. Its presence leads to approximately a 5% reduction in Aer Lingus’ fares. it is also noteworthy that a non-flag carrier with a base at the destination also appears to have a relevant effect.
Fixed effects regressions
Frequency specification The frequency specification is intended to test the that the stronger the presence of one of the merging parties the more pronounced the effect on fares of the other. The log frequencies of Ryanair has a significant and negative effect on Aer Lingus’ prices. This effect is robust to the inclusion in the regression of base destination dummies, demand controls and even the frequencies HHI. In this set of regressions the frequencies of non-flag carriers also have a negative effect on Aer Lingus’ fares but much smaller in magnitude (less than a third) than that of Ryanair. The demand control (ln_dest_freq_total) is highly significant, as in the presence specification. Again, the presence of a non-flag carrier with a base at the destination has a very similar effect as in the presence specification
Fixed effects regressions Limitations of the fixed-effects regressions in this case (acknowledged by the Commission) The fixed-effects procedure is subject to two caveats.
Firstly, it is based on the assumption that entry and exit decisions are exogenous. A second problem is that the frequency variables may also be possibly endogenous, although it seems sensible to assume that airlines set these frequencies at least a few weeks in advance and then optimize their pricing and load factors conditional on the preset frequencies.
In theory, these problems can be addressed by instrumenting the explanatory variables. The Commission has tested a number of candidate. However they all turned out to have very poor properties
Fixed effects regressions RBB raises several criticisms to the Commission's analysis. We focus on one in particular:
The Commission’s specifications suffer omitted variable bias in that they fail to control appropriately for route specific demand conditions and/or fail to account for endogenously determined fares and frequencies.
RBB makes two distinct, albeit related arguments. First RBB argues that the inclusion of Aer Lingus’ capacity (i.e. frequency) as an explanatory variable of Aer Lingus fares in some regressions will lead to endogeneity bias. This is because RBB argues that capacity is simultaneously (i.e. endogenously) determined with prices. Second, given that "capacity" is not a good proxy for demand (because it is allegedly endogenous) the Commission's regression actually fails to include an adequate demand control. Hence the results are likely to suffer from omitted variable bias.
Fixed effects regressions The Commission agrees that the parties have flexibility to change frequencies. However flexibility in shifting frequencies does not mean that frequencies are set simultaneously with prices. In general the levels of frequencies depend on expected levels of demand. Hence fluctuations in frequencies across routes and season simply reflect fluctuations in expected demand, for example due to seasonality or anticipated one time events. As for the second argument the Commission included an alternative demand control, namely the log of total frequencies offered by all carriers at the destination airport (ln_dest_freq_total) (see for example table 10). As explained above this variable controls for the traffic at the destination airport and is therefore a good proxy for variations in demand (anticipated at least at the time that carriers set their frequencies).
Fixed effects regressions
This variable turns out to be highly significant in all specifications. More importantly the coefficients of Ryanair's presence are robust both statistically and economically to the use of this alternative demand control. What conclusions could be drawn: Panel data are superior to cross sectional regressions (indeed the results presented by RBB based on cross sectional data were pretty different, which shows the significance of the bias itself) The use of panel data might not be the panacea, as time variant heterogeneity in demand and cost conditions might still bias the results.
Obrigado!
37
www.planejamento.gov.br/gestao/dialogos dialogos.setoriais@planejamento.gov.br Departamento de Cooperação Internacional Secretaria de Gestão – SEGES
Ministério do Planejamento, Orçamento e Gestão Esplanada, Bloco K, 4° andar (61) 2020- 4906