
3 minute read
Annex 4B. The Kalman Filter
Subnational Government S in South aS ia 163
at rate pt. Net of the fiscal deficit, the debt stock then evolves as follows:
SFAt = (pt Bt − pt−1 Bt−1) = [pt (Bt − Bt−1) + (pt − pt−1) Bt−1]. (4A.3)
Equation (4A.3) highlights that the SFA can arise for two reasons: first, because of the below-the-line acquisition of liabilities (and assets), holding their valuation constant; and second, because of changes to the valuation of the existing debt stock. Changing valuations can arise, for instance, because of movements in the exchange rate if debt is denominated in a foreign currency or because of changes to interest rates. While not modeled here explicitly, statistical discrepancies can also be responsible for changes to the SFA.33
Consistent with the literature (see, for example, Bova et al. 2016), this chapter identifies contingent liability shocks using the SFA because a significant share of contingent liabilities materializes “below the line” (such as the UDAY debt relief scheme in India). “Above-the-line” contingent liability shocks to the fiscal deficit, such as relief expenditures related to natural disasters, are rare in Indian states. Given the prevalence of cash accounting, this means that this definition captures contingent liabilities that do not go through the budget—for instance, bailouts of stateowned enterprises or pension funds if governments take over their debt—but not through the payment of subsidies. Similarly, this definition captures the realization of debt guarantees, but not of price guarantees.
To identify unexpected shocks in the SFA, we apply a Kalman filter to the series. Conceptually, the Kalman filter predicts an expected value of the series for the next period given its historic trajectory. Annex 4B provides a detailed description of the statistical methodology. A contingent liability shock in this application is then defined as a data observation that sufficiently exceeds the predicted expectation. More specifically, outliers are defined by standardizing the Kalman filter residuals and classifying any observation that lies more than 1 standard deviation above the mean as a contingent liability (shock).
The purpose of filtering is to extract useful information from a signal, removing the noise. The Kalman filter is the best known of these filtering methods. It is a recursive algorithm that estimates unknown variables using imperfect measurements of these variables. In our application, the unknown (state) variable that we are trying to estimate is the underlying level of the stock-flow adjustment (or other public finance series) after we have filtered out the noise from expected expenditures or debt waivers.
In order to estimate this latent variable, we must model how we believe it behaves. Since we are using time-series data, we focus on modeling our series as autoregressive integrated moving average (ARIMA) processes, as they are highly flexible. To select the ARIMA model that best fits our data series for each subnational region, we implement the Hyndman-Khandakar algorithm. This algorithm selects the model that minimizes the Akaike information criterion.
Kalman defined his filter using statespace methods, which simplifies implementation in discrete time. Therefore, we rewrite the best ARIMA model for each subnational entity in its corresponding state-space form and estimate this model using the square-root filter to numerically implement the Kalman filter recursions (De Jong 1991; Durbin and Koopman 2001, sec. 6.3).
When the model is not stationary, the filter is augmented as described by De Jong (1991), De Jong and Chu-Chun-Lin (1994), and Durbin and Koopman (2001, sec. 5.7).
We then estimate the parameters of this linear state-space model by maximum likelihood. The Kalman filter is used to construct the log likelihood, assuming normality and stationarity.
Once we have these parameter estimates, we estimate the underlying states at each time period using previous information from the data series. The data series is predicted by plugging in the estimated states. The residuals are then calculated as the differences between the predicted and the realized data series.