Methodology Note 1: Handling the Time Series Data




At the time of writing, there has been 8 models done so far on electricity market price and rolling system demand, but there has been no explanation of why I have decided to use a bivariate VAR model with only 2 lags of each variable. 

The answer is, simply, safety. We will go over the data between the 1st of February and the 7th of February to highlight why 2 lags is safe enough. Or so I am opining, at least.

You may have noticed that the graph axes are a “change” in price or demand, rather than simply price or demand. This is done to help eliminate trends over time, as we want the relationship between price and demand rather than either variable and time itself. If there are time trends, the system is dynamic; its always on the move, so relationships between variables can quickly get blurred. If you have ever skimmed over economics textbook, you may have noticed the phrase “comparative statics” in price and quantity graphs – this is no coincidence. Things are easier to analyse if they are “static” rather than dynamic. If time series data exhibits a static nature, it is stationary. If it does not, it is non-stationary. Below are the log-transformations of chanes in price and demand over time. You can see that there is little in the way of a time trend.





Serial correlation is another problem to address with time series data. In order to test if lags in models are causing serial correlation – and hence an inferior model – we use a Breusch-Godfrey test. This is simply an F-test (or a Chi-squared if you like your stats to move around as your degrees of freedom change) of a model where the dependent variable is the error term of the actual model in question, and the regressors are distributed lags. Should one of the coefficients of this auxiliary model be statistically significant, we have a problem.

Luckily in our little sample, we do not. With the aid of R and Excel, below are the p-values of the test explained above as the order of lags increases. It quickly begins to converge to 1. For clarification, lag order 2 still yields a p-value of above 5% -- close but no dice.




 So in sum, the lags that do end up being present, even at a low maximum of two, are there just for safety. In fact, in this the week of writing’s price-demand analysis, no lags are included, given how there is such a low chance of non-stationarity. Hopefully this clarifies things.



Comments