Modelling & Forecasting Time-Series data has been one of the cornerstones of Predictive Analytics in the era of Big Data. There are a plethora of forecasting techniques available today whose context can be a pain to understand and as we know, in the war on noise, context serves as a crucial ammunition. To that end, we, Hemanth Sindhanuru & Srinidhi K, from LatentView Analytics are presenting this series of articles where we will be discussing a structured methodology to understand, analyse & forecast time-series data
In the previous article, we have explored one of the most significant characteristics of a time-series, Periodicity. Periodicity is a pre-requisite input for most of the modelling techniques we will be discussing i.e. if we vary the values of periodicity for a particular time-series, we would be getting different model parameters as output. The attributes we discuss today, unlike Periodicity, may not be directly implemented in our modelling approaches. However, they serve as criteria based on which we can select the appropriate modelling technique, instead of a Trial & Error approach to find the best model.
This field of study has some interesting publications from Xiaozhe Wang, Kate Smith Miles & Rob Hyndman which focus on applying statistical operations to identify a set of features that best summarize and capture the global picture of the data. In this article, we will be exploring these measures which capture the underlying structural characteristics of a time-series.
INTRINSIC AMBIGUITY IN THE ESTIMATION OF TREND & SEASONALITY
Decomposition of a time-series into its components is a crucial step before we move on to statistical measuring of its attributes. To obtain a precise calibration for the time-series under consideration, we calculate some measures on the original data as well as the de-trended & de-seasonalized data.
The ambiguity refers to the estimation of Trend & Seasonal Patterns being subjective to the nature of the problem statement and the data-scientist. There are diagnostic methods that help in choosing the best estimate but the modelling techniques we explored previously for modelling Seasonality & Trend (which include Exponential Smoothing, Fourier Decomposition, STL, X-11 etc.) do not always lead to a unique choice. That being said, the ideal estimate of trend & seasonality would be the one where the Remainder component would have no residual (uncaptured) seasonality or trend.
APPLYING TRANSFORMATIONS
There are three main reasons for applying a transformation on the original dataset;to stabilize the variance, to make the data normally distributed and to make the seasonal effect additive.
Most often, we apply transformations for the third reason since we would be using STL algorithm for time-series decomposition (as stated in the previous articles) & STL only handles additive Seasonality.The two most commonly used transformations, the Log transform & the Square-Root transform are special cases of the class of Box-Cox Transformations. The Box-Cox Transformation is defined as follows;
IMPLEMENTATION IN R
require(fpp)
dataset <- as.numeric(departures[,4])
plot( dataset, type = 'l' )
# departures is an inherent object in R containing the tourism data for ...
# ... the number of tourists departing from Australia
periodogram <- spec.pgram( x = dataset, spans = c(3) )
# From the Periodogram (NOT PLOTTED HERE), we can identify significant peak at
# frequency 0.083333 and its corresponding harmonics (i.e. multiples) at frequencies
# 0.166667, 0.269999, 0.333333 etc.
# Hence, we can infer the Periodicity of the seasonal pattern as 1/0.083333 ~ 12
timeSeries <- ts( data = dataset, frequency = 12 )
# Declaring an object of class "ts" requires two arguments from the user
# ----> data: the time-series values in the form of a numeric vector or list
# ----> frequency:...
# 1) PLEASE DO NOT CONFUSE THIS ARGUMENT WITH THE DEFINITION OF FREQUENCY IN
# PHYSICS WHICH WE REFERRED EARLIER WHILE DISCUSSING THE PERIODOGRAM
# 2) "frequency" ARGUMENT HERE IS EQUIVALENT TO THE PERIODICITY
# 3) FOR NON-SEASONAL DATA, "frequency" ARGUMENT EQUALS 1
Lambda <- BoxCox.lambda()
# BoxCox.lambda() gives us the transformation parameter
adj_timeSeries <- BoxCox( x = timeSeries, lambda = Lambda )
plot(adj_timeSeries)
OUTPUT
SCALING THE ATTRIBUTES ONTO (0,1)
Gaining insights from the attributes would be easier if all of them are measured on the same scale. Hence, we map the attribute values from their natural range to [0, 1]. For scaling the values, we use the statistical transformations described in the work by Xiaozhe Wang, Kate Smith Miles & Rob Hyndman.
IMPLEMENTATION IN R
# f1 maps [0,Inf) to [0,1]
f1<-function(x,a,b) {
if (exp(a*x) == Inf)
map <- 1
else
map <- (exp(a*x)-1)/(exp(a*x)+b)
return (map)
}
# f2 maps [0,1] to [0,1]
f2 function(x,a,b) {
return( (exp(a*x)-1)/(exp(a*x)+b)*(exp(a)+b)/(exp(a)-1) )
}
TIME-SERIES ATTRIBUTES
Degree of Seasonality & Trend
It’s natural to characterize a time-series by its degree of trend and seasonality due to them being two most significant structural characteristics. These two measures can be defined as follows
Here, Y refers to the Original Time-Series, ( Y – Trend ), ( Y – Seasn ), ( Y – Trend – Seasn ) refers to the de-trended series, de-seasonalized series, de-trended & de-seasonalized series.
IMPLEMENTATION IN R
ts_measures <- function(ts) {
if( frequency(ts) > 1 ) {
STL_model <- stl( ts, s.window = "periodic" )
seasn <- as.numeric( STL_model[[1]][,1] )
trend <- as.numeric( STL_model[[1]][,2] )
deseasn <- ts - seasn
detrend <- ts - trend
remainder <- ts - trend + mean(trend, na.rm = TRUE)
deg_seasn <- ifelse( var(deseasn) < 1e-10,
0, max(0, min(1, 1-var(remainder)/var(deseasn))) )
deg_trend <- ifelse( var(detrend) < 1e-10,
0, max(0, min(1, 1-var(remainder)/var(detrend))) )
} else {
trend <- lowess(ts)$y
# Here we are using the lowess() function, a local polynomial regression
# to estimate the trend. There are other alternatives like splines, moving averages etc.
# As discussed, ideal estimation would be the one with no residual trend in the remainder
detrend <- ts - trend + mean(trend, na.rm = TRUE)
deg_seasn <- 0
deg_trend <- ifelse( var(detrend) < 1e-10, 0, max(0, min(1, 1-var(remainder)/var(ts))) )
}
measures ("Degree of Seasonality" = deg_seasn, "Degree of Trend" = deg_trend )
return(measures)
}
To explore how effective is our time-series model (here, we are using a STL model) in capturing the seasonal & trend components, let’s check if there is any residual Seasonality or Trend by calculating the same attributes, Degrees of Seasonality & Trend for the remainder component
IMPLEMENTATION IN R
# Extracting the remainder component from the STL model used previously
print( ts_measures(adj_timeSeries) )
STL_model <- stl(adj_timeSeries, s.window = "periodic")
remainder <- as.numeric( STL_model[[1]][,3] )
# Checking any inherent periodic pateerns left uncaptured in the remainder
res_periodogram <- spec.pgram(x = remainder, spans = c(3,5))
plot(res_periodogram)
# Checking the scale of thses uncaptured attributes...
# Calculating the Degrees of Seasonality & Trend on the remainder
residual_ts ( remainder, frequency = 12 )
print( ts_measures(residual_ts) )
OUTPUT
We can see the scale of uncaptured seasonality and trend are abt 0.04 & 0.001 on a scale of 0 to 1. Hence, we can infer that the STL algorithm is a reliable method for decomposing the current time-series. It must be reiterated that this selection might not hold for other time-series’.
Skewness & Kurtosis
Skewness and Kurtosis measure how similar a given distribution is to a normal distribution. Skewness is a measure of the presence of symmetry in the data. Kurtosis is a measure of whether the dataset is peaked or flat relative to a normal distribution.
These attributes can be better understood visually;
These attributes are quantified in the following way for a univariate time-series with n values, where Y refers to a generic element of the time-series, E(Y) refers to the mean of all the observations
Non-Linearity
Some time-series can’t be adequately represented by linear models. An example of this would be the behaviour of an economic time-series during recession which are not handled well by traditional linear models. Hence, the non-linearity of a time-series is an important attribute to determine the selection of the best forecasting method.
There are many approaches to test the nonlinearity in time series models of various functional natures. One of the most reliable among them is a test based on a Neural Network model named, Terasvirta’s test. We will be using the test statistic for the aforementioned test for measuring time series data nonlinearity.
Auto-Correlation
The modelling techniques we will be studying in the upcoming articles are built on one fundamental assumption that any specific observation of a time-series can be interpreted as a function (either linear or non-linear) of the observations preceding it.
Although we can gain an inference from the Auto-Correlation Function (ACF) plots we discussed in the previous articles, a measure which quantifies the degree of auto-correlation would be useful to check if the series fits a white-noise model.
Once possible choice is the Box-Pierce Statistic whose mathematical formulation is
Entropy
Entropy is a measure for quantifying the noise in a time-series. Recent studies have re-categorized the previously considered random noise series’ as non-linear dynamic systems and they aim to understand the nature of random behaviour & improve the short term forecasts. Lyapunov Exponent is one such measure which characterizes the entropy of the system
Keep watching this space for more
BACK TO THE FUTURE – A Beginner’s Guide to Forecasting
1. A Primer on Time-Series Forecasting
2. Structural Time-Series Models
3. Periodicity of a Seasonal Time-Series
4. Defining Time-Series Attributes
5. Exponential Smoothing Framework
6. ARIMA modelling in a nutshell