A Primer on Time-Series Forecasting

soupa_soundararajan1

Hemanth Sindhanuru

Apr 21, 2016
Share Article

Modelling & Forecasting Time-Series data has been one of the cornerstones of Predictive Analytics in the era of Big Data. There are a plethora of forecasting techniques available today whose context can be a pain to understand and as we know, in the war on noise, context serves as a crucial ammunition. To that end, we, Hemanth Sindhanuru & Srinidhi K, from LatentView Analytics are presenting this series of articles where we will be discussing a structured methodology to understand, analyse & forecast time-series data

1: Introduction to Time-Series Analysis
A data-scientist comes across time-series in almost every aspect of his day to day research, may it be a straight-forward sales data of a firm or a complex cohort data of different customer segments of a firm over the years.

Although data-scientists have a slew of machine-learning techniques at their disposal for generalized data analysis, the analysis of time-series is fundamentally different due to the fact that time-series have an attribute of Sequentiality i.e. two time-series with the same n observations but in a different sequence have completely different characteristics. This attribute results in certain features specific to a time-series

This article will be the first in a series which will be summarizing the various techniques for modelling & forecasting a time-series available in the current scientific literature.

We begin with the simplest form of temporal data, Univariate Time-Series and then move onto time-series structured in other ways which include Multi-Variate Time-Series & Hierarchical Time-Series.

Unicariate Time-Series
So, how do we define a Univariate Time-series as an entity? Any sequence of real numbers collected regularly in time, where each number represents a value & the index of the number in the series represents time-stamp of the recorded time can be considered a Univariate time-series.

The flowchart below describes a methodical approach for the analysis & forecasting of univariate time-series

The below diagram lists the forecasting models which will be covered as we move forward in this series.

Inferring from the Time-Series Plots
The first step in time-series analysis (as is the case with any other analysis) is visualizing the data. In most of the cases, the nature of the time-series & its attributes can be deduced from the time-series plot itself.

Answering the following questions from the time-series plot will give us a very strong context of the time-series attributes which we will be discussing going forward.

• Are there any periodic patterns that are clearly dominant in the data
➢ Are there any significant peaks or troughs at regular intervals?
➢ What is the scale of these periodic variations, are they constant at every interval or do they increase in amplitude?

• Is there any long-term pattern the data is displaying like
➢ Is the mean of the data increasing or decreasing or constant with time?
➢ And what is the nature of this pattern, linear, non-linear or ambiguous?

Stationarity of the Time-Series
Stationarity is one of the fundamental attributes of a time-series. A time-series is said to be stationary when the attributes of the time-series values do not depend on the time at which the series is observed i.e. there is no identifiable pattern in the time-series. The values are equivalent to random noise. So a stationary time-series would be roughly horizontal (with no increasing or decreasing or periodic patterns we looked out for above) with constant variance.

To test whether a series is stationary, we have Unit Root tests, which basically are Hypothesis tests with the Stationarity of the data as a Null/Alternative hypothesis. Some of the reliable ones among them are the Dickey-Fuller (DF) test, Phillips-Perron (PP) test, Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test etc.

The R library has an implementation of the Augmented Dickey Fuller (ADF) test, which is an augmented version of the Dickey–Fuller test for a larger and more complicated set of time series models. Please note that since the ADF test is a hypothesis test, there may be cases of false +/- errors. It is recommended to further examine the data to confirm the stationarity of the data.

Keep watching this space for more

BACK TO THE FUTURE – A Beginner’s Guide to Forecasting
1. A Primer on Time-Series Forecasting
2. Structured Time-Series Models
3. Periodicity of a Seasonal Time-Series

This site uses cookies to give our users the best experience on our website. By continuing on our website, you are agreeing to the use of cookies. To learn more, you can read our privacy policy.