Time series data have more structure

than most common data sets.

Each row in time series data represents a date or time,

and columns represent properties about that time.

In this example, the time series plots airline passengers

on US carriers year by year.

Although these data are accumulated yearly,

different time series use different measures of time,

and any time series can be averaged over a larger time

increment.

Because these data are yearly, we

could average airline passengers over a 10-year period

to get decade-by-decade rates.

We would need more granular data to look at passengers

on a monthly or weekly basis.

Some time series are granular on the hourly level,

such as the solar farm hourly energy output plot.

These data might be more useful when averaged over a larger

period, but after the hourly data are collected,

we could average over days, weeks, months, or even years

to get a broader view of the data.

The US monthly energy consumption plot

shows energy consumption year by year,

but the data might have been collected by the utility

companies as hourly usage rates.

These data can then be averaged over different time periods

to create a variety of time series,

including the yearly one displayed here.

We want to predict the future values of the time series given

the past values, but we must recognize that the past values

are inherently noisy.

Our time series consist of two conceptual components:

the signal and the noise.

The signal is the underlying process

that governs the dynamics of the time series.

The noise is random variation that

lives on top of the signal, making it harder to detect.

In this time series, we’re just flipping coins in order.

The signal here is just the mean of 50% heads and 50% tails.

The underlying process governing the coin flips doesn’t tell us

anything more.

The noise is all the variation around the mean,

in other words, any deviation from the expected 50/50

outcome.

Good forecasts capture as much of the signal as possible while

attempting to ignore the noise.

In this coin flip example, the best forecast we could generate

would predict the mean with some random variation.

More complicated forecasts might seem appealing

but will just end up overfitting to the noise

given how simple the signal is.

In general, forecasting will do poorly

when noise dominates the signal in the data.