Time Series Forecasting Made Simple (Part 4.1): Understanding Stationarity in a Time Series

so far, we have discussed different decomposition methods and baseline models. Now, we move on to Time Series Forecasting models like ARIMA, SARIMA, etc.

But these forecasting models require the data to be stationary. So first, we will discuss what stationarity in a time series actually is, why it is required, and how it is achieved.

Perhaps most of you have already read a lot about stationarity in a time series through blogs, books, etc., as there are a lot of resources available to learn about it.

At first, I thought to explain what stationarity in time series is when I discuss forecasting models like ARIMA, etc.

But when I first learnt about this topic, my understanding didn’t go much beyond constant mean or variance, strong or weak stationarity, and tests to check for stationarity.

Something always felt missing; I was unable to understand a few things about stationarity.

So, I decided to write a separate article on this topic to explain what I learnt in response to my questions or doubts on stationarity.

I just tried to write about stationarity in time series in a more intuitive way, and I hope you will get a fresh perspective on this topic beyond the methods and statistical tests.

We call a time series stationary when it has a constant mean, constant variance and a constant autocovariance or constant autocorrelation structure.

Let’s discuss each property.

What do we mean by Constant Mean?

For example, consider a time series of sales data for 5 years. If we calculate the average sales of each year, the values should be roughly the same and if the averages differ significantly for each year, then there is no constant mean and time series is not stationary.

Image by Author

The next property of a stationary time series is Constant Variance.

If the spread of the data is the same throughout the series, then it is said to have Constant Variance.

In other words, if the time series goes up and down by similar amounts throughout the series, then it is said to have Constant Variance.

But if the ups and downs start small and then become larger later, then there is no constant variance.

The third property of a stationary time series is Constant Autocovariance (or Autocorrelation).

If the relationship between values depends only on the gap between them, regardless of when they occur, then there is Constant Autocovariance.

For example, you have written a blog and tracked its views for 50 days and if each day’s views are closely related to previous day’s views (day 6 views similar to day 5 and day 37 views similar to day 36, because they are one day apart).

If this relationship stays the same throughout the entire series, then autocovariance is constant.

In a stationary time series the autocorrelation usually decreases as the lag (or distance) increases because only nearby values are strongly related.

If autocorrelation remains high at larger lags, it may indicate the presence of trend or seasonality, suggesting non-stationarity.

When a time series has all of these three properties, then we call it a stationary time series, but we call this a second order stationarity or weak stationarity.

There are mainly two types of stationarity:
1) Strong Stationarity
2) Weak Stationarity

Strong Stationarity means the entire time series stays the same whenever we observe it, not just mean and variance but even skewness, kurtosis and overall shape of distribution.

In real world this is rare for a time series, so the classic forecasting models assumes weak stationarity, a more realistic and practical condition.

Identifying Stationarity in a Time Series

There are different methods for identifying stationarity in a Time Series.

To understand those methods, let’s consider a retail sales dataset which we used earlier in this series for STL Decomposition.

First is Visual Inspection.

Let’s plot this series

From the above plot, we can observe both trend and seasonality in the time series, which indicates that the mean is not constant. Therefore, we can conclude that this series is non-stationary.

Another method to test stationarity is to divide the time series into two halves and calculate the mean and variance.

If the values are roughly the same, then the series is stationary.

For this time series,

The mean is significantly higher, and variance is also much larger in first half. Since the mean and variance are not constant, this confirms that this time series is nonstationary.

We can also identify the stationarity in a time series by using the Autocorrelation (ACF) plot.

ACF plot for this time series

In the above plot, we can observe that each observation in this time series is correlated with its previous values at different lags.

As discussed earlier, autocorrelation gradually decays to zero in a stationary time series.

But that is not the case here as the autocorrelation is high at several lags (i.e., the observations are highly correlated even when they are farther apart) and it suggests the presence of trend and seasonality, which confirms that the series is non-stationary.

We also have statistical tests to identify stationarity in a time series.

One is Augmented Dickey Fuller (ADF) Test, and the other is Kwiatkowski-Phillips-Schmidt-Shin (“KPSS”) Test.

Let’s see what we get when we apply these tests to the time series.

Both the tests confirm that the time series is non-stationary.

These are the methods we use to identify stationarity in a time series,

Transforming a non-stationary time series to a Stationary Time Series.

We have a technique called ‘Differencing’ to transform a non-stationary to stationary series.

In this method, we subtract each value from its previous value. This way we see how much they change from one time to next.

Let’s consider a sample from retail sales dataset and then proceed with differencing.

Now we perform differencing, this we call as first-order differencing.

This is how differencing is applied across the whole time series to see how the values change over time.

Before first order differencing,

After first order differencing

Before applying first-order differencing, we can observe a growing trend in the original time series along with occasional spikes at regular intervals, indicating seasonality.

After differencing, the series fluctuates around zero, which means the trend has been removed.

However, since the seasonal spikes are still present, the next step is to apply seasonal differencing.

In seasonal differencing, we subtract the value from the same season in previous cycle.

In this time series we have yearly seasonality (12 months), which means:

For January 1993, we calculate Jan 1993 – Jan 1992.

This way we apply seasonal differencing to whole series.

After seasonal differencing on first order differenced series, we get

We can observe that the seasonal spikes are gone and also for year 1992 we get null values because there are no previous values to subtract.

After first order differencing and seasonal differencing, the trend and seasonality in a time series are removed.

Now we will again test for stationarity using ADF and KPSS tests.

We can see that the time series is stationary.

Note: In the final seasonal differenced series, we still observe some spikes around 2020-2022 because of pandemic (one-time events).

These are called Interventions. They may not violate stationarity; they can affect model accuracy. Techniques like Intervention analysis can be used here.

We will discuss this when we explore ARIMA modeling.

We removed the trend and seasonality in the time series to make it stationary using differencing.

Now instead of differencing, we can also use STL Decomposition.

Earlier in this series, we discussed that when trend and seasonal patterns in time series get messy, we use STL to extract those patterns in a time series.

So, we can apply STL decomposition on a time series and extract the residual component which we get after removing trend and seasonality.

We will also discuss ‘STL + ARIMA’ when we explore the ARIMA forecasting model.

So far, we have discussed methods for identifying stationarity and for transforming non-stationary time series into stationary.

But why do time series forecasting models assume stationarity?

We use time series forecasting models to predict the future based on past values.

These models require a stationary time series to predict the future because the patterns remain consistent over time.

In non-stationary time series, there is a constant change in mean and variance, making the patterns unstable and the predictions unreliable.

Aren’t trend and seasonality also patterns in a time series?

Trend and Seasonality are also the patterns in a time series, but they violate the assumptions of models like ARIMA, which require stationary input.

Trend and Seasonality are handled separately before modelling, and we will discuss this in upcoming blogs.

These time series forecasting models are designed to capture short-term dependencies after removing global patterns.

What exactly are these short-term dependencies?

When we have a time series, we try to decompose it using decomposition methods to understand the trend, seasonality and residuals in it.

We already know that the trend gives us the overall direction of the data over time (up or down) and seasonality shows the patterns that repeat at regular intervals.

We also get residual, which is remaining after we remove trend and seasonality from the time series. This residual is unexplained by trend and seasonality.

Trend gives overall direction and seasonality shows patterns that gets repeated throughout the series.

But there may be some patterns in residual which are temporary in a time series like a sudden spike in sales due to a promotion event or a sudden drop in sales due to strike or weather conditions.

What can models like ARIMA do with this data?

Do models predict future promotional events or strikes based on this data? No.

Most of the time series forecasting models are used in live production systems across many industries (Real-Time).

In real-time forecasting systems, as new data comes in, the forecasts are continuously updated to reflect the latest trends and patterns.

Let’s take a simple example of cool drinks inventory management.

The store owner knows that the cool drinks sales is high in summer and low in winter. But that doesn’t help him in daily inventory planning. Here the short term dependencies are critical.

For example,

there may be a spike in sales during festivals and wedding season for some time.
If there is a sudden temperature spike (heat wave)
A weekend 1+1 offer may increase sales
Weekend sales may be high compared to weekdays.
When store was out of stock for 2-3 days and the moment stock is back there may be a sudden burst of sales.

These patterns don’t repeat consistently like seasonality, and they aren’t part of a long-term trend. But they do occur often that forecasting models can learn from them.

Time series forecasting models don’t predict these future events, but they learn the patterns or behavior of the data when such a spike appears.

The model then predicts according to it like a spike in sales after a promotional event the sales may gradually become normal rather than a sudden drop. The models capture these patterns and provide reliable outcomes.

After prediction, the trend and seasonality components are added back to obtain the final forecast.

This is why the short-term dependencies are critical in time series forecasting.

Dataset: This blog uses publicly available data from FRED (Federal Reserve Economic Data). The series Advance Retail Sales: Department Stores (RSDSELD) is published by the U.S. Census Bureau and can be used for analysis and publication with appropriate citation.

Official citation:
U.S. Census Bureau, Advance Retail Sales: Department Stores [RSDSELD], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/RSDSELD, July 7, 2025.

Note:
All the visualizations and test results shown in this blog were generated using Python code.
You can explore the complete code here: GitHub.

In this blog, I used Python to perform statistical tests and, based on the results, determined whether the time series is stationary or non-stationary.

Next up in this series is a detailed discussion of the statistical tests (ADF and KPSS tests) used to identify stationarity.

I hope you found this blog intuitive and helpful.

I’d love to hear your thoughts or answer any questions.

Thanks for reading!

Time Series Forecasting Made Simple (Part 4.1): Understanding Stationarity in a Time Series

Related Posts

Energy Resilience In The Age Of Uptime: Why Generators Still Matter In A Digital-first World

Apple Sets Sept 9 “Awe Dropping” Event For IPhone 17 Launch

Leave a Reply Cancel reply