Autoregression: Time Series, Models, Trading, Python and More – Part I

Articles From: QuantInsti
Website: QuantInsti

Author: Chainika Thakar (Originally written by Satyapriya Chaudhari)

Autoregression emerges as a powerful tool for anticipating future values in time-based data. This data, known as a time series, consists of observations collected at various timestamps, spaced either regularly or irregularly. Leveraging historical trends, patterns, and other hidden influences, autoregression models unlock the capability to forecast the value for the next time step.

By analysing and learning from past data, these models (including various options beyond autoregression) paint a picture of future outcomes. This article delves deeper into one particular type: the autoregression model, often abbreviated as the AR model.

This article covers:

  • What is autoregression?
  • Formula of autoregression
  • Autoregression calculation
  • Autoregression model
  • Autoregression vs autocorrelation
  • Autoregression vs linear regression
  • Autoregression vs spatial autoregression
  • Autocorrelation Function and Partial Autocorrelation Function
  • Steps to build an autoregressive model
  • Example of autoregressive model in Python for trading
  • Applications of autoregression model in trading
  • Common challenges of autoregression models
  • Tips for optimizing autoregressive model performance

What is autoregression?

Autoregression models time-series data as a linear function of its past values. It assumes that the value of a variable today is a weighted sum of its previous values.

For example, analysing the past one month’s performance of AAPL (APPLE) to predict future performance.

Formula of autoregression

In simpler terms, autoregression says: “Today’s value depends on yesterday’s value, the day before that, and so on.

We express this relationship mathematically using a formula:

X t = c + φ1Xt1 + φ2Xt2 + … + φpXtp + εt


  • Xt is the current value in the timeseries.
  • c is a constant or intercept term.
  • ϕ12,…,ϕp are the autoregressive coefficients.
  • Xt1,Xt2,…,Xtp are the past values of the time series.
  • εt is the error term representing the random fluctuations or unobserved factors.

Autoregression calculation

The autoregressive coefficients (1, 2,…..,p) are typically estimated using statistical methods like least squares regression.

In the context of autoregressive (AR) models, the coefficients represent the weights assigned to the lagged values of the time series to predict the current value. These coefficients capture the relationship between the current observation and its past values.

The goal is to find the coefficients that best fit the historical data, allowing the model to accurately capture the underlying patterns and trends. Once the coefficients are determined, they can be used to forecast future values in the time series based on the observed values from previous time points. Hence, the autoregression calculation helps to create an autoregressive model for time series forecasting.

Autoregression model

Before delving into autoregression, it’s beneficial to revisit the concept of a regression model.⁽¹⁾

A regression model serves as a statistical method to determine the association between a dependent variable (often denoted as y) and an independent variable (typically represented as X). Thus, in regression analysis, the focus is on understanding the relationship between these two variables.

For instance, consider having the stock prices of Bank of America (ticker: BAC) and J.P. Morgan (ticker: JPM).

If the objective is to forecast the stock price of JPM based on BAC’s stock price, then JPM’s stock price would be the dependent variable, y, while BAC’s stock price would act as the independent variable, X. Assuming a linear association between X and y, the regression equation would be:

y = mX + c

m represents the slope, and c denotes the intercept of the equation.

However, if you possess only one set of data, such as the stock prices of JPM, and wish to forecast its future values based on its past values, you can employ autoregression. Let’s denote the stock price at time t as yt.

The relationship between yt and its preceding value yt−1 can be modelled using:

AR(1) = yt = ϕ1yt1 + c

Here, Φ1 is the model parameter, and c remains the constant. This equation represents an autoregressive model of order 1, signifying regression against a variable’s own earlier values.

Similar to linear regression, the autoregressive model presupposes a linear connection between yt and yt−1 , termed as autocorrelation. A deeper exploration of this concept will follow subsequently.

Autoregression models of order 2 and generalise to order p

Let’s delve into autoregression models, starting with order 2 and then generalising to order p.

Autoregression Model of Order 2 (AR(2))

In an autoregression model of order 2 (AR(2)), the current value yt is predicted based on its two most recent lagged values, ​yt-1 and yt-2 .

yt = c + ϕ1yt1 + ϕ2yt2 + εt


  • c is a constant
  • ϕ1 and ϕ2 are the autoregressive coefficients for the first and second lags, respectively
  • εt represents the error term

Generalising to order p (AR(p))

For an autoregression model of order p (AR(p)), the current value yt is predicted based on its p most recent lagged values.

yt = c + ϕ1yt1 + ϕ2yt2 +…+ ϕpytp + εt


  • c is a constant
  • ϕ1ϕ2,…,ϕp are the autoregressive coefficients for the respective lagged terms yt1,yt2,…ytp
  • εt represents the error term

In essence, an AR(p) model considers the influence of the p previous observations on the current value. The choice of p depends on the specific time series data and is often determined using methods like information criteria or examination of autocorrelation and partial autocorrelation plots.

The higher the order p, the more complex the model becomes, capturing more historical information but also potentially becoming more prone to overfitting. Therefore, it’s essential to strike a balance and select an appropriate p based on the data characteristics and model diagnostics.

Stay tuned to learn about autoregression vs autocorrelation.

Originally posted on QuantInsti Blog.

Join the Discussion

Thank you for engaging with IBKR Campus. If you have a general question, it may already be covered in our FAQs. If you have an account-specific question or concern, please reach out to Client Services.

Your email address will not be published. Required fields are marked *

Disclosure: Interactive Brokers

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from QuantInsti and is being posted with its permission. The views expressed in this material are solely those of the author and/or QuantInsti and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.