Autoregression: Time Series, Models, Trading, Python and More – Part I

Author: Chainika Thakar (Originally written by Satyapriya Chaudhari)

Autoregression emerges as a powerful tool for anticipating future values in time-based data. This data, known as a time series, consists of observations collected at various timestamps, spaced either regularly or irregularly. Leveraging historical trends, patterns, and other hidden influences, autoregression models unlock the capability to forecast the value for the next time step.

By analysing and learning from past data, these models (including various options beyond autoregression) paint a picture of future outcomes. This article delves deeper into one particular type: the autoregression model, often abbreviated as the AR model.

This article covers:

What is autoregression?
Formula of autoregression
Autoregression calculation
Autoregression model
Autoregression vs autocorrelation
Autoregression vs linear regression
Autoregression vs spatial autoregression
Autocorrelation Function and Partial Autocorrelation Function
Steps to build an autoregressive model
Example of autoregressive model in Python for trading
Applications of autoregression model in trading
Common challenges of autoregression models
Tips for optimizing autoregressive model performance

What is autoregression?

Autoregression models time-series data as a linear function of its past values. It assumes that the value of a variable today is a weighted sum of its previous values.

For example, analysing the past one month’s performance of AAPL (APPLE) to predict future performance.

Formula of autoregression

In simpler terms, autoregression says: “Today’s value depends on yesterday’s value, the day before that, and so on.“

We express this relationship mathematically using a formula:

X _t = c + φ₁X_t − ₁ + φ₂X_t − ₂ + … + φ_pX_t − _p + ε_t

Where,

X_t is the current value in the timeseries.
c is a constant or intercept term.
ϕ₁,ϕ₂,…,ϕ_p are the autoregressive coefficients.
X_t−₁,X_t−₂,…,X_t−_p are the past values of the time series.
ε_t is the error term representing the random fluctuations or unobserved factors.

Autoregression calculation

The autoregressive coefficients (1, 2,…..,p) are typically estimated using statistical methods like least squares regression.

In the context of autoregressive (AR) models, the coefficients represent the weights assigned to the lagged values of the time series to predict the current value. These coefficients capture the relationship between the current observation and its past values.

The goal is to find the coefficients that best fit the historical data, allowing the model to accurately capture the underlying patterns and trends. Once the coefficients are determined, they can be used to forecast future values in the time series based on the observed values from previous time points. Hence, the autoregression calculation helps to create an autoregressive model for time series forecasting.

Autoregression model

Before delving into autoregression, it’s beneficial to revisit the concept of a regression model.⁽¹⁾

A regression model serves as a statistical method to determine the association between a dependent variable (often denoted as y) and an independent variable (typically represented as X). Thus, in regression analysis, the focus is on understanding the relationship between these two variables.

For instance, consider having the stock prices of Bank of America (ticker: BAC) and J.P. Morgan (ticker: JPM).

If the objective is to forecast the stock price of JPM based on BAC’s stock price, then JPM’s stock price would be the dependent variable, y, while BAC’s stock price would act as the independent variable, X. Assuming a linear association between X and y, the regression equation would be:

y = mX + c

here,
m represents the slope, and c denotes the intercept of the equation.

However, if you possess only one set of data, such as the stock prices of JPM, and wish to forecast its future values based on its past values, you can employ autoregression. Let’s denote the stock price at time t as yt.

The relationship between yt and its preceding value yt−1 can be modelled using:

AR(1) = y_t = ϕ₁y_t −₁ + c

Here, Φ1 is the model parameter, and c remains the constant. This equation represents an autoregressive model of order 1, signifying regression against a variable’s own earlier values.

Similar to linear regression, the autoregressive model presupposes a linear connection between yt and yt−1 , termed as autocorrelation. A deeper exploration of this concept will follow subsequently.

Autoregression models of order 2 and generalise to order p

Let’s delve into autoregression models, starting with order 2 and then generalising to order p.

Autoregression Model of Order 2 (AR(2))

In an autoregression model of order 2 (AR(2)), the current value yt is predicted based on its two most recent lagged values, yt-1 and yt-2 .

y_t = c + ϕ₁y_t−₁ + ϕ₂y_t−₂ + ε_t

Where,

c is a constant
ϕ₁ and ϕ₂ are the autoregressive coefficients for the first and second lags, respectively
ε_t represents the error term

Generalising to order p (AR(p))

For an autoregression model of order p (AR(p)), the current value yt is predicted based on its p most recent lagged values.

y_t = c + ϕ₁y_t−₁ + ϕ₂y_t−₂ +…+ ϕ_py_t−_p + ε_t

Where,∙

c is a constant
ϕ₁ϕ₂,…,ϕ_p are the autoregressive coefficients for the respective lagged terms y_t−₁,y_t−₂,…y_t−_p
ε_t represents the error term

In essence, an AR(p) model considers the influence of the p previous observations on the current value. The choice of p depends on the specific time series data and is often determined using methods like information criteria or examination of autocorrelation and partial autocorrelation plots.

The higher the order p, the more complex the model becomes, capturing more historical information but also potentially becoming more prone to overfitting. Therefore, it’s essential to strike a balance and select an appropriate p based on the data characteristics and model diagnostics.

Stay tuned to learn about autoregression vs autocorrelation.

Originally posted on QuantInsti Blog.

Join The Conversation

If you have a general question, it may already be covered in our FAQs. If you have an account-specific question or concern, please reach out to Client Services.

Visit IBKR.com Open an IBKR Account

Autoregression: Time Series, Models, Trading, Python and More – Part I

Posted February 7, 2024

What is autoregression?

Formula of autoregression

Autoregression calculation

Autoregression model

Autoregression models of order 2 and generalise to order p

Autoregression Model of Order 2 (AR(2))

Generalising to order p (AR(p))

Join The Conversation

Leave a Reply Cancel reply

Disclosure: Interactive Brokers

IBKR Campus Newsletters

Interactive Brokers Canada Inc.

Interactive Brokers Australia Pty. Ltd.

Interactive Brokers Hong Kong Limited

Interactive Brokers India Pvt. Ltd.

Interactive Brokers Securities Japan Inc.

Interactive Brokers Singapore Pte. Ltd.

What is autoregression?

Formula of autoregression

Autoregression calculation

Autoregression model

Autoregression models of order 2 and generalise to order p

Autoregression Model of Order 2 (AR(2))

Generalising to order p (AR(p))

Related Tags

Join The Conversation

Leave a Reply Cancel reply

Disclosure: Interactive Brokers

IBKR Campus Newsletters