Popular Python Libraries for Algorithmic Trading – Part II

Articles From: QuantInsti
Website: QuantInsti

See Part I for an overview of Python libraries.

Python libraries for data manipulation

NumPy

NumPy or Numerical Python provides powerful implementations of large multi-dimensional arrays and matrices. The library consists of functions for complex array processing and high-level computations on these arrays. Some of the mathematical functions of this library include:

  • trigonometric functions (sin, cos, tan, radians),
  • hyperbolic functions (sinh, cosh, tanh),
  • logarithmic functions (log, logaddexp, log10, log2) etc.

Pandas

Pandas is a vast Python library used for the purpose of data analysis and manipulation and also for working with numerical tables or data frames and time series, thus, being heavily used for algorithmic trading using Python. Pandas can be used for various functions including:

  • importing CSV files,
  • performing arithmetic operations in series,
  • boolean indexing,
  • collecting information about a data frame, etc.

SciPy

SciPy, just as the name suggests, is an open-source Python library used for scientific computations. It is used along with the NumPy to perform complex functions like numerical integration, optimization, image processing etc. These are a few modules from SciPy which are used for performing the above functions:

  • scipy.integrate (For numerical integration),
  • scipy.signal (For signal processing),
  • scipy.fftpack(For Fast Fourier Transform) etc.

Summary: The libraries under data manipulation are unique since they are used for mathematical functions. Numpy is used basically for complex array processing and high-level computations. Whereas, Pandas is mainly used for importing the data, for arithmetic operations, for using dataframe etc.

Coming to SciPy, the library is used for more scientific computations such as for the signal processing as to whether to buy or sell etc.

Python library for technical analysis

TA-Lib

TA-Lib or Technical Analysis library is an open-source library and is extensively used to perform technical analysis on financial data using technical indicators such as RSI (Relative Strength Index)Bollinger bands, MACD etc. It not only works with Python but also with other programming languages such as C/C++, Java, Perl etc. Here are some of the functions ⁽¹⁾ available in TA-Lib:

  • BBANDS – For Bollinger Bands
  • AROONOSC – For Aroon Oscillator
  • MACD – For  Moving Average Convergence/Divergence
  • RSI – For Relative Strength Index.

Summary: The technical analysis library is meant for using technical indicators while trading. These indicators help the algorithmic trader to create a strategy on the basis of important findings.

For example, RSI indicates the overbought and oversold conditions in the market for you to predict such a condition in the future. In the case of the prediction of overbought stocks, such stocks are good candidates for selling. Whereas, the prediction of an oversold condition implies that the stocks can be bought.


Python libraries for machine learning

Scikit-learn

Scikit-learn is a Machine Learning library built upon the SciPy library and consists of various algorithms including classification, clustering, and regression, and can be used along with other Python libraries like NumPy and SciPy for scientific and numerical computations. Some of its classes and functions are:

  • sklearn.cluster,
  • sklearn.datasets,
  • sklearn.ensemble,
  • sklearn.mixture etc.

TensorFlow

TensorFlow ⁽²⁾ is an open-source software library for high-performance numerical computations and machine learning applications such as neural networks.

TensorFlow GPU can be installed for using in Python.

TensorFlow allows easy deployment of computation across various platforms like CPUs, GPUs, TPUs etc. due to its flexible architecture.

Keras

Keras ⁽³⁾ is a deep learning library used to develop neural networks and other deep learning models. Keras is a library that can be installed. Furthermore, Keras can be built on top of TensorFlow, Microsoft Cognitive Toolkit or Theano and focuses on being modular and extensible. It consists of the elements used to build neural networks such as layers, objectives, optimizers etc. This library can be used in trading for stock price prediction using Artificial Neural Networks.

Theano

Theano is a computational framework machine learning library in Python for computing multidimensional arrays. Theano works similarly to TensorFlow, but it is not as efficient as TensorFlow.

But, Theano can be used in distributed or parallel environments and is mostly used in deep learning projects.

LightGBM (Gradient Boost)

Gradient Boosting is one of the best and most popular machine learning libraries, which helps developers in building new algorithms by using redefined elementary models and namely decision trees. Therefore, there are special libraries which are available for fast and efficient implementation of this method.

These libraries are LightGBMXGBoost, and CatBoost. All these libraries help in solving a common problem and can be utilised in almost a similar manner.

Here are some unique points of LightGBM-

  • Very fast computation ensures high production efficiency.
  • Intuitive, hence making it user-friendly.
  • Faster training than many other deep learning libraries.
  • Will not produce errors when you consider NaN values and other canonical values.

This library provides highly scalable, optimised, and fast implementations of gradient boosting, which makes it popular among machine learning developers.

Summary: In the case of machine learning libraries, each library is used for a different training purpose. Sci-kit can be used for scientific and numerical computations.

Although TensorFlow and Theano are quite similar in their working, Theano is not as efficient as TensorFlow. But, Theano is usually preferred for deep learning projects since it allows us to evaluate mathematical operations including multi-dimensional arrays.

Keras is used to build neural networks such as layers, objectives, optimizers etc. Coming to  Eli5, it is efficient in supporting other libraries such as XGBoost, lightning, and scikit-learn so as to lead to accuracy in machine learning model predictions.

Last but not least, LightGBM is the most efficient for creating algorithms from scratch.

Stay tuned for the next installment in this series to learn about Python libraries used for backtesting.

For additional insight on this topic and to read the article originally posted on QuantInsti, visit https://blog.quantinsti.com/python-trading-library/.

Disclosure: Interactive Brokers

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from QuantInsti and is being posted with its permission. The views expressed in this material are solely those of the author and/or QuantInsti and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.