Detecting Bots On Twitter Using Botometer – Part I

Articles From: QuantInsti
Website: QuantInsti

In this blog, we will learn what bots are and how they can skew the sentiment analysis used in your trading strategy. We will cover the following topics:

When we perform trading on the basis of market sentiment, we need to fetch data from news sources such as Twitter, Reuters, Bloomberg and Webhosie, etc. Although reading complete articles and gauging their sentiment can be difficult, estimating the sentiment of a tweet is not that complicated.

But before you estimate the sentiment of a tweet you need to know if the tweet was an automated response of a bot or made by a human.

You may ask why this is relevant?

Why should we identify a bot?

It is relevant because you need to know what the bots are doing, which in turn will tell you how the sentiment of a particular stock on Twitter is being manipulated. When we calculate the Twitter sentiment of a particular stock, we identify and remove those tweets made by bot users. This will give the true sentiment sans manipulation. This true sentiment can be a very powerful metric, when used with other technical indicators, to call the tops and bottoms of a trend.


In python, we use the library called botometer to know if a particular tweet was made by a bot or not.
The botometer library uses a machine learning algorithm trained on tens of thousands of labelled data. This algorithm’s output is a probability on a scale of 0 to 1, where 1 indicates that a Twitter account is managed by a bot.

The Botometer API takes the user id as the input and then extracts 1200 features related to that user to compute a score. The Botometer gives separate scores for the following categories:

  1. Network features
  2. User features
  3. Friends features
  4. Temporal features
  5. Content features
  6. Sentiment features

Let us discuss some of these features.

Network features

Network features of a user include information on the retweets, mentions, and hashtags that a user tweeted in the past.

For example, If the user is retweeting only those tweets made by a particular handle, then the user is most likely a bot.

User features

This contains user-specific information such as the user name, language, location, account created date, etc., Generally, bots do not contain such information. And if they do, it will be something gibberish.

Learn more QuantInsti here

To learn more about Python and R, visit QuantInsti website and their educational offerings at their Executive Programme in Algorithmic Trading (EPAT™).

Disclosure: Interactive Brokers

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from QuantInsti and is being posted with its permission. The views expressed in this material are solely those of the author and/or QuantInsti and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.