In the previous blogs, we examined supervised learning algorithms like linear regression in detail. In this blog, we look at what unsupervised learning is and how it differs from supervised learning.
Then, we move on to discuss some use cases of unsupervised learning in investment and trading. We explore two unsupervised techniques in particular- k-means clustering and PCA with examples in Python.
Contents
- What is unsupervised learning?
- Supervised vs unsupervised learning
- When do we use unsupervised algorithms?
- Clustering algorithms
- Concept of clustering
- K-means clustering algorithm
- Dimensionality Reduction
- Concept of dimensionality reduction
- Principal Component Analysis
- Other types of unsupervised algorithms
- Challenges in unsupervised learning
What is unsupervised learning?
As the name suggests, ‘unsupervised’ learning takes place when there is no supervisor or teacher and the learner learns on her own.
For instance, consider a child who sees and tastes an apple for the very first time. She registers the colour, the texture, the taste and the smell of the fruit. The next time she sees an apple, she knows that both this and the previous apple are similar objects as they have very similar characteristics.
She knows that this is very different from an orange. But still, she does not know what it is called in human-speak, i.e. an ‘apple’ as there is no knowledge of the label.
Such learning where the labels do not exist (in the absence of a teacher) but the learner can still learn about patterns on her own is referred to as unsupervised learning.
In the context of machine learning algorithms, unsupervised learning occurs when an algorithm learns from plain examples without any associated response and determines the data patterns on its own.
In the next section, we will discuss how this type of learning differs from the other type of popular learning algorithms in machine learning, i.e. supervised learning algorithms.
Supervised vs unsupervised learning
Learning in supervised learning, as the name suggests, occurs under supervision, i.e., when the algorithm predicts a value for a sample from the training data, it is told whether the prediction was correct or not.
This is possible as we have the correct values stored as ‘labels’/’target variable’, which are passed to the algorithm along with the input data. Common supervised learning tasks are those of classification and regression.
In classification tasks, the labels are the correct class to which the sample belongs, whereas, in regression, the actual value of the dependent variable(Y) serves as a benchmark for comparing the prediction. The algorithm can then tweak its parameters to achieve higher accuracy in prediction.
Thus, the main goal of supervised learning is to build a robust predictive model.
On the other hand, in unsupervised learning, we only pass the input data, and there are no labels. Unsupervised models seek to find the underlying or hidden structure or distribution in the data in order to learn more about the data.
In other words, unsupervised learning is where we only have input data and no corresponding output variables, and the main goal is to learn more or discover new insights from the input data itself.
A common example of unsupervised algorithms are the clustering algorithms, that group the data based on the patterns that the machine detects.
For example, let us consider a situation in which we have a few data points based on two input features X1 and X2.
- If we want our algorithm to classify/categorize the data into two known classes, we will use a supervised classification algorithm.
- On the other hand, if we want the algorithm to tell us how the data is structured, we would use an unsupervised clustering algorithm.
Stay tuned for the next installment to learn about unsupervised algorithms.
Originally posted on QuantInsti Blog.
Disclosure: Interactive Brokers
Information posted on IBKR Campus that is provided by third-parties and not by Interactive Brokers does NOT constitute a recommendation by Interactive Brokers that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from QuantInsti and is being posted with permission from QuantInsti. The views expressed in this material are solely those of the author and/or QuantInsti and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.
In accordance with EU regulation: The statements in this document shall not be considered as an objective or independent explanation of the matters. Please note that this document (a) has not been prepared in accordance with legal requirements designed to promote the independence of investment research, and (b) is not subject to any prohibition on dealing ahead of the dissemination or publication of investment research.
Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations.