The agent is the RL model which takes the input features/state and decides the action to take. For example, the RL agent takes RSI and past 10 minutes returns as input and tells us whether we should go long on the Apple stock or square off the long position if we are already in a long position.
Let’s put everything together and see how it works.
Visit QuantInsti to see the animated graphic: https://blog.quantinsti.com/reinforcement-learning-trading/
State & Action: Suppose the Closing price of Apple was $92 on July 24, 2020. Based on the state (RSI and 10-days returns), the agent gave a buy signal.
Environment: For simplicity, we say that the order was placed at the open the next trading day, which is July 27. The order was filled at $92.Thus, the environment tells us that you are long one share of Apple at $92.
Reward: And no reward is given as we are still in the trade.
State & Action: You get the next state of the system created using the latest price data which is available. On the close of July 27, the price had reached $94. The agent would analyse the state and give the next action, say Sell to environment
Environment: A sell order will be placed which will square off the long position
Reward: A reward of 2.1% is given to the agent.
|Date||Closing price||Action||Reward (% returns)|
Great! We have understood how the different components of the RL model come together. Let us now try to understand the intuition of how the RL agent takes the action.
Stay tuned for the next installment in which Ishan will demonstrate the Q Table and Q Learning.
Visit QuantInsti to download practical code: https://blog.quantinsti.com/reinforcement-learning-trading/.
Disclosure: Interactive Brokers
Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from QuantInsti and is being posted with its permission. The views expressed in this material are solely those of the author and/or QuantInsti and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.
Disclosure: Displaying Symbols on Video
Any stock, options or futures symbols displayed are for illustrative purposes only and are not intended to portray recommendations.