Train a deep reinforcement learning agent to trade. Starting from the Markov decision process and building up to a complete DQN system that learns to buy, hold, and sell from raw price data.
Foundations
Lesson 01
The agent, environment, and reward loop. States, actions, transitions, and the Bellman equation.
RL · MathLesson 02
What you reward is what you get. Designing P&L, Sharpe, and drawdown-penalised reward functions.
Finance · RLLesson 03
The Bellman equation as an algorithm. Q-tables, TD error, and ε-greedy exploration.
RL · AlgorithmsThe Model
Lesson 04
Replacing the Q-table with a neural network. Experience replay, target networks, and DQN loss.
Deep Learning · RLLesson 05
Balancing exploration and exploitation. ε-greedy decay, UCB, and why exploration matters.
RL · StrategyThe System
Lesson 06
Building a Gym-compatible trading environment. Observation space, action space, and episode structure.
Engineering · PythonLesson 07
Complete DQN trading agent in Python. Training loop, evaluation, and the live results.
Python · Full Build