← Back

RL Markets

Train a deep reinforcement learning agent to trade. Starting from the Markov decision process and building up to a complete DQN system that learns to buy, hold, and sell from raw price data.

Market DataOHLCV prices
State Encodingreturns, position, vol
DQN Agentneural Q-network
ActionBuy / Hold / Sell
Environmentportfolio update
RewardP&L + penalties
Policy UpdateBellman + backprop

Foundations

Lesson 01

Markov Decision Processes

The agent, environment, and reward loop. States, actions, transitions, and the Bellman equation.

RL · Math

Lesson 02

Reward Design

What you reward is what you get. Designing P&L, Sharpe, and drawdown-penalised reward functions.

Finance · RL

Lesson 03

Q-Learning

The Bellman equation as an algorithm. Q-tables, TD error, and ε-greedy exploration.

RL · Algorithms

The Model

Lesson 04

Deep Q-Networks

Replacing the Q-table with a neural network. Experience replay, target networks, and DQN loss.

Deep Learning · RL

Lesson 05

Exploration Strategies

Balancing exploration and exploitation. ε-greedy decay, UCB, and why exploration matters.

RL · Strategy

The System

Lesson 06

Trading Environment

Building a Gym-compatible trading environment. Observation space, action space, and episode structure.

Engineering · Python

Lesson 07

Full System

Complete DQN trading agent in Python. Training loop, evaluation, and the live results.

Python · Full Build
View the complete DQN trading system → trading_env.py · dqn_agent.py · train.py · evaluate.py