Lesson 03

Gaussian Distributions

Each HMM state emits observations drawn from a Gaussian distribution with its own mean and variance. Understanding these parameters is understanding what the model has learned about each regime.

The probability density function

The Gaussian (normal) distribution describes the probability of observing a value x given a mean μ and standard deviation σ. Drag the sliders to feel how each parameter shapes the curve.

f(x; μ, σ) = 1 / (σ√2π) · exp( −(x − μ)² / 2σ² )

Mean (μ) — centre of the bell 0.00%

Std Dev (σ) — width of the bell 1.00

Peak height: —

P(x within 1σ): 68.3%

P(x within 2σ): 95.4%

P(x within 3σ): 99.7%

Gaussian PDF

−8−40+4+8

μ (mean) is the expected value — the centre of the distribution. In regime terms: Bull Run has μ > 0 (positive average return), Bear/Crash has μ < 0, Chop has μ ≈ 0.

σ (standard deviation) measures spread — how far from the mean values typically fall. High σ means high volatility: large positive AND large negative returns are both likely. Bear/Crash regimes have the highest σ.

Regime distributions side by side

The three market regimes each have their own characteristic Gaussian. The HMM learns these parameters automatically during training — it finds the (μ, σ) pair that best explains the data for each state.

Bull Run vs Bear/Crash vs Chop — overlaid distributions

■ Bull Run μ = +0.05%, σ = 1.0% ■ Bear/Crash μ = −0.12%, σ = 2.2% ■ Chop μ = +0.00%, σ = 1.5%

Separation enables classification. When you observe a return of +3%, the Bull Run distribution assigns it a much higher probability than Bear/Crash does. The HMM uses these relative likelihoods to assign the bar to a state. The greater the separation between distributions, the more confident the model's assignments.

From univariate to multivariate

Our HMM uses three features simultaneously — returns, range, and volume change. This requires a multivariate Gaussian with a mean vector and a covariance matrix.

f(x; μ, Σ) = (2π)^(−d/2) |Σ|^(−½) · exp( −½ (x−μ)ᵀ Σ⁻¹ (x−μ) )

Covariance matrix Σ. d = 3 features → 3×3 matrix. The diagonal entries are the variances of each feature. The off-diagonals capture correlations: e.g., Bull Run periods might show that high positive returns tend to co-occur with rising volume (positive correlation). The HMM learns this structure during training.

Why "full" covariance matters. The hmmlearn library offers diagonal (uncorrelated) or full covariance. Full is more expressive but requires more data to estimate reliably. With 730 days of hourly data (~17,520 bars), full covariance across 3 features is tractable and captures important cross-feature dependencies.

← Lesson 02: Market Regimes Next: HMM Foundations →