The Homography Matrix

A homography is a linear transformation that relates two projections of the same planar surface. Mathematically, it is represented by a $3 \times 3$ matrix $H$.

$\mathbf{x}' \sim H\mathbf{x} = \begin{bmatrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & h_{33} \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$

8 Degrees of Freedom

Although the matrix has 9 elements, it is defined up to a scale factor. We typically set $h_{33} = 1$, leaving 8 independent variables that control translation, rotation, scale, shear, and perspective.

Decomposition

A homography can be decomposed to reveal the physical camera motion and plane geometry:

$H = K(R - \frac{\mathbf{t}\mathbf{n}^T}{d})K^{-1}$

Where $K$ is intrinsics, $R$ is rotation, $\mathbf{t}$ is translation, $\mathbf{n}$ is the plane normal, and $d$ is distance.

Transformation DOF Invariants
Translation 2 Area, Orientation, Parallelism
Affine 6 Parallelism, Ratio of Areas
Homography 8 Collinearity, Cross-ratio
IMG
h11 (sx)
1.0
h12 (shx)
0.0
h13 (tx)
0.0
h21 (shy)
0.0
h22 (sy)
1.0
h23 (ty)
0.0
h31 (px)
0.000
h32 (py)
0.000
h33 (scale)
1.000 (Fixed)