Bridging Two Worlds

Once our TensorFlow.js model is trained, we have a system that can take a raw video frame and output 8 precise floating-point numbers. To make this data useful, we must bridge the gap into OpenCV.js for final geometric reconstruction.

Resolving the Homography

We take the 4 corner (x,y) pairs predicted by our network and pass them into cv.getPerspectiveTransform alongside our "Ideal" square coordinates. This generates the 3x3 Homography matrix $H$.

Inverse Mapping vs. Forward Warp

Traditional image warping (cv.warpPerspective) pulls pixels from the source into a destination grid. This often causes Bilinear Blur, where colors are averaged across adjacent cells—killing our colorimetric precision.

Instead, we use Inverse Mapping. We invert $H$ to get $H^{-1}$. For every cell in our 21x21 ideal grid, we mathematically map its center back to the exact sub-pixel location in the *original* raw frame.

P_source = H_inv * P_target
Result: Pure, Raw Pixel Sampling

Pristine Extraction: Sampling the raw source frame ensures we pull the exact color values without the interpolation noise introduced by image warping.

Final step for 21x21 Color Grid Reconstruction

Inference-to-Sampling Pipeline

TFJS Inference

Corners (x,y)

→

OpenCV.js

Matrix $H$

→

Inversion $H^{-1}$

Precision Map

Direct Sub-pixel Coordinate Mapping