Tutorial: Multi-Voxel Point Observation Learning
Categories: Voxel Grids & Spatial Fields, Learning & Hybrid Modules, Dynamic Scene Graphs
Overview
In voxel-based mapping systems, observed 3D points often serve as soft constraints that pull voxel centers toward actual sensor measurements. When these observations are uncertain or biased, it becomes valuable to learn the observation parameters themselves by differentiating through the optimization process.
This tutorial demonstrates a compact example of trainer-style learning for voxel point observations. We combine:
- A 3‑voxel chain in 1D-like geometry
- Smoothness constraints (voxel‑to‑voxel)
- A prior anchor for the first voxel
- Three learnable voxel-point observations supplied through a parameter matrix
theta - A Gauss‑Newton inner solver operating on voxel variables
- An outer gradient descent loop updating
thetavia supervision
The experiment is based on exp13_trainer_voxel_point_multi.py.
Building the Voxel Graph
We construct a tiny world model containing three voxel variables:
v0 = Variable(NodeId(0), "voxel_cell", jnp.array([0.0, 0.0, 0.0]))
v1 = Variable(NodeId(1), "voxel_cell", jnp.array([1.2, 0.2, 0.0]))
v2 = Variable(NodeId(2), "voxel_cell", jnp.array([2.3, -0.1, 0.1]))
Next we register residuals used by the factors:
wm.register_residual("prior", prior_residual)
wm.register_residual("voxel_smoothness", voxel_smoothness_residual)
wm.register_residual("voxel_point_obs", voxel_point_observation_residual)
We add:
- A strong prior on
v0 - Smoothness factors between
(v0, v1)and(v1, v2) - Three voxel-point observation factors, each of which receives its real
point_worldfromtheta
This produces a compact, differentiable voxel estimation problem.
Defining the Parameterized Observation Model
Rather than storing fixed point_world values in factors, we use:
theta ∈ ℝ^(K × 3)
where each row of theta[k] is injected into the corresponding voxel_point_obs factor. This is implemented through:
residual_fn_param, _ = wm.build_residual_function_voxel_point_param_multi()
This allows the residual function to depend on both the state x and the learnable parameters theta.
Inner Optimization: Solving for Voxels
For each proposed value of theta, we solve for voxel positions using Gauss‑Newton:
def solve_inner_voxel(wm, theta):
residual_fn_param, _ = wm.build_residual_function_voxel_point_param_multi()
x0, _ = wm.pack_state()
def residual_x(x):
return residual_fn_param(x, theta)
cfg = GNConfig(max_iters=20, damping=1e-3, max_step_norm=1.0)
return gauss_newton(residual_x, x0, cfg)
This inner loop is fully differentiable.
The Supervised Learning Objective
We specify target voxel centers:
gt_voxels = jnp.array([
[0.0, 0.0, 0.0],
[1.0, 0.0, 0.0],
[2.0, 0.0, 0.0]
])
The supervised loss is:
L(θ) = ½ Σ_i || v_i(θ) – gt_i ||²
Implemented as:
def supervised_loss(theta):
x_opt = solve_inner_voxel(wm, theta)
...
v_stack = jnp.stack([v0, v1, v2])
return 0.5 * jnp.sum((v_stack - gt_voxels)**2)
Outer Learning Loop
We apply gradient descent to adjust theta:
theta = theta0
lr = 0.1
for it in range(20):
g = grad_fn(theta)
theta = theta - lr * g
Optionally, gradient clipping avoids numerical instability.
Over iterations, the observation points theta[k] become more consistent with the ground-truth voxel positions. This, in turn, drives the optimized voxel states closer to the true layout.
Results
After optimization:
- The learned observation points
thetaconverge toward ground truth. - The voxel estimates
(v0, v1, v2)align closely with their true positions. - The supervised loss decreases significantly.
This demonstrates one of DSG‑JIT’s core advantages:
You can differentiate through a full Gauss‑Newton optimization and learn parameters that influence the system.
Summary
In this tutorial, you learned how to:
- Build a voxel-based factor graph with smoothness and observation factors.
- Use parameterized voxel-point observations with differentiable residuals.
- Run a Gauss‑Newton inner solver to estimate voxel state.
- Define a supervised objective on voxel positions.
- Use outer-loop gradient descent to learn per-observation parameters.
This trainer-style workflow generalizes to larger voxel grids, learned observation models, or hybrid neural feature extractors. It is a powerful pattern enabled by DSG‑JIT’s differentiable factor graph engine.