Tutorial: Hybrid Differentiable Scene Graphs

Categories: SE(3) & SLAM, Voxel Grids & Spatial Fields, Dynamic Scene Graphs, Learning & Hybrid Modules

Overview

This tutorial walks through the HERO hybrid experiment—a large‑scale differentiable factor‑graph example that combines:

6 SE(3) poses
6 voxel centers
Odometry factors (odom_se3)
Voxel point observations (voxel_point_obs)
Learnable odometry measurements
Learnable voxel observation points
A fully differentiable inner optimization loop
A fully differentiable outer learning loop

This experiment demonstrates how to jointly learn: 1. The correct odometry increments between poses. 2. The correct observed 3D points associated with voxels. 3. The optimal state configuration of both poses and voxels.

It represents one of the most complete examples of hybrid SE(3) + voxel learning in this repository. Under the hood, this experiment uses a WorldModel‑backed factor graph, where residuals are registered with the WorldModel and all state packing/unpacking happens at the WorldModel layer.

Tutorial: Hybrid Joint Learning With SE(3) Poses + Voxels

1. What We Build

We construct a hybrid WorldModel‑backed factor graph with:

6 SE(3) Poses

Ground‑truth conceptual targets:

pose0: [0, 0, 0, 0, 0, 0]
pose1: [1, 0, 0, 0, 0, 0]
...
pose5: [5, 0, 0, 0, 0, 0]

Each pose has a small initial perturbation.

6 Voxel Centers

Ground‑truth conceptual targets:

voxel0: [0, 0, 0]
voxel1: [1, 0, 0]
...
voxel5: [5, 0, 0]

Each voxel is initialized with positional errors in x and y.

2. Factor Types in the Graph

We incorporate the following factors:

Prior Factors

A strong prior on pose0 → anchors the absolute frame.
A weak prior on voxel0.

Odom Factors (pose_i → pose_{i+1})

These are learnable: - Each odometry measurement is a 6‑vector SE3 increment. - Initial measurements are intentionally biased.

Voxel Observation Factors (pose_j, voxel_i)

These are also learnable: - Each factor uses one 3D point (world coordinate). - These 3D points become the learnable parameters theta["obs"].

3. Parameterization

We learn two parameter sets:

theta = {
    "odom": (n_odom, 6)   # learned SE(3) increments
    "obs":  (n_obs, 3)   # learned 3D observation points
}

Both are jointly optimized during the outer loop.

4. Inner Optimization Loop (Gradient Descent)

We optimize the state vector (all poses + all voxels) using:

A differentiable objective
Based on 0.5 * || residuals ||²
Gradient descent (small learning rate, 80 iterations)

Because the inner loop is differentiable, we can backpropagate through it to update theta.

5. Outer Optimization Loop (Learn Parameters)

We optimize theta to minimize:

Pose supervision

Encourage:

pose5.tx → 5.0

Voxel supervision

Encourage:

voxel_i.x → i
voxel_i.y → 0

We compute gradients using JAX:

grad_theta = jax.grad(supervised_loss)(theta)
theta ← theta - lr * grad_theta

This is effectively a hybrid differentiable SLAM + mapping system.

Full Code (With Explanations)

Build the Hybrid Factor Graph

def build_hybrid_graph():
    ...

This constructs a WorldModel, adds all SE(3) poses, voxels, and factors described above, and returns the WorldModel (and associated pose/voxel ids) used by the rest of the experiment.

Build the Parametric Residual Function

from dsg_jit.world.model import WorldModel

def build_param_residual(wm: WorldModel):
    ...

A residual(x, theta) function built on top of the WorldModel residual registry
That injects learned odom & voxel obs parameters into each corresponding factor
While using wm.pack_state() / wm.unpack_state() to manage the stacked state

This keeps the graph structure, residual definitions, and packed state layout centralized in the WorldModel.

Inner Solve (Differentiate Through GD)

def inner_solve(theta):
    x_opt = gradient_descent(objective, x0, cfg)

The inner optimization updates all pose/voxel states.

Outer Supervised Loss

def supervised_loss(theta):
    ...

This controls learning behavior: - Move last pose toward 5.0 along x - Move voxels to correct x positions - Penalize y drift

Training Loop

for it in range(steps):
    g = grad_fn(theta)
    theta = {
        "odom": theta["odom"] - lr * g["odom"],
        "obs":  theta["obs"] - lr * g["obs"],
    }

Summary

In this tutorial you learned how to:

Construct a hybrid SE(3) + voxel WorldModel‑backed factor graph
Parameterize both odometry and 3D point observations
Build a differentiable residual function with parameter injection
Implement a differentiable inner solver
Implement an outer learning loop to optimize parameters
Achieve a complete end‑to‑end differentiable SLAM + mapping system

This HERO experiment is one of the most advanced examples in the project and serves as a blueprint for (implemented on top of the WorldModel residual architecture):

Joint pose + map learning
Robust SLAM systems
Hybrid neural‑symbolic optimization
Differentiable scene‑graph reasoning