Scene Graph Modules
This section documents entity definitions and relational structures used to build differentiable dynamic scene graphs.
scene_graph.entities
Entity definitions for the Dynamic Scene Graph.
This module defines the core node types stored in the DSG-JIT scene graph, such as:
• Poses / agents (SE3 trajectories)
• Places (topological nodes)
• Rooms / regions
• Objects
• Voxel cells or volumetric primitives (if represented as entities)
Each entity is typically a lightweight data structure that carries:
• A unique identifier
• A semantic type (e.g., "agent", "object", "place", "room")
• Optional metric information (pose, position, bounding volume)
• Optional attributes (class labels, instance ids, timestamps, etc.)
These entities form the nodes of the scene graph; edges and semantic
relations between them are defined in scene_graph.relations.
Typical Contents
Common patterns in this module include:
• Data classes for:
- BaseEntity: minimal base type for all nodes
- AgentEntity: for robots / people with SE3 pose trajectories
- ObjectEntity: for movable / manipulable objects
- PlaceEntity: for nodes in the topological graph
- RoomEntity: for higher-level spatial regions
• Utility functions for:
- Creating entities from raw SLAM / perception outputs
- Updating metric fields (e.g., pose) after optimization
- Serializing / deserializing entity metadata
Integration with DSG-JIT
The entity layer is tightly coupled to:
• `world.scene_graph.SceneGraphWorld`:
which stores entities, relations, and their mapping to variables
in the factor graph.
• `core.factor_graph.FactorGraph`:
entities that carry metric state (e.g., poses, voxel centers)
are typically tied to variables in the factor graph so they can
be optimized along with the rest of the world model.
• `slam.measurements`:
certain measurements (e.g., pose attachments, voxel observations)
operate directly on entity-linked variables.
Design Goals
• Thin, explicit data model: Entities should be simple, well-typed containers that are easy to reason about and manipulate in experiments.
• Bridging symbolic and geometric worlds: Entities provide the anchor points where semantic structure (object, room, agent, place) meets metric state (pose, voxel position, etc.).
• Extendable for research: New entity types (e.g., semantic regions, affordance nodes, learned latent entities) can be added without changing the core optimization backend, as long as they are mapped to appropriate variable types.
ObjectNode(id, position)
dataclass
Scene graph object node with a 3D position.
:param id: Integer identifier for this object node. :param position: 3D position of the object as a JAX array of shape (3,).
RoomNode(id, position)
dataclass
Scene graph room centroid represented as a 3D point.
:param id: Integer identifier for this room node. :param position: 3D position of the room centroid as a JAX array of shape (3,).
scene_graph.relations
Semantic and topological relations for the Dynamic Scene Graph.
This module defines the core relation types and utilities used to connect entities in the DSG-JIT scene graph, such as:
• Place ↔ Place (topological connectivity)
• Room ↔ Room (adjacency, containment)
• Object ↔ Place / Room (support, inside, on, near)
• Agent ↔ Object / Place (interaction, visibility, reachability)
The goal is to provide a lightweight, JAX-friendly representation of edges and relation labels that can be used both for:
• Pure graph reasoning (e.g., "what objects are on this table?")
• Differentiable optimization (e.g., factors that enforce relational
consistency between metric poses and symbolic structure)
Typical Contents
Although the exact API may evolve, this module usually contains:
• Enumerations or string constants for relation types
(e.g., "on", "inside", "adjacent", "connected_to", "observes")
• Simple data classes / containers for relations:
- relation id
- source entity id
- target entity id
- relation type
- optional attributes (weights, confidences, timestamps)
• Helper functions for:
- Adding / removing relations in a `SceneGraphWorld`
- Querying neighbors by relation type
- Converting relations into factor-graph constraints when needed
Design Goals
• Separation of concerns: Geometry (poses, voxels) is stored elsewhere; this module only cares about relationships between entities.
• Compatibility with optimization: When relations induce constraints (e.g., "object is on a surface"), these can be translated into factors in the world model or SLAM layer.
• Extensibility: New relation types or attributes should be easy to add without breaking the core graph structure.
Notes
The scene graph can be used in both non-differentiable and differentiable
modes. In the differentiable setting, certain relations may correspond
to factors whose residuals live in slam.measurements. This module
provides the symbolic layer that those factors are grounded in.
pose_place_attachment_residual(x, params)
Residual tying a place's position to a pose's translation component.
The input state vector x is assumed to be the concatenation of a pose
vector and a place vector::
x = [pose_vec, place_vec]
By default, the pose is represented in se(3) as a 6D vector
[tx, ty, tz, rx, ry, rz] and the place is a 1D scalar.
The constraint enforces that the place coordinate tracks one component of the pose translation plus an optional offset::
pose_val = pose_vec[pose_coord_index]
target_place = pose_val + offset
r = place_vec - target_place
:param x: Flat state vector containing the pose and place variables,
[pose_vec, place_vec]. The first pose_dim entries are the
pose, followed by place_dim entries for the place.
:param params: Dictionary of parameters controlling the attachment:
"pose_dim" (int, default 6), the dimension of the
pose vector; "place_dim" (int, default 1), the
dimension of the place vector; "pose_coord_index" (int,
default 0), index of the pose component used to attach
the place; "offset" (array-like of shape (place_dim,),
optional), additive offset applied to the selected pose
component before comparison.
:return: Residual vector r with shape (place_dim,) enforcing the
attachment between pose and place.
:rtype: jax.numpy.ndarray
Source code in dsg-jit/dsg_jit/scene_graph/relations.py
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | |
room_centroid_residual(x, params)
Compute a residual enforcing that a room's position matches the centroid of its member places.
The input state vector x is assumed to contain a concatenation of the room
position followed by the positions of its member places::
x = [room_pos, place0_pos, place1_pos, ..., placeN_pos]
All positions live in :math:\mathbb{R}^d, and d is provided via
params["dim"].
The residual is defined as::
r = room_pos - mean(place_positions)
If no member places are provided, the residual is a zero vector of the same
shape as room_pos.
:param x: Flat state vector containing the room position followed by the
positions of its member places. Shape (dim * (N + 1),) where
dim is the position dimension and N is the number of
member places.
:param params: Dictionary of parameters. Must contain:
"dim" (int or scalar array) indicating the dimension
of each position vector.
:return: Residual vector r with shape (dim,) enforcing that the room
lies at the centroid of its member places.
:rtype: jax.numpy.ndarray
Source code in dsg-jit/dsg_jit/scene_graph/relations.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | |