Datasets
DSG-JIT includes light-weight loaders for common SLAM / VO datasets so you can quickly hook real sequences into the sensor stack, factor graph, and dynamic scene graph.
The goal is:
- No heavy dependencies (no OpenCV required just to list frames).
- Simple dataclasses with timestamps + file paths.
- Easy integration with
sensors.*streams andworld.*components.
Currently supported:
datasets.tum_rgbd
TumRgbdFrame(t, rgb_path, depth_path=None, pose_quat=None)
dataclass
Single RGB-D frame from a TUM RGB-D sequence.
This dataclass stores only light-weight metadata: timestamps and relative file paths. Consumers are responsible for actually loading images/depth (e.g., via OpenCV or Pillow) if desired.
:param t: Timestamp in seconds (as parsed from the TUM text files). :type t: float
:param rgb_path:
Relative or absolute path to the RGB image file corresponding to this
frame. Typically something like "rgb/1341847980.722988.png".
:type rgb_path: str
:param depth_path:
Optional path to the depth image associated with this frame. May be
None if depth is not available or use_depth=False was passed
to the loader.
:type depth_path: Optional[str]
:param pose_quat:
Optional ground-truth pose as a 7-tuple
(tx, ty, tz, qx, qy, qz, qw) in TUM convention. May be None
if ground truth is unavailable or alignment was disabled.
:type pose_quat: Optional[Tuple[float, float, float, float, float, float, float]]
load_tum_rgbd_sequence(root, use_depth=True, use_groundtruth=False, max_frames=None, max_time_diff=0.02)
Load a TUM RGB-D sequence directory into a list of frames.
The directory is expected to contain standard TUM files such as
rgb.txt, depth.txt, and optionally groundtruth.txt. This
loader parses metadata and returns a list of :class:TumRgbdFrame
instances, but does not actually load images or depth maps.
:param root: Path to the TUM sequence root directory. :type root: Union[str, os.PathLike]
:param use_depth:
Whether to attempt to associate depth frames from depth.txt with
each RGB frame.
:type use_depth: bool
:param use_groundtruth:
Whether to attempt to associate ground-truth poses from
groundtruth.txt with each RGB frame.
:type use_groundtruth: bool
:param max_frames:
Optional maximum number of frames to return. If None, the full
sequence is loaded.
:type max_frames: Optional[int]
:param max_time_diff: Maximum allowed absolute difference in timestamps (seconds) when associating RGB with depth and ground truth. :type max_time_diff: float
:return: A list of TUM RGB-D frames with timestamps, file paths, and optionally ground-truth poses. :rtype: List[TumRgbdFrame]
Source code in dsg-jit/dsg_jit/datasets/tum_rgbd.py
167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 | |
datasets.kitti_odometry
KittiOdomFrame(seq, idx, t, left_path, right_path=None, velo_path=None, T_w_cam0=None)
dataclass
Single frame from the KITTI Odometry dataset.
This dataclass provides file paths and optional ground-truth pose for a particular frame index in a given sequence.
:param seq:
KITTI odometry sequence ID (e.g. "00", "05").
:type seq: str
:param idx: Integer frame index within the sequence. :type idx: int
:param t:
Approximate timestamp in seconds. Many downstream pipelines assume
KITTI odometry runs at 10 Hz, so a common convention is
t = idx / 10.0.
:type t: float
:param left_path:
Path to the left camera image (image_0) for this frame.
:type left_path: str
:param right_path:
Optional path to the right camera image (image_1). May be None
if not available or desired.
:type right_path: Optional[str]
:param velo_path:
Optional path to the LiDAR point cloud (velodyne). May be None
if not available or desired.
:type velo_path: Optional[str]
:param T_w_cam0:
Optional 4x4 homogeneous transform from camera-0 to world frame as a
flattened 16-element tuple in row-major order. This is derived from
the official poses/<seq>.txt file if available.
:type T_w_cam0: Optional[Tuple[float, ...]]
load_kitti_odometry_sequence(root, seq, load_right=True, load_velodyne=False, with_poses=True, max_frames=None)
Load a KITTI Odometry sequence into a list of frames.
This helper assumes the standard KITTI directory structure:
.. code-block::
root/
sequences/
00/
image_0/
image_1/
velodyne/
calib.txt
poses/
00.txt
Only metadata (paths and ground-truth transforms) is loaded; images and point clouds are not read into memory.
:param root: Path to the KITTI odometry dataset root directory. :type root: Union[str, os.PathLike]
:param seq:
Sequence ID string (e.g. "00", "01").
:type seq: str
:param load_right:
Whether to populate right_path pointing to image_1. If False,
the field will always be None.
:type load_right: bool
:param load_velodyne:
Whether to populate velo_path pointing to velodyne scans. If
False, the field will always be None.
:type load_velodyne: bool
:param with_poses:
Whether to attempt to load ground-truth poses from poses/<seq>.txt.
:type with_poses: bool
:param max_frames:
Optional maximum number of frames to return. If None, all available
frames in the left camera directory are used.
:type max_frames: Optional[int]
:return:
List of :class:KittiOdomFrame entries with timestamps, paths, and
optionally ground-truth transforms.
:rtype: List[KittiOdomFrame]
Source code in dsg-jit/dsg_jit/datasets/kitti_odometry.py
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 | |