RobotForge
Published·~14 min

LiDAR SLAM: LOAM and its descendants

LOAM, F-LOAM, LIO-SAM — the lineage that powers most modern robot mapping stacks. Why LiDAR SLAM is different from visual, and the algorithms that became the production standard.

by RobotForge
#slam#lidar#loam

If autonomy is outdoors, the sensor is probably LiDAR. Cameras struggle with featureless surfaces, harsh lighting, and night; LiDAR is consistent. The dominant SLAM family on LiDAR data starts with LOAM (Lidar Odometry And Mapping, Zhang & Singh 2014) and runs through F-LOAM, LeGO-LOAM, LIO-SAM, and Faster-LIO. They all share an architecture; the differences are speed, accuracy, and how they fuse other sensors.

Why LiDAR SLAM is different from visual

  • Direct 3D measurements: each return is a 3D point. No need to triangulate from 2D features.
  • Consistent in lighting: works in dark, fog, glare. Doesn't rely on visual texture.
  • Sparse but reliable: ~100k–1M points per scan. Fewer than camera pixels, but each carries metric depth.
  • Geometry-driven: features are edges and planes in the cloud, not corners + descriptors.
  • Computationally heavy: nearest-neighbor search over millions of points dominates.

LOAM: the architecture

LOAM splits the problem into two threads:

  • Odometry (high-rate): per-scan, ~10 Hz. Estimate motion between consecutive scans.
  • Mapping (lower-rate): per few-scans, ~1 Hz. Refine the pose against an accumulated map.

The two-rate decoupling lets odometry run real-time on commodity CPUs while mapping does the heavy lifting in a slower thread.

The features: edges and planes

For each LiDAR ring (one circular sweep at constant elevation):

  • Compute curvature c for each point: how much does the local neighborhood deviate from a smooth curve?
  • High curvature → edge feature (corners of objects).
  • Low curvature → planar feature (walls, ground).

Pick the top-k edge points and top-k planar points per ring. ~100–300 features per scan. Much smaller than the raw cloud; still characterizes the geometry.

Scan-to-scan matching

Given two consecutive scans:

  • For each edge feature in the current scan, find the closest edge in the previous. Compute point-to-line residual.
  • For each planar feature, find the closest plane in the previous. Compute point-to-plane residual.
  • Solve non-linear least-squares: find the relative pose minimizing total residual.

Output: 6-DOF transform between scans. Compose with previous pose to get current pose. This is "scan-to-scan odometry" — fast but drifts.

Scan-to-map matching

The slower thread maintains an accumulated point cloud (the "map"). For each new scan:

  • Match the scan's features against the map's features (kd-tree spatial query).
  • Solve the same point-to-line + point-to-plane least-squares.
  • Update the map with the registered scan.

Scan-to-map drifts much less than scan-to-scan because each match references the long-term accumulated geometry. The cost: more compute per match.

The descendants

Variant What it adds
F-LOAMFaster — single-rate, optimized feature extraction.
LeGO-LOAMGround-segmentation prior; better for ground robots.
LIO-SAMTightly-coupled IMU; factor-graph back-end (GTSAM).
FAST-LIO / Faster-LIODirect EKF-style update without explicit features. Very fast.
SuMa++Surfel-based, semantic-aware (excludes moving objects).

For 2026 production: LIO-SAM is the most-cited choice. Combines LiDAR + IMU + GPS in a factor graph; battle-tested on autonomous vehicles, drones, ground robots.

The IMU integration

Adding an IMU buys you:

  • De-skewing: a single LiDAR scan is collected over ~100 ms while the robot moves. The IMU predicts that motion; the points are rectified to a single instant.
  • Initial guesses: IMU provides scan-to-scan initial pose estimates, dramatically improving convergence.
  • High-rate poses between scans: 100 Hz IMU vs 10 Hz LiDAR.
  • Robustness in degraded scenes: open fields, tunnels (no features). IMU keeps the pose alive.

LIO-SAM and Faster-LIO both fuse IMU tightly. The cleanest formulations use IMU pre-integration (Forster et al.) — accumulate IMU readings between scans into a single relative-motion factor for the optimization.

Loop closure on LiDAR

Same idea as visual: detect revisits, add a constraint, optimize. Differences:

  • Place descriptors: classical (M2DP, ScanContext); learned (PointNetVLAD, OverlapNet). ScanContext is the most popular in 2026 — a simple polar-grid descriptor that's fast and accurate.
  • Geometric verification: RANSAC + ICP for relative-pose refinement.
  • Pose graph optimization: GTSAM or g2o. Same as visual SLAM.

Loop closures on LiDAR are typically more reliable than on cameras because LiDAR sees through lighting changes that cameras don't.

The compute reality

LiDAR SLAM is compute-hungry. Per-scan operations on hundreds of thousands of points include:

  • Feature extraction: ~5 ms with optimized code.
  • kd-tree queries: depends on map size. ~10–50 ms for a 1M-point map.
  • Non-linear least-squares: ~5–20 ms.
  • Map update: ~10 ms (incremental kd-trees like ikd-Tree help).

For a 10 Hz LiDAR (Velodyne VLP-16, Ouster OS1), real-time is achievable on a Jetson Orin or modest desktop CPU. For 20 Hz operation in dense scenes, you're optimizing.

Failure modes

  • Featureless tunnels / open fields: edges and planes degenerate. Without IMU, LiDAR SLAM diverges.
  • Heavy motion: if LiDAR de-skewing is wrong, scans look smeared. Tightly-coupled IMU fusion is the fix.
  • Moving objects in the scene: cars, pedestrians become noise points. Modern systems (SuMa++) explicitly remove dynamic returns.
  • Reflective surfaces: glass, polished metal. LiDAR sees through them or ghost-reflects. Filter aggressively or accept some pollution.

How to start

  1. Get a Velodyne VLP-16 or Ouster OS0 (~$5k-$10k) — or use a public dataset like KITTI / nuScenes.
  2. Clone LIO-SAM; build with ROS 2.
  3. Run on the bagged data; watch the map build and the trajectory drift, then snap on loop closure.
  4. Add your own IMU calibration; tune the per-IMU noise parameters.

The first time the system maps a parking lot in real time and snaps closed when you drive past the start, the field becomes tangible.

Exercise

On the KITTI sequence 00, run F-LOAM and LIO-SAM. Compare absolute pose error vs ground truth. LIO-SAM (with IMU) should be 2–5× more accurate. The improvement is what tight-coupled IMU buys you in real driving conditions.

Next

Modern SLAM with learned features and Gaussian splatting — the deep-learning wave reshaping what's possible.

Comments

    Sign in to post a comment.