Factor graphs and graph-based SLAM
The modern formulation: nodes are unknowns, edges are constraints, the graph is the joint posterior. Solve with sparse least-squares. GTSAM, g2o, Ceres — every state-of-the-art SLAM uses this.
EKF-SLAM was the standard until ~2010. Then graph-based SLAM ate it. The reason: factor graphs let you express the joint posterior as a sparse network of constraints; sparse non-linear least-squares solves it efficiently. Every modern SLAM system — ORB-SLAM3, LIO-SAM, COLMAP — boils down to a factor graph optimized with GTSAM, g2o, or Ceres.
The setup
The robot moves through time, observing landmarks. Variables to estimate:
- Robot poses — one per timestep.
- Landmark positions .
Constraints (factors):
- Odometry factor: between consecutive poses and , derived from wheel/IMU measurement of motion.
- Observation factor: between pose and landmark , derived from a sensor measurement (camera, lidar).
- Loop closure factor: between two poses far apart in time, derived from "I'm seeing a place I've been before."
- Prior factor: the first pose anchors the coordinate system; sometimes GPS adds priors elsewhere.
Each factor is a noise-weighted residual: how much do the variables disagree with this measurement?
The objective
Find variables that minimize total squared residual:
Where is the k-th factor's residual and is its covariance. The Mahalanobis-norm weighting accounts for measurement noise: noisy measurements contribute less, certain ones contribute more.
Mathematically equivalent to maximum-likelihood estimation under Gaussian assumptions. Solve with non-linear least-squares.
Why "graph" matters
Each factor connects only a few variables. A camera observation factor: 2 variables (one pose + one landmark). An odometry factor: 2 variables (consecutive poses). A loop closure: 2 poses.
The Jacobian of the residual function is therefore extremely sparse. Sparse linear-algebra solvers (Cholesky, QR) handle SLAM-sized problems with thousands of variables in milliseconds.
Compare to EKF-SLAM: dense covariance matrix, memory, update. Graph-SLAM: factors, memory after factorization.
Levenberg-Marquardt: the workhorse solver
Iterative non-linear least-squares:
- Start at an initial estimate .
- Linearize each factor around the current estimate.
- Solve the resulting sparse linear system .
- Update: . Adjust damping based on cost change.
- Iterate until convergence.
Levenberg-Marquardt blends Gauss-Newton (fast) with gradient descent (safe) via the damping parameter. Standard implementation in GTSAM, g2o, Ceres.
The libraries
| Library | Used by |
|---|---|
| GTSAM (Georgia Tech) | LIO-SAM, OpenVSLAM, many research |
| g2o (TU Munich / Heidelberg) | ORB-SLAM3, RGB-D SLAM |
| Ceres (Google) | COLMAP (SfM), Cartographer |
| SymForce (Skydio) | Drone autonomy, code-generation oriented |
For ROS-based work, GTSAM is the most common entry point. Python bindings exist; build from source recommended.
The 30-line factor-graph SLAM
import gtsam
from gtsam import symbol_shorthand
X, L = symbol_shorthand.X, symbol_shorthand.L
graph = gtsam.NonlinearFactorGraph()
initial = gtsam.Values()
# Prior on first pose
prior_noise = gtsam.noiseModel.Diagonal.Sigmas([0.01, 0.01, 0.01])
graph.add(gtsam.PriorFactorPose2(X(0), gtsam.Pose2(0, 0, 0), prior_noise))
initial.insert(X(0), gtsam.Pose2(0, 0, 0))
# Odometry factors
odom_noise = gtsam.noiseModel.Diagonal.Sigmas([0.05, 0.05, 0.02])
for t in range(1, T):
graph.add(gtsam.BetweenFactorPose2(
X(t-1), X(t), gtsam.Pose2(*odometry[t-1]), odom_noise))
initial.insert(X(t), gtsam.Pose2(*estimated_pose(t)))
# Landmark observations
obs_noise = gtsam.noiseModel.Diagonal.Sigmas([0.1, 0.1])
for t, landmark_id, bearing, range_ in observations:
graph.add(gtsam.BearingRangeFactor2D(
X(t), L(landmark_id), bearing, range_, obs_noise))
# Solve
result = gtsam.LevenbergMarquardtOptimizer(graph, initial).optimize()
Thirty lines for a working 2D SLAM. Add more factor types (loop closures, IMU preintegration, GPS) by inserting them into the graph.
Loop closure: the killer feature
Without loop closure, SLAM accumulates drift forever. With it, returning to a known place adds a constraint that rewinds drift across the entire trajectory.
Loop closure detection is the hard part:
- Visual: bag-of-words (DBoW3), or learned features (NetVLAD, AnyLoc).
- LiDAR: scan context descriptors (SC, OverlapNet).
- Geometric verification: after candidate matching, verify with RANSAC.
Once you have a verified loop, add a between-pose factor with the inferred relative transform. The optimizer rewinds the whole graph. Watch the trajectory snap into place.
Incremental optimization (iSAM2)
Real SLAM doesn't re-solve from scratch every step. iSAM2 (Kaess et al.) maintains a Bayes-tree representation that updates incrementally: only re-linearize the variables affected by new measurements.
Result: real-time SLAM that scales to thousands of poses with millisecond-per-step latency. GTSAM ships iSAM2 out of the box.
Robustness
Wrong loop closures (false positives) destroy a graph-SLAM solution. Mitigations:
- Robust kernels: Huber, Cauchy. Reduce the influence of huge residuals.
- Switch variables / max-mixtures: explicit "is this loop closure right?" hidden variables.
- RRR / DCS: drop loop closures whose residuals exceed a threshold.
Modern systems combine these. ORB-SLAM3 has multi-stage geometric verification before adding any loop closure.
Why this beat EKF-SLAM
- Scaling: graph-SLAM handles thousands of poses; EKF chokes past hundreds.
- Re-linearization: full re-linearization at each iteration corrects bias from poor initial estimates.
- Information separation: each measurement is its own factor, easy to add or remove.
- Multi-modal posteriors: factor graphs represent any posterior, not just unimodal Gaussians.
Exercise
In GTSAM, set up a 2D landmark-SLAM problem with 5 poses and 3 landmarks. Add some loop closure factors with deliberately bad measurements. Optimize without robust kernels — watch the solution warp. Re-run with Huber kernel; the solution stays clean. The robust-kernel demo is the moment factor graphs feel modern.
Next
Occupancy grids — the simplest form of mapping, and the layer Nav2's costmap is built on.
Comments
Sign in to post a comment.