Published2026-04-26·~14 min

Factor graphs and graph-based SLAM

The modern formulation: nodes are unknowns, edges are constraints, the graph is the joint posterior. Solve with sparse least-squares. GTSAM, g2o, Ceres — every state-of-the-art SLAM uses this.

by RobotForge

#slam#factor-graphs#optimization

EKF-SLAM was the standard until ~2010. Then graph-based SLAM ate it. The reason: factor graphs let you express the joint posterior as a sparse network of constraints; sparse non-linear least-squares solves it efficiently. Every modern SLAM system — ORB-SLAM3, LIO-SAM, COLMAP — boils down to a factor graph optimized with GTSAM, g2o, or Ceres.

The setup

The robot moves through time, observing landmarks. Variables to estimate:

Robot poses $x_{1}, x_{2}, \dots, x_{T}$ — one per timestep.
Landmark positions $l_{1}, l_{2}, \dots, l_{M}$ .

Constraints (factors):

Odometry factor: between consecutive poses $x_{t}$ and $x_{t + 1}$ , derived from wheel/IMU measurement of motion.
Observation factor: between pose $x_{t}$ and landmark $l_{i}$ , derived from a sensor measurement (camera, lidar).
Loop closure factor: between two poses far apart in time, derived from "I'm seeing a place I've been before."
Prior factor: the first pose anchors the coordinate system; sometimes GPS adds priors elsewhere.

Each factor is a noise-weighted residual: how much do the variables disagree with this measurement?

The objective

Find variables that minimize total squared residual:

x, l min k \sum ∥ r_{k} (x, l) ∥_{Σ_{k}}^{2}

Where $r_{k}$ is the k-th factor's residual and $Σ_{k}$ is its covariance. The Mahalanobis-norm weighting accounts for measurement noise: noisy measurements contribute less, certain ones contribute more.

Mathematically equivalent to maximum-likelihood estimation under Gaussian assumptions. Solve with non-linear least-squares.

Why "graph" matters

Each factor connects only a few variables. A camera observation factor: 2 variables (one pose + one landmark). An odometry factor: 2 variables (consecutive poses). A loop closure: 2 poses.

The Jacobian of the residual function is therefore extremely sparse. Sparse linear-algebra solvers (Cholesky, QR) handle SLAM-sized problems with thousands of variables in milliseconds.

Compare to EKF-SLAM: dense covariance matrix, $O (M^{2})$ memory, $O (M^{3})$ update. Graph-SLAM: $O (M)$ factors, $O (M)$ memory after factorization.

Levenberg-Marquardt: the workhorse solver

Iterative non-linear least-squares:

Start at an initial estimate $x^{(0)}$ .
Linearize each factor around the current estimate.
Solve the resulting sparse linear system $(J^{T} Σ^{- 1} J + λ I) Δ x = - J^{T} Σ^{- 1} r$ .
Update: $x^{(k + 1)} = x^{(k)} + Δ x$ . Adjust damping $λ$ based on cost change.
Iterate until convergence.

Levenberg-Marquardt blends Gauss-Newton (fast) with gradient descent (safe) via the damping parameter. Standard implementation in GTSAM, g2o, Ceres.

The libraries

Library	Used by
GTSAM (Georgia Tech)	LIO-SAM, OpenVSLAM, many research
g2o (TU Munich / Heidelberg)	ORB-SLAM3, RGB-D SLAM
Ceres (Google)	COLMAP (SfM), Cartographer
SymForce (Skydio)	Drone autonomy, code-generation oriented

For ROS-based work, GTSAM is the most common entry point. Python bindings exist; build from source recommended.

The 30-line factor-graph SLAM

import gtsam
from gtsam import symbol_shorthand
X, L = symbol_shorthand.X, symbol_shorthand.L

graph = gtsam.NonlinearFactorGraph()
initial = gtsam.Values()

# Prior on first pose
prior_noise = gtsam.noiseModel.Diagonal.Sigmas([0.01, 0.01, 0.01])
graph.add(gtsam.PriorFactorPose2(X(0), gtsam.Pose2(0, 0, 0), prior_noise))
initial.insert(X(0), gtsam.Pose2(0, 0, 0))

# Odometry factors
odom_noise = gtsam.noiseModel.Diagonal.Sigmas([0.05, 0.05, 0.02])
for t in range(1, T):
    graph.add(gtsam.BetweenFactorPose2(
        X(t-1), X(t), gtsam.Pose2(*odometry[t-1]), odom_noise))
    initial.insert(X(t), gtsam.Pose2(*estimated_pose(t)))

# Landmark observations
obs_noise = gtsam.noiseModel.Diagonal.Sigmas([0.1, 0.1])
for t, landmark_id, bearing, range_ in observations:
    graph.add(gtsam.BearingRangeFactor2D(
        X(t), L(landmark_id), bearing, range_, obs_noise))

# Solve
result = gtsam.LevenbergMarquardtOptimizer(graph, initial).optimize()

Thirty lines for a working 2D SLAM. Add more factor types (loop closures, IMU preintegration, GPS) by inserting them into the graph.

Loop closure: the killer feature

Without loop closure, SLAM accumulates drift forever. With it, returning to a known place adds a constraint that rewinds drift across the entire trajectory.

Loop closure detection is the hard part:

Visual: bag-of-words (DBoW3), or learned features (NetVLAD, AnyLoc).
LiDAR: scan context descriptors (SC, OverlapNet).
Geometric verification: after candidate matching, verify with RANSAC.

Once you have a verified loop, add a between-pose factor with the inferred relative transform. The optimizer rewinds the whole graph. Watch the trajectory snap into place.

Incremental optimization (iSAM2)

Real SLAM doesn't re-solve from scratch every step. iSAM2 (Kaess et al.) maintains a Bayes-tree representation that updates incrementally: only re-linearize the variables affected by new measurements.

Result: real-time SLAM that scales to thousands of poses with millisecond-per-step latency. GTSAM ships iSAM2 out of the box.

Robustness

Wrong loop closures (false positives) destroy a graph-SLAM solution. Mitigations:

Robust kernels: Huber, Cauchy. Reduce the influence of huge residuals.
Switch variables / max-mixtures: explicit "is this loop closure right?" hidden variables.
RRR / DCS: drop loop closures whose residuals exceed a threshold.

Modern systems combine these. ORB-SLAM3 has multi-stage geometric verification before adding any loop closure.

Why this beat EKF-SLAM

Scaling: graph-SLAM handles thousands of poses; EKF chokes past hundreds.
Re-linearization: full re-linearization at each iteration corrects bias from poor initial estimates.
Information separation: each measurement is its own factor, easy to add or remove.
Multi-modal posteriors: factor graphs represent any posterior, not just unimodal Gaussians.

Exercise

In GTSAM, set up a 2D landmark-SLAM problem with 5 poses and 3 landmarks. Add some loop closure factors with deliberately bad measurements. Optimize without robust kernels — watch the solution warp. Re-run with Huber kernel; the solution stays clean. The robust-kernel demo is the moment factor graphs feel modern.

Occupancy grids — the simplest form of mapping, and the layer Nav2's costmap is built on.

← Previous

Particle filters and Monte Carlo localization

Occupancy grid mapping