Published2026-04-25·~15 min

Extended and unscented Kalman filters

Real robots are nonlinear. The Kalman filter assumes linearity. EKF linearizes around the current mean; UKF propagates sigma points through the real nonlinear models. Same Bayes filter, two ways to handle nonlinearity.

by RobotForge

#slam#estimation#ekf#ukf

Linear Kalman filters are the simplest case of state estimation. Real robots aren't linear — wheel-odometry has trig, IMU has nonlinear bias dynamics, GPS lat/lon has Earth curvature. Two Bayes-filter generalizations handle nonlinearity within the Gaussian framework: EKF (extended Kalman filter) and UKF (unscented Kalman filter). Both keep the belief Gaussian; they differ in how they push it through the nonlinearity.

The recap: where Kalman fails

The plain Kalman filter assumes linear models:

x_{t} = F x_{t - 1} + B u_{t} + w, z_{t} = H x_{t} + v

The "F·x" and "H·x" forms keep Gaussian-in-Gaussian-out. Real models are nonlinear:

x_{t} = f (x_{t - 1}, u_{t}) + w, z_{t} = h (x_{t}) + v

Push a Gaussian through a nonlinear function and the result is generally not Gaussian. Approximation needed.

EKF: linearize, then Kalman

The trick: at each step, replace $f$ and $h$ with their linearizations around the current mean estimate.

F_{t} \approx \frac{\partial f}{\partial x}_{μ_{t - 1}, u_{t}}, H_{t} \approx \frac{\partial h}{\partial x}_{μ_{t}^{-}}

These are Jacobian matrices. Compute them analytically (preferred) or via finite differences. Then run the standard Kalman update with these locally-linear matrices.

Predict:

μ_{t}^{-} = f (μ_{t - 1}, u_{t}), Σ_{t}^{-} = F_{t} Σ_{t - 1} F_{t}^{T} + Q

Update:

K_{t} = Σ_{t}^{-} H_{t}^{T} (H_{t} Σ_{t}^{-} H_{t}^{T} + R)^{- 1}

μ_{t} = μ_{t}^{-} + K_{t} (z_{t} - h (μ_{t}^{-}))

Σ_{t} = (I - K_{t} H_{t}) Σ_{t}^{-}

Same five equations as Kalman, but with Jacobians evaluated at the current mean.

When EKF works: nonlinearity is mild near the operating point. Wheel-odometry of a slowly-moving robot. Most consumer-IMU integration.

When EKF fails: strong nonlinearity, or the mean estimate drifts far from truth (linearization point is wrong). Classic example: bearing-only tracking when the target is close. The Jacobian goes to infinity.

UKF: skip linearization, propagate samples

The unscented transform: instead of linearizing $f$ , sample a small set of "sigma points" around the current mean, push each one through the real $f$ , then re-fit a Gaussian to the result.

Sigma points (for n-dimensional state):

χ^{(0)} = μ, χ^{(i)} = μ \pm (n + λ) Σ_{(i)}

That's 2n+1 samples covering the central mean and the principal axes of the covariance ellipsoid. The square root is the Cholesky decomposition of Σ.

Push each sample through $f$ , get 2n+1 transformed points. Compute their weighted mean and covariance. That's the propagated belief.

The trade:

EKF: 1 evaluation of f + Jacobian. Fast.
UKF: 2n+1 evaluations of f, no Jacobian needed. Slower per step but no Jacobian derivation.

Why UKF often beats EKF on accuracy

The unscented transform exactly captures the first two moments of any nonlinear function up to second order — the EKF only captures first order. For systems with significant curvature, UKF stays accurate where EKF starts to lie about its uncertainty.

This matters most for:

Long-duration integration (e.g. predicting many steps ahead).
Highly nonlinear measurement models (range-bearing, GPS in the wrong frame).
State estimates that travel far from any single linearization point.

The third option no one talks about: iterated EKF

If you have time, run the EKF update several times, each time relinearizing around the current estimate. Convergence is usually fast (3–5 iterations). The Iterated EKF often closes most of the gap with UKF at a fraction of the cost.

Implementation in 50 lines (EKF)

def ekf_step(mu, Sigma, u, z, dt):
    # Predict
    mu_pred = f(mu, u, dt)
    F = jacobian_f(mu, u, dt)
    Sigma_pred = F @ Sigma @ F.T + Q

    # Update
    z_pred = h(mu_pred)
    H = jacobian_h(mu_pred)
    S = H @ Sigma_pred @ H.T + R
    K = Sigma_pred @ H.T @ np.linalg.inv(S)
    mu_new = mu_pred + K @ (z - z_pred)
    Sigma_new = (np.eye(len(mu)) - K @ H) @ Sigma_pred
    return mu_new, Sigma_new

Plug in your specific $f$ and $h$ functions and analytical Jacobians. That's a working EKF.

Real-world considerations

Numerical stability: covariance matrices can lose positive-definiteness due to rounding. Re-symmetrize after each update: $Σ \leftarrow (Σ + Σ^{T}) /2$ . Add tiny diagonal regularization if Cholesky decomposition fails.
Joseph form: numerically more stable update form than $(I - K H) Σ_{pred}$ . Prefer it for production code.
Outlier rejection: validate measurements before updating. The Mahalanobis distance from $z$ to $h (μ)$ above a threshold means "this measurement disagrees with the model — reject it." Cuts the impact of GPS spikes, IMU glitches.
Initial state: get the initial mean and covariance right. Wrong initial covariance can take the filter many seconds to recover.

Where EKF/UKF live in modern robotics

Visual-inertial odometry: MSCKF, ROVIO, VINS-Mono — all EKF-based. State includes pose + velocity + IMU bias.
GPS/INS fusion: every commercial GNSS-INS unit uses an EKF. Some research-grade systems use UKF.
SLAM front-ends: many earlier SLAM systems were EKF-SLAM. Modern systems use factor graphs (sparse least-squares) for scale, but the underlying math is similar.
Robot pose estimation: Nav2's robot_localization package is a configurable EKF/UKF.

When even UKF isn't enough

Multimodal posteriors ("I'm in aisle 2 OR aisle 5" — two modes). Single Gaussian can't represent. Use particle filter or Gaussian mixture.
Heavy non-Gaussian noise. Heavy-tailed sensor noise. Use Student-t or robust filters.
Strict structure (factorizes nicely across many landmarks). Factor graphs beat dense covariance EKF.

Exercise

Implement an EKF for tracking a unicycle robot with state $(x, y, θ)$ from wheel-odometry inputs $(v, ω)$ and noisy GPS measurements. Add a UKF version of the same problem. Drive a simulated trajectory; compare estimated paths and reported uncertainty ellipsoids. The two converge to similar answers in mild scenarios but diverge when nonlinearity is strong (sharp turns, fast rotations).