RobotForge
Published·~12 min

MPC as a motion planner

When the line between control and planning blurs. MPC re-plans every tick over a finite horizon, replacing a slow planner + fast controller with a single layer that does both. The pattern autonomous cars adopted.

by RobotForge
#planning#mpc#control

Classical robotics splits "what trajectory should I follow?" (planning) from "how do I follow it?" (control). MPC blurs that line — at every control tick, it re-plans the next H seconds of trajectory. For systems with fast dynamics in a changing environment (cars, drones, quadrupeds), MPC is increasingly the architecture instead of a separate planner + controller.

The shift in perspective

Traditional pipeline:

Planner (1 Hz) → produces trajectory →  Controller (100 Hz) → tracks it

MPC pipeline:

MPC (10–100 Hz) → re-plans + tracks in one optimization

The MPC's horizon (1–5 seconds typical) is short enough to update fast, long enough to anticipate. The planner-controller boundary disappears.

Why this changes things

  • Reactivity: a person walks in front of the car. Classical: planner finds a new trajectory in 500 ms; controller still tracks the old one until then. MPC: at the next 50 ms tick, the new obstacle is in the cost function; trajectory adjusts immediately.
  • Constraint awareness: hard limits (joint, torque, obstacle clearance) are enforced at every control tick, not just by the planner.
  • Trajectory smoothness: each new MPC solve warm-starts from the previous; consecutive trajectories differ by tiny perturbations.
  • Less hand-tuning: cost function captures intent; one layer to tune instead of two.

The cost

MPC's beautifully unified architecture has two real downsides:

  • Compute: solving a non-trivial optimization at 50+ Hz is hard. Custom solvers (acados), GPU, or simplification (linear MPC) are usually required.
  • Local horizon: a 5-second horizon doesn't see beyond the corner. For long-horizon goals, you still need a slow global planner setting waypoints.

Production stacks combine: slow global planner sets goals; MPC does both local planning and control to those goals.

Where this is the production answer

  • Autonomous cars: lane-keeping, adaptive cruise, lane changes. All MPC. Tesla, Cruise, Waymo all use variations.
  • Drone agility: aggressive trajectory following requires re-planning on the millisecond timescale. MPC handles it; classical planner + controller doesn't.
  • Quadruped locomotion: ANYmal's whole-body MPC, Boston Dynamics' Spot. Re-plans gait + body trajectory every 1–10 ms.
  • Humanoid balance: capture point + zero moment point methods are linear MPC variants.

The discretization question

MPC inherits trajectory-optimization's discretization choices:

  • Linear MPC: linearize dynamics around a reference; QP at each tick. Very fast (1 ms typical); industrial-grade. Loses accuracy on aggressive maneuvers.
  • Nonlinear MPC (NMPC): full nonlinear dynamics; NLP at each tick. Slower (5–50 ms) but accurate; covers everything.
  • Sequential Quadratic MPC: each NMPC iteration is a QP; warm-start across ticks; accuracy of NMPC at speeds near linear.

For cars at highway speeds: linear MPC. For drones doing flips: nonlinear MPC.

The receding horizon

At each tick:

  1. Solve for the next H steps of trajectory.
  2. Apply only the first control input.
  3. At the next tick, re-solve from the new state.

The "discard the rest" feels wasteful but isn't — disturbances and modeling errors mean the predicted future is uncertain. Re-solving with fresh state always beats committing.

Implementing in 50 lines (CasADi)

import casadi as ca
import numpy as np

# State: [x, y, theta, v]; control: [a, omega]
def make_mpc(N=20, dt=0.1):
    opti = ca.Opti()
    X = opti.variable(4, N+1)
    U = opti.variable(2, N)
    X_init = opti.parameter(4)

    # Bicycle model dynamics
    for k in range(N):
        x_dot = ca.vertcat(
            X[3, k] * ca.cos(X[2, k]),
            X[3, k] * ca.sin(X[2, k]),
            U[1, k],
            U[0, k],
        )
        opti.subject_to(X[:, k+1] == X[:, k] + dt * x_dot)

    opti.subject_to(X[:, 0] == X_init)
    opti.subject_to(opti.bounded(-2, U[0, :], 2))
    opti.subject_to(opti.bounded(-1, U[1, :], 1))

    # Cost: track reference position with low effort
    Q = np.diag([10, 10, 1, 1])
    R = np.diag([0.1, 0.1])
    cost = sum(ca.mtimes([X[:, k].T, Q, X[:, k]]) +
               ca.mtimes([U[:, k].T, R, U[:, k]])
               for k in range(N))
    opti.minimize(cost)
    opti.solver('ipopt', {'print_time': False, 'ipopt': {'print_level': 0}})
    return opti, X, U, X_init

mpc, X, U, X_init = make_mpc()

# Control loop
state = np.array([0, 0, 0, 0])
for t in range(200):
    mpc.set_value(X_init, state)
    sol = mpc.solve()
    u_now = sol.value(U[:, 0])
    state = simulate(state, u_now, dt=0.1)
    mpc.set_initial(X, sol.value(X))   # warm start
    mpc.set_initial(U, sol.value(U))

That's a working MPC for a bicycle model. Run at 10 Hz; track a reference; respect bounds. The skeleton scales to drones, quadrupeds, manipulators.

Solver choice for production

  • OSQP: linear MPC at 1 kHz on embedded hardware. The default for autonomous-car steering.
  • acados: nonlinear MPC, generates C code for embedded targets. Used in PX4-based drones.
  • CasADi + IPOPT: prototyping; runs in Python, slower at runtime.
  • cuRobo (NVIDIA): GPU-based MPC for arm and mobile manipulation. Production-grade.

Common gotchas

  • Solver timeout in worst case: NMPC sometimes takes 100 ms instead of 10. Set hard timeouts; have a fallback (last solution, safe-stop).
  • Warm-starting matters: MPC quality is dominated by warm-start. Pass the previous tick's solution as the next tick's initial guess.
  • Horizon length tradeoff: too short → no anticipation; too long → slow + far-future predictions are unreliable. 1–3 seconds typical for cars; 0.5–1 sec for drones.
  • Tuning Q vs R: same as LQR. Drives behavior; spend tuning time here.

The MPC + RL trend

Modern hybrids combine MPC's structure with RL's learned components:

  • MPC structure for stability + safety.
  • Neural network learns the cost function or value function.
  • Best of both: provable bounds, learned data efficiency.

"Differentiable MPC," "learned MPC," and "actor-critic with MPC" are all variants in 2024–26 papers. Production-ready for narrow tasks; emerging for general use.

Exercise

Implement the bicycle MPC above. Track a sinusoidal reference (an evasive lane change). Compare to a pure pursuit controller on the same reference. MPC handles tighter bends within the same actuator limits; pure pursuit overshoots or undershoots.

Next

Task and motion planning (TAMP) — combining discrete logic ("what to do") with continuous motion ("how to move"). The frontier of long-horizon autonomy.

Comments

    Sign in to post a comment.