MPC as a motion planner
When the line between control and planning blurs. MPC re-plans every tick over a finite horizon, replacing a slow planner + fast controller with a single layer that does both. The pattern autonomous cars adopted.
Classical robotics splits "what trajectory should I follow?" (planning) from "how do I follow it?" (control). MPC blurs that line — at every control tick, it re-plans the next H seconds of trajectory. For systems with fast dynamics in a changing environment (cars, drones, quadrupeds), MPC is increasingly the architecture instead of a separate planner + controller.
The shift in perspective
Traditional pipeline:
Planner (1 Hz) → produces trajectory → Controller (100 Hz) → tracks it
MPC pipeline:
MPC (10–100 Hz) → re-plans + tracks in one optimization
The MPC's horizon (1–5 seconds typical) is short enough to update fast, long enough to anticipate. The planner-controller boundary disappears.
Why this changes things
- Reactivity: a person walks in front of the car. Classical: planner finds a new trajectory in 500 ms; controller still tracks the old one until then. MPC: at the next 50 ms tick, the new obstacle is in the cost function; trajectory adjusts immediately.
- Constraint awareness: hard limits (joint, torque, obstacle clearance) are enforced at every control tick, not just by the planner.
- Trajectory smoothness: each new MPC solve warm-starts from the previous; consecutive trajectories differ by tiny perturbations.
- Less hand-tuning: cost function captures intent; one layer to tune instead of two.
The cost
MPC's beautifully unified architecture has two real downsides:
- Compute: solving a non-trivial optimization at 50+ Hz is hard. Custom solvers (acados), GPU, or simplification (linear MPC) are usually required.
- Local horizon: a 5-second horizon doesn't see beyond the corner. For long-horizon goals, you still need a slow global planner setting waypoints.
Production stacks combine: slow global planner sets goals; MPC does both local planning and control to those goals.
Where this is the production answer
- Autonomous cars: lane-keeping, adaptive cruise, lane changes. All MPC. Tesla, Cruise, Waymo all use variations.
- Drone agility: aggressive trajectory following requires re-planning on the millisecond timescale. MPC handles it; classical planner + controller doesn't.
- Quadruped locomotion: ANYmal's whole-body MPC, Boston Dynamics' Spot. Re-plans gait + body trajectory every 1–10 ms.
- Humanoid balance: capture point + zero moment point methods are linear MPC variants.
The discretization question
MPC inherits trajectory-optimization's discretization choices:
- Linear MPC: linearize dynamics around a reference; QP at each tick. Very fast (1 ms typical); industrial-grade. Loses accuracy on aggressive maneuvers.
- Nonlinear MPC (NMPC): full nonlinear dynamics; NLP at each tick. Slower (5–50 ms) but accurate; covers everything.
- Sequential Quadratic MPC: each NMPC iteration is a QP; warm-start across ticks; accuracy of NMPC at speeds near linear.
For cars at highway speeds: linear MPC. For drones doing flips: nonlinear MPC.
The receding horizon
At each tick:
- Solve for the next H steps of trajectory.
- Apply only the first control input.
- At the next tick, re-solve from the new state.
The "discard the rest" feels wasteful but isn't — disturbances and modeling errors mean the predicted future is uncertain. Re-solving with fresh state always beats committing.
Implementing in 50 lines (CasADi)
import casadi as ca
import numpy as np
# State: [x, y, theta, v]; control: [a, omega]
def make_mpc(N=20, dt=0.1):
opti = ca.Opti()
X = opti.variable(4, N+1)
U = opti.variable(2, N)
X_init = opti.parameter(4)
# Bicycle model dynamics
for k in range(N):
x_dot = ca.vertcat(
X[3, k] * ca.cos(X[2, k]),
X[3, k] * ca.sin(X[2, k]),
U[1, k],
U[0, k],
)
opti.subject_to(X[:, k+1] == X[:, k] + dt * x_dot)
opti.subject_to(X[:, 0] == X_init)
opti.subject_to(opti.bounded(-2, U[0, :], 2))
opti.subject_to(opti.bounded(-1, U[1, :], 1))
# Cost: track reference position with low effort
Q = np.diag([10, 10, 1, 1])
R = np.diag([0.1, 0.1])
cost = sum(ca.mtimes([X[:, k].T, Q, X[:, k]]) +
ca.mtimes([U[:, k].T, R, U[:, k]])
for k in range(N))
opti.minimize(cost)
opti.solver('ipopt', {'print_time': False, 'ipopt': {'print_level': 0}})
return opti, X, U, X_init
mpc, X, U, X_init = make_mpc()
# Control loop
state = np.array([0, 0, 0, 0])
for t in range(200):
mpc.set_value(X_init, state)
sol = mpc.solve()
u_now = sol.value(U[:, 0])
state = simulate(state, u_now, dt=0.1)
mpc.set_initial(X, sol.value(X)) # warm start
mpc.set_initial(U, sol.value(U))
That's a working MPC for a bicycle model. Run at 10 Hz; track a reference; respect bounds. The skeleton scales to drones, quadrupeds, manipulators.
Solver choice for production
- OSQP: linear MPC at 1 kHz on embedded hardware. The default for autonomous-car steering.
- acados: nonlinear MPC, generates C code for embedded targets. Used in PX4-based drones.
- CasADi + IPOPT: prototyping; runs in Python, slower at runtime.
- cuRobo (NVIDIA): GPU-based MPC for arm and mobile manipulation. Production-grade.
Common gotchas
- Solver timeout in worst case: NMPC sometimes takes 100 ms instead of 10. Set hard timeouts; have a fallback (last solution, safe-stop).
- Warm-starting matters: MPC quality is dominated by warm-start. Pass the previous tick's solution as the next tick's initial guess.
- Horizon length tradeoff: too short → no anticipation; too long → slow + far-future predictions are unreliable. 1–3 seconds typical for cars; 0.5–1 sec for drones.
- Tuning Q vs R: same as LQR. Drives behavior; spend tuning time here.
The MPC + RL trend
Modern hybrids combine MPC's structure with RL's learned components:
- MPC structure for stability + safety.
- Neural network learns the cost function or value function.
- Best of both: provable bounds, learned data efficiency.
"Differentiable MPC," "learned MPC," and "actor-critic with MPC" are all variants in 2024–26 papers. Production-ready for narrow tasks; emerging for general use.
Exercise
Implement the bicycle MPC above. Track a sinusoidal reference (an evasive lane change). Compare to a pure pursuit controller on the same reference. MPC handles tighter bends within the same actuator limits; pure pursuit overshoots or undershoots.
Next
Task and motion planning (TAMP) — combining discrete logic ("what to do") with continuous motion ("how to move"). The frontier of long-horizon autonomy.
Comments
Sign in to post a comment.