Published2026-04-23·~17 min

State-space control: from PID to LQR

PID works for one variable. State-space and LQR are the principled way to control many variables at once, with one cost function instead of six tuning knobs.

by RobotForge

#control#lqr#state-space

A quadrotor has three rotation axes and three translation axes — six things to control at once. You can stack six independent PIDs, and many quadrotors fly this way, but performance is limited and tuning is a nightmare. State-space control gives you one matrix equation that controls everything together. LQR tells you the optimal gains. Here's the leap.

The setup: state-space form

Any linear system can be written as:

ẋ = A·x + B·u
y = C·x + D·u

Where:

x is the state vector — everything you need to describe the system right now (position, velocity, angle, angular velocity, etc.)
u is the control input vector
A describes how the state evolves on its own
B describes how control inputs affect the state
C and D describe what you measure (often C = I, D = 0)

Example — a simple cart with mass m, position p, velocity v, force input F:

x = [p; v]
ẋ = A·x + B·u  where  A = [0  1; 0  0]  and  B = [0; 1/m]

First row: "position's derivative is velocity." Second row: "velocity's derivative is force/mass." The math is banal — the value is that it scales.

State-feedback control

Pick a gain matrix K. Control law: u = −K·x. Substitute:

ẋ = A·x − B·K·x = (A − B·K)·x

The closed-loop dynamics are governed by (A − B·K). If its eigenvalues are all negative (stable), the system converges to zero from any initial state. The game: pick K to make those eigenvalues land where you want.

PID corresponds to a specific form of K. State-space lets you pick any K — including one that couples multiple inputs and outputs.

LQR: principled gain selection

How do you pick K? Hand-tune? Place eigenvalues? Both work; neither is principled. LQR (Linear-Quadratic Regulator) does it by optimization.

You write down a cost function:

J = ∫ (x^T·Q·x + u^T·R·u) dt

Q penalizes being far from zero in the state. Larger Q → tighter tracking.
R penalizes large control effort. Larger R → softer control, saves actuators.

LQR computes the optimal K that minimizes J. The solution involves the algebraic Riccati equation — don't do it by hand, let SciPy do it:

from scipy.linalg import solve_continuous_are
import numpy as np

A = np.array([[0, 1], [0, 0]])
B = np.array([[0], [1/1.0]])     # mass = 1 kg
Q = np.diag([10.0, 1.0])         # care about position 10x more than velocity
R = np.array([[0.1]])            # modest control penalty

P = solve_continuous_are(A, B, Q, R)
K = np.linalg.inv(R) @ B.T @ P
print('LQR gain K =', K)

Two lines of Python replace hours of hand-tuning. The output K is optimal for the cost you specified.

What the cost matrices actually mean

Q and R are the tuning knobs. A few rules of thumb:

Start with Q = I, R = I. See what the controller does.
Bigger Q elements for states you care about (position usually matters more than velocity).
Bigger R elements for control inputs you want to use sparingly (a big motor you don't want to saturate).
Scale matters, not absolute value. Multiplying both by 10 changes nothing. The ratio Q/R sets the aggressiveness.
Units. If position is meters (small numbers) and velocity is m/s (also small), a Q that's diagonal [1, 1] is fine. If position is millimeters and velocity is m/s, you need to rebalance.

Applying the controller

Once you have K, the control loop is tiny:

# At every control tick:
x = measure_state()          # get position, velocity, angle, ...
u = -K @ x                   # compute control
send_to_actuator(u)

That's it. No integral windup. No derivative filter. One matrix multiply. The hard work is in the off-line K computation; run-time is a few multiplications.

Tracking a non-zero setpoint

Pure LQR drives state to zero. To track a setpoint, subtract the desired state:

u = -K @ (x - x_desired)

For more sophisticated tracking (time-varying references, feedforward), you extend to LQI (with integrator) or LQG (LQR + Kalman filter for estimation). These are next chapters.

Where LQR shines

Multi-state, multi-input systems. Quadrotors, inverted pendulums, Segways — places where hand-tuning six loops is a nightmare.
Known dynamics. If you have a decent model of A, B, LQR will do better than PID.
Systems where you want predictable behavior. LQR is deterministic — the same Q, R always gives the same K.

Where LQR falls short

Nonlinear systems far from the linearization point. LQR gives you local stability, not global. Robot arms near singularities or walking robots at high speeds need more than LQR.
Hard constraints. LQR can't enforce "motor torque must stay under 10 N·m." If you need constraints, reach for MPC.
Unknown dynamics. Garbage A, B → garbage K. You need a model.

Exercise

Simulate an inverted pendulum on a cart. Four states: cart position, cart velocity, pole angle, pole angular velocity. One input: force on the cart. Linearize around the upright equilibrium. Compute LQR with Q = diag([10, 1, 100, 1]) (we care a lot about the pole angle). Simulate the closed-loop system starting with the pole tilted 10° off vertical. Watch it stabilize.

That project takes 30 minutes in Python. It's the classical benchmark for every state-space control course, and seeing it work is the moment LQR stops being abstract.

Model predictive control — LQR's big sibling. Same philosophy, but computes the optimal action over a finite horizon each step, so it can handle nonlinearity and hard constraints. That's the controller that runs in autonomous cars, drone racing, and modern humanoid gait generators.

← Previous

PID: the 80% of industrial control

Model predictive control (MPC): when and why