Published2026-04-25·~14 min

Real-time control: RTOS, latency, and jitter

1 kHz control loops, hard deadlines, deterministic timing. The kernel tricks, scheduler tweaks, and architecture choices that separate hobby control from production.

by RobotForge

#control#realtime#rtos#embedded

Hobby robotics runs control loops at 50–100 Hz on whatever schedule the OS gives you. Production robotics needs 1 kHz with deterministic timing — the same micro-cadence every cycle, never missing a deadline. The gap is full of kernel tricks, RTOS choices, lock-free queues, and a different mindset. Here's the working knowledge.

Why real-time matters

Control bandwidth is bounded by control rate. A 100 Hz loop can stably control dynamics up to ~10 Hz; a 1 kHz loop can handle 100 Hz dynamics. Quadrotors, direct-drive arms, legged robots — anything whose mechanical resonance is above 10 Hz needs >100 Hz control to stay stable.

And the worst case dominates. A controller that runs in 0.5 ms average but occasionally spikes to 5 ms can cause oscillation if those spikes coincide with dynamics events. Real-time isn't about being fast on average; it's about being fast every cycle.

Hard vs soft real-time

Hard real-time: missing a deadline is a system failure. A flight controller missing one cycle = the drone crashes. Surgical robots, automotive ABS, missile guidance.
Soft real-time: missing occasional deadlines degrades performance but doesn't break things. Video playback, voice calls, most consumer robots.
Firm real-time: missing deadlines makes the result useless but doesn't crash anything. Robotic perception, sensor fusion.

The boundary determines what tools are appropriate. Hard real-time needs a real-time OS. Soft real-time can run on Linux with PREEMPT_RT or even mainstream Linux with care.

What gets in the way (sources of jitter)

Scheduler: the OS decides when your thread runs. On standard Linux, this can introduce milliseconds of jitter. Mitigation: real-time priority, kernel thread, dedicated CPU core.
Garbage collection: Python's GC, Java's stop-the-world pauses can hit 10+ ms. Don't run safety-critical control in GC'd languages.
Memory allocation: malloc can take unpredictable time. Pre-allocate everything; use object pools.
Page faults: virtual memory swapping introduces millisecond-scale stalls. Lock pages with mlockall().
Interrupts: hardware interrupts hijack the CPU briefly. Pin them to a separate core if possible.
Caches: cache misses are 10–100ns each; cache-cold code paths are slower than cache-hot ones. Warm up the path before it matters.
Network: anything over TCP has unpredictable latency. UDP, shared memory, or low-latency middlewares (Cyclone DDS, Zenoh).

The RTOS option

For hard real-time on small systems:

FreeRTOS: most common in embedded; runs on ESP32, STM32, RP2040. Tasks, queues, semaphores. Sub-microsecond context switches.
Zephyr: Linux Foundation project; richer than FreeRTOS, supports more peripherals, used in Nordic-platform IoT.
NuttX: powers the PX4 flight stack; full POSIX-like API on tiny MCUs.
QNX: commercial; used in automotive (BlackBerry IVI, Tesla autopilot supplier code).

For larger systems (industrial robots, autonomous vehicles):

Linux + PREEMPT_RT: a kernel patch that makes Linux interruptible at almost every point. Real-time guarantees in the 50–200 µs range. Free and broadly compatible.
Xenomai / RTAI: dual-kernel approach where a real-time kernel runs alongside Linux. Submicrosecond performance but more setup pain.
VxWorks: commercial RTOS used in aerospace and many industrial robots.

The Linux + PREEMPT_RT recipe

Most modern robotics teams choose this: free, mature, runs the same userspace as regular Linux, plays nicely with ROS 2.

Setup checklist:

Boot a PREEMPT_RT-patched kernel (Ubuntu 24.04 has it; otherwise build from kernel.org).
Lock memory: mlockall(MCL_CURRENT | MCL_FUTURE) at process start.
Pre-allocate everything: no malloc in the hot loop.
Set scheduler to FIFO or RR: sched_setscheduler(SCHED_FIFO, prio=80).
Pin to a dedicated CPU core: sched_setaffinity; isolate the core via isolcpus= kernel param so the scheduler keeps everything else off it.
Disable CPU frequency scaling: cpufreq-set --governor performance.
Disable hyperthreading on the dedicated core.

With these, you can hit ~50 µs worst-case jitter on commodity x86. Good enough for 1 kHz arm control.

The control-loop pattern

// Pre-allocate state, config, scratch buffers BEFORE entering the loop.
struct State { /* ... */ } state;

// Lock memory
mlockall(MCL_CURRENT | MCL_FUTURE);

// Set real-time priority
struct sched_param param = { .sched_priority = 80 };
sched_setscheduler(0, SCHED_FIFO, ¶m);

// 1 kHz control loop
struct timespec deadline;
clock_gettime(CLOCK_MONOTONIC, &deadline);

while (running) {
  read_sensors(&state);          // bounded, no allocation
  compute_control(&state);       // bounded, no allocation
  write_actuators(&state);       // bounded, no allocation

  deadline.tv_nsec += 1000000;   // 1 ms
  if (deadline.tv_nsec >= 1e9) { deadline.tv_nsec -= 1e9; deadline.tv_sec += 1; }
  clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &deadline, NULL);
}

Three rules in the loop body: bounded execution time, no allocation, no blocking I/O. That's it.

The two-process pattern (what real robots actually do)

Most production robots split:

Real-time control process (C++, single core, FIFO priority, 1 kHz). Reads encoders, runs PID/MPC/inverse dynamics, writes torques. No file I/O, no logging, no networking. Communicates with the rest of the system via lock-free shared memory.
Non-real-time application process (Python or C++, multi-core, normal priority). Vision, planning, state machines, ROS topics, logging, web UI. Reads/writes the shared-memory ring buffer at whatever rate it can manage.

The lock-free ring buffer between them is the critical piece. Use concurrentqueue, atomics-based SPSC queues, or LCM. Never a mutex (which can block).

Networking and middleware

Cyclone DDS / iceoryx: zero-copy IPC for ROS 2; bypasses the network stack on local hosts.
Shared memory + atomics: lower-level, more control, used by EtherCAT-driven industrial arms.
EtherCAT: industrial real-time fieldbus. Sub-microsecond synchronization across many slaves. Works with Linux + PREEMPT_RT (igh-ethercat) or proprietary masters.
CAN bus: lower-rate but deterministic. Used in automotive and many older industrial robots.

Measuring jitter

You can't trust real-time without measuring. Tools:

cyclictest (rt-tests package): runs a high-priority loop, reports max latency. The standard PREEMPT_RT benchmark.
perf record / ftrace: kernel tracing for finding what stalled your loop.
Custom: log the time between consecutive loop iterations into a histogram. Anything in the tail is your jitter problem.

When to skip all of this

Most hobby robots don't need any of this. ESP32 with Arduino loop running at 200 Hz controlling hobby servos works fine. Modern direct-drive arms with sub-millisecond control needs are where it matters.

Build the simple version first. When you measure that timing matters, add real-time only as far as required. Premature real-time is a serious project sink.

Exercise

Boot a PREEMPT_RT Linux kernel. Run cyclictest -t 1 -p 80 -i 1000 for a few minutes. Note the max latency. Now run a heavy CPU load on another core (stress -c N); re-run cyclictest. The max latency should barely change. Then run the load on the same core; max latency spikes 100×. That's why core isolation matters.

That's the Control track done

You've covered the full progression: PID → state-space + LQR → MPC → feedback linearization → impedance/admittance → force/hybrid → visual servoing → Lyapunov → underactuated → real-time. Together, they cover 95% of the control work in real robots. Pick a track and dig deeper, or jump to Manipulation, SLAM, or Learning for adjacent topics.

← Previous

Underactuated swing-up with energy shaping