Published2026-04-26·~12 min

Co-simulation and hardware-in-the-loop

Mixing real and simulated components — when it's brilliant and when it's a debugging trap. The patterns for testing real robots against simulated worlds, and vice versa.

by RobotForge

#simulators#hil#testing

Hardware-in-the-loop (HIL) testing runs your real controller (firmware, ROS node, autopilot) against a simulated world. Or vice versa: real sensors feeding into a simulated controller. Done well, it catches bugs that pure-sim and pure-hardware testing both miss. Done poorly, it adds confusing latency and timing artifacts that are worse than either alone.

The four configurations

Configuration	Real	Simulated	Catches
Pure sim	—	Everything	Algorithm bugs
Pure hardware	Everything	—	All bugs (slowly)
HIL (controller in loop)	Embedded controller / autopilot	Sensors, world	Firmware bugs, real-time issues
Software-in-the-loop (SIL)	Robot, sensors	Higher-level controller / planner	Algorithm bugs against real world

HIL is the most common variant in production: simulate the world, exercise the real firmware. SIL is rare; mostly used by perception teams testing learned policies on real cameras.

The classic HIL setup: drone autopilot

You're developing a drone flight controller (Pixhawk firmware). Pure-hardware testing requires a real drone, a flying space, and risk of crash. Pure-sim testing exercises algorithms but not firmware.

HIL bridges:

Real Pixhawk hardware on a desk.
Simulator (Gazebo, jMAVSim) running the drone's dynamics + sensors.
Communication: simulated sensor data flows into the Pixhawk over its serial / SPI; control commands flow back.

The Pixhawk thinks it's flying a real drone. Bugs in the firmware (timing, buffer overflows, scheduler issues) appear; algorithmic bugs in the controller (PID tuning, MPC solver) appear; the simulated dynamics catches both.

This setup is the standard PX4 / ArduPilot dev workflow.

The autonomous-vehicle HIL setup

An L4 stack runs on production compute (Drive AGX, Jetson Orin). Replay logged sensor data + simulated synthetic scenarios into it. Inputs are bit-identical to what the production hardware sees in the field.

This is how AV companies regression-test new code: every commit triggers a fleet-scale simulation run, with HIL on a representative compute target. Catches everything from "this CUDA kernel is now slower" to "the planner's behavior changed in this edge case."

The latency trap

HIL is brilliant when latency matches reality. It's catastrophic when latency artifacts dominate.

Common failure modes:

Sim too slow: simulator runs at 0.5× real-time. Controller misinterprets timestamps; control loop unstable.
Sim too fast: simulator runs at 5× real-time. Sensor delivery jitters; controller's filtering tuned for slower delivery breaks.
Variable latency: simulator real-time-factor varies between 0.5× and 1.0× depending on scene complexity. Controller sees inconsistent timing; debugging is impossible.

Mitigations:

Lock the simulator to real-time mode (RTF=1.0 with hard pacing).
Use simulator timestamps consistently throughout the controller (use_sim_time in ROS).
Measure end-to-end latency before trusting test results.

Co-simulation: when the world is multiple sims

Co-simulation runs multiple simulators together. Common cases:

Mechanical sim + electrical sim: MuJoCo for the mechanical robot + Modelica/SPICE for the motor electronics.
Robot sim + traffic sim: AV stack tested in CARLA against simulated traffic from SUMO.
Sim + real hardware: hybrid HIL where a simulated robot collaborates with a real one for multi-robot testing.

The hard part is interface synchronization. Standards like FMI (Functional Mock-up Interface) define how multiple simulators exchange state and time-step together.

For most robotics, single-sim is enough; co-sim is for specialty applications.

The "real sensor in sim" case (SIL)

Run real cameras / IMU / depth sensor; feed into a simulated robot dynamics. Used for:

Sim-to-real perception transfer: train perception in sim with synthetic data; test against real cameras.
Behavior testing: replay real sensor recordings through the planner to check decisions.
Lab-environment validation: place real sensors in a controlled scene; see how perception handles it.

Less common than HIL but occasionally invaluable.

The 2026 production reality

HIL has become standard in:

Drone development (PX4, ArduPilot, all major drone OEMs).
Autonomous-vehicle stacks (every L3+ team uses HIL extensively).
Industrial robot firmware (UR, KUKA, ABB use HIL for new controller releases).
Aerospace (canonical HIL use case for decades).

HIL is less common in:

Hobby / educational projects (simulator-only is faster to iterate).
Pure-research robotics (real-robot demos at conferences are the validation).
Manipulation research (manipulation HIL is rare; pure-sim or pure-real is more common).

Building a HIL setup

Steps:

Identify the boundary: which components are real, which are simulated.
Define the interface: serial, Ethernet, USB, ROS topics. Match the real interface as closely as possible.
Pace the simulator: lock to real-time. Monitor RTF.
Start small: one closed loop (sensor in, command out). Verify it matches pure-sim results.
Expand gradually: add more sensors, more complex scenes.
Automate: HIL tests run on every commit. Caught bugs feed back into the test suite.

For PX4: make px4_sitl_default gazebo + make px4_fmu-v5_default upload to a real flight controller, then ./Tools/sitl_run.sh to bridge them. ~30 minutes to first flight in HIL.

Common gotchas

USB-serial bandwidth: simulated sensor data over USB-serial can saturate. Use Ethernet or DDS.
Clock drift: real hardware clock vs sim clock differ. Synchronize explicitly.
Resource contention: sim + real hardware on the same desktop might starve each other. Dedicated hardware helps.
Test environment != production: HIL's network setup may not match the field. Test with field-realistic latencies.

Exercise

Set up PX4 SITL + a real Pixhawk in HIL mode. Fly the simulated drone using the real flight controller. Modify the firmware (e.g., change a PID gain), reflash, observe the behavior change in sim. The development cycle from "edit firmware code" to "see real consequence safely" is the HIL value proposition in 30 minutes.

That's the Simulators track done

You've covered the full progression: choosing a simulator → MuJoCo → Gazebo → PyBullet → Isaac Sim/Lab → Drake → Webots → URDF/MJCF/USD → Gymnasium API → HIL. With this, you can pick the right simulator for any task, build / convert robot descriptions, wrap any sim for RL training, and integrate with real hardware. With this and the ten other completed tracks (Foundations, Kinematics, Control, ROS 2, SLAM, Perception, Planning, Manipulation, Learning, Mobile/Legged), you have eleven complete tracks. Two left: Embedded & Hardware, Frontiers.

← Previous

Writing your own sim environment (Gymnasium API)