Track 10

Learning for Robotics

Imitation, reinforcement, and the new wave of foundation models. The field that's redefining what robots can do.

12 published · 0 planned · 12 lessons total

01
What are VLAs (Vision-Language-Action models) and why they matter
Published
π0, RT-2, OpenVLA, Gemini Robotics. A new kind of model that takes a camera image and a natural-language instruction and outputs robot motor commands. Here's what that unlocks — and what it doesn't.
~16 min
→
02
Imitation learning 101: behavior cloning and DAgger
Published
The simplest way to train a robot policy: show it what to do, then make it copy. Here's how behavior cloning works, why it fails on its own, and the fix (DAgger) that made modern VLAs possible.
~15 min
→
03
Diffusion policy explained
Published
The 2023 breakthrough that brought generative modeling to robot control. Why predicting actions via denoising beats predicting them directly — and why multimodal action distributions matter.
~14 min
→
04
ACT and action chunking
Published
The architecture that made bimanual teleop policies actually work. Predict 16 actions at once, execute open-loop, repeat. Behind ALOHA, Mobile ALOHA, and many of the 2024+ VLAs.
~12 min
→
05
Reinforcement learning primer for roboticists
Published
MDPs, value functions, policy gradients — the RL minimum for a robotics audience. Skip the chess-playing examples; learn the math you'll use to train a quadruped.
~16 min
→
06
PPO in practice: the hyperparameters that actually matter
Published
The algorithm that dominates robotics RL. Honest about what's art and what's science. The 8 hyperparameters that determine whether your training succeeds, fails, or wastes a week.
~14 min
→
07
SAC and off-policy methods
Published
When to reach beyond PPO. SAC's maximum-entropy framework, replay buffers, and the algorithms that win on real-world hardware where every minute of robot time matters.
~13 min
→
08
Sim-to-real: domain randomization playbook
Published
A policy trained in sim that works on real hardware was science fiction in 2018. In 2026 it's a weekend project — if you get the randomization right. Here's the playbook.
~16 min
→
09
Real-world RL: HIL-SERL and friends
Published
Skip the simulator entirely. Train on hardware, with humans in the loop, demonstrations seeding the buffer, safety wrappers everywhere. The methods letting hobbyists train policies on actual robots in 2026.
~13 min
→
10
Collecting demonstrations: teleop rigs that work
Published
ALOHA, GELLO, phone-teleop, VR, leader-follower puppets. What each gives you, what each costs, and how to build one this weekend. The hardware behind every working VLA in 2026.
~12 min
→
11
Fine-tuning a VLA on your own data
Published
The conceptual side of VLA adaptation. LoRA vs full fine-tune, action representations, loss functions, and the tradeoffs that determine whether a 200-demo run produces a working policy.
~14 min
→
12
Open X-Embodiment and the dataset landscape
Published
What datasets exist, what they contain, and how to actually use them. The base data behind every modern VLA, the benchmarks worth trusting, and how to contribute your own demos.
~12 min
→

Learning for Robotics

What are VLAs (Vision-Language-Action models) and why they matter

Imitation learning 101: behavior cloning and DAgger

Diffusion policy explained

ACT and action chunking

Reinforcement learning primer for roboticists

PPO in practice: the hyperparameters that actually matter

SAC and off-policy methods

Sim-to-real: domain randomization playbook

Real-world RL: HIL-SERL and friends

Collecting demonstrations: teleop rigs that work

Fine-tuning a VLA on your own data

Open X-Embodiment and the dataset landscape