Published2026-04-26·~13 min

Tactile sensing in manipulation

GelSight, DIGIT, BioTac, e-skins — the sensors that finally give robots hands with feeling. What they measure, what to do with the data, and the workflow that turns 90% grasps into 99%.

by RobotForge

#manipulation#tactile#sensing

Vision tells you where an object is. Tactile sensing tells you what's happening at the contact. For decades robots had vision and force/torque at the wrist — coarse compared to what a human fingertip provides. The 2018+ wave of vision-based tactile sensors (GelSight, DIGIT) changed that. In 2026, tactile sensing is starting to be standard equipment on serious manipulation platforms.

What tactile sensors actually measure

Different sensor types capture different aspects of contact:

Force/torque (wrist-mounted): 6 DOF — net force and torque at the wrist. Coarse contact info.
Vision-based fingertip (GelSight, DIGIT): a small camera looks through a soft elastomer. Object impressions deform the gel; the camera images the deformation. Sub-millimeter spatial resolution; frame rate 30–120 Hz.
Capacitive arrays (BioTac): arrays of capacitive sensors on a fingertip. Detect texture, slip, vibration.
Resistive e-skins: pressure-sensitive resistive elements on a flexible substrate. Cover larger areas (palm, arm); lower resolution.
Piezo sensors: detect vibration / slip at high frequencies (1–10 kHz). Fast but limited spatial info.

Each modality answers different questions. Vision-based tactile (GelSight/DIGIT) currently dominates research because the data is image-shaped — easy to feed into convnets and VLAs.

What you can do with tactile data

Slip detection

The classical first use case. As an object starts to slip, characteristic vibrations appear in the high-frequency band. Detect them, increase grip force, prevent the drop.

Hand-engineered detectors work; learned classifiers (CNN on a window of tactile frames) work better. Either way, sub-100 ms response prevents most slips.

Grasp verification

After closing the gripper, the tactile pattern tells you what you're holding (or whether you're holding nothing). A textbook "no contact" tactile reading after a grasp attempt → retry. This is the major lift from 90% to 95% pick-and-place success.

Object pose estimation

The tactile imprint of a known object encodes its orientation in the gripper. With a learned model, you can recover sub-millimeter relative pose from a single tactile reading — useful for precise placement after pickup.

Texture and material recognition

Different materials produce different tactile signatures. A cloth feels different from a plastic cup. Learned classifiers achieve high accuracy on small material sets.

In-hand reorientation

Tactile data tells you where the object is and how it's moving inside your hand. Combined with policy learning, this closes the perception-action loop for dexterous manipulation. State-of-the-art papers (2024+) use tactile-conditioned policies that don't need vision after the initial grasp.

Shape exploration

Touch an unknown object many times; piece together its shape. Slow but works without vision; useful for occluded scenarios.

The 2026 sensor catalog

Sensor	Type	Cost	Status
GelSight Mini	Vision-based	~$1k	Research workhorse
DIGIT (Meta)	Vision-based	~$300	Open hardware
DIGIT 360 (Meta, 2024)	Vision + multimodal	~$5k	Higher-end research
BioTac (SynTouch)	Capacitive + vibrational	~$10k	Niche; bio-inspired
Robotous F/T	Wrist 6-DOF F/T	~$2k	Industrial standard
Sensapex / iniLabs e-skin	Resistive arrays	~$5k	Emerging for arms

The minimum hardware setup

To start: get a Robotiq 2F-85 gripper + 2× GelSight Mini fingertips. ~$5k, plug-and-play with most cobots. You'll have:

Real-time tactile imaging at 60 Hz.
Slip detection out of the box (existing libraries).
Grasp success/failure binary signal.
Path to learned manipulation policies.

That's the production-realistic entry point.

Software pipelines

Hand-engineered slip detection

def detect_slip(tactile_history):
    # Compute frame-to-frame difference; high-frequency content indicates slip
    diffs = [np.linalg.norm(tactile_history[i+1] - tactile_history[i])
             for i in range(len(tactile_history)-1)]
    high_freq_energy = np.std(diffs[-10:])
    return high_freq_energy > SLIP_THRESHOLD

Tune the threshold for your sensor and gripper. Works well enough for most production tasks.

Learned grasp success classifier

tactile_image = sensor.capture()
features = grasp_cnn(tactile_image)
success_prob = classifier(features)
if success_prob < 0.7:
    retry_grasp()

Train on a dataset of successful + failed grasps with tactile readings labeled. Hundreds of examples are sufficient; thousands give better generalization.

Tactile-conditioned policies

The 2024+ pattern: feed tactile + vision + proprioception into a single policy network. Output joint actions. Treats tactile as just another sensor modality.

Frameworks: LeRobot supports tactile observations in its dataset format. The Tactile-VLM literature is the cutting edge.

Common gotchas

Calibration drift: GelSight gels deform permanently after thousands of contacts. Replace gels every few thousand grasps.
Lighting changes: vision-based tactile data can be sensitive to internal LED brightness drift. Re-calibrate periodically.
Coordinate frames: tactile imprints come in fingertip frame; you usually want them in the world frame. Compose with TF.
Frame rate matters: slip detection needs > 60 Hz. Cheap webcam-based DIGITs hit this; lower-frame sensors don't.
Compute load: processing two 60 Hz tactile streams + vision + control runs out of CPU on an embedded controller. Offload tactile inference to a separate machine.

What tactile is starting to enable

Robust pick-and-place: tactile verification + slip recovery. Production use.
Cable / cord manipulation: combining vision + tactile to follow flexible objects.
Precision insertion: tactile-guided assembly with sub-millimeter accuracy.
Texture-based sorting: separating items by feel as well as look.
Dexterous manipulation: in-hand rotation and reorientation, increasingly viable.

Where it's still research

Generalist tactile foundation models (analogous to vision-language models).
Tactile + language: "this feels rough" understood by VLAs.
High-bandwidth tactile (1+ kHz) for very fast manipulation.
Whole-body e-skins for full-arm contact awareness.

Exercise

If you have access to a DIGIT (cheapest entry), instrument it on a parallel-jaw gripper. Collect 50 grasps with various objects: 25 successful, 25 missing. Train a binary classifier on the tactile images. Deploy: every grasp goes through the classifier; failed grasps trigger a retry. The 5–10% bump in end-to-end success rate is what tactile delivers.

That's the Manipulation track done

You've covered the full progression: pick-and-place pipeline → grasp analysis → deep grasping → MoveIt for manipulation → trajectory generation → impedance for assembly → non-prehensile → dexterous → mobile manipulation → tactile. Together they trace the field from 1980s theory to 2026 production. Read papers in any of these subfields and the math will be familiar.

← Previous

Mobile manipulation: arm + base coordination