Onsite Experiments

Task definitions, data specification, and sensor setup for offline evaluation

1. Task Definition

The onsite experiments consist of three tasks derived from a planetary-gear assembly workflow. Tasks are designed to evaluate manipulation accuracy, partial-progress continuation, and error recovery under realistic deployment conditions.

Task Goal Notes
Task 1 Install three planetary gears onto the carrier pins Each planetary gear must be picked and mounted onto the correct carrier pin with correct placement and stability.
Task 2 Continue from partial progress of Task 1 (two gears already installed) Task 2 is not provided as an independent dataset. It reuses Task 1 data format, but starts from an intermediate state where two gears are already assembled and the policy must install the remaining gear.
Task 3 Error detection and recovery: remove an incorrectly installed larger gear and replace it with the correct smaller gear Task 3 targets recovery behavior: the robot must identify the wrong assembly, remove the incorrect part, and complete the correct replacement.
Data availability: Task 2 does not have a standalone data split; it is evaluated as a continuation scenario based on Task 1.

2. Data Content

Each trajectory provides synchronized multi-modal observations and actions for a dual-arm mobile-manipulation platform. The dataset includes robot state/action streams, end-effector feedback, and multi-camera visual observations.

Embodiment: Galaxea R1 Lite
Arms: Dual-arm with grippers
Action: Dual-arm + gripper action commands
Obs: Dual-arm + gripper states
EE feedback: End-effector pose feedback (per arm)
Implementation note: “Obs” and “Action” refer to the logged control/state interfaces for both arms and grippers. EE feedback is provided as a pose stream to facilitate control/learning with kinematic supervision.

3. Visual Streams

Visual observations are recorded from head and wrist cameras. The head provides stereo RGB streams (left/right), and each wrist provides RGB plus a depth stream.

Camera Modality Details
Head (Stereo Left) RGB Left head camera RGB stream.
Head (Stereo Right) RGB Right head camera RGB stream.
Wrist (Left) RGB + Depth Left wrist camera provides RGB and depth observations.
Wrist (Right) RGB + Depth Right wrist camera provides RGB and depth observations.

4. Head Depth Reconstruction (Stereo)

The head cameras do not provide native depth. If depth is required, it must be reconstructed from the stereo RGB pair using camera intrinsics/extrinsics.

Calibration file: stereo.yaml
Use this file to obtain intrinsics/extrinsics for stereo rectification and disparity-based depth reconstruction.

5. Reference Video (Onsite Scene & Data Collection Process)