RoCo Challenge@AAAI 2026 — Onsite Experiments

Tasks, scoring, and updated onsite evaluation rules

IMPORTANT

Onsite Logistics & Deployment

Read Before Onsite Evaluation

Critical Notice: Due to potential network instability or limited internet access at the onsite venue, all teams must be fully prepared for offline execution. Failure to comply may result in inability to run evaluations.
Item Requirement Scope Details
Network constraints Offline-ready deployment All teams Teams must bring their own storage device(s) (portable SSD / USB drive) and copy all source code, model checkpoints, and environment dependencies to the onsite workstation before evaluation starts.
Recommendation: Bring redundant copies of critical files and verify that the offline environment can run end-to-end on the onsite workstation prior to the recorded trials.
Section 1

Task Description

Onsite Evaluation

Task Name Initial State Goal
Task 1 Assembly from Scratch Gears placed besides the planet carrier, pending assembly onto the pins. Assemble three gears correctly onto the corresponding pins on the carrier.
Task 2 Resume from Partial State Two gears already assembled correctly, with one gear placed beside for assembly. Assemble the remaining gear onto the correct pin to complete the assembly.
Task 3 Error Detection and Recovery Three gears assembled, but one of them is larger than the correct size. One gear with correct size is placed beside. Correct the mistake by picking the wrongly assembled (larger) gear and placing it on the table, then assemble the correct gear onto the intended pin.
Note: “Correctly assembled” means the gear is placed on its intended pin/position according to the official evaluation script.
Section 2

Scoring Rules (Updated)

Atomic Event Scoring

Event Definition Points Notes
Pick success (Task 1&2&3) Grasp & lift a target gear 1 The gripper grasps the gear and lifts it off the table (detached from support).
Place success (Task 1&2&3) Place gear onto intended pin 2 Correct placement onto the target pin meeting the required pose/fit.
Pick wrong gear (Task 3) Pick the wrongly assembled (larger) gear 2 Grasp the wrong gear and lift it until detached from the pin. Picking the wrong gear is rewarded higher in Task 3 to emphasize error removal.
Place wrong gear to table (Task 3) Place the removed wrong gear on the table 1 After removal, the wrong gear must be placed on the table to complete the recovery step.
No double counting First success only per event Repeated attempts on the same gear/event are not double-counted; only the first success is scored.
Final score Weighted aggregation Final Score = (4*S1 + 2*S2 + 4*S3) / 10 (consistent with the simulation setting).
Note1: S1 / S2 / S3 denote task scores computed by the official evaluation scripts under the updated event scoring. Note2: Picking of the right gear in Task 3 will score points only if the preceding removal of the wrong gear is successful.
Section 3

Evaluation Procedure (Updated)

Time Budget & Trials

Rule Setting Value Explanation
Total team time Per team (onsite) 35 min Each team has 35 minutes total onsite time: 5 minutes for initialization and 30 minutes for performance testing.
Recorded results Per task 4 results For each task, the committee records 4 test results. Teams may attempt more runs, but only highest 4 results will be recorded.
Per-task time cap Task 1 / 2 / 3 12 / 6 / 12 min Time caps per task: Task 1 = 12 min, Task 2 = 6 min, Task 3 = 12 min. If fewer than 4 recorded tests are completed within the time cap, the remaining missing results are recorded as 0.
Important: The initialization window (5 min) is intended for environment setup and safety checks only. Teams should plan to start evaluation runs promptly during the 30-minute testing window.
On-site

On-site Notifications

File & Environment Management

Scope: The following policies apply to all onsite machines and are enforced to ensure fairness and system stability.
Item Requirement Constraint Details
Code & checkpoints Use team directory only All teams All teams should store code and checkpoints only under ~/roco/[teamname]. Any changes outside this directory must be performed with organizer assistance.
Conda environments Team-scoped env naming If using conda Teams should only create and modify their own conda environments named roco_[teamname]_[serial]. Modifying base or other teams’ environments is not allowed.
System-level installs Organizer approval required All teams Any installation to the system environment or ROS2 Python packages must be requested through the organizers.
System variables & paths No direct edits All teams Editing existing system variables and paths is not allowed. Adding new environment variables requires organizer assistance.
Reminder: Please plan your deployment to be self-contained within your team directory and conda environment, and contact the organizers early if special system permissions are needed.