Home/Training Data/Full Overview
Black Gibbon — Robotics Training Data

Training data
for robotics AI.
End to end.

Production-grade training data and embedded human specialists — purpose-built for teams building physical AI, manipulation systems, and autonomous robots.

Get A Quote → See Both Offerings →
📦
Offering 01
Synthetic Datasets
Ready-to-train robotics environments built on World Labs Marble + NVIDIA Isaac Sim. Delivered in days, not months.
👥
Offering 02
Supplemental Team
Embed 300+ specialist annotators and senior 3D sim engineers into your pipeline. Operational in under 30 days.
World Labs Marble· NVIDIA Isaac Sim· Isaac Lab· Isaac Replicator SDK· PhysX 5.1· Omniverse RTX NuRec· MuJoCo· OmniGibson· BEHAVIOR-1K· Gaussian Splatting· USD / USDZ· COCO JSON· YOLO v8· HDF5· sim-to-real transfer· domain randomization· World Labs Marble· NVIDIA Isaac Sim· Isaac Lab· Isaac Replicator SDK· PhysX 5.1· Omniverse RTX NuRec· MuJoCo· OmniGibson· BEHAVIOR-1K· Gaussian Splatting· USD / USDZ· COCO JSON· YOLO v8· HDF5· sim-to-real transfer· domain randomization·
400+
Annotators across 5 centers, 2 branches
10yr
Japan AV market · Toyota + Tier-1 OEM programs
54-cls
Max segmentation class depth delivered at scale
400K
Files annotated per month at peak throughput
<30d
Supplemental team onboarding SLA
Frames / Month
400K
Peak throughput · sensor fusion + seg
Annotators Active
247
Across 3 shifts · Hanoi ops
62% shift capacity
QA Pass Rate
98.9%
Rolling 30-day average
Avg Turnaround
11.2h
Submit → QA signed · SLA: 12h
Synthetic Variants
10K+
Domain-rand. frames per scene
Track record includes
Toyota Autonomous Vehicles· Japanese Tier-1 OEMs· Sensor Fusion Annotation· 54-Class Segmentation· Radar Event Labeling· 3D Cuboid Bounding Boxes
The Data Problem

Robotics teams are data-bottlenecked.
Not model-bottlenecked.

Building physical AI is hard enough. Building the training data to power it — at quality and scale — is what breaks most teams.

✕  The Status Quo
Manual 3D environment modeling costs $5K–$10K per scene and takes weeks
Off-the-shelf datasets don't match your robot, gripper, or facility
Annotation vendors lack robotics domain expertise — wrong taxonomies, wrong QA
Sim-to-real gap kills model performance when domain randomization is wrong
Hiring an in-house annotation team takes 6–9 months — you needed it yesterday
Four vendors for world gen, physics, annotation, QA — handoffs break quality
✓  The Black Gibbon Approach
Marble generates 500 environments in the time it takes to model one manually
Send facility photos or floor plans — we generate a sim-to-real digital twin
300+ annotators trained on Toyota AV programs know robotics taxonomies cold
Isaac Replicator domain randomization built into every dataset — 10K+ variants
Supplemental team operational in <30 days — no recruiting pipeline needed
One team owns world gen through QA sign-off — zero handoff friction
Same Project. Two Approaches.

What changes when you have the right data.

Traditional (Fragmented Vendors)Slower
// Warehouse AMR training dataset // 4 vendors: 3D studio + sim + annotation + QA Timeline: 6 months Cost: $180K // 3D studio: $6-8K per environment // 20 environments = $140K + 10 weeks // Sim vendor: 3 weeks of integration // Annotation BPO: no robotics expertise // 12% error rate on 3D bounding boxes // QA catches errors, rebuild required // Sim-to-real accuracy: 61%
6 months
💸 $180K+
📉 61% sim-to-real
Black Gibbon (Integrated Pipeline)3–5x Faster
# Same warehouse AMR dataset scene = marble.generate( prompt="warehouse AMR environment", inputs=["facility_walkthrough.mp4"], variations=500 # not 20 — 500 ) # PhysX 5.1 physics auto-assigned # 10,000 Replicator domain rand. frames # 300+ annotators: Toyota-grade QA # All formats: COCO, YOLO, HDF5, Isaac Lab # Sim-to-real: 91%. Time: 3 weeks. $8K.
3 weeks
💰 $8K
🎯 91% sim-to-real
How It Works

One pipeline. Zero handoffs.

Every stage — world generation through QA sign-off — is owned by one integrated team. No vendor seams. No quality loss between steps.

// Black Gibbon integrated pipeline — zero vendor handoffs
🌍
Marble
World Gen
⚙️
Isaac Sim
Physics
🎲
Replicator
Domain Rand.
👁️
300+ HITL
Annotators
🛠️
Proprietary
QA Tools
📦
Production
Dataset
01 — GENERATE

World Labs Marble creates the environments

Text prompts, facility photos, or video walkthroughs → Marble API → photorealistic Gaussian Splat worlds with valid USD geometry. Our 3D experts with decades of production experience curate every output.

02 — SIMULATE

Isaac Sim adds physics. Replicator adds variety.

USD import into NVIDIA Isaac Sim — PhysX 5.1 collision primitives, rigid body dynamics. Isaac Replicator then produces 10,000+ domain-randomized variants: lighting, albedo, object poses, clutter density, camera frustum.

03 — VALIDATE

400+ humans review every frame. Our VSAT tool catches what others miss.

Semantic seg (up to 54 classes), instance seg, 2D+3D bounding boxes, 3D cuboid boxes, radar event labels, keypoints, depth QA — reviewed by annotators with 10 years of Toyota AV and Japanese Tier-1 OEM programs under their belt. Our proprietary VSAT tool tracks every action, every rejection, every correction with full Kibana-integrated auditability.

// Live pipeline run — warehouse_v3 · sensor_fusion_batch_041
● PROCESSING
✓ done
01
🌍
Marble World Gen
500 variants
USD export · 2.3GB
✓ done
02
⚙️
Isaac Sim Load
PhysX 5.1 · URDF
847 meshes validated
✓ done
03
🎲
Replicator Run
10,240 frames
Albedo + HDRI + pose
⟳ running
04
👁️
HITL Review
7,842 / 10,240
54-cls seg + fusion
◯ queue
05
🛠️
VSAT QA Sign-off
Multi-layer review
Kibana dashboard
◯ queue
06
📦
Dataset Export
COCO JSON · YOLO v8
HDF5 · MuJoCo XML
Overall batch progress
78%
What We Offer

Two ways to work with us.
One team behind both.

Use one or both. The same 300+ specialists, 3D experts, and QA tooling power both offerings.

📦
Offering 01 — Dataset

Synthetic Robotics Datasets

Ready-made or custom-built robotics training data. Choose from pre-built environment types, or send your real facility and we generate a sim-to-real digital twin with thousands of domain-randomized variants.

  • Warehouse/AMR, factory, home, hospital, construction, agricultural environments
  • Sensor fusion: simultaneous 2D BBoxes, 3D cuboid boxes, and segmentation per frame — same pipeline used for Toyota AV programs
  • Segmentation up to 54 classes (5, 10, 28, 54-class levels); 9,000–15,000 files/month sustained throughput
  • Radar event annotation — video input → structured event output; 300,000 images/month at 0.2s/image
  • COCO JSON, YOLO v8, HDF5, Isaac Lab RL wrappers, MuJoCo XML output formats
  • VSAT in-house tool — 2D/3D shapes, video interpolation, semi-auto annotation, Kibana QA dashboard
Request a sample dataset →
👥
Offering 02 — Embedded Team

Supplemental Annotation & Sim Team

Embed our 300+ specialists directly into your pipeline. Annotators who use your tooling and taxonomy, or senior 3D sim engineers working inside your Isaac Sim workflow. Operational in under 30 days. Scale monthly.

  • 400+ annotators across 5 centers, proven on Toyota AV + Japanese Tier-1 OEM programs since 2015
  • Senior 3D experts: Marble, Isaac Sim — decades of production experience
  • Works in CVAT, Scale, Label Studio — or our in-house VSAT tool with Kibana quality dashboard
  • Multi-layer QC: unlimited correction cycles, full per-frame annotator + reviewer attribution
  • 3-shift operation → 24/7 throughput, US teams wake to completed work every morning
  • Custom taxonomy onboarding in <30 days — annotation team, 3D sim team, or both
Schedule an onboarding call →
The People

Two specialist teams.
One seamless product.

Vietnam-based, 24/7, operating across three shifts so your robot training never waits on a timezone.

Frame 4821 · BBox QA robot_arm conveyor IOU:0.97 · Rev:#142 · PASSING Black Gibbon QA Suite v4.2 · 300+ Active Semantic Seg · 3D BBox · Depth QA · PASSING Live QA Metrics BBox IOU 99.1% Seg Precision 97.8% Collision Valid. 100% 183/300 · Shift 2/3 · Toyota grade 👤 👤 👤 + 297 more annotators · 3 shifts · 24/7 SHIFT ACTIVE · 183/300 online · 6,241 frames reviewed · QA: Toyota Tier-1 · PASSING
Team A — Annotation & QA

400+ Expert Annotators Ready to Embed

Vietnam-based specialists proven on Toyota AV programs and Japanese Tier-1 OEM projects since 2015 — 10 years of delivering the annotation types your HITL pipeline actually needs: sensor fusion, 54-class segmentation, radar events, 3D cuboid bounding boxes. Operational in under 30 days on your taxonomy and tooling.

400+
Annotators, 5 centers
10yr
Japan AV market
30d
Onboard SLA
Sensor fusion: simultaneous 2D BBoxes, 3D cuboid boxes, and segmentation masks per frame — 300,000+ images/month
Semantic segmentation up to 54 classes; radar event labeling (video in → structured output, 0.2s/image)
VSAT in-house tool — 2D/3D shapes, cuboids, polygons, video interpolation, semi-auto annotation, Kibana QA dashboard
Multi-layer QC, unlimited correction cycles — 3 shifts, US teams wake up to QA-signed batches every morning
Isaac Sim 4.x · warehouse_v3.usd · RTX NuRec Scene Properties Physics PhysX 5.1 · Rigid Body Collision Meshes 847 validated ✓ Lighting Lumen GI + HDRI Renderer RTX NuRec (Neural) Domain Rand. Replicator · 10K frames Export USD · GLB · PLY · MuJoCo Marble: 500 variants $ marble.generate(n=500) | isaac_sim.load() | replicator.run(frames=10000) ✓ 10K frames · PhysX: PASS · NuRec: OK → HITL queue 👤 Sr. 3D Expert 👤 Sim Engineer
Team B — 3D Simulation

Top 3D Experts, Decades of Production Experience

Senior specialists who shipped environments for VFX, game engines, and digital twin clients at the highest level — now applying that expertise to robotics simulation pipelines. They own Marble → Isaac Sim end to end.

20yr+
Avg. experience
30+
Senior experts
Scalable output
Marble curation — expert prompting, multi-image input, 3DGS → USD pipeline
Isaac Sim pipeline — PhysX 5.1, Lumen GI + HDRI, RTX NuRec, articulation joints
Replicator SDK — albedo sweeps, HDRI rotation, pose randomization, clutter density
USD, GLB, PLY, MuJoCo XML export
3D Visualization Expertise

Every major engine.
Production-grade.

Our 3D simulation team has 20+ years of combined experience across VFX, game engine pipelines, digital twins, and robotics simulation. They don't learn your toolchain — they've already shipped in it.

🌍
World Generation

World Labs Marble

Li Fei-Fei's spatial intelligence lab. Text prompts, facility photos, or video → photorealistic 3D environments with valid USD geometry. 500 variants where manual modeling produces one. Our team handles expert prompting, multi-image input, 3DGS-to-USD pipelines, and environment curation at scale.

Multi-image input 3DGS → USD pipeline Scene curation Batch generation 500+ variants/project
⚙️
Physics Simulation

NVIDIA Isaac Sim

The robotics simulation standard. PhysX 5.1 rigid body dynamics, articulation joints, accurate collision primitives, Lumen GI + HDRI lighting, RTX NuRec neural rendering. Our team configures full robot URDFs, sensor rigs (depth, lidar, radar), and domain randomization pipelines from scratch.

PhysX 5.1 RTX NuRec URDF configuration Sensor rigs Replicator SDK
🎲
Domain Randomization

Isaac Replicator

Scriptable domain randomization for sim-to-real transfer. Albedo sweeps, HDRI rotation, object pose distributions, clutter density, camera frustum variation — 10,000+ unique frames per scene. Closes the 30-point accuracy gap between naive sim-trained and production-ready models.

10K+ variants/scene Albedo sweeps Pose randomization HDRI rotation
🧊
3D Authoring

Blender + Houdini

Environment authoring, procedural generation, and asset preparation. Blender for USD/GLB/PLY export pipelines and custom object creation. Houdini for procedural environment generation — warehouse layouts, road networks, industrial facilities — that feed directly into Isaac Sim.

USD/GLB/PLY export Procedural gen Asset rigging Houdini PDG
Export formats we deliver in:
USD GLB / glTF PLY COCO JSON YOLO v8 HDF5 Isaac Lab MuJoCo XML SOLO ROS bags
🕐

Your robots train while you sleep.

3 shifts in Hanoi means continuous throughput. US teams submit by EOD and wake to QA-signed datasets or annotated batches — every morning, backed by SLA. Same 12-hour advantage Black Gibbon engineering teams use.

HANOI GMT+7
--:--
3 shifts · Primary ops
IRVINE CA
--:--
Account mgmt
COVERAGE
24/7
Always delivering
Proprietary QA Tooling

Ten years of
zero-defect tooling.

Built in-house over a decade of production annotation for Toyota AV programs and Japanese Tier-1 OEM suppliers — where errors have million-dollar consequences. Our VSAT (VBPO Smart Annotation Tool) supports 2D/3D bounding boxes, cuboid boxes, polygons, polylines, and points; video interpolation; semi-automatic annotation; and a Kibana-integrated quality dashboard that tracks every action in real time.

When you embed our team, VSAT comes included — along with unlimited correction cycles. If a batch doesn't pass, we rework it. QA risk stays with us, not you.

🛠️

VSAT — In-House Annotation Tool

Basic shapes (2D boxes, 3D cuboids, polygons, polylines, points), video interpolation, semi-automatic annotation, and a Kibana dashboard for real-time QA stats and data visualization. Deployed and battle-tested across Toyota AV programs.

🏭

Multi-Layer Review — Unlimited Corrections

Every frame passes through annotator → reviewer → final sign-off. Rejected batches are reannotated at no charge. Reviewers comment directly on specific frames. Nothing ships until it passes your acceptance criteria.

📋

Full Audit Trail with Every Delivery

Per-frame QA scores, annotator and reviewer IDs, randomization parameters, physics logs, compliance certification. When a model underperforms on synthetic-to-real transfer, you can trace it back to the specific frames.

VSAT QA Dashboard — 10 Years in Production · Toyota AV Standard
📐 2D Bounding Box IOU Accuracy
99.1%
📦 3D Cuboid Box Precision
98.6%
🎭 Segmentation Mask Precision
97.8%
📡 Radar Event Detection Rate
99.6%
🏷️ Label Taxonomy Consistency
98.4%
⚡ Physics Plausibility Score
99.5%
🏆 Japanese Client Standard: Benchmarked against Toyota AV and Tier-1 OEM zero-defect thresholds. Kibana dashboard available for real-time client visibility. Full audit trail shipped with every delivery.
// VSAT dashboard — annotation throughput · real-time
LIVE SESSION
QA Accuracy by Type
2D BBox IOU
99.1%
3D Cuboid Precision
98.6%
Segmentation Mask
97.8%
Radar Events
99.6%
Label Taxonomy
98.4%
Physics Plausibility
99.5%
Active Job Queue
warehouse_v3_fusion
7,842 frames
running
auto_line_seg_54cls
12,000 frames
✓ done
radar_ev_batch_017
300K imgs · 0.2s
queue
cuboid_boxes_r3
4,500 frames
queue
keypoint_dms_041
8,000 frames
queue
Daily Throughput — Frames Annotated
Mon Tue Wed Thu Fri Sat Sun Mon Today
This week avg: 53,400 frames/day
↑ 12% vs last week
Supplemental Team Onboarding

Full team operational
in under 30 days.

01
DAY 1–3

Requirements & Taxonomy Review

Your labeling taxonomy, QA standards, and workflow tools reviewed. We adapt to you — not the other way around.

02
DAY 4–10

Tool Integration & Pilot Batch

Annotators set up in your tooling. First 500-frame pilot batch delivered for calibration and feedback.

03
DAY 11–20

QA Calibration & Alignment

QA thresholds tuned to match your exact standards. Taxonomy edge cases resolved and documented.

04
DAY 21–30

Full Throughput — Production Ready

Full team operational on daily delivery cadence. 24/7 throughput begins.

Why embed vs. hire?

Recruiting a 50-person annotation team takes 6–9 months. We're operational in 30 days.

No recruiting pipeline. No training ramp. 400+ specialists already running Toyota-grade AV annotation — sensor fusion, 54-class segmentation, radar event labeling, 3D cuboid boxes — plug directly into your Marble → Isaac Sim → HITL workflow from day one.

What you get from day one

A team that's been delivering for Japanese Tier-1 OEMs since 2015, with VSAT tooling and multi-layer QC calibrated to your spec in 30 days.

400+ annotators across 5 centers already on payroll — no headcount commitment
VSAT included — 2D/3D boxes, cuboids, polygons, video interpolation, Kibana QA dashboard
Unlimited correction cycles — rework at no charge until every batch passes your spec
Scale month-to-month to match your Marble/Isaac Sim training throughput needs
24/7 output — 3 shifts in Hanoi, annotated batches ready every morning
Schedule an Onboarding Call →
Engagement Models

Dataset order, embedded team,
or both.

Start with a free sample to validate quality. Scale to a full embedded team when you're ready. Cancel monthly.

📦
Dataset · One-Time

Synthetic Dataset

Order a ready-made or custom-built dataset. Delivered in 3–5 days. No commitment required.

  • Starter: 5K frames from $1,500
  • Growth: 50K frames from $8,000
  • Custom facility digitization available
  • QA scorecard + audit trail included
Get a quote →
MOST POPULAR
🔄
Dataset · Monthly Retainer

Ongoing Data Production

Weekly dataset drops from a dedicated pipeline team. Unlimited environments, all formats, SLA-backed delivery.

  • Unlimited environments per month
  • All annotation types + custom taxonomies
  • Dedicated PM + QA dashboard
  • Weekly delivery cadence with SLA
Talk to sales →
👥
Supplemental Team · Monthly

Embedded Specialists

Annotation team, 3D sim engineers, or both — embedded into your pipeline. Operational in <30 days.

  • Annotation, sim, or full pipeline team
  • Scale 10 to 300+ month-to-month
  • Your tooling + your taxonomy
  • Cancel monthly, no lock-in
Schedule a call →
FAQ

Common questions

Everything you need to know about Black Gibbon synthetic datasets and supplemental annotation teams.

What annotation types does the HITL team support at scale?+
The full production stack — tested across Toyota AV programs: 2D bounding boxes with attributes (traffic lights, road markings, vehicle lights with on/off state); 3D cuboid boxes for vehicles; semantic segmentation up to 54 classes (5, 10, 28, 54-class levels at 9,000–15,000 files/month); polygon and polyline annotation for lanes and driveable regions; radar event labeling (video input → structured output, 300,000 images/month at 0.2s/image); keypoint/pose annotation for in-cabin driver monitoring and worker action detection; and image classification OK/NG or multi-class. All delivered through our VSAT tool with Kibana QA dashboard and multi-layer review.
Can you digitize our real facility for sim-to-real training data?+
Yes. Marble accepts multi-image inputs, video walkthroughs, and 360° panoramas. Send photos or a walkthrough of your actual facility and we generate a sim-to-real digital twin that matches your deployment environment — then produce thousands of domain-randomized Isaac Replicator variants from it. Included in Growth and Retainer dataset tiers.
How quickly can the supplemental annotation team be operational?+
For standard annotation domains (semantic segmentation, 2D/3D bounding boxes, cuboid boxes, radar labels), we can be fully operational in under 2 weeks. For custom robotics taxonomies in new domains, we target 30 days. Week 1–2: pilot batch delivery, taxonomy review, QA calibration. Week 3–4: full-rate daily delivery begins. Team size scales from dozens to hundreds month-to-month — no headcount commitment from you.
What is VSAT and how does it fit into the Marble → Isaac Sim pipeline?+
VSAT (VBPO Smart Annotation Tool) is our proprietary in-house annotation platform. It supports 2D boxes, 3D cuboids, polygons, polylines, points, video object interpolation, and semi-automatic annotation — with a Kibana-integrated dashboard for real-time QA tracking. After Isaac Replicator generates frames, those frames are routed to our HITL queue in VSAT for annotation and multi-layer review before packaging into your final dataset. We can also work in CVAT, Scale AI, or Label Studio if you have an existing toolchain.
How does the 24/7 delivery cycle work?+
Our Hanoi team operates in 3 shifts covering the full 24-hour cycle. US teams submit Marble-generated frames or Isaac Sim renders by end of business and receive completed, VSAT QA-validated annotation batches the next morning — every time, backed by SLA for enterprise clients. This is the same 12-hour timezone advantage Black Gibbon engineering clients already rely on for overnight model monitoring and retraining.
Can we combine synthetic dataset production with a supplemental annotation team?+
That's the most common enterprise setup. Marble → Isaac Sim generates the synthetic frames; our HITL team annotates them at scale using VSAT. The 400+ annotators who power the embedded team engagement also run the HITL QA stage of every dataset delivery — same team, same QA standards, no vendor change when you scale from dataset orders to ongoing pipeline ownership.
Get Started

One partner for your
entire data operation.

Start with a free 500-frame sample dataset, or book a call to explore the supplemental team model. No commitment required.

📦

Start with a Dataset

Get a free 500-frame sample with full annotation suite and QA scorecard. Validate pipeline quality before committing.

Get Free Sample →
👥

Explore the Team Model

Book a 30-minute call to discuss your annotation or sim pipeline needs. We'll map out a team structure and onboarding timeline.

Schedule a Call →
© 2026 Black Gibbon. All rights reserved.  ·  Irvine, California · Hanoi, Vietnam