Back to case studies

Published on August 1, 2024

Platform-generated corner case image for autonomous driving training data

Closing the Training Data Gap for Corner Cases with Synthetic Data Generation

Yodo Labs' Synthetic Data Platform generates annotated training data for visual recognition models in safety-critical industries. Purpose-built for functional correctness , not visual appeal , the platform enables organizations to close the training data gap for corner cases, rare object classes, and scenarios that cannot be collected in the real world.

  • Adapts to new visual domains with extremely few reference images
  • Deployed in autonomous driving and manufacturing quality inspection

The Training Data Gap

Deep learning models in autonomous driving, manufacturing inspection, and other safety-critical applications share a common failure pattern: they perform worst on the scenarios that matter most. We built the Synthetic Data Platform to close this gap, generating annotated training data for the corner cases, rare classes, and scenarios that cannot be collected in the real world. The following is a pilot we conducted with a major automotive OEM.

Case Study: A Major Automotive OEM

We worked with the R&D division of a major automotive OEM, whose autonomous driving team operates a multi-camera sensor suite with wide-angle and fisheye lenses. The client provided a subset of 24,000 annotated images from their production dataset as the training set, along with a separate held-out test set.

The training data gap in their dataset was typical of the industry. Passenger cars appeared thousands of times; school buses, fire trucks, and construction vehicles appeared a handful of times each. Close-range vehicles filling the entire camera frame, among the most dangerous scenarios for an autonomous vehicle, were severely underrepresented. Trucks so close that they extend beyond the frame and appear only partially, for example, appeared only 233 times in 24,000 images.

This is primarily a data bottleneck, not an algorithmic one. The real world is long-tailed, corner cases cannot be staged or collected on demand, and conventional augmentation techniques cannot introduce visual diversity that was never captured.

We adapted the platform to each of the client's camera configurations and generated synthetic training data targeting four categories of gaps.

Rebalancing the Long Tail

Replace over-represented object classes with rare ones in existing scenes. The scene composition is preserved; only the target objects change , directly rebalancing class distribution without new data collection.

Original scene with common vehicles Same scene with rare class (bus)
Common vehicle (top) replaced with rare class (bus) (bottom). Works correctly with truncated objects at image boundaries.

Increasing Instance Density

Add new object instances at plausible positions within existing scenes, increasing the number of labeled objects per image.

Original scene Scene with additional instances
New vehicle instances added at plausible positions, increasing labeled object density.

Targeting Specific Corner Cases

Define the exact scenario needed , an extremely close truck, an unusual spatial arrangement, a rare configuration , and generate as many diverse examples as required.

Real corner case: truck so close it extends beyond the frame Platform-generated corner case
Real (top) vs. generated (bottom). The client requested a specific scenario: a truck so close it extends beyond the frame, on a Japanese highway, captured by a fisheye camera. The platform produces diverse examples matching the real deployment environment on demand.

Generating Unknown Objects

Generate objects that never existed in the training data , road debris, unusual obstacles, animals , as new object categories at specified positions with correct annotations. This enables extending the detection vocabulary to classes that have zero real-world training samples.

Original scene without obstacle Generated stone obstacle on roadway
Original scene (top) and the same scene with a stone added on the roadway (bottom). The stone is a new object category absent from the original training data, generated at a specified location with bounding box annotation.

Results

We added approximately 36,000 synthetic images to the 24,000-image training set and measured the impact on the client's object detection pipeline.

mAP improvement: Real 24K (0.28) to Real 24K + Syn. 36K (0.38), with synthetic outperforming real at equal volume
Detection accuracy (mAP@0.50:0.95) as synthetic data volume increases. The curve had not plateaued.

mAP from 0.279 to 0.378. Adding approximately 36,000 synthetic images improved mAP by +0.099 absolute (+35.5% relative). The performance curve had not plateaued at the maximum tested volume, though marginal gains may diminish as synthetic-to-real ratio increases further.

Synthetic outperformed real at this data ratio. In this specific comparison, 24,000 real + 6,000 synthetic images achieved higher mAP than 30,000 real images. Synthetic images generated via the addition strategy contain more annotated instances per image, providing a denser training signal.

233 images improved a critical corner case. For trucks so close they extend beyond the frame and appear only partially , only 233 examples in training vs. 503 in the test set , generating 233 targeted synthetic examples improved detection in this scenario.

Detection fails before augmentation Detection succeeds after augmentation
Detection of a truck extending beyond the frame, before (top) and after (bottom) augmenting training with 233 synthetic images.

The Synthetic Data Platform

The Synthetic Data Platform is built on a proprietary image generation model developed in-house by Yodo Labs, based on the computer vision research of founder Xiuxi Pan, PhD. It is neither a general-purpose generator repurposed for training data, nor a wrapper around existing generative AI services. It is a dedicated model architected from the ground up for one objective: generating training data that makes recognition models better.

The mainstream generative AI industry competes on producing images that are beautiful and interesting , optimizing for human visual preference. We deliberately go in the opposite direction. Our model optimizes for functional correctness: the two properties that determine whether synthetic data actually improves model performance, and that existing generators often fail to deliver.

Positional fidelity , generated objects appear exactly where their annotations specify. The input layout becomes the output annotation by construction, with no manual labeling required. This is the property that makes synthetic data usable as training data at all , and one that general-purpose generators often treat as an afterthought.

Stylistic coherence , generated images match the visual characteristics of the target deployment environment. The platform adapts to a new camera system or visual domain with extremely few reference images , no annotations needed. This is what closes the domain gap between synthetic and real data, helping synthetic images contribute to training rather than degrading it.

Applications

The Synthetic Data Platform is deployed across:

  • Autonomous driving , corner case coverage, long-tail rebalancing, unknown obstacle detection, multi-camera adaptation
  • Manufacturing quality inspection , rare defect augmentation, cold-start training for new product lines where no defect data yet exists

The underlying problem , too few examples of the scenarios that matter most , is common to any domain where visual recognition operates in safety-critical conditions.

Get Started

If your organization faces training data gaps for corner cases, rare classes, or new deployment scenarios, we would welcome a conversation.

Contact Yodo Labs →