yolodex
case study · april 2026

yolodex

openai codex skill repo turning videos into fully trained object detection models. yolo + codex.

yolodex hero
about.

openai codex skill repo turning videos into fully trained object detection models. yolo + codex.

challenge.

a youtube video to a trained YOLO model, autonomously

yolodex is an openai codex skill that takes a video url and a class list and outputs production-ready YOLO weights. point it at a clip of subway surfers, say "detect player, train, coins," and 30 minutes later you've got a model you can drop into prod.

the pipeline

.agents/skills/ exposes 5 discrete steps the codex runtime orchestrates:

  • collect: yt-dlp pulls the video, ffmpeg extracts frames at the rate yolo needs
  • label: dispatches subagents in parallel git worktrees to label classes per-frame
  • augment: pillow runs a deterministic augmentation set (rotation, hue jitter, etc.)
  • train: ultralytics YOLOv8 with config-driven epoch counts
  • eval: mAP@50 scoring against a held-out split, with a fail-fast threshold

yolodex-run.sh runs the autonomous loop: train → eval → if accuracy < target, regenerate labels and retrain.

why parallel git worktrees

vision dataset labeling at scale is the bottleneck. companies pay $100k+/yr for manual annotation. naive llm labeling is sequential and slow. yolodex spawns subagents in isolated git worktrees: each agent labels its own subset of frames, commits to its own branch, and the orchestrator merges results.

worktrees give you fork-level isolation without the spin-up cost of containers. dispatch 4 agents in parallel and labeling finishes in roughly 1/4 the time. the orchestration runs on top of the codex sdk so it inherits the model orchestration primitives directly.

the demo

at the openai codex hackathon (feb 2026): "we're 19-year-old founders building opal, an ai gaming companion you can play alongside. but before an agent can play, it has to see. and training vision usually means manually labeling thousands of frames, costing companies $100k+ a year. so we supercharged codex to automate that entire workflow, for us and any team building vision-powered ai."

what shipped

top 5 finalist at the openai codex hackathon, $10,000 prize, presented to sam altman and greg brockman. team: joshua lin, philip chen, ryan ni (the ucsd goats). open-source skill repo, runs locally with uv package manager.

yolodex screenshot 2
yolodex screenshot 3
yolodex screenshot 4
stack.
YOLOOpenAI Codex
more work.