258k real engineering tasks captured end-to-end: reasoning traces, tool calls, code edits, and explicit human acceptance signals. Built for labs training the next generation of coding agents.
| Capability | SWE-bench | HumanEval | CodeSearchNet | Zstate SWE |
|---|---|---|---|---|
| Full reasoning traces | ✕ | ✕ | ✕ | ✓ |
| Tool use captured | ✕ | ✕ | ✕ | ✓ |
| Human acceptance signals | ✕ | ✕ | ✕ | ✓ |
| Real production tasks | ✓ | ✕ | ~ | ✓ |
| Action-level reward signal | ✕ | ✕ | ✕ | ✓ |
| Scale (tasks) | 300 | 164 | ~100k | 258k |
We're packaging a sample and schema for AI lab outreach now. Get in touch to be first in line, or to discuss a curated subset built for your training pipeline.