The full agent trajectory.
Not just the code.

258k real engineering tasks captured end-to-end: reasoning traces, tool calls, code edits, and explicit human acceptance signals. Built for labs training the next generation of coding agents.

Request access Request schema + sample

Why this goes beyond the benchmarks?

Capability	SWE-bench	HumanEval	CodeSearchNet	Zstate SWE
Full reasoning traces	✕	✕	✕	✓
Tool use captured	✕	✕	✕	✓
Human acceptance signals	✕	✕	✕	✓
Real production tasks	✓	✕	~	✓
Action-level reward signal	✕	✕	✕	✓
Scale (tasks)	300	164	~100k	258k

One corpus.
Three lenses.

258k

Tasks

Task dataset

Real engineering problems with cleaned prompts and execution summaries. The foundation for SFT on problem comprehension and solution planning.

Engg Problems Prompts Summaries

3.7M

Steps

Trajectory dataset

Full step-by-step agent traces: reasoning, tool usage, and code generation at every decision point. The complete picture of how an expert agent solves problems.

Traces Tool Use Reasoning

130k

Signals

Reward dataset

Explicit user acceptance signals at the action level. 50% acceptance rate. Supports iterative multi-accept workflows and action-level reward modelling.

Reward Signals RLHF RLAIF

258k

Real engineering
tasks

3.7M

Trajectory
steps

1.7M

Tool interactions
across 22 tools

130k

Accepted code
actions

14.5

Steps per task
on average

63k

Tasks with acceptance signal

Each capturing explicit human approval at the action level, not just pass/fail at test time.

~50%

Acceptance rate on code actions

A meaningful signal-to-noise ratio for reward model training.

6–7

Tool calls per task on average

Semantic search, call graph analysis, file edits, CLI execution, and more, logged with full context.

25%

Tasks carry acceptance signal

The most valuable subset for reward-sensitive training. Remaining tasks retain full trajectory data for SFT.

Ready to see the swe
dataset schema?

We're packaging a sample and schema for AI lab outreach now. Get in touch to be first in line, or to discuss a curated subset built for your training pipeline.

Request schema + sample Discuss a custom subset

The full agent trajectory.Not just the code.

One corpus.Three lenses.

Every tool interactionlogged with context.

Ready to see the swedataset schema?

The full agent trajectory.
Not just the code.

One corpus.
Three lenses.

Every tool interaction
logged with context.

Ready to see the swe
dataset schema?