Benchmark task

Submit a benchmark task.

Send a real task where coding agents are expensive, unreliable, or hard to evaluate. The task may shape future benchmark specs and tools.

Good benchmark tasks include

What happens next

1We triage for a concrete cost, reliability, eval, or workflow-architecture problem.
2If there is a fit, we ask for a small sample: traces, prompts, tool lists, repo instructions, workflow notes, or anonymized task examples.
3The first output is a scoped path: what to inspect, what to measure, and where savings or leverage are most likely.

Direct email

For lightweight notes, use research@vorplabs.com.