Traverse (@traverseso) is a data research lab solving the hardest problem in AI training: subjective, taste-dependent work.
They generate long-horizon training data for frontier models by observing and capturing what specialists actually do on the job.
Congrats on the launch, @lanceyyan and @thezacharyyu!
https://t.co/709lxPfMuN
Traverse (@traverseso) is a data research lab solving the hardest problem in AI training: subjective, taste-dependent work.
They generate long-horizon training data for frontier models by observing and capturing what specialists actually do on the job.
Congrats on the launch, @lanceyyan and @thezacharyyu!
https://t.co/709lxPfMuN
or just build AGI fast enough that this worry ceases to exist. ideally, AI does all the blunt work and humans are 'free'
i do think there's a lot more room for optimism about what should be the most exciting moment in human history
Today, we are announcing Traverse.
We are building reinforcement learning environments for long-horizon agent journeys. Our goal is to automate all economically valuable work.
AI systems have made rapid progress on tasks that are easy to describe and easy to evaluate, yet they continue to struggle with the work that defines most professions. We believe this gap exists because agents are trained in environments that optimize for convenience rather than realism, and because the industry has largely converged on data collection methods that fail to capture how work is actually done.
The bottleneck to agent capability is no longer architecture or compute but data quality and environment fidelity. Much of today's training data is contrived and produced by people who do not deeply understand the domains being modeled, which optimizes for volume and surface correctness but strips out the reasoning and judgment and context that define real expertise.
The gold standard is to capture that expertise directly: observed data with reasoning traces intact. When observation is not possible, the next best thing is a pipeline where domain experts own every layer. Most of the industry does neither.
We are starting with software engineering as it is the most verifiable and deterministic domain that exists today. And automating the creation of software starts a natural flywheel where agents that write code can eventually improve the infrastructure used to train them, which is a feedback loop we want to establish as early as possible.
Our approach is to assemble small teams of world-class domain experts and world-class ML engineers, and to hold that bar fixed as we expand into law, medicine, finance, and beyond.
The path to general intelligence is to traverse through every domain where work happens.
https://t.co/gQh8WRdbE0
Today, we're announcing Traverse. @traverseso
We're building reinforcement learning environments for long-horizon agent journeys, and our goal is to automate all economically valuable work.
The data labeling industry is broken in ways that aren't obvious from the outside. Semi-experts annotate, non-experts QA, non-technicals manage the entire pipeline, and post-training on your own data still isn't standard. The feedback loop is fundamentally broken, and bad data is being delivered to AI labs.
We envision the opposite, pairing world-class domain experts with world-class ML engineers at every layer of our stack. Our bar is extremely high because mediocre environments pollute the frontier models, and we'd rather not add to that.
Traverse will start by automating software engineering, as agents that write code can improve the infrastructure we use to train them, and we want that flywheel running as early as possible. Everything else follows from there.
If you're a researcher or frontier lab thinking about long-horizon agents, reach out.
https://t.co/HsX70hn4MR
Today, we are announcing Traverse.
We are building reinforcement learning environments for long-horizon agent journeys. Our goal is to automate all economically valuable work.
AI systems have made rapid progress on tasks that are easy to describe and easy to evaluate, yet they continue to struggle with the work that defines most professions. We believe this gap exists because agents are trained in environments that optimize for convenience rather than realism, and because the industry has largely converged on data collection methods that fail to capture how work is actually done.
The bottleneck to agent capability is no longer architecture or compute but data quality and environment fidelity. Much of today's training data is contrived and produced by people who do not deeply understand the domains being modeled, which optimizes for volume and surface correctness but strips out the reasoning and judgment and context that define real expertise.
The gold standard is to capture that expertise directly: observed data with reasoning traces intact. When observation is not possible, the next best thing is a pipeline where domain experts own every layer. Most of the industry does neither.
We are starting with software engineering as it is the most verifiable and deterministic domain that exists today. And automating the creation of software starts a natural flywheel where agents that write code can eventually improve the infrastructure used to train them, which is a feedback loop we want to establish as early as possible.
Our approach is to assemble small teams of world-class domain experts and world-class ML engineers, and to hold that bar fixed as we expand into law, medicine, finance, and beyond.
The path to general intelligence is to traverse through every domain where work happens.
https://t.co/gQh8WRdbE0