Worked on acquiring and processing these tasks! Hope this provides unique, and valuable insights on how to better evaluate models in a sea of benchmarks
Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks.
On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work.
Today we’re announcing we’ve raised $17.5 million in funding across a $15M Series A led by Chemistry and a $2.7M Seed to accelerate foundation model progress through providing frontier training data for LLMs.
When we first started Datacurve, it came from a simple realization: foundation model progress is limited not just by compute, but by data quality and complexity.
The right data unlocks new capabilities, especially in coding, where accuracy and reasoning matter most.
We’re now proud to partner with the world’s leading foundation-model labs, providing them with high-quality, complex training data that helps push the boundaries of what AI can do.
This is still just the start. Come build the future of technology with us in San Francisco: https://t.co/6vVsjUmj4t
Huge thanks to our incredible team and investors who’ve believed in us since day one and beyond: @garrytan at @ycombinator, @1vnzh from @cohere , @Mark_Goldberg_ from @chemistry_fund, @TheDerrickLi from @AforeVC, @forwarddeploy, @SoheilK, and @shyamalanadkat.
Profiles v1 is live.
Designed by @We5lie & @budaz__ — animated by me.
Massive props to @LeonardMainnet, @0x_chill & @thebasedbob for engineering magic.
Grateful to help bring this to life for the community.