Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks.
On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work.
Synphony deploys robots to pick strawberries.
It's a $3B market in California alone, where labor is 60% of cost and the workforce is shrinking fast.
Robots just hit the crossover point with field labor, and strawberries are the wedge — berries are a $15B market, and the same deployment layer extends everywhere else.
Congrats on the launch, @im_sean_wu and @fujimotosaichi!
https://t.co/i6w6z1Bw2b
hot take: build in-house code-reviewers
we built one for vibe slop prs and it saves much more time when it actually know your review-style and codebase so it can actually finish out the pr itself
that is the real alpha unlock, not a github app spamming benign bug reports
greptile is being used by nvidia, meta, solana and netflix.
we've updated our examples page to reflect some of the most interesting recent catches in their public repos