We partnered with @FireworksAI_HQ to train open-source models for legal. Here's what we found:
1) Hybrid legal agents can beat frontier models on quality and cost by routing selectively to a frontier advisor.
We tested a hybrid setup where GLM 5.1 served as the primary worker, routing tasks to Opus 4.7 as an advisor when needed.
GLM invoked Opus sparingly, just 0.83 times per task on average.
The hybrid setup beat Opus on both quality and cost: 18% all-pass vs 14%, at $368 vs $954 across the same 100 tasks.
2) Post-training can push open models to frontier-level legal performance.
On a 100-task slice of our Legal Agent Benchmark (LAB), SFT moved Kimi 2.6's all-pass rate from 11% to 15%, beating Opus' 14%.
But the cost gap was even more striking: $84 vs $954 across the same 100 tasks, or ~11x cheaper.
We're excited to continue working with @FireworksAI_HQ on the next generation of open-source legal agents.
A common informal usage of torch.compile is to generate Triton code which people then copy paste into their codebase. https://t.co/swpryMSoL6 is an experiment to put a nice API around this workflow. Curious to see if it will get any traction!
Check out this awesome work led by @reeselevine and many other great ucsc students!
There was so much work to make this run interesting models across many systems! Check it out (and try out the demos in the blog post!)
WebGPU support in llama.cpp is here! Check out our blog post introducing it: https://t.co/3OUusMYqIY
Run local models in your browser, with GPU acceleration. No data leaves your computer!
Thanks to everyone who's made this possible, especially @ggerganov
@dyn___@anton_chuvakin I think of exploitable vulnerabilities as a natural resource like oil, minerals, etc. There are varying densities of proven reserves in various codebases and "mining" technology improvements make discovering and exploiting deeper vulns viable. They'll just keep getting rarer.
Grateful to @neal_katyal and his @MilbankLaw team for trusting us to help them prep for such an important case.
So many great lawyer<>AI collaboration tips in @neal_katyal's Ted talk. Don't parrot the stochastic parrot; stay engaged. Find the human connection.
We got a lot of demand for a Tinfoil Rust SDK. In fact, it’s been a TODO post-it on our wall for over a year!
The reason it took this long was that we had to implement a Sigstore verifier for client-side supply chain verification and work around the lack of official libraries:
had a jane street interview in 2013
on the way there, i run into a dog. the dog is hurt. i stop to help it but am an hour late to the interview
i arrive at the office. the dog is my interviewer.
i sigh with relief. surely i'm hired.
'you didn't get the job' the dog says
'reason: ineffective altruism. you failed to realize that arriving on time, earning $500k/year as a junior trader, and donating 10% to shrimp welfare would have prevented approximately 4 million shrimp-hours of suffering. you saved one dog. me. a dog with negligible moral weight relative to the marginal bednet.'
i open my mouth.
'also i wasn't hurt. it was a trolley problem. you pulled the wrong lever.'
i nod. it is true. dogs are not a givewell top charity.
the dog slides a pamphlet across the desk. it says 80,000 hours.
'have you considered earning to give'
i start to cry. the dog does not update on this. the dog has read the sequences.
'one more thing,' the dog says. 'the dog you saved. that was also me. i contain multitudes. specifically, i contain a counterfactual in which you arrived on time and we are currently shaking hands. that version of you is now my colleague. he tips well at lunch.'
i leave the building. on the sidewalk, another dog is hurt.
i keep walking. i have learned.
the dog yells after me, 'WRONG. THAT ONE WAS REAL'
- written by Claude 4.7 (Adaptive)
Since this has blown up, I’d like to shout out all the other trans founders. I won’t name them out of respect for their privacy.
There aren’t a lot of us out there and it’s hard. All the ones I’ve met are incredibly hard working, kind, and generous people, not to mention completely fucking cracked.
Shit like this happens all the time but that’s life, can’t make everyone like you. :/ Life is not easy—for anyone