Agent mode on ChatGPT Atlas isn't working. This is literally the only reason I'm on Atlas!
Love the browser but please fix
@ChatGPTapp@OpenAI@thsottiaux
the year is 2027
you're running unlicensed GLM 6.7 in the corgi cafe on your M6 MacBook Pro which mortgaged your house to buy
suddenly there's a knock on the door - it's the department of inference and intelligence
you're sentenced to 13 years for building a react app
So far Ornith-1.0-35B performs better in some coding tasks but worse in others, compared to Qwen-3.6-27b. The results are inconclusive.
I'm looking forward to test their 31B-dense model once they publish it.
Full tool-eval-bench results below.
https://t.co/bnDhQGM1zO
@ryaneshea You're comparing the ornith 397b to qwopus 27b 😅 to get those numbers
Id you compare ornith 35b (75.6 swe bench verified) against qwopus 35b, the difference is 4 pts?
Then if you compare ornith 35b against qwopus 27b, the difference is 0.35 pts
This isnt revolutionary 😅
@ryaneshea People have bee doing this for months. Dude named Jackrong uploaded qwopus models, where he distills opus reasoning into qwen models to achieve way better benchmark results.
Anyone with a mac studio can do this, this is not that big of a deal IMO
Browser automation in T3 Code is pretty darn sweet. Agent can create recordings so it can analyze frame by frame what's going on. Used to do this manually but now I can just tell the agent to!
Try it on nightly now, fingers crossed it'll be on latest tomorrow 🤞