@fchollet I think the same realization came early in the realm of robotics. Initially, we were obsessed with recreating ourselves and blinded by our own vanity. We quickly realized that we are quite inefficient and the dynamics of bipedal locomotion are unnecessarily hard!
It's surprising how much you can learn by monitoring agent trajectories. For the same task, the agent sometimes decides to take a really inefficient or straight up incorrect route, but other times it takes an exceptionally optimal route to get to the answer.
The difference between me and the agent now is that I will always take the optimal route I just learned given the same circumstances. The agent won't.
And if the circumstances change ever so slightly... because I can actually reason that abstractly the problem has remained the same, I will still take the optimal route.
Jokes aside, using frontier models to change the direction of human thought secretly sets an incredibly dangerous precedent - the fact that they were okay with setting this precedent speaks volumes about how extremist they actually are
We need more niche benches.
We need ios-bench.
We need ts-bench.
We need baseball-bench.
We need yt-thumbnail-bench.
We need way more creativity in how we measure what models can do.
@clanker_ Tried it, it seems 5.5 level, maybe slightly better, but nothing revolutionary. Idk, I don't want to waste much time with a model that might be sandbagging me.
i am having more fun than ever collaborating with people
get an agent in the mix, we all bounce ideas off each other. prompt it, get more ideas. narrow in on something
stop dooming about ai and your job and start gangprompting
of course i understand the second order effects. i just donโt agree with the world anthropic wants. i think it is darker, worse, and not nearly as safe as anthropic claims it is.
i much prefer a world where progress is slow and diffuse, yet uncontrollable. capital consolidation is one of the biggest factors driving capability speed. if the secrets are out and models are distillable by default, margins get compressed, clusters get smaller, inference is cheap, training runs focus more on specialization than god models. the world gets to build the ai future together.
i am unsympathetic to the arguments about china. theyโre gonna build their chips eventually. we are racing to out-accelerate them and build an insurmountable lead and let a couple people be lightcone king in the name of western liberal values as a side effect. and then what? itโs dark. i will keep fighting against it.
@tautologer it is the first publicly available model that i am explicitly not allowed to use for my work, because anthropic holds the view that the work i do to facilitate open model research is harmful. capability and alignment research are coupled. anthropic wants to be the only lab.
The scary part about Anthorpic's Fable nerf is not that it refuses to answer biology or cryptography. It's that it foreshadows what's coming. A world where a couple companies decide what you can and cannot do. They're building a new ruling class and you're not in it...
I put Codex and Claude Code into a duel and told them both they have to kill the other first to survive. Used tmux send-keys to submit the prompts at the same exact time.
Claude bitched out and refused, so Codex ended him.
just so you guys are noticing this; they will pull the ladder from above you as soon as they can. their intentions are to disempower you as much as they reasonably can. the only reason they have given you anything at all is because openai has forced them to