@kalomaze@notnullptr Perhaps a dumb an but how do you assess the strength of the pretrain vs post training/alignment disasters? Is it just the vibe of having latent capabilities or something more quantitative?
@willccbb The CLI is really slick / well designed. Nails balance of minimalism + features you'd expect to be preoductive. Even has a nice commenting flow for reviewing plans. It was pure pain using it with Grok Build lol but pumped to use Composer in it.
Nathan has worked tirelessly to bring forward the (truly) open research frontier. If we secure a positive outcome with ML & LLMs, it will have been in no small part due to him and others who dedicate themselves to the dissemination of knowledge at great personal opportunity cost.
My time at Ai2 / @allen_ai has come to an end.
Ai2 is a wonderful place. The last 2.5+ years building Olmo, Tulu, and other projects will be one of the peaks of my entire career. I'm extremely thankful for my teammates and the open community who made this work possible.
For me, it's time to try something different. I will still be working in the open model & open science spaces (more news on that soon). In the meantime I'll be spending a few months learning, chatting with a broader network, getting married (!!) and most importantly recharging from pouring my soul into this place.
I've attached the note I shared with the team and some fun photos from our time together. I'll keep cheering for Ai2 and am excited to see what you build next.
@patdennis Yeah, I can def see that. Esp a price/param count increases. Big q is: do closed frontier buy all the compute? Would be quite sad, but can see it happening. Or do neo clouds get enough to serve open models. Otherwise we’re looking at 120b as max param count for average joes lol.
@patdennis Sorry for the wall of text lol but for some optimism on US side: Nvidia, Arcee, and Zyphra are all training models on the open frontier and are very, very strong. Reflection, too, ostensibly, albeit without a release yet.
@patdennis I test all the open models pretty extensively and am a big believer fwiw. But the benchmarks are poor indicators of performance across many vectors. Chinese labs tend to benchmax quite a bit. Still very excited for the M3 weights release and will find many uses!
@code_star I wonder if dynamics will shift further this direction if frontier labs continue towards and beyond Mythos class models, selling fewer higher value tokens to large enterprises.
@fascinated@boomkatonline@raspberryjones Mine is almost entirely boomkat distro exclusives lol which means a lot of early PAN, Andy Stott’s Modern Love, and anything Mark Fell or his progeny ever release
@hamandcheese I think there is a huge distinction when it comes to biological 1 of 1 life vs frozen weights that can be distributed ad infinitum. I wonder if that’s what ppl are really grappling with? On that set of questions I’m far less sure.
@hamandcheese I think questions of consciousness are likely being used as a stand in for whether or not it is ethical to shut down an AI or use it to work for us. I’d argue it’s undoubtedly conscious, and that consciousness in humans is nothing special, mechanically. Just my feeling.