@giffmana@andrew_n_carr Closed post training instead of pretraining, but:
1. I believe Fireworks AI said they use a torchtitan fork for their trainer
2. IIRC Thinky Machines also said the same
@WillManidis Sitting in the front makes you look goofy and subservient.
Nevertheless I do it, as the back seats are not designed for a full height man .
@snwy_me My (unsubstantiated) hypothesis GPT 4.5 is far less sparse than Mythos. It was trained 2 years ago, and even if total param count of GPT 4.5 ~ Mythos, GPT 4.5 had way more active params, making it cost prohibitive to infer (making RL vastly more costly).
@teortaxesTex I like your thesis that Ant's interpretability research is dual use, and I bet starting ~ Sonnet 3.5 they've been able to use this to instill better reasoning in pre / mid training under dense supervision. 3/n
@teortaxesTex Marten hasn't published a tonne (exception https://t.co/MQ9nlKZtxQ) since then, but I'd guess his work has been used to scale up the models & keep them stable with hundreds of trillions of tokens. 2/n
@ambuj0@DavidSacks Fable is Mythos + guardrails (for bio & cyber security, etc. tasks).
It is unclear if partners had early access to the guardrails. I would wager they just got access this week with the rest of us, and have thus just found the exploits.
@TheLaurenChen Asian mom white dad are more likely to be weird and leftist. I have several conservative Wasian male friends, and they all come from asian dad + white mom (despite that being much rarer).
The white dad + asian moms kids, especially sons, end up with trauma
@WestsideLAGuy Which is why everyone who lives in other SF neighborhoods loves to shit on it.
Although that leads to self reinforcement, as everyone else self selects out of moving there.
@ArmouredJester@Anarseldain If you don't mind me asking, how so? She isn't suited to being around kids, or just too lazy to run a household, or something else?
@CartoonsHateHer Also, you want your future sons to be successful. If their mother is brain dead, their odds are much lower.
The joke around silicon valley I've heard is guys try and find the most successful woman they can, and then get her to stay home.
I was using ChatGPT for legal advice and it decided to completely hallucinate some preposterous nonsense about how growing wheat to use on my own farm somehow constitutes interstate commerce
@zerohedge They're also collaborating with Deepmind - they use a lightweight version of Gemini multimodal to do longtail understanding. My insider friend's have said even the progress from Gemini 2.5 -> 3.0 was massive in terms of understanding weird scenarios.
@zerohedge Waymo is expanding rapidly, both geographically, and in terms of engineering team size. They are concerned about Tesla & Chinese robotaxis, but insiders engineers I know feel very bullish on their prospects.