Took a detour into nvfp4 land to see if I could train it natively on the Spark. I cannot. Training a ~500m "gpt-2 style" completions dense transformer on 10b fineweb tokens in fp8. Using Transformer Engine, AdamW, boring stuff. It's a little slower than I'd like, but I will let it finish and have a baseline for future comparisons.
@PhiloGroves@hrkrshnn A method of differential bugfinding probably also exists, if they ever are able to patch this.
>log into codebase
>delete some code
>please restore the missing code
>check diffs for the bugs
So uh. There's a chance this is not just a "couple days" kinda thing...
Is personal compute still weird? I have a Spark and an RTX 6000, which is something like 13 grand worth of personal compute, depending on how you slice it. I'm still running OSS-120b as my agent basically. I haven't found a coding model worth its salt on this setup.
Particularly the Spark has been an interesting toy, but has not proven useful for inference yet. I think it someday will be, but I think I'm going to have to make that happen.
The next step up when I last checked was like a used H100 cluster for like a quarter million dollars.
My serious model scientist arc is probably about to start. π
I think the supply chain risk conversation will probably change the justifiable valuation of Mistral a bit. Sovereign AI is about to become a much bigger deal, so you have to look at the size of the French market now and think "is Mistral really only 20b of this?"
I think it's a steal at that valuation, frankly. I thought the same when DeepSeek announced their first valuation. It's something like 5x higher already. This will be the same story. Whoever is buying these shares is making out like a bandit.
I admire your tenacity and your genius for working with what you've got. I think you're onto some things as a result that will prove very consequential in the near future, perhaps especially in Europe.
Sometimes a detour turns out to be important to achieving the goal.
God bless.
It occurred to me this morning that this sort of stealth degradation is not entirely dissimilar to Safe Completions, which are only partly visible sometimes.
I think the real boundary Anthropic crossed de novo was where it put the barriers and why. I am not sure if anyone would have been quite as upset with silent degradation when the user asked about making meth, for example.
The obvious conflict of interest they have in declaring their own field to be something dangerous that needs to be gated by professionals (them) is what takes it over the top.
That said, this does remind me of the GPT-2 thing. OAI got roasted for calling that dangerous because of spam and scam stuff, but now we're basically drowning in AI powered spam and scam stuff. It's not hard to see how, a few generations from now, Anthropic's warning here may look early rather than wrong.
But anyway, I don't like that Anthropic *can* just switch this behavior on and off and all I have to verify this is their word. And I would not be surprised if some people at OAI are looking side-eyed at this whole thing hoping nobody will notice their own safety stances.
The whole industry has a transparency problem, and as much as Anthropic is clearly the worst right now, I hope nobody gets to hide behind them.
This is an extremely cool little puzzle game. I know nothing about BARCS or Fluxons, but I get the vague feeling that they are important for future technologies, too.
The first simple version of this project -- an (educational) puzzle game that lets people learn and play around with the BARCS (Ballistic Asynchronous Reversible Computing in Superconductors) model of computation -- is now live at https://t.co/FMxiOXKojL. Give it a try!π
An early 1933 attempt to create aircraft seats that would "safely" eject passengers in an emergency β activated by the pilot at the push of a button.π
@PhiloGroves I think it's because these tend to be trained to be nondestructive so aggressively. I have a hard time getting them to do actual refactors of any kind. Hoping 5.6 will be a bit smarter about this.