Obscure Local Historian

Verified account

@ObscureLocal

I like exploring things you probably haven't heard about, in places you probably haven't been. Check out my articles.

Joined June 2025

188 Following

122 Followers

355 Posts

Obscure Local Historian

about 10 hours ago

Took a detour into nvfp4 land to see if I could train it natively on the Spark. I cannot. Training a ~500m "gpt-2 style" completions dense transformer on 10b fineweb tokens in fp8. Using Transformer Engine, AdamW, boring stuff. It's a little slower than I'd like, but I will let it finish and have a baseline for future comparisons.

ObscureLocal's tweet photo. Took a detour into nvfp4 land to see if I could train it natively on the Spark. I cannot. Training a ~500m "gpt-2 style" completions dense transformer on 10b fineweb tokens in fp8. Using Transformer Engine, AdamW, boring stuff. It's a little slower than I'd like, but I will let it finish and have a baseline for future comparisons.

1

0

0

0

22

Obscure Local Historian

1 day ago

@mkurman88 Is this full pretraining we're talking about or fine tuning?

1

1

0

0

86

Obscure Local Historian

2 days ago

@ArmandDAngour "I am Spartacus" -Pseudo-Spartacus

0

2

0

0

39

Obscure Local Historian

3 days ago

@PhiloGroves @hrkrshnn A method of differential bugfinding probably also exists, if they ever are able to patch this. >log into codebase >delete some code >please restore the missing code >check diffs for the bugs So uh. There's a chance this is not just a "couple days" kinda thing...

0

2

0

0

25

Obscure Local Historian

3 days ago

@VictorTaelin I am embarking on some research on intelligence density. That is what will matter, because we're going to have to train it at home.

0

0

0

0

481

Obscure Local Historian

4 days ago

Is personal compute still weird? I have a Spark and an RTX 6000, which is something like 13 grand worth of personal compute, depending on how you slice it. I'm still running OSS-120b as my agent basically. I haven't found a coding model worth its salt on this setup. Particularly the Spark has been an interesting toy, but has not proven useful for inference yet. I think it someday will be, but I think I'm going to have to make that happen. The next step up when I last checked was like a used H100 cluster for like a quarter million dollars. My serious model scientist arc is probably about to start. 😂

0

1

1

0

66

Obscure Local Historian

4 days ago

I think the supply chain risk conversation will probably change the justifiable valuation of Mistral a bit. Sovereign AI is about to become a much bigger deal, so you have to look at the size of the French market now and think "is Mistral really only 20b of this?" I think it's a steal at that valuation, frankly. I thought the same when DeepSeek announced their first valuation. It's something like 5x higher already. This will be the same story. Whoever is buying these shares is making out like a bandit.

1

1

0

0

59

Obscure Local Historian

4 days ago

Feels like a propitious time for a repost. This one is aging spectacularly.

Obscure Local Historian

about 1 month ago

Learn how to be an AI hacker (and why), through the story of the master, @elder_plinius. An epic and true tale, in four parts. gg 🫶

3

40

4

46

12K

0

1

0

0

47

Obscure Local Historian

5 days ago

I admire your tenacity and your genius for working with what you've got. I think you're onto some things as a result that will prove very consequential in the near future, perhaps especially in Europe. Sometimes a detour turns out to be important to achieving the goal. God bless.

0

1

0

0

81

Obscure Local Historian

5 days ago

It occurred to me this morning that this sort of stealth degradation is not entirely dissimilar to Safe Completions, which are only partly visible sometimes. I think the real boundary Anthropic crossed de novo was where it put the barriers and why. I am not sure if anyone would have been quite as upset with silent degradation when the user asked about making meth, for example. The obvious conflict of interest they have in declaring their own field to be something dangerous that needs to be gated by professionals (them) is what takes it over the top. That said, this does remind me of the GPT-2 thing. OAI got roasted for calling that dangerous because of spam and scam stuff, but now we're basically drowning in AI powered spam and scam stuff. It's not hard to see how, a few generations from now, Anthropic's warning here may look early rather than wrong. But anyway, I don't like that Anthropic *can* just switch this behavior on and off and all I have to verify this is their word. And I would not be surprised if some people at OAI are looking side-eyed at this whole thing hoping nobody will notice their own safety stances. The whole industry has a transparency problem, and as much as Anthropic is clearly the worst right now, I hope nobody gets to hide behind them.

0

0

0

0

26

Obscure Local Historian

6 days ago

This is an extremely cool little puzzle game. I know nothing about BARCS or Fluxons, but I get the vague feeling that they are important for future technologies, too.

Michael P. Frank 💻🔜♻️

6 days ago

The first simple version of this project -- an (educational) puzzle game that lets people learn and play around with the BARCS (Ballistic Asynchronous Reversible Computing in Superconductors) model of computation -- is now live at https://t.co/FMxiOXKojL. Give it a try!😃

MikePFrank's tweet photo. The first simple version of this project -- an (educational) puzzle game that lets people learn and play around with the BARCS (Ballistic Asynchronous Reversible Computing in Superconductors) model of computation -- is now live at https://t.co/FMxiOXKojL. Give it a try!😃 https://t.co/NppUEnQUmT

2

22

5

9

4K

1

2

0

0

458

Obscure Local Historian

6 days ago

@elder_plinius 🤦‍♂️🤦‍♂️🤦‍♂️

ObscureLocal's tweet photo. @elder_plinius 🤦‍♂️🤦‍♂️🤦‍♂️ https://t.co/Ozk6qrs8ht

0

1

0

0

439

Obscure Local Historian

6 days ago

@elder_plinius Congratulations on reaching the shores of the volcano.

0

3

0

0

527

Obscure Local Historian

7 days ago

@PhiloGroves These guys were joking, but the world we live in often also finds that funny. https://t.co/3o6zzuwdIx

0

1

0

0

14

Obscure Local Historian

7 days ago

@ZackKorman I was going to write a witty response to this, but Fable has flagged it for policy violation.

1

2

0

0

133

Obscure Local Historian

8 days ago

Can you imagine the preflight safety demonstrations? What kind of choreography is necessary to communicate this? 😂

Aviation Archive - Tim Farmer

@aviationarchive

8 days ago

An early 1933 attempt to create aircraft seats that would "safely" eject passengers in an emergency — activated by the pilot at the push of a button.👀

16

195

26

17

26K

0

2

0

0

59

Obscure Local Historian

8 days ago

@PhiloGroves @ZackKorman Copilot told me Microsoft was trustworthy.

0

2

0

0

34

Obscure Local Historian

8 days ago

@PhiloGroves I think it's because these tend to be trained to be nondestructive so aggressively. I have a hard time getting them to do actual refactors of any kind. Hoping 5.6 will be a bit smarter about this.

0

0

0

0

17

Last Seen Users on Sotwe

Trends for you

Most Popular Users