Aiden @VibeCodeAiden - Twitter Profile

about 16 hours ago

@teortaxesTex Haha tbh the pareto isn't even saturated, get it to spend more tokens and it'll cross Mythos 5. OAI just doesn't wanna see that happen

0

1

0

205

Aiden @VibeCodeAiden

about 16 hours ago

@decoded_dev @jpschroeder No, it does not matter especially with modern engineering around quantization. And your entitlement of having contributed to pytorch has nothing to do with the accuracy of your arguments. You already mixed up a lot of concepts inaccurately in ur earlier comment lol.

0

6

0

69

Aiden @VibeCodeAiden

about 18 hours ago

@cremieuxrecueil But I literally saw a guy writing his own text and showing up positive for Pangram xD

0

1

0

96

Aiden @VibeCodeAiden

about 18 hours ago

@MParakhin God damn Fable on the coding and deliberation while 5.6 executes perfectly? THE Setup. I wish...

1

4

0

3K

Aiden @VibeCodeAiden

about 18 hours ago

@decoded_dev @jpschroeder This relates to why different jobs you do take difference precision. Training at fp8 uses e5m2 to have larger range of coverage while at inference you use e4m3. And of course, for the optimizer, you go for fp32 and that's one reason why training is so fucking expensive

0

1

0

21

Aiden @VibeCodeAiden

about 18 hours ago

@decoded_dev @jpschroeder it is not a big deal and the only reason the AI will say it is a big deal is because you pushed it that way. It is largely agreed that fp8 does not lose too much accuracy that is relevant compared to bf16 in inference. Yes, for training, you need higher precision.

3

15

0

863

Aiden @VibeCodeAiden

about 20 hours ago

@richmail20 @kaikuspa @justalexoki Oh yeah ur right

0

23

Aiden @VibeCodeAiden

about 23 hours ago

@Hesamation I don’t think Dario has studied enough about what authoritarianism and bureaucracy has done to a lot of things

0

1

0

164

Aiden @VibeCodeAiden

about 23 hours ago

@bdsqlsz Idk how many times I saw this bullshit THIS. IS. NOT. STATISTICALLY. RELEVANT.

0

112

Aiden @VibeCodeAiden

about 23 hours ago

@kaikuspa @justalexoki I mean if the 750tok/s rumor is true (i think this wont be general default but some /fast tier) it’s about 18x faster 😂

1

0

664

Aiden @VibeCodeAiden

about 23 hours ago

@vvvrrrrr @shaolinchen9 @METR_Evals 😂😂😂😂

0

46

Aiden @VibeCodeAiden

about 23 hours ago

Damn what could the future bring

METR @METR_Evals

1 day ago

If future models display much fewer undesirable propensities, we could become more concerned about catastrophic misalignment, as we’d be worried that models may have learnt to evade detection (for example, as a result of being trained not to produce misaligned reasoning).

4

140

6

23K

0

19

Aiden @VibeCodeAiden

2 days ago

@an_engineer_log E2B QAT’d to 4bit is sooo performant

0

69

Aiden @VibeCodeAiden

2 days ago

fr

Alana Fisher @Alanafisher91

2 days ago

@nonregemesse European police when they detect an unlicenced air conditioner

9

3K

37

28

131K

0

27

Aiden @VibeCodeAiden

2 days ago

@MarioNawfal How is the world going backwards

0

22

Aiden @VibeCodeAiden

2 days ago

atp all I'm asking for is my Chinese homies to step up the game and show us DSV4.1, Kimi K3, GLM-5.3 heh

0

51

Aiden @VibeCodeAiden

2 days ago

@1casie Qwythos? Lmfao write a paper about it and cry me a fucking river when you get desk rejected

1

2

0

55

Aiden @VibeCodeAiden

2 days ago

Actually i'm terrified about this

Francisco Hermida PRO

@FranciscoHPro

2 days ago

@elder_plinius

4

35

6

3

1K

0

2

0

26

Aiden @VibeCodeAiden

2 days ago

@dragonitematero @WatcherGuru Are you fucking retarded The "not preferred long term model" means that OpenAI would not prefer that the government has to check every release and validate it before rollout

1

14

0

273

Aiden @VibeCodeAiden

2 days ago

@yihui_indie Proceeds to get supply chain attacked

0

466

Aiden

@VibeCodeAiden

Last Seen Users on Sotwe

Trends for you

Most Popular Users