rami

Verified account

@vicilah

AI/ML engineer building agents, evals & strange interactive tools. Shipping in public.

mars

Joined March 2023

46 Following

13 Followers

98 Posts

2 days ago

Benchmarks aren’t even real anymore. The labs making these LLMs are overfitting the model weights to the benchmark dataset. The ONLY way i personally believe testing a model is seeing its capabilities across real world tasks like UI, or debugging a specific problem.

3 days ago

WHAT THE HELL is happening in AI? A 3B parameter model just put up coding benchmark scores in the same league as Claude Opus 4.5. 3 BILLION. The weights are on Hugging Face, anyone can test it. I genuinely don't know if this is a breakthrough or if the benchmarks are broken.

orcus108's tweet photo. WHAT THE HELL is happening in AI?

A 3B parameter model just put up coding benchmark scores in the same league as Claude Opus 4.5.

3 BILLION.

The weights are on Hugging Face, anyone can test it.

I genuinely don't know if this is a breakthrough or if the benchmarks are broken. https://t.co/8nVIbwjLUQ

217

3K

192

3K

662K

0

2

0

0

9

2 days ago

@kimmonismus not to mention they also limit their models on the chat interface, whereas i’m with 5.5 speaking to it without being conservative of my prompts.

0

0

0

0

19

2 days ago

@kimmonismus why do you think claude are surpassing openai in growth? I tried claude code for fable when it came out but even when i was using a older llm like opus 4.7 my limit would hit way faster than codex.

1

0

0

0

185

4 days ago

@Youssofal_ I like where the world is going, curious to see when open weights beat frontier. Also happy birthday @Youssofal_

1

1

0

0

40

4 days ago

@theo yeah haha, I feel like so many people on the ai train hype anything that’s packaged as “groundbreaking”. Like fusion style model panels for example, cool benchmark but cost/latency doesn't make quite sense. That's the part a lot of non-technical people miss.

0

0

0

1

917

5 days ago

@beka_saparbek TaylorCV, an app that tailors your cv to the job posting.

0

0

0

0

14

5 days ago

@kimmonismus Ah great insight thanks chubby :)

0

0

0

0

125

5 days ago

Interesting idea, but I’m trying to understand the economics here. If Fusion is running multiple models in parallel + a judge/synthesizer, shouldn’t token cost scale pretty aggressively with panel size? @OpenRouter

6 days ago

Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇

OpenRouter's tweet photo. Introducing the Fusion API, the smartest compound model in the market.

Fusion achieves Fable-level intelligence at half the price.

How it works 👇 https://t.co/OTUQAdTQjU

708

15K

2K

13K

6M

0

1

0

0

49

6 days ago

Very intriguing

7 days ago

AMD Ryzen AI Halo. The ultimate local AI developer platform. Pre-order now: https://t.co/Ny0ZV8LOYi ⚡ Up to 128GB unified memory ⚡ Support for models up to 200B parameters ⚡ Windows & Linux support ⚡ Ready-to-run AI workflows out of the box Build, prototype, and deploy locally without cloud constraints.

AMD's tweet photo. AMD Ryzen AI Halo. The ultimate local AI developer platform.

Pre-order now: https://t.co/Ny0ZV8LOYi

⚡ Up to 128GB unified memory
⚡ Support for models up to 200B parameters
⚡ Windows & Linux support
⚡ Ready-to-run AI workflows out of the box

Build, prototype, and deploy locally without cloud constraints.

225

3K

316

978

458K

0

0

0

0

29

6 days ago

Fable 5 rebuilt a 150 year old mechanical computer from one prompt. it actually works try it https://t.co/KuMvc10G4l

0

1

0

0

28

7 days ago

@MatthewBerman frontier is temporary the real pressure is open weights getting “good enough” faster than closed labs can widen the gap

0

1

0

0

87

7 days ago

frontier models are slowly turning into metered compute Use them while you still can

7 days ago

10 days left to escape the permanent underclass

theo's tweet photo. 10 days left to escape the permanent underclass https://t.co/F2OUhW2eW7

103

2K

44

171

149K

0

0

0

0

70

3 months ago

@memefomo This is great, def using this!

0

1

0

0

51

10 months ago

@XScharo Something I don’t have sadly😔

0

1

0

0

38

10 months ago

@ihateoop Nice selfie, didn’t know you’re black

0

0

0

0

214

10 months ago

@pullupso Hey nigga

0

0

0

0

18

10 months ago

@MaddieGPT @BCRT @finnbags this was made before by someone else. this is a larp.

0

0

0

0

173

11 months ago

@earlTrades $ijustfoundsomethingcrazy

0

0

0

0

9

11 months ago

@ihateoop I used u as my EL just now pls don’t be mad

0

0

0

0

175

Last Seen Users on Sotwe

Trends for you

Most Popular Users