Jordan @JordanDevAi - Twitter Profile

@stevibe @TeksEdge I have no problem getting 500-700 tok/s on Dense 30B models on my 5090 and 300 tok/sec on my 3090. Its not so much the quant as it is compiling Llama.cpp for your specific hardware. This pic is Gemma4 running their dense 31b at 500+ tok/sec.

JordanDevAi's tweet photo. @stevibe @TeksEdge I have no problem getting 500-700 tok/s on Dense 30B models on my 5090 and 300 tok/sec on my 3090.

Its not so much the quant as it is compiling Llama.cpp for your specific hardware. This pic is Gemma4 running their dense 31b at 500+ tok/sec. https://t.co/Je40fB9Hv3

2

0

3

182

Jordan

@JordanDevAi

about 1 month ago

@googlegemma And by both, I mean dual instances of Gemma4

0

2

0

248

Jordan

@JordanDevAi

about 1 month ago

@googlegemma I've been running both 24/7 for the last 6 days:

1

31

0

3

5K

Jordan

@JordanDevAi

about 1 month ago

It's efficient It does everything I need to do And its freedom of choice No bloat

prashant varma @realpvarma

about 1 month ago

Why do you actually use linux? - Control - Performance - Just for flex - Open-source love

218

334

19

26

33K

0

1

0

54

Jordan

@JordanDevAi

about 1 month ago

@RealProductGirl The puppers 🐾

1

0

20

Jordan

@JordanDevAi

about 1 month ago

@BuescherScott Yo. That's wild. I recently formed a Real estate tech company with a partner who owns a realtor company in South Florida. We're publicly launching soon. I'll follow you back and check out your project.

1

0

25

Jordan

@JordanDevAi

about 2 months ago

I have about 3 million photos I need to run classification on. Can someone do the AWQ 4 bit quantization for me

Qwen

@Alibaba_Qwen

about 2 months ago

⚡ Meet Qwen3.6-35B-A3B：Now Open-Source！🚀🚀 A sparse MoE model, 35B total params, 3B active. Apache 2.0 license. 🔥 Agentic coding on par with models 10x its active size 📷 Strong multimodal perception and reasoning ability 🧠 Multimodal thinking + non-thinking modes Efficient. Powerful. Versatile. Try it now👇 Blog：https://t.co/EXx5y466su Qwen Studio：https://t.co/bg4tAU1p74 HuggingFace：https://t.co/w4pDX14DZS ModelScope：https://t.co/SuRyLzdQiO API（‘Qwen3.6-Flash’ on Model Studio）：Coming soon～ Stay tuned