CJ @maxweicj - Twitter Profile

CJ @maxweicj

2 days ago

@davideciffa beautiful

0

1

0

40

CJ @maxweicj

4 days ago

@Tono_Ken3 Prefill at 16tk/s?😅

0

1

0

432

CJ @maxweicj

4 days ago

@JosephJacks_ Not a bad idea for the “opening” AI world

0

CJ @maxweicj

4 days ago

M720q(i3-9100T) with dual 2080Ti 22G Nvlinked cost around $600, might be the only server you’ll need for Qwen27B inference, hit maximum 110tk/s at MTP3 with FP8 weight and 256K ctx (tqk8v4) https://t.co/nx7u93LjiF

0

25

Who to follow

Navy veteran, former U.S. diplomat, multiple patents, diver, fighting cancer, @TED speaker, I have a Apple employee stalking me

khan khan

@SamiullahArab12

CJ @maxweicj

6 days ago

@kimmonismus The only reason why it excelled is because Qwen doesn't release their own 3.6 version of 397B (might not appear forever sadly)

0

25

CJ @maxweicj

6 days ago

@CuiMao 现在趁便宜赶紧买，然后出租4个月，等发布了玩他一个星期再带着游戏一起出租，你会发现玩游戏是个赚钱事情

0

115

CJ @maxweicj

7 days ago

Qwen-AgentWorld 35B-A3B is by far the most advanced 30B-level MoE model in terms of real agentic tasks. Even though derived from Qwen3.5 35B-A3B, it is almost par of 3.6 27B, and simply surplus 3.6 35B-A3B in every aspect.

maxweicj's tweet photo. Qwen-AgentWorld 35B-A3B is by far the most advanced 30B-level MoE model in terms of real agentic tasks. Even though derived from Qwen3.5 35B-A3B, it is almost par of 3.6 27B, and simply surplus 3.6 35B-A3B in every aspect. https://t.co/XxRxkIB68p

0

1

0

89

CJ @maxweicj

8 days ago

@victormustar WTF is 120B😂😂

0

15

CJ @maxweicj

8 days ago

@CuiMao 有点可爱

0

88

CJ @maxweicj

8 days ago

@giantcutie666 我觉得这个不完全是Qwen团队自己的，HF上一大堆蒸馏哥做的事情可不少

0

274

CJ @maxweicj

25 days ago

@sakurayukiai @davideciffa We got a sophisticated ipc workaround to address interconnection issue, and the latency between backends is really small (even at my 15-year-old Westmere Xeon), so now we can put any layer shard onto any GPU we want.😎

0

2

0

34

CJ @maxweicj

25 days ago

@fantopy_kai @davideciffa we got a very sophisticated ipc workaround to address the interconnecting issue, the latency between backends is really small even with a low-end cpu. feel free to try and feedback😀

0

1

0

15

maxweicj retweeted

mrciffa

@davideciffa

25 days ago

Thanks to @maxweicj now Lucebox speculative inference engine supports using Luce DFlash and DDTree on mixed backends with Amd and Nvidia cards linked together🏎️

davideciffa's tweet photo. Thanks to @maxweicj now Lucebox speculative inference engine supports using Luce DFlash and DDTree on mixed backends with Amd and Nvidia cards linked together🏎️ https://t.co/6wwEaezsK6

2

25

3

6

2K

CJ @maxweicj

about 1 month ago

#vLLM_2080Ti_Definitive_Edition ready to go, enjoy single request 100+ tok/s of 27B/31B dense at 1/8 cost of RTX 5090. Qwen3.6 27B full feature support (Gemma4 31B as experiemental path). Check https://t.co/Db2KVMwgMQ to boost your RTX 2080Ti now #llm #vllm

0

1

0

1

167

CJ @maxweicj

about 1 month ago

@featuringjared @rumgewieselt not recommended as P100 does not support the crucial IDP4A ISA as other Pascal cards do, P40/P10 will be better choice if you do want to try sm61 cards

0

60

CJ @maxweicj

about 1 month ago

@elmoche_ @Alibaba_Qwen @vllm_project 735k under tq4 kv

0

37

CJ @maxweicj

about 1 month ago

PP 1841.7 tk/s | TG 101.3 tk/s | Context 735K 2 x #2080Ti 22GB NVlinked run Qwen3.6-27B-AWQ through vLLM TP=2 MTP K=3 KV=tq4nc single request at extraordinary performance! Maximized AI value of the $500 legacy setup. https://t.co/7DzqWsxUZG #localLLM @Alibaba_Qwen @vllm_project

maxweicj's tweet photo. PP 1841.7 tk/s | TG 101.3 tk/s | Context 735K
2 x #2080Ti 22GB NVlinked run Qwen3.6-27B-AWQ through vLLM TP=2 MTP K=3 KV=tq4nc single request at extraordinary performance! Maximized AI value of the $500 legacy setup.
https://t.co/7DzqWsxUZG
#localLLM @Alibaba_Qwen @vllm_project https://t.co/Od8IT6nzfD

1

24

1

7

4K

CJ @maxweicj

about 1 month ago

@Ratul_AI not even close. but Omni can do quite a lot more ofc

0

1K

CJ @maxweicj

about 2 months ago

@MoSalah It’s biggest regret Xabi went for Chelsea, no hope for next season under Slot’s lead

0

7

CJ @maxweicj

about 2 months ago

@rumgewieselt @UnslothAI looks pretty usable decoding speed, what about prefill? MTP always slowed down my prefill to great extent

0

58

CJ

@maxweicj

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users