Philippe Bich @pbicho96 - Twitter Profile

5 days ago

Small KVarN update 🚀 Thanks to everyone who tested it and shared feedback! The vLLM PR is coming soon. At the same KV-cache budget: • ~2× faster decode than TurboQuant • Better accuracy • MLA, hybrid & speculative decoding

pbicho96's tweet photo. Small KVarN update 🚀

Thanks to everyone who tested it and shared feedback! The vLLM PR is coming soon.

At the same KV-cache budget:
• ~2× faster decode than TurboQuant
• Better accuracy
• MLA, hybrid & speculative decoding https://t.co/2ola30CsvR

0

16

Philippe Bich @pbicho96

15 days ago

@so_sthbryan Thanks for sharing our work :)

0

1

0

6

Philippe Bich @pbicho96

15 days ago

@MorelMatth66161 Thanks for sharing our work! https://t.co/sBv78XjTBs

0

1

0

9

Philippe Bich @pbicho96

16 days ago

@LivingInHarmony Ahah, thanks for sharing :) here our code: https://t.co/sBv78XjTBs

1

4

2

0

65

Philippe Bich @pbicho96

16 days ago

@vintcessun @mobailabs Thanks for sharing! Here the code https://t.co/sBv78XjTBs

0

4

Philippe Bich @pbicho96

16 days ago

@sztlink @no_stp_on_snek Thanks, happy everything works. I am still fixing many things on the repo so this kind of posts help me a lot :) put a star if you want to support the project :) https://t.co/sBv78XjTBs

1

0

31

Philippe Bich @pbicho96

16 days ago

@rascazzione @jordimash Thanks for talking about KVarN! Code to test it is here https://t.co/sBv78XjTBs

0

3

Philippe Bich @pbicho96

16 days ago

@vintcessun Thanks for sharing our work!

0

8

Philippe Bich @pbicho96

16 days ago

@sztlink @no_stp_on_snek Author here. Can you provide some info about your setup? It should not break, I tested it on math and it was great 🥲

0

1

0

18

Philippe Bich @pbicho96

16 days ago

@andre_banandre Thanks! Code available here https://t.co/sBv78XjTBs

0

3

2

0

51

Philippe Bich @pbicho96

16 days ago

@Marco_Ramilli Thanks for sharing our work!

0

4

Philippe Bich @pbicho96

17 days ago

@MorelMatth66161 Thanks for sharing. Here the code for benchmarking :) https://t.co/sBv78XjTBs

0

5

Philippe Bich @pbicho96

17 days ago

@ai_paper_jp Thanks for sharing! Here the code: https://t.co/sBv78XjTBs

0

5

Philippe Bich @pbicho96

18 days ago

@fukuronomoride Thanks for sharing our work! :)

0

4

Philippe Bich @pbicho96

18 days ago

@devaxsha Thanks for sharing. You can test using our code: https://t.co/sBv78XjTBs

1

0

11

Philippe Bich @pbicho96

18 days ago

@Memoirs Thanks for sharing :) https://t.co/sBv78XjTBs

0

3

Philippe Bich @pbicho96

19 days ago

@dr_cintas Our Huawei version that beats TurboQuant https://t.co/sBv78XjTBs

0

12

Philippe Bich @pbicho96

19 days ago

@ai_hakase_ Thanks! Code: https://t.co/sBv78XjTBs

0

9

Philippe Bich @pbicho96

19 days ago

@ai_hakase_ Thanks for talking about KVarN :) We also have the code available here: https://t.co/sBv78XjTBs

0

3

Philippe Bich @pbicho96

20 days ago

@jaga_prasanna @PavloMolchanov tbh, this is a trend I struggle to understand as well. In terms of accuracy, INT4 was very strong (especially compared to MXFP4). The main reason we're now dealing with all these new microscaling formats is that hardware is being built around them. Not a huge fan, personally

0

5

Philippe Bich

@pbicho96

Last Seen Users on Sotwe

Trends for you

Most Popular Users