dnixty @dnixty - Twitter Profile

I did some more tests with Kimi K2.6 locally on two Mac Studios via @exolabs measuring both parallel execution and optimal context size. Parallel requests scaled up to 8 concurrent calls reaching ~62 tok/s aggregate. For context length, ~85k prompt tokens was still fast, ~87k slowed down sharply, and 128k technically ran but took ~7.5 minutes.

0

45

dnixty

@dnixty

about 2 months ago

Managed to run Kimi-K2.6 on two mac studios via @exolabs. Got 21 tokens/s. Crazy to think it outperforms gpt-5.4 and opus-4.6 on some benchmarks and you can run it locally.

dnixty's tweet photo. Managed to run Kimi-K2.6 on two mac studios via @exolabs. Got 21 tokens/s. Crazy to think it outperforms gpt-5.4 and opus-4.6 on some benchmarks and you can run it locally. https://t.co/0RTe1noYVN

0

18

2

14

4K

dnixty

@dnixty

about 2 months ago

@0xSero

0

137

dnixty

@dnixty

2 months ago

@0xSero Can I ask you if those missions have anything to do with ralph loops or it's somethign different?

1

0

313

dnixty

@dnixty

2 months ago

@NousResearch Tasteful

0

1

0

26

dnixty

@dnixty

2 months ago

I think the key to success is to blend the ancient and the bleeding edge of technology. The classical education (latin, greek, aristotle) with AI papers / hardware and robotics.

0

35

dnixty

@dnixty

3 months ago

@AlicanKiraz0 @NVIDIAAIDev @Apple @exolabs @Kimi_Moonshot You're running exolabs from master? On v1.0.68 I didnt managed to make mac studios appear with dgx in the same cluster.

1

2

0

1K

dnixty

@dnixty

3 months ago

heres my small ai lab I have atm. On one mac studio I do the dev work remotely from my phone or macbook pro. The other one is for extra compute (kimi, minimax). DGX spark for experiments. All connected with 10 gigabit ethernet.