Why dual RTX 3090 is a perfect starter pack into local AI
- 48GB VRAM for ~$2000, just drop them into your pc
- Runs Qwen3.6-27B at Q6 with fat context
- CUDA + vLLM/llama.cpp work out of the box, zero headache
- Buy threadripper and scale your rig up to 8x3090
- Prices are structurally rising, supply demand shock
3090 is the people's card
If 2x DGX Sparks can hit 25-tps on Minimax, what is the theoretical limit for 4x Intel B70s?
I hit 93 tps, which is spot on with what ChatGPT suggested was the likely optimized outcome.
Intel is over 3x faster, and significantly cheaper.
I paid $5300 USD for the Intel build.