Local AI hardware = capacity × bandwidth × software stack
- Capacity tells you what fits
- Bandwidth tells you how hard the box can breathe
- The software stack tells you how much of the spec sheet you can actually cash out.
Hardware by Memory Bandwidth
- Mac Studio M3 Ultra: up to 512GB @ 819 GB/s
- RTX PRO 6000 Blackwell: 96GB @ 1792 GB/s
- RTX 5090: 32GB @ 1792 GB/s
- RTX 4090: 24GB @ 1008 GB/s
- RX 7900 XTX: 24GB @ 960 GB/s
- Radeon PRO W7900: 48GB @ 864 GB/s
- AMD Radeon AI PRO R9700: 32GB @ 640 GB/s
- Intel Arc Pro B65: 32GB @ ~608 GB/s
- Tenstorrent Wormhole n300: 24GB @ 576 GB/s
- Tenstorrent Blackhole p150: 32GB @ 512 GB/s + 800G
- MacBook Pro M5 Max: 460-614 GB/s
- MacBook Pro M5 Pro: 307 GB/s
- DGX Spark: 128GB @ 273 GB/s (coherent + CUDA)
- Mac mini M4 Pro: 273 GB/s
- Ryzen AI Max / Strix Halo: ~256 GB/s (~96GB usable GPU)
- MacBook Air M5: 153 GB/s
- Snapdragon X2 Elite: 152-228 GB/s
- Intel Lunar Lake: 136 GB/s
- Snapdragon X Elite: 135 GB/s
- Mac mini M4: 120 GB/s
- Arc Pro B60: 24GB @ ~456 GB/s
Verdict
- GPUs are still the bandwidth kings
- Apple wins: stupid amounts of memory, don’t want to shard across GPUs
- Apple loses: when raw tokens/sec & concurrency matter more
- DGX Spark: coherent memory + NVIDIA stack
- Strix Halo / Ryzen AI Max: first real x86 unified-memory contender
- Tenstorrent: fully OSS stack, excited to see this mature
Fitting ≠ serving
Even if it fits, you still pay for
- bandwidth during decode
- KV cache growth
- dequantization
- batching + concurrency
- scheduler quality
- framework overhead
The only mental model that matters:
1. What must fit?
2. What bandwidth tier do I need?
3. What software stack can actually deliver it?
In short:
- NVIDIA → fastest raw speed
- Apple Studio M3 Ultra → biggest one-box memory
- Strix Halo → first real x86 unified
- DGX Spark → coherent NVIDIA dev appliance
- AMD / Intel Arc → rising alternatives
- Tenstorrent → fully opensource stack
Do ask: “which bottleneck am I buying?”
Not: “which hardware is best?”
REVEALED: Audi R8 Successor!
Meet the Audi Nuvolari - a brand-new, mid-engined 1,000hp hybrid supercar!
🔥 4-litre twin-turbo V8 and 3 motors!
🔥 1000hp & 730Nm!
🔥 0- 60mph in 2.6 seconds!
🔥 Full carbon exterior!
🔥 £500,000!
🔥 Limited to 499 units!
Audi has the Temerario in its sights! But would you pick the Nuvolari over the Lambo? 🤔 Let us know in the comments!
@matteopelleg I asked a VC how they handle their inbox. It’s easy to spot most of them. I am concerned when AI can fingerprint you and that’s your agent…
Audi has officially unveiled its first hybrid supercar called the Audi Nuvolari.
• Starting price: $697,000
• 0-60mph: 2.5s
• 978 hp
• 499 units will be built
• Fastest production vehicle in Audi history
• Top speed: Over 217 mph (350 km/h)
• New Audi signature paint color: Titanium
• Three electric motors each produce 110 kW
• Two oil-cooled electric motors mounted on the front axle
• 7.3 kWh lithium-ion battery
• Fully electric driving capability in E-Hybrid mode
• Hybrid powertrain combines a 4-liter twin-turbo V8 with three axial-flux electric motors
• Formula 1-derived prepreg autoclave carbon manufacturing process
• Brake-by-wire braking system
• Braking system capable of absorbing up to 2.8 megawatts of energy
Deliveries begin in the first half of 2027. More photos in the thread below:
@googlegemma Thank you Google Deepmind for constantly releasing open models! 🌟
We made Dynamic GGUFs so you can run Gemma 4 12B more efficiently: https://t.co/8cL321pVDh
Meet Gemma 4 12B!
A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license.
Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: 👇