Zima @MichaelZima - Twitter Profile

Local AI hardware = capacity × bandwidth × software stack - Capacity tells you what fits - Bandwidth tells you how hard the box can breathe - The software stack tells you how much of the spec sheet you can actually cash out. Hardware by Memory Bandwidth - Mac Studio M3 Ultra: up to 512GB @ 819 GB/s - RTX PRO 6000 Blackwell: 96GB @ 1792 GB/s - RTX 5090: 32GB @ 1792 GB/s - RTX 4090: 24GB @ 1008 GB/s - RX 7900 XTX: 24GB @ 960 GB/s - Radeon PRO W7900: 48GB @ 864 GB/s - AMD Radeon AI PRO R9700: 32GB @ 640 GB/s - Intel Arc Pro B65: 32GB @ ~608 GB/s - Tenstorrent Wormhole n300: 24GB @ 576 GB/s - Tenstorrent Blackhole p150: 32GB @ 512 GB/s + 800G - MacBook Pro M5 Max: 460-614 GB/s - MacBook Pro M5 Pro: 307 GB/s - DGX Spark: 128GB @ 273 GB/s (coherent + CUDA) - Mac mini M4 Pro: 273 GB/s - Ryzen AI Max / Strix Halo: ~256 GB/s (~96GB usable GPU) - MacBook Air M5: 153 GB/s - Snapdragon X2 Elite: 152-228 GB/s - Intel Lunar Lake: 136 GB/s - Snapdragon X Elite: 135 GB/s - Mac mini M4: 120 GB/s - Arc Pro B60: 24GB @ ~456 GB/s Verdict - GPUs are still the bandwidth kings - Apple wins: stupid amounts of memory, don’t want to shard across GPUs - Apple loses: when raw tokens/sec & concurrency matter more - DGX Spark: coherent memory + NVIDIA stack - Strix Halo / Ryzen AI Max: first real x86 unified-memory contender - Tenstorrent: fully OSS stack, excited to see this mature Fitting ≠ serving Even if it fits, you still pay for - bandwidth during decode - KV cache growth - dequantization - batching + concurrency - scheduler quality - framework overhead The only mental model that matters: 1. What must fit? 2. What bandwidth tier do I need? 3. What software stack can actually deliver it? In short: - NVIDIA → fastest raw speed - Apple Studio M3 Ultra → biggest one-box memory - Strix Halo → first real x86 unified - DGX Spark → coherent NVIDIA dev appliance - AMD / Intel Arc → rising alternatives - Tenstorrent → fully opensource stack Do ask: “which bottleneck am I buying?” Not: “which hardware is best?”

TheAhmadOsman's tweet photo. Local AI hardware = capacity × bandwidth × software stack

- Capacity tells you what fits
- Bandwidth tells you how hard the box can breathe
- The software stack tells you how much of the spec sheet you can actually cash out.

Hardware by Memory Bandwidth
- Mac Studio M3 Ultra: up to 512GB @ 819 GB/s
- RTX PRO 6000 Blackwell: 96GB @ 1792 GB/s
- RTX 5090: 32GB @ 1792 GB/s
- RTX 4090: 24GB @ 1008 GB/s
- RX 7900 XTX: 24GB @ 960 GB/s
- Radeon PRO W7900: 48GB @ 864 GB/s
- AMD Radeon AI PRO R9700: 32GB @ 640 GB/s
- Intel Arc Pro B65: 32GB @ ~608 GB/s
- Tenstorrent Wormhole n300: 24GB @ 576 GB/s
- Tenstorrent Blackhole p150: 32GB @ 512 GB/s + 800G
- MacBook Pro M5 Max: 460-614 GB/s
- MacBook Pro M5 Pro: 307 GB/s
- DGX Spark: 128GB @ 273 GB/s (coherent + CUDA)
- Mac mini M4 Pro: 273 GB/s
- Ryzen AI Max / Strix Halo: ~256 GB/s (~96GB usable GPU)
- MacBook Air M5: 153 GB/s
- Snapdragon X2 Elite: 152-228 GB/s
- Intel Lunar Lake: 136 GB/s
- Snapdragon X Elite: 135 GB/s
- Mac mini M4: 120 GB/s
- Arc Pro B60: 24GB @ ~456 GB/s

Verdict

- GPUs are still the bandwidth kings

- Apple wins: stupid amounts of memory, don’t want to shard across GPUs
- Apple loses: when raw tokens/sec & concurrency matter more

- DGX Spark: coherent memory + NVIDIA stack

- Strix Halo / Ryzen AI Max: first real x86 unified-memory contender

- Tenstorrent: fully OSS stack, excited to see this mature

Fitting ≠ serving

Even if it fits, you still pay for
- bandwidth during decode
- KV cache growth
- dequantization
- batching + concurrency
- scheduler quality
- framework overhead

The only mental model that matters:

1. What must fit?
2. What bandwidth tier do I need?
3. What software stack can actually deliver it?

In short:
- NVIDIA → fastest raw speed
- Apple Studio M3 Ultra → biggest one-box memory
- Strix Halo → first real x86 unified
- DGX Spark → coherent NVIDIA dev appliance
- AMD / Intel Arc → rising alternatives
- Tenstorrent → fully opensource stack

Do ask: “which bottleneck am I buying?”

Not: “which hardware is best?”

98

1K

179

1K

116K

MichaelZima retweeted

Brian Roemmele

@BrianRoemmele

about 11 hours ago

Steve Jobs's bedroom when he was still living with his parents, 1976. Apple 1 boxes stored on the right.

18

234

23

8K

Who to follow

Zima Media ™

@Zima_Media

Digital growth starts here. We help businesses succeed with our all-in-one, unlimited solution that unifies data, tools, strategy, and execution!

Earn bitcoin rewards on your shopping. Exchange bitcoin for Visa, Mastercard, and gift cards. #npub18tcc00lqpysdsurg567dllzg7jeyr5wcyk2v6w23rx3s3ygyze2qv32nxx

Zima

@MichaelZima

about 8 hours ago

@yannschaub I need to come up from Alicante.

0

18

MichaelZima retweeted

carwow

@carwowuk

about 21 hours ago

REVEALED: Audi R8 Successor! Meet the Audi Nuvolari - a brand-new, mid-engined 1,000hp hybrid supercar! 🔥 4-litre twin-turbo V8 and 3 motors! 🔥 1000hp & 730Nm! 🔥 0- 60mph in 2.6 seconds! 🔥 Full carbon exterior! 🔥 £500,000! 🔥 Limited to 499 units! Audi has the Temerario in its sights! But would you pick the Nuvolari over the Lambo? 🤔 Let us know in the comments!

311

2K

156

363

2M

Zima

@MichaelZima

about 17 hours ago

@matteopelleg I asked a VC how they handle their inbox. It’s easy to spot most of them. I am concerned when AI can fingerprint you and that’s your agent…

0

24

MichaelZima retweeted

Sawyer Merritt

@SawyerMerritt

about 20 hours ago

Audi has officially unveiled its first hybrid supercar called the Audi Nuvolari. • Starting price: $697,000 • 0-60mph: 2.5s • 978 hp • 499 units will be built • Fastest production vehicle in Audi history • Top speed: Over 217 mph (350 km/h) • New Audi signature paint color: Titanium • Three electric motors each produce 110 kW • Two oil-cooled electric motors mounted on the front axle • 7.3 kWh lithium-ion battery • Fully electric driving capability in E-Hybrid mode • Hybrid powertrain combines a 4-liter twin-turbo V8 with three axial-flux electric motors • Formula 1-derived prepreg autoclave carbon manufacturing process • Brake-by-wire braking system • Braking system capable of absorbing up to 2.8 megawatts of energy Deliveries begin in the first half of 2027. More photos in the thread below:

SawyerMerritt's tweet photo. Audi has officially unveiled its first hybrid supercar called the Audi Nuvolari.

• Starting price: $697,000
• 0-60mph: 2.5s
• 978 hp
• 499 units will be built
• Fastest production vehicle in Audi history
• Top speed: Over 217 mph (350 km/h)
• New Audi signature paint color: Titanium
• Three electric motors each produce 110 kW
• Two oil-cooled electric motors mounted on the front axle
• 7.3 kWh lithium-ion battery
• Fully electric driving capability in E-Hybrid mode
• Hybrid powertrain combines a 4-liter twin-turbo V8 with three axial-flux electric motors
• Formula 1-derived prepreg autoclave carbon manufacturing process
• Brake-by-wire braking system
• Braking system capable of absorbing up to 2.8 megawatts of energy

Deliveries begin in the first half of 2027. More photos in the thread below:

648

4K

296

546

555K

Zima

@MichaelZima

about 18 hours ago

@lmstudio 🔥🔥🔥

0

30

Zima

@MichaelZima

1 day ago

@TimurNegru €500-1000+ a m2 to restore it to its former glory.

0

1

0

358

MichaelZima retweeted

Unsloth AI

@UnslothAI

2 days ago

@googlegemma Thank you Google Deepmind for constantly releasing open models! 🌟 We made Dynamic GGUFs so you can run Gemma 4 12B more efficiently: https://t.co/8cL321pVDh

22

955

59

352

40K

MichaelZima retweeted

Google Gemma

@googlegemma

2 days ago

Meet Gemma 4 12B! A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license. Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: 👇