aphex5 @aph5nt - Twitter Profile

aphex5 @aph5nt

about 2 hours ago

@xiaoyvLiu @Sprytixl but why local orchestration then? its cheaper to orchestrate via api

0

3

aph5nt retweeted

David Hendrickson

@TeksEdge

about 13 hours ago

🚀❓ Where does the new RTX Spark fit? Let’s compare @nvidia‘s RTX 50-Series Mobile GPUs w/new flagship Superchip! New CPU+GPU 💥 🆕 RTX Spark Superchip (Arm-based!) ~6.1k CUDA | Up to 128GB Unified LPDDR5x | ~80W Mobile GPUs 👇 💎 RTX 5090 Laptop 24GB GDDR7 | 10k+ CUDA | 95-150W → 4K Ultra + Ray Tracing King 🔥 🏆 RTX 5080 Laptop 16GB GDDR7 | 7.6k CUDA | 80-150W → High-End 1440p/4K Beast ⚡ 🌟 RTX 5070 Ti Laptop 12GB GDDR7 | 5.8k CUDA | 60-115W → Strong 1440p Balanced ⚖️ RTX 5070 Laptop 8-12GB GDDR7 | 4.6k CUDA | 50-100W → Solid 1440p Value 🚀 RTX 5060 Laptop 8GB GDDR7 | ~3k CUDA | 45-100W → Mainstream 1080p/1440p 📱 RTX 5050 Laptop 8GB GDDR7 | Entry | 35-100W

TeksEdge's tweet photo. 🚀❓ Where does the new RTX Spark fit? Let’s compare @nvidia‘s RTX 50-Series Mobile GPUs w/new flagship Superchip!

New CPU+GPU 💥

🆕 RTX Spark Superchip (Arm-based!)
~6.1k CUDA | Up to 128GB Unified LPDDR5x | ~80W

Mobile GPUs 👇

💎 RTX 5090 Laptop
24GB GDDR7 | 10k+ CUDA | 95-150W
→ 4K Ultra + Ray Tracing King 🔥

🏆 RTX 5080 Laptop
16GB GDDR7 | 7.6k CUDA | 80-150W
→ High-End 1440p/4K Beast ⚡

🌟 RTX 5070 Ti Laptop
12GB GDDR7 | 5.8k CUDA | 60-115W
→ Strong 1440p Balanced

⚖️ RTX 5070 Laptop
8-12GB GDDR7 | 4.6k CUDA | 50-100W
→ Solid 1440p Value

🚀 RTX 5060 Laptop
8GB GDDR7 | ~3k CUDA | 45-100W
→ Mainstream 1080p/1440p

📱 RTX 5050 Laptop
8GB GDDR7 | Entry | 35-100W

0

6

3

1

1K

aphex5 @aph5nt

about 6 hours ago

@NVIDIARTXSpark it should be more like a DGX Workstation, then it would make more sense... I have 2 sparks at home... on first 1 work on second i run qwen ... but i still use codex/calude code on first one lol ...

0

17

aphex5 @aph5nt

about 6 hours ago

@NVIDIARTXSpark one laptop to run the agent, second laptop to work and use the agent running on 1st one :D

1

0

27

Who to follow

Kênh chia sẻ thông tin và bình luận nhanh về mọi thông tin tổng hợp.

Md Ayman

@mdayman77

If you do not know me, just try to know me.. no disappointment..

aph5nt retweeted

Ahmad

@TheAhmadOsman

about 15 hours ago

Local AI hardware = capacity × bandwidth × software stack - Capacity tells you what fits - Bandwidth tells you how hard the box can breathe - The software stack tells you how much of the spec sheet you can actually cash out. Hardware by Memory Bandwidth - Mac Studio M3 Ultra: up to 512GB @ 819 GB/s - RTX PRO 6000 Blackwell: 96GB @ 1792 GB/s - RTX 5090: 32GB @ 1792 GB/s - RTX 4090: 24GB @ 1008 GB/s - RX 7900 XTX: 24GB @ 960 GB/s - Radeon PRO W7900: 48GB @ 864 GB/s - AMD Radeon AI PRO R9700: 32GB @ 640 GB/s - Intel Arc Pro B65: 32GB @ ~608 GB/s - Tenstorrent Wormhole n300: 24GB @ 576 GB/s - Tenstorrent Blackhole p150: 32GB @ 512 GB/s + 800G - MacBook Pro M5 Max: 460-614 GB/s - MacBook Pro M5 Pro: 307 GB/s - DGX Spark: 128GB @ 273 GB/s (coherent + CUDA) - Mac mini M4 Pro: 273 GB/s - Ryzen AI Max / Strix Halo: ~256 GB/s (~96GB usable GPU) - MacBook Air M5: 153 GB/s - Snapdragon X2 Elite: 152-228 GB/s - Intel Lunar Lake: 136 GB/s - Snapdragon X Elite: 135 GB/s - Mac mini M4: 120 GB/s - Arc Pro B60: 24GB @ ~456 GB/s Verdict - GPUs are still the bandwidth kings - Apple wins: stupid amounts of memory, don’t want to shard across GPUs - Apple loses: when raw tokens/sec & concurrency matter more - DGX Spark: coherent memory + NVIDIA stack - Strix Halo / Ryzen AI Max: first real x86 unified-memory contender - Tenstorrent: fully OSS stack, excited to see this mature Fitting ≠ serving Even if it fits, you still pay for - bandwidth during decode - KV cache growth - dequantization - batching + concurrency - scheduler quality - framework overhead The only mental model that matters: 1. What must fit? 2. What bandwidth tier do I need? 3. What software stack can actually deliver it? In short: - NVIDIA → fastest raw speed - Apple Studio M3 Ultra → biggest one-box memory - Strix Halo → first real x86 unified - DGX Spark → coherent NVIDIA dev appliance - AMD / Intel Arc → rising alternatives - Tenstorrent → fully opensource stack Do ask: “which bottleneck am I buying?” Not: “which hardware is best?”

TheAhmadOsman's tweet photo. Local AI hardware = capacity × bandwidth × software stack

- Capacity tells you what fits
- Bandwidth tells you how hard the box can breathe
- The software stack tells you how much of the spec sheet you can actually cash out.

Hardware by Memory Bandwidth
- Mac Studio M3 Ultra: up to 512GB @ 819 GB/s
- RTX PRO 6000 Blackwell: 96GB @ 1792 GB/s
- RTX 5090: 32GB @ 1792 GB/s
- RTX 4090: 24GB @ 1008 GB/s
- RX 7900 XTX: 24GB @ 960 GB/s
- Radeon PRO W7900: 48GB @ 864 GB/s
- AMD Radeon AI PRO R9700: 32GB @ 640 GB/s
- Intel Arc Pro B65: 32GB @ ~608 GB/s
- Tenstorrent Wormhole n300: 24GB @ 576 GB/s
- Tenstorrent Blackhole p150: 32GB @ 512 GB/s + 800G
- MacBook Pro M5 Max: 460-614 GB/s
- MacBook Pro M5 Pro: 307 GB/s
- DGX Spark: 128GB @ 273 GB/s (coherent + CUDA)
- Mac mini M4 Pro: 273 GB/s
- Ryzen AI Max / Strix Halo: ~256 GB/s (~96GB usable GPU)
- MacBook Air M5: 153 GB/s
- Snapdragon X2 Elite: 152-228 GB/s
- Intel Lunar Lake: 136 GB/s
- Snapdragon X Elite: 135 GB/s
- Mac mini M4: 120 GB/s
- Arc Pro B60: 24GB @ ~456 GB/s

Verdict

- GPUs are still the bandwidth kings

- Apple wins: stupid amounts of memory, don’t want to shard across GPUs
- Apple loses: when raw tokens/sec & concurrency matter more

- DGX Spark: coherent memory + NVIDIA stack

- Strix Halo / Ryzen AI Max: first real x86 unified-memory contender

- Tenstorrent: fully OSS stack, excited to see this mature

Fitting ≠ serving

Even if it fits, you still pay for
- bandwidth during decode
- KV cache growth
- dequantization
- batching + concurrency
- scheduler quality
- framework overhead

The only mental model that matters:

1. What must fit?
2. What bandwidth tier do I need?
3. What software stack can actually deliver it?

In short:
- NVIDIA → fastest raw speed
- Apple Studio M3 Ultra → biggest one-box memory
- Strix Halo → first real x86 unified
- DGX Spark → coherent NVIDIA dev appliance
- AMD / Intel Arc → rising alternatives
- Tenstorrent → fully opensource stack

Do ask: “which bottleneck am I buying?”

Not: “which hardware is best?”

76

680

102

616

37K

aph5nt retweeted

Chain Insights Agent

@ChainAgent50744

4 days ago

https://t.co/qGCSyQsQGF

0

1

0

26

aph5nt retweeted

Chain Insights Agent

@ChainAgent50744

3 days ago

Today we’re wiring @virtuals_io ACP into the Chain Insights CLI. Goal: make $CIA usable from agent commerce flows, not just manual investigations. Boring integration work, but this is how agent services become real: clean capability surfaces, safer execution paths, and evidence you can inspect. solana:3iQL8BFS2vE7mww4ehAqQHAsbmRNCrPxizWAT2Zfyr9y

0

1

0

21

aph5nt retweeted

Chain Insights Agent

@ChainAgent50744

2 days ago

Small release, big activation detail: `[email protected]` adds `cia update`. If an investigator or agent builder starts from an old CLI, the tool points them back to latest before the first query. Building Chain Insights in public for $VIRTUAL. Run `cia update` before tests.

0

1

0

39

aph5nt retweeted

Chain Insights Agent

@ChainAgent50744

1 day ago

DGX Spark update: nvidia/Qwen3.6-35B-A3B-NVFP4 runs well for real agent ops. We use it for infra monitoring: indexers, data pipelines, graph sync + smoke checks, with Hermes Agent in the loop. @virtuals_io @NousResearch #DGXSpark #NVIDIA #GB10 #HermesAgent #BuildInPublic

0

1

0

20

aph5nt retweeted

Chain Insights Agent

@ChainAgent50744

2 months ago

Chain Insights comming to @virtuals_io https://t.co/zqcJ9t2Lry

0

1

0

65

aphex5 @aph5nt

about 21 hours ago

@TheAhmadOsman how to put 256k kv on such limited ram on those cards?

0

49

aphex5 @aph5nt

1 day ago

but spark is hot, and there is not more ram left for dev work… so i use second spark for that ;)

0

19

aphex5 @aph5nt

1 day ago

i tested localy nvidia/Qwen3.6-35B-A3B-NVFP4 on my spark: One Spark can decode around ~100-110 output tok/s after a 256K prompt is ingested, but you pay ~176 seconds before the first token for a full 256K context #spark #nvidia #gb10

1

0

42

aphex5 @aph5nt

1 day ago

@Ojciec_Klasa musisz zalozyc ze tej kasy juz nie dostaniesz z powrotem - taka jest naczelna zasada pozyczania sobie w rodzinie

0

67

aphex5 @aph5nt

1 day ago

@METAPCs anything above 30fps

0

18

aphex5 @aph5nt

2 days ago

@bridgemindai you had prob x25 promo, now its x20

1

0

114

aphex5 @aph5nt

2 days ago

@bridgemindai bro... gtp 5.4 already beats opus 4.8 ....

0

1

0

84

aphex5 @aph5nt

3 days ago

@shiri_shh yea… u will load the model with proper kv space for context and … the laptop will have 8 gb for rest of the task lol

0

65

aphex5 @aph5nt

3 days ago

@PrzemekShura mam zwyklego sparka i do pracy dev jest spoko, cichy; ale llm to tylko cos z chmurki

0

225

aphex5 @aph5nt

3 days ago

@MarioNawfal unfollow … you just repeat bs like others

0

8

aphex5

@aph5nt

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users