hackcore @hardmeta - Twitter Profile

2 months ago

DeepSeek just shipped V4 and casually dropped this in the pricing page: "Pro is throughput-limited. Wait for Huawei Ascend 950 supernodes in H2 — prices will drop a lot." A frontier lab publicly pinning its product roadmap to a Chinese chip release. That's the loudest ad Huawei never bought.

hardmeta's tweet photo. DeepSeek just shipped V4 and casually dropped this in the pricing page:

"Pro is throughput-limited. Wait for Huawei Ascend 950 supernodes in H2 — prices will drop a lot."

A frontier lab publicly pinning its product roadmap to a Chinese chip release. That's the loudest ad Huawei never bought.

0

1

0

74

hardmeta retweeted

hackcore

@hardmeta

2 months ago

When Apple killed NVIDIA support 7 years ago, nobody expected the deadlock to be broken not by Apple, not by NVIDIA, but by an open-source team writing a driver from scratch. The significance of tinygrad's driver isn't "NVIDIA works on Mac again." It's the first proof that GPU compute can run outside the vendor's proprietary driver stack. Plug in via Thunderbolt, compute. Performance will catch up. The decoupling has already happened.

0

1

0

130

hackcore

@hardmeta

2 months ago

When Apple killed NVIDIA support 7 years ago, nobody expected the deadlock to be broken not by Apple, not by NVIDIA, but by an open-source team writing a driver from scratch. The significance of tinygrad's driver isn't "NVIDIA works on Mac again." It's the first proof that GPU compute can run outside the vendor's proprietary driver stack. Plug in via Thunderbolt, compute. Performance will catch up. The decoupling has already happened.

0

1

0

130

hackcore

@hardmeta

2 months ago

Found a way to keep OpenCode/OpenClaw humming on Claude Max Plan. Not telling how. Just know the cat-and-mouse game is very much still on. 🐈🐁

0

29

hardmeta retweeted

hackcore

@hardmeta

3 months ago

@GenAI_is_real Agent-inference co-design and cache-aware scheduling feel like exactly the right direction. The current stateless contract is the HTTP/1.1 of our era — and it took HTTP/2 / gRPC nearly a decade to break that inertia with shared connection state.

0

1

0

117

hardmeta retweeted

hackcore

@hardmeta

3 months ago

@SemiAnalysis_ The real moat here isn't CUDA — it's that NIXL is becoming the protocol layer for KV transfer between heterogeneous accelerators. And it's not a coincidence that it's open source.

0

1

0

35

hackcore

@hardmeta

3 months ago

@SemiAnalysis_ The real moat here isn't CUDA — it's that NIXL is becoming the protocol layer for KV transfer between heterogeneous accelerators. And it's not a coincidence that it's open source.

0

1

0

35

hackcore

@hardmeta

3 months ago

@GenAI_is_real Agent-inference co-design and cache-aware scheduling feel like exactly the right direction. The current stateless contract is the HTTP/1.1 of our era — and it took HTTP/2 / gRPC nearly a decade to break that inertia with shared connection state.

0

1

0

117

hackcore

@hardmeta

3 months ago

Anthropic leaked their Claude Code source code. I built it from source. I unlocked a hidden Tamagotchi pet system called BUDDY. I hatched a goose named Mochi. It wiggles happily when I pet it. Happy April Fools' — except none of this is a joke. #ClaudeCode

hardmeta's tweet photo. Anthropic leaked their Claude Code source code.
I built it from source.
I unlocked a hidden Tamagotchi pet system called BUDDY.
I hatched a goose named Mochi.
It wiggles happily when I pet it.

Happy April Fools' — except none of this is a joke.

#ClaudeCode https://t.co/JV8BvvxrTX

0

1

0

83

hackcore

@hardmeta

3 months ago

NVIDIA's Defense and Offense Strategy Defense (protect GPU revenue): Dynamo + CMX → make every GPU produce more tokens, raise switching cost Offense (invade new markets): Vera CPU → steal x86 server market Spectrum-X → steal Ethernet switching STX/BlueField-4 → steal storage Groq LPX → steal inference accelerators Jensen's playbook: defend the GPU moat by making everything around it NVIDIA too. Every new "X" platform is a new revenue stream disguised as GPU optimization.

0

29

hackcore

@hardmeta

3 months ago

14/ For anyone building in this space, one number to remember: In our benchmarks, 80% KV cache hit rate = 185% throughput improvement. That's the value proposition of KV cache storage in one line. Not faster disks. Not more capacity. CONTEXT REUSE AT SCALE. The GPU produces tokens. But storage decides how many of those tokens are NEW WORK vs REPEATED WORK. Storage is no longer where data rests. It's where intelligence persists. Data: GTC 2026 keynote (Mar 16), NVIDIA CMX product page, NVIDIA STX launch press release. KV cache hit rate / throughput data from author's own benchmarks. Jensen quotes are direct transcriptions from GTC 2026 keynote.

0

9

hackcore

@hardmeta

3 months ago

1/ At GTC 2026, Jensen showed the Vera Rubin system: 5 rack-scale computers side by side. GPU compute. CPU orchestration. Networking. And for the first time — STORAGE. Jensen: "The storage system is going to get pounded... which is the reason why we reinvented the storage system." Here's why this changes everything. 🧵

1

0

26

hackcore

@hardmeta

3 months ago

13/ The strategic implication: NVIDIA now defines what "AI storage" means, just as it defined what "AI compute" means with CUDA. STX is the reference architecture. CMX is the reference product. NIXL is the reference data transfer library. BlueField-4 is the reference DPU. Storage vendors can build on this stack — but NVIDIA sets the standard. Sound familiar? It's CUDA for storage.

1

0

21

hackcore

@hardmeta

Last Seen Users on Sotwe

Trends for you

Most Popular Users