emi @gpuemi - Twitter Profile

8 days ago

@gpusteve @xXshaurizardXx don’t worry about it kitten

1

2

0

68

gpuemi retweeted

Max Zeff

@ZeffMax

9 days ago

New: A group of AI researchers from Google DeepMind, Apple, MSL, and OpenAI are launching a new startup called Trajectory to build a continual learning platform for companies. They've raised a $15M seed round from Sarah Guo's Conviction, Jeff Dean, Fei-Fei Li, and others.

ZeffMax's tweet photo. New: A group of AI researchers from Google DeepMind, Apple, MSL, and OpenAI are launching a new startup called Trajectory to build a continual learning platform for companies.

They've raised a $15M seed round from Sarah Guo's Conviction, Jeff Dean, Fei-Fei Li, and others. https://t.co/deakiJj9ls

11

208

15

121

32K

Who to follow

gpuemi retweeted

10 days ago

Can't we do better? First short film from @fiftyyears.

25

156

33

64

58K

gpuemi retweeted

Jared Friedman

@snowmaker

10 days ago

@garrytan That's https://t.co/lTemakul6b if you want to try it.

3

31

3

27

6K

gpuemi retweeted

10 days ago

yes, garry tan uses wafer 🪩

3

31

4

3

4K

gpuemi retweeted

Garry Tan

@garrytan

10 days ago

This is a killer stack I just started using Wafer to serve my qwen3.6-27b custom fine tuned llm and it's excellent

24

492

31

771

105K

10 days ago

@garrytan let's goooo <3

7

8

0

848

gpuemi retweeted

InfronAI

@InfronAI

11 days ago

🎉 Excited to welcome @wafer_ai to Infron as a provider. Wafer does what used to take a team of world-class performance engineers automatically. Their AI agents optimize GPU inference across any hardware, finding the configurations that matter. Infron is a unified AI gateway: 400+ models, 100+ providers, one API key, zero markup. ⚡Cheap intelligence is the most essential technology for the future. Wafer and Infron are making that real. Wafer-optimized Qwen3.6-35B-A3B is now available on Infron: 🔗 https://t.co/dtsJ7neqUU #AIInfrastructure #AIGateway #LLMOps #Infron #Wafer

InfronAI's tweet photo. 🎉 Excited to welcome @wafer_ai to Infron as a provider.
Wafer does what used to take a team of world-class performance engineers automatically. Their AI agents optimize GPU inference across any hardware, finding the configurations that matter.

Infron is a unified AI gateway: 400+ models, 100+ providers, one API key, zero markup.

⚡Cheap intelligence is the most essential technology for the future. Wafer and Infron are making that real.

Wafer-optimized Qwen3.6-35B-A3B is now available on Infron:
🔗 https://t.co/dtsJ7neqUU

#AIInfrastructure #AIGateway #LLMOps #Infron #Wafer

0

12

1

9

4K

gpuemi retweeted

Dwarkesh Patel

@dwarkesh_sp

14 days ago

New blackboard lecture w @reinerpope How do chips actually work – starting with basic logic gates, and working up to why GPUs, TPUs, FPGAs, and the human brain each look the way they do. 0:00:00 – Building a multiply-accumulate from logic gates 0:16:20 – Muxes and the cost of data movement 0:25:59 – How systolic arrays work 0:39:00 – Clock cycles and pipeline registers 0:51:40 – FPGAs vs ASICs 1:03:14 – Cache vs scratchpad 1:07:16 – Why CPU cores are much bigger than GPU cores 1:11:49 – Brains vs chips 1:15:22 – A GPU is just a bunch of tiny TPUs Look up Dwarkesh Podcast on YouTube/Spotify/etc to watch. Enjoy!

93

6K

722

7K

919K

14 days ago

holy fuck god bless dwarkesh

Dwarkesh Patel

@dwarkesh_sp

14 days ago

New blackboard lecture w @reinerpope How do chips actually work – starting with basic logic gates, and working up to why GPUs, TPUs, FPGAs, and the human brain each look the way they do. 0:00:00 – Building a multiply-accumulate from logic gates 0:16:20 – Muxes and the cost of data movement 0:25:59 – How systolic arrays work 0:39:00 – Clock cycles and pipeline registers 0:51:40 – FPGAs vs ASICs 1:03:14 – Cache vs scratchpad 1:07:16 – Why CPU cores are much bigger than GPU cores 1:11:49 – Brains vs chips 1:15:22 – A GPU is just a bunch of tiny TPUs Look up Dwarkesh Podcast on YouTube/Spotify/etc to watch. Enjoy!

93

6K

722

7K

919K

4

15

1

14

4K

gpuemi retweeted

16 days ago

we recently optimized qwen3.5-397b-a17b to be the fastest deployment publicly hosted. and the crazy thing: we did it by writing CUSTOM KERNELS for AMD MI355x. 🍿 see our post below outlining how we optimized kernels to achieve SOTA performance.

gpusteve's tweet photo. we recently optimized qwen3.5-397b-a17b to be the fastest deployment publicly hosted.

and the crazy thing: we did it by writing CUSTOM KERNELS for AMD MI355x. 🍿

see our post below outlining how we optimized kernels to achieve SOTA performance. https://t.co/tNbFvzqNzI

7

109

11

85

8K

gpuemi retweeted

TensorWave @tensorwave

17 days ago

You have to read this one. We just published a recap into how @wafer_ai pushed @AMD inference performance to a level that’s getting the entire ecosystem’s attention and the results are kind of wild. What makes this story interesting isn’t just the performance itself. It’s how they achieved it: systems-level optimization, smart inference tuning, and a belief that AMD can compete at the very highest tier. Proud this work was powered on TensorWave’s AMD-native cloud infrastructure and early #MI355X deployments. https://t.co/8q1KPgHnYf

tensorwave's tweet photo. You have to read this one.

We just published a recap into how @wafer_ai pushed @AMD inference performance to a level that’s getting the entire ecosystem’s attention and the results are kind of wild.

What makes this story interesting isn’t just the performance itself. It’s how they achieved it: systems-level optimization, smart inference tuning, and a belief that AMD can compete at the very highest tier.

Proud this work was powered on TensorWave’s AMD-native cloud infrastructure and early #MI355X deployments.

https://t.co/8q1KPgHnYf

4

42

6

4

5K

gpuemi retweeted

18 days ago

will be in nyc 5/26 - 5/31. would love to chat w/ anyone interested in inference, hardware, or ml in general!

4

24

3

4

2K

gpuemi retweeted

18 days ago

ever since i was little i wanted to beat big inference at both speed and latency on big benchmark.

3

25

1

4

1K

gpuemi retweeted

20 days ago

every since i was a kid i wanted to decrease the amount of energy used to produce a token

1

15

2

976