shikhar @encapsulated007 - Twitter Profile

Pinned Tweet

shikhar @encapsulated007

over 1 year ago

just not today, Mr. Hotz!!

the tiny corp

@__tinygrad__

over 1 year ago

Cool competition! `examples/torch_cuda_kernel.py` shows how to use tinygrad (with BEAM=2). The bitter lesson always wins in the end, any tricks people find should be added to our search.

__tinygrad__'s tweet photo. Cool competition! `examples/torch_cuda_kernel.py` shows how to use tinygrad (with BEAM=2). The bitter lesson always wins in the end, any tricks people find should be added to our search. https://t.co/dI50Y7gWb7

5

327

15

77

101K

38

665

12

108

70K

shikhar @encapsulated007

3 days ago

@waterloo_intern looks like FA-3??

1

0

370

shikhar @encapsulated007

7 days ago

@Simon_Vt jax scaling book?

1

2

0

1

179

encapsulated007 retweeted

jonah

@jonahsamost

9 days ago

In this blog, I'm taking a look at how CuTe layouts simplify gemm epilogue retiling and tile-shape tuning across persistent/non-persistent kernels

1

16

2

11

1K

Who to follow

Rahul Gupta

@rahu__24

20 . Backend . RAG . Ragebait

Bhanu singh

@Bhanu_here

22. sde ☘️ + freelancer🌼

Punit

@thepunitmkr

22

shikhar @encapsulated007

11 days ago

@_arohan_ about time to get in!!

1

4

0

474

shikhar @encapsulated007

12 days ago

it's GPU-Mode time boiss!!

Mark Saroufim

@marksaroufim

12 days ago

Launching a new kernel competition: Linear Algebra Kernels For The Age Of Research. First problem: batched QR decomposition on B200. Old math, modern hardware. Prize: Rare swag and hangout in SF

12

401

31

307

145K

0

4

0

248

shikhar @encapsulated007

13 days ago

yeah, modal just can't stop cooking...

Charles 🎉 Frye

@charles_irl

14 days ago

Last fall, we shared our deep dive on FA4 internals. But we didn't stop at grokking the kernel. Since then, we've been developing improvements for inference performance and upstreaming them. This blog post explains those contributions. https://t.co/xzDNHdq3Zw

charles_irl's tweet photo. Last fall, we shared our deep dive on FA4 internals.

But we didn't stop at grokking the kernel.

Since then, we've been developing improvements for inference performance and upstreaming them.

This blog post explains those contributions.

https://t.co/xzDNHdq3Zw https://t.co/AzFs33Xqif

2

196

27

146

18K

0

11

1

5

2K

encapsulated007 retweeted

Charles 🎉 Frye

@charles_irl

14 days ago

Last fall, we shared our deep dive on FA4 internals. But we didn't stop at grokking the kernel. Since then, we've been developing improvements for inference performance and upstreaming them. This blog post explains those contributions. https://t.co/xzDNHdq3Zw

2

196

27

146

18K

encapsulated007 retweeted

/ @gazorp5

16 days ago

.@_arohan_ vs @kellerjordan0 is our kendrick vs drake.

1

29

2

1

2K

encapsulated007 retweeted

Brandon Devlin @sandalwoodgrips

18 days ago

@BusinessInsider As a follower of Paul Atreides I also refuse to use AI

5

873

30

13

62K

shikhar @encapsulated007

16 days ago

is this online-MoE?

Awni Hannun

@awnihannun

16 days ago

It's very cool that Apple shipped a 20B parameter on-device. You can't put 20B parameters in RAM at any reasonable precision. To make it work they are using pretty exotic architecture by today's standards. A small model predicts from the query (or prompt) which experts to load from Nand into RAM. The key distinction from a typical MoE is that you do this once per query and then generate all the tokens with the same experts (instead of switching the experts for every token).

awnihannun's tweet photo. It's very cool that Apple shipped a 20B parameter on-device.

You can't put 20B parameters in RAM at any reasonable precision. To make it work they are using pretty exotic architecture by today's standards.

A small model predicts from the query (or prompt) which experts to load from Nand into RAM. The key distinction from a typical MoE is that you do this once per query and then generate all the tokens with the same experts (instead of switching the experts for every token).

77

3K

296

1K

227K

1

4

0

1

962

shikhar @encapsulated007

17 days ago

@norpadon iirc, mamba-3's trapezoidal-discretization solves exactly the need for short conv in recurrent models.

0

1

0

248

encapsulated007 retweeted

Sakana AI

@SakanaAILabs

17 days ago

AIを作るAIを作る：RSI Lab始動 https://t.co/Tvc1zQyKkk Sakana AIは、再帰的自己改善(Recursive Self-Improvement、RSI)に取り組む専任の研究グループ「RSI Lab」を、東京で立ち上げます。RSIは、AIがAIそのものを作る仕組みです。この2年間、私たちはLLM-Squared、Darwin Gödel Machine、Shinka Evolve、ALE-Agent、Digital Red Queen、そしてThe AI Scientistといった研究を積み重ねてきました。いずれも、エージェント用途のために設計されたモデルが研究を自動で行うAIを生み、そのAIがさらに優れたモデルを生み出していく、というひとつの循環に向けた歩みです。自己改善型AIという発想は、いまや私たちだけのものではありません。2026年に入ってRSIは大きな潮流となり、この考え方を掲げる組織が世界各地で相次いで立ち上がっています。そのなかでSakana AIは、創業以来、計算資源の量に頼らずにAIを開発する独自の方法を切り拓いてきました。私たちがめざすのは、計算資源を際限なく注ぎ込むことなくRSIを実現することです。計算規模で世界最上位の国と張り合うのが難しい日本だからこそ、取り組む意味のある研究だと考えています。RSIの研究を責任あるかたちで進めるため、これまでの経験も活かしながら、コミュニティの皆様と知見を共有しつつ取り組んでいきます。国内外から研究者・エンジニアを集め、Sakana AI RSI Labを組成していきます。この取り組みに、さまざまなかたちで関わってくださる皆様と協働できることを楽しみにしています。

SakanaAILabs's tweet photo. AIを作るAIを作る：RSI Lab始動

https://t.co/Tvc1zQyKkk

Sakana AIは、再帰的自己改善(Recursive Self-Improvement、RSI)に取り組む専任の研究グループ「RSI Lab」を、東京で立ち上げます。RSIは、AIがAIそのものを作る仕組みです。

この2年間、私たちはLLM-Squared、Darwin Gödel Machine、Shinka Evolve、ALE-Agent、Digital Red Queen、そしてThe AI Scientistといった研究を積み重ねてきました。いずれも、エージェント用途のために設計されたモデルが研究を自動で行うAIを生み、そのAIがさらに優れたモデルを生み出していく、というひとつの循環に向けた歩みです。

自己改善型AIという発想は、いまや私たちだけのものではありません。2026年に入ってRSIは大きな潮流となり、この考え方を掲げる組織が世界各地で相次いで立ち上がっています。そのなかでSakana AIは、創業以来、計算資源の量に頼らずにAIを開発する独自の方法を切り拓いてきました。

私たちがめざすのは、計算資源を際限なく注ぎ込むことなくRSIを実現することです。計算規模で世界最上位の国と張り合うのが難しい日本だからこそ、取り組む意味のある研究だと考えています。RSIの研究を責任あるかたちで進めるため、これまでの経験も活かしながら、コミュニティの皆様と知見を共有しつつ取り組んでいきます。

国内外から研究者・エンジニアを集め、Sakana AI RSI Labを組成していきます。この取り組みに、さまざまなかたちで関わってくださる皆様と協働できることを楽しみにしています。

30

645

107

236

80K

shikhar @encapsulated007

19 days ago

@snowclipsed @m_sirovatka new chapter, mbarrier visualized.

0

2

0

40

shikhar @encapsulated007

21 days ago

If you know Hindi, you know you can use cuda 🗑️(garbage) collector when you dealloc memory

rohan anil

@_arohan_

21 days ago

If you know Malayalam, you know you can use a cuda ☂️ when it rains.

12

56

1

9K

1

5

0

1

1K

encapsulated007 retweeted

wh

@nrehiew_

22 days ago

Super detailed tech report for MAI-Thinking-1, with a ton of info on all stages of the pipeline. I'm surprised so much of this info is released :) Super long thread on my notes:

nrehiew_'s tweet photo. Super detailed tech report for MAI-Thinking-1, with a ton of info on all stages of the pipeline. I'm surprised so much of this info is released :)

Super long thread on my notes: https://t.co/uCtan39KUp

1

160

18

114

21K

encapsulated007 retweeted

Aman

@arcaman07

23 days ago

Introducing Claude Cairn 🪨 It is a Claude Code skill for carrying your thinking across sessions. Say you explored a few different projects, ideas, or features in one session. Drop a checkpoint and you can restore or continue any of them in a later conversation with the relevant context already in place. It came out of my own exploration sprints: in one Claude Code session I'd chase a few unrelated projects, ideas, and features at once, then waste time later untangling one thread from the rest to continue it elsewhere.

2

49

3

39

5K

shikhar @encapsulated007

23 days ago

@henrylhtsang yeah, i just don't trust what any model says, before it can pull that info actually out of PTX ISA for me

0

4

0

223

encapsulated007 retweeted

John Schulman

@johnschulman2

27 days ago

Glad to see this -- renderers are a foundational component of the LLM stack. Renderers map between tokens and messages, which are invariant to tokenizer and formatting details. Most APIs, datasets, and RL environments are defined in terms of messages. Getting the details wrong leads to train-test mismatches, caching inefficiencies, and prompt injection vulnerabilities. We included a renderers module in Tinker Cookbook, but it makes sense as a standalone library.

15

672

56

380

79K

encapsulated007 retweeted

Mingxing Zhang

@james0zan

28 days ago

@KranenKyle Great Job. Looking forward to the future multi-GPU related driver optimizations!

1

11

2

0

974

encapsulated007 retweeted

Aman

@arcaman07

28 days ago

new blog post out! almost a month since I picked up robotics and one of the first things I dug into was inverse RL and behaviour cloning. it covers the bedrock of the field and the key concepts that shaped with with some fun interactive widgets to play around with. Link below ⬇️

arcaman07's tweet photo. new blog post out!
almost a month since I picked up robotics and one of the first things I dug into was inverse RL and behaviour cloning. it covers the bedrock of the field and the key concepts that shaped with with some fun interactive widgets to play around with.
Link below ⬇️ https://t.co/PNRqkAKaPo

1

30

2

26

5K

shikhar

@encapsulated007

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users