Balaji Varatharajan @BalajiAI - Twitter Profile

Pinned Tweet

8 months ago

Today I’m excited to introduce VibeEdit - https://t.co/j4OJrSwivF It lets you edit any content - image, video, audio, just using text prompts.

3

7

2

3

1K

BalajiAI retweeted

Tianqi Chen

@tqchenml

3 days ago

We taught a brand-new mini-series this year at @SCSatCMU on Modern GPU Programming for ML Systems, as part of the ML Systems course, touching on fun questions like what data layout swizzling is, how to use 3D TMA, and state-of-the-art Blackwell programming. We released a curated online book based on the materials: https://t.co/5ZJg2lySNO check it out

20

2K

231

2K

135K

BalajiAI retweeted

John Jumper

@JohnJumperSci

6 days ago

A bit of news: After nearly 9 years, I have decided to leave Google DeepMind and join Anthropic (after taking some time to recharge). I am incredibly grateful for my time at GDM. @demishassabis took a real chance letting me lead the AlphaFold team just six months after finishing my PhD, and the entire GDM team taught me so much about how to do great science. GDM is a special place, and I’ll still be excited to hear about what amazing things they discover next.

615

14K

979

2K

6M

BalajiAI retweeted

Alexander Whedon

@alex_whedon

10 days ago

Here is the technical report on SubQ 1.1 Small. https://t.co/bu8AEc4lsk This is the second iteration on our Subquadratic Sparse Attention (SSA) model, and the first to be deployed with design partners in the coming weeks. The results are compelling and verified by @AppenResearch. - Near-perfect long-context retrieval up to 12M tokens on the needle-in-a-haystack test, with up to nearly 1,000x attention compute reduction. - A balance of long-context optimization and general reasoning ability, with strong performance retained across knowledge, coding, and non-coding enterprise agent benchmarks. - At 1M tokens, SubQ 1.1 Small requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2. These results highlight a significant scaling advantage thanks to the efficiency gains from the SSA architecture. We included some details and learnings from the development process which may be helpful to the community. Comment with questions, I’ll try to respond!

alex_whedon's tweet photo. Here is the technical report on SubQ 1.1 Small.
https://t.co/bu8AEc4lsk

This is the second iteration on our Subquadratic Sparse Attention (SSA) model, and the first to be deployed with design partners in the coming weeks.

The results are compelling and verified by @AppenResearch.

- Near-perfect long-context retrieval up to 12M tokens on the needle-in-a-haystack test, with up to nearly 1,000x attention compute reduction.

- A balance of long-context optimization and general reasoning ability, with strong performance retained across knowledge, coding, and non-coding enterprise agent benchmarks.

- At 1M tokens, SubQ 1.1 Small requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2.

These results highlight a significant scaling advantage thanks to the efficiency gains from the SSA architecture.

We included some details and learnings from the development process which may be helpful to the community.

Comment with questions, I’ll try to respond!

66

549

79

277

142K

Who to follow

Sanyam Bhutani

@bhutanisanyam1

👨‍💻 Working on llama models @AIatMeta | Previously: @h2oai, @weights_biases 🎙 Podcast @ctdsshow 👨‍🎓 Fellow @fastdotai 🎲 Grandmaster @Kaggle

Prashant

@capeandcode

AI Engineer // Senior Lead @TigerAnalytics

Rodrigo 🐍🚀

@mathsppblog

Python deep dive every Monday 🐍🚀 https://t.co/5mtbRI2bxz

BalajiAI retweeted

Sarvam

@SarvamAI

11 days ago

We're thrilled to announce that we have raised $234M in the first close of our $300M Series B at a $1.5B valuation. @HCLTech and @BessemerVP have joined us in this round, alongside continued support from @khoslaventures and @peakxvpartners For countries and companies, sovereign control on the AI stack is no longer an optionality. Sarvam will be the partner of choice for this aspiration. The capital allows us to accelerate our momentum towards this full stack of models, compute, and deployments. A huge thank you to our customers, partners, investors, and the Sarvam team for your trust and belief in what we are building. We’re just getting started. Read more: https://t.co/VmLtpnj8gx

SarvamAI's tweet photo. We're thrilled to announce that we have raised $234M in the first close of our $300M Series B at a $1.5B valuation.

@HCLTech and @BessemerVP have joined us in this round, alongside continued support from @khoslaventures and @peakxvpartners

For countries and companies, sovereign control on the AI stack is no longer an optionality. Sarvam will be the partner of choice for this aspiration. The capital allows us to accelerate our momentum towards this full stack of models, compute, and deployments.

A huge thank you to our customers, partners, investors, and the Sarvam team for your trust and belief in what we are building. We’re just getting started.

Read more: https://t.co/VmLtpnj8gx

657

10K

2K

517

1M

BalajiAI retweeted

Vasilios Mavroudis

@mavroudisv

11 days ago

Fable 5 is out now but before that, it had its model card updated: https://t.co/JVuxGZRFdN The doc's changelog is mostly accurate this time but eg missed mentioning the removal of this footnote: "This threshold maps to the High-stakes sabotage opportunities threat model in our current Responsible Scaling Policy."

mavroudisv's tweet photo. Fable 5 is out now
but before that, it had its model card updated:
https://t.co/JVuxGZRFdN

The doc's changelog is mostly accurate this time but eg missed mentioning the removal of this footnote:

"This threshold maps to the High-stakes sabotage opportunities threat model in our current Responsible Scaling Policy."

1

3

2

1

17K

BalajiAI retweeted

Deb Raji @rajiinio

14 days ago

I do not want to do AI research that is reactive to what these companies are doing, or even what they're saying. The entire field keeps chasing after product releases. Some spend more time reading marketing copy than their colleagues papers and I just... do not want to do that?

5

128

16

20

8K

BalajiAI retweeted

Sam Altman

@sama

23 days ago

one of the quotes i find most inspiring on a hard day: "Whatever your hand finds to do, do it with all your might, for in the realm of the dead, where you are going, there is neither working nor planning nor knowledge nor wisdom" Ecclesiastes 9:10

1K

19K

3K

6K

2M

Balaji Varatharajan @BalajiAI

24 days ago

@Karthickhps Congrats na!

0

58

Balaji Varatharajan @BalajiAI

26 days ago

I got my first customer for @vibeeditapp about a month ago 🤗

Priyam Raj

@priymrj

about 1 month ago

that first annual plan sale, when was it for you?

15

148

2

27

7K

1

2

0

80

BalajiAI retweeted

Sakana AI

@SakanaAILabs

30 days ago

Introducing DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation https://t.co/c9AvsRKybj What if we didn’t have to hold an entire neural network in memory to train it? Standard neural net training optimizes all parameters jointly. As a result, the memory required during training grows linearly with the depth of the network. In our #ICLR2026 paper, we propose DiffusionBlocks, a principled framework to train networks one block at a time, drastically reducing memory requirements while matching end-to-end performance. With DiffusionBlocks, we split the network into blocks and train them one at a time, so you only need memory for a single block. How? We explicitly assign each block a role: to move the representation a little closer to the target than the block before it did. That role turns out to be precisely what a diffusion model does, step by step. Each block only needs to optimize its own objective and can be trained independently. We validated this across five different architectures: • ViT • DiT • Masked diffusion • Autoregressive transformers • Recurrent-depth transformers In each case, performance is competitive with end-to-end training while using a fraction of the memory. This perspective also extends naturally to recurrent-depth (Looped) transformers, which apply the same network iteratively and normally require expensive backpropagation through time (BPTT). Viewed through DiffusionBlocks, we can replace those multiple iterations with a single forward pass during training. Read our paper and code, to learn more. Paper: https://t.co/CRj96VGYQn GitHub: https://t.co/eNW0K9Xh8E 🐟

56

2K

367

2K

872K

Balaji Varatharajan @BalajiAI

29 days ago

@jeyasingh_alex UI looks good!

0

1

0

18

Balaji Varatharajan @BalajiAI

over 2 years ago

I tried @infography_ai built by @r_manoj11. Infography allows you to create visuals/infographics from a blog. I gave my own blog post as an input, it can able to understand and managed to create stunning visuals.

BalajiAI's tweet photo. I tried @infography_ai built by @r_manoj11. Infography allows you to create visuals/infographics from a blog.

I gave my own blog post as an input, it can able to understand and managed to create stunning visuals. https://t.co/B9QafaN8Gp

3

2

3

821

BalajiAI retweeted

Hermeus @hermeuscorp

about 1 month ago

Supersonic. Mach 1.21. Quarterhorse Mk 2.1 is now the world’s first privately developed, unmanned supersonic jet and the fastest unmanned aircraft flying today. This flight makes Hermeus the fastest company in aviation history to go from founding to supersonic flight - exactly 364 days after the maiden flight of our first aircraft. Now, we fly faster. A special thanks to @DIU_x, Director @OwenWest91, Maj. Gen. Joe "Solo" Kunkel, and Deputy Director Kyle Norman.

65

1K

188

252

741K

Balaji Varatharajan @BalajiAI

about 1 month ago

@chandhana @stripepress @chandhana which copy is it - bhagavad gita??

1

0

57

BalajiAI retweeted

Slater Stich

@slaterstich

about 1 month ago

3

11

1

3

699

BalajiAI retweeted

Andrej Karpathy

@karpathy

about 1 month ago

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

8K

150K

11K

14K

28M

Balaji Varatharajan @BalajiAI

about 1 month ago

@Nevasini1 @fpv_labs 😲

0

70

Balaji Varatharajan @BalajiAI

about 1 month ago

@sab8a @fal sub

1

0

77

BalajiAI retweeted

Sabba Keynejad

@sab8a

about 1 month ago

I have a confession to make... I accidentally built world's best subtitle API. Free @fal API credits for first 350 users who RT + comment "sub".

90

145

68

50

38K

Balaji Varatharajan @BalajiAI

about 1 month ago

LOL!

Andrej Karpathy

@karpathy

about 1 month ago

@jino_rohit I was recording my nanochat video when I realized that “first boot up an 8XH100 from your favorite provider!” would instantly get everyone stuck on step 1 of the video

66

2K

31

147

216K

0

82

BalajiAI retweeted

JNS

@_devJNS

about 1 month ago

213

13K

1K

548K

Balaji Varatharajan

@BalajiAI

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users