Gantavya Bhatt @BhattGantavya - Twitter Profile

Pinned Tweet

Gantavya Bhatt @BhattGantavya

5 months ago

Not a shabby start to the day, 1/1 in #ICLR2026 🇧🇷⚽️ (Now just need a good coffee!) 🧵 soon.

1

35

1

2

3K

BhattGantavya retweeted

Bryan Catanzaro

@ctnzr

2 days ago

Really proud to see Nemotron 3 Ultra generalizing to agentic tasks that didn't exist when we trained it!

4

87

14

8

13K

BhattGantavya retweeted

Pratyush Maini

@pratyushmaini

3 days ago

In our first talk of the summer, we had @kothasuhas and @konwookim talk about their incredible work on scaling synthetic data with MegaDocs. See thread below for link to the video.

pratyushmaini's tweet photo. In our first talk of the summer, we had @kothasuhas and @konwookim talk about their incredible work on scaling synthetic data with MegaDocs. See thread below for link to the video. https://t.co/seouMSPBut

0

31

5

3

3K

Gantavya Bhatt @BhattGantavya

3 days ago

@lilianweng In our work -- https://t.co/O5bllH62er We formally show the common theme connecting all the scaling laws -- Submodularity and Diminishing Returns.

0

3

0

3

575

Who to follow

Pratik Joshi

@Roprajo

Research Engineer @GoogleDeepMind | Teaching machines to code | Prev @LTIatCMU @GoogleAI, @MSFTResearch @BITSPilaniGoa

Peng Qi

@qi2peng2

AI Lead @Uniphore. Previously: @OrbyAI, @AWS AI, $JD AI, PhD @stanfordnlp, UG @Tsinghua_Uni. He/him. Opinions my own.

Dhruv Batra

@DhruvBatra_

Co-founder & Chief Scientist @yutori_ai. Prev: Senior Director leading FAIR Embodied AI @MetaAI and Professor @GeorgiaTech.

BhattGantavya retweeted

Matthew Leavitt

@leavittron

4 days ago

What if you could induce models to be more concise via pretraining data curation?

1

160

24

51

6M

BhattGantavya retweeted

Alisa Liu @alisawuffles

8 days ago

I'm joining OpenAI next week!🥹 The job search turned out to be really challenging but also super rewarding, so I wrote a small blog to share what I learned along the way and hopefully make the process a little less mysterious for the next person. https://t.co/6FigSBdenD

506

14K

1K

19K

5M

BhattGantavya retweeted

Charlie O'Neill

@oneill_c

11 days ago

1/ We fine-tune a lot of customer models, so we decided to systematically try and figure out some best practices for finetuning. SFT isn't sexy, but it's still important. We vary one SFT lever at a time across 2 model families, dense + MoE to 235B, on 4 real-world customer datasets. What makes this clean is that each dataset is paired with an eval that took weeks to build with the customer, and the training outputs were generated to pass that eval. So the supervised target and the thing we measure downstream are the same criterion, which strips out the usual confounders

oneill_c's tweet photo. 1/ We fine-tune a lot of customer models, so we decided to systematically try and figure out some best practices for finetuning. SFT isn't sexy, but it's still important. We vary one SFT lever at a time across 2 model families, dense + MoE to 235B, on 4 real-world customer datasets.

What makes this clean is that each dataset is paired with an eval that took weeks to build with the customer, and the training outputs were generated to pass that eval. So the supervised target and the thing we measure downstream are the same criterion, which strips out the usual confounders

21

712

73

937

150K

BhattGantavya retweeted

Ari Morcos

@arimorcos

13 days ago

Most companies that say they want to own their model are going to fail at it. I like the conversation about "owning vs renting" intelligence. It's the right frame, and it's about to become the defining decision for most companies, because AI will be core for pretty much everyone, even if they don't realize it yet. For those where the model is the business today, these questions hit close to home. Renting was the right call for the last three years. Call an API, ship, don't think about infrastructure. But the ground is shifting. The frontier is quietly closing up. Meta moved its newest flagship work to closed models under its Superintelligence Lab, and the strongest Chinese models, like Qwen's top tier, are now API-only. Open weights aren't going away, but counting on the best ones being there for you isn't a safe bet anymore. And the closed labs are compute-constrained enough that access itself is becoming something you reserve years in advance: OpenAI is already selling multi-year "Guaranteed Capacity" contracts. So serious companies are deciding to own their models rather than rent them. Here's where almost everyone gets it wrong. They treat it as a compute problem, or a talent problem: get the GPUs, hire the team, and you can build a great model. They line up the compute, put a date on the calendar for the model, and then hit the real blocker. Data. Their proprietary data isn't ready for training. There isn't enough data to train on in the areas they really care about. That's the part almost nobody budgets for: data quality in a shape that can train the model your business needs. Getting the data right also flips the economics that we have come to expect from the past three years. A small, domain-specific model built on the right data can go toe-to-toe with the best the frontier labs can build, and you keep your data, own your roadmap, control your costs, and build a moat that's actually yours. The future worth betting on isn't three labs renting the same model to everyone. It's thousands of companies building their own domain-specific models, each better at its job than any general model could be. The frontier used to look like the shining house on the hill. Lately, it looks more like a landlord, happy to keep you renting as long as you never price out what owning could really look like.

0

23

10

2

2K

Gantavya Bhatt @BhattGantavya

11 days ago

Very cool!!

DatologyAI @datologyai

11 days ago

1/ 🌞 Our Summer of Data Seminar brought together some of the sharpest minds in data curation last year. We are bringing it back in 2026! Let's recap the great talks from 2025!

datologyai's tweet photo. 1/ 🌞 Our Summer of Data Seminar brought together some of the sharpest minds in data curation last year. We are bringing it back in 2026! Let's recap the great talks from 2025! https://t.co/wddwTQMu8C

2

38

9

8

4K

0

1

0

115

BhattGantavya retweeted

DatologyAI @datologyai

11 days ago

1/ 🌞 Our Summer of Data Seminar brought together some of the sharpest minds in data curation last year. We are bringing it back in 2026! Let's recap the great talks from 2025!

2

38

9

8

4K

BhattGantavya retweeted

Tianyi Zhou @zhoutianyi

19 days ago

🤔 What if scaling LLMs isn't just about adding parameters but about finding the right program hidden inside the model you already have? If every layer block is viewed as a callable function, can we program an LLM for each input by choosing to keep a layer, skip it, or loop it?

2

6

1

4

1K

BhattGantavya retweeted

Andrei Bursuc @abursuc

24 days ago

ResNet and YOLO received to the Longuet-Higgins Test of Time award. Congrats! Three thoughts: - very “difficult” job for the committee this year - people are still using both quite a bit - time for an additional generation to feel old — I already got used to that 😅 #cvpr2026

abursuc's tweet photo. ResNet and YOLO received to the Longuet-Higgins Test of Time award. Congrats!
Three thoughts:
- very “difficult” job for the committee this year
- people are still using both quite a bit
- time for an additional generation to feel old — I already got used to that 😅 #cvpr2026 https://t.co/JAnBZTnXfm

2

80

9

10

39K

BhattGantavya retweeted

Bryan Catanzaro

@ctnzr

25 days ago

NVIDIA Nemotron 3 Ultra is now live! Frontier accuracy, 5X greater speed, 30% lower cost. Deploy however you need - on-premise, on the cloud, or at the edge. Model is live on HuggingFace under the OpenMDW 1.1 license. https://t.co/IOfAwv3jB6

25

370

75

84

93K

Gantavya Bhatt @BhattGantavya

26 days ago

@shizhediao @thinkymachines Congrats, I've enjoyed reading your work, especially CLIMB. :)

1

3

0

223

Gantavya Bhatt @BhattGantavya

27 days ago

@BanghuaZ @Osmosis_AI This is very cool!

0

1

0

99

BhattGantavya retweeted

Bryan Catanzaro

@ctnzr

28 days ago

Nemotron 3 Ultra: Frontier smart. 5X faster. 30% cheaper. 💚💚💚

45

894

94

119

276K

BhattGantavya retweeted

Nishad Singhi

@nishadsinghi

about 1 month ago

Embodied agents are getting better at reasoning… but they still make surprisingly brittle mistakes. A robot that can “bring me a banana” may completely fail at: “bring me a yellow curved fruit.” Why? Because current agents usually commit to the first action they come up with. We introduce VeGAS: Verifier-Guided Action Selection for embodied agents. 🤖 Instead of acting immediately, the agent: • samples multiple candidate actions • verifies them • executes the best one Accepted as a #CVPR 2026 Findings paper. 🧵

nishadsinghi's tweet photo. Embodied agents are getting better at reasoning… but they still make surprisingly brittle mistakes.
A robot that can “bring me a banana”
may completely fail at: “bring me a yellow curved fruit.”

Why?

Because current agents usually commit to the first action they come up with.

We introduce VeGAS: Verifier-Guided Action Selection for embodied agents. 🤖

Instead of acting immediately, the agent:
• samples multiple candidate actions
• verifies them
• executes the best one

Accepted as a #CVPR 2026 Findings paper.

🧵

3

68

16

30

9K

BhattGantavya retweeted

Jeff Bilmes @jbilmes

about 1 month ago

What makes a dataset valuable? And when is "more data" not the same as "better data" in machine learning and AI? Read more to find out: https://t.co/Q0wPOtfm5d

0

6

3

230

Gantavya Bhatt @BhattGantavya

about 1 month ago

@pratyushmaini @oliveraochongli 🍽️

0

1

0

99

BhattGantavya retweeted

Pratyush Maini

@pratyushmaini

about 1 month ago

The evidence for specialized pretraining keeps growing. This really nice study shows how early exposure leads to robustness to forgetting. Enterprises serious about AI use cases should start thinking about training custom models from scratch, not just post-training or RL.

2

36

7

30

6K

Gantavya Bhatt @BhattGantavya

about 2 months ago

Happy to be recognized as a gold reviewer for ICML! I will keep up with my efforts to make the reviewing community as constructive as possible.

0

6

0

360

Gantavya Bhatt

@BhattGantavya

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users