Abitha @abitha___ - Twitter Profile

10 days ago

Excited to share last summer's work at Google Research! Most hybrid models today are static: each token sees the same interleaved pattern of your favorite linear model and attention. Oryx instead varies the model used across the sequence through shared representations. 1/

kevinyli_'s tweet photo. Excited to share last summer's work at Google Research!

Most hybrid models today are static: each token sees the same interleaved pattern of your favorite linear model and attention.

Oryx instead varies the model used across the sequence through shared representations.

1/ https://t.co/TGf6iqkq4x

5

141

36

78

43K

Abitha @abitha___

11 days ago

More cool work from @gaurav_ghosal!

Gaurav Ghosal

@gaurav_ghosal

11 days ago

We are taking a big step towards scaling LLMs that can unlearn on demand. Cleanly deleting data from LLMs has proven impossible: training entangles every source in shared weights. NULLs (Natively Unlearnable LLMs) escapes this, keeping millions of sources individually deletable in a 1B-parameter model trained on web data. (1/8)

7

134

36

97

38K

0

6

0

2K

abitha___ retweeted

Lawrence Feng

@lawrencefeng17

about 1 month ago

1/ To retain post-training capabilities after further fine-tuning, mix that data into pretraining. The effect can be invisible until fine-tuning begins; early exposure may not help post-training performance, but it changes what persists. How a model learns a task matters.

6

88

24

57

28K

abitha___ retweeted

Amanda Bertsch @abertsch72

about 2 months ago

New paper! https://t.co/1ETmDt0ZB8 This tackles a puzzle we found during the training of Olmo 3: how could two models with nearly identical short-context performance (and trained on the same data!) behave completely differently after long context extension?

3

112

28

50

16K

Who to follow

Russell Mendonca

@mendonca_rl

World models for robotics @GoogleDeepMind Prev - Optimus AI @Tesla, PhD student @CMU_Robotics

Jacky Liang

@jackyliang42

Research Scientist @GoogleDeepMind working on foundation models for robotics. PhD @CMU_Robotics @iamlab_cmu

Thomas Weng

@thomas_weng

Robotics Research Scientist at The AI Institute | @CMU_Robotics PhD

Abitha @abitha___

2 months ago

@pratyushmaini time to parallelise. multiple subagents should write multiple SKILL.md s.

0

3

0

169

abitha___ retweeted

Yash Jangir

@off_jangir

3 months ago

🤖 What would LMArena for robotics look like? Introducing RobotArena ∞ We turn real videos into simulated environments and evaluate robot policies at scale using VLM scoring + human preferences A scalable benchmark for robot generalists 🔗 https://t.co/E74fjWvlG3 Details 🧵👇

5

125

27

84

22K

abitha___ retweeted

Christina Baek

@_christinabaek

3 months ago

Models are typically specialized to new domains by finetuning on small, high-quality datasets. We find that repeating the same dataset 10–50× starting from pretraining leads to substantially better downstream performance, in some cases outperforming larger models. 🧵

_christinabaek's tweet photo. Models are typically specialized to new domains by finetuning on small, high-quality datasets.

We find that repeating the same dataset 10–50× starting from pretraining leads to substantially better downstream performance, in some cases outperforming larger models. 🧵 https://t.co/stFslu9Mv7

19

626

81

532

96K

abitha___ retweeted

Aldo Gael Carranza @agcrnz

4 months ago

1/ We’ve released a report on our work on multilingual data curation @datologyai. tl;dr: We shift the performance–compute Pareto frontier for multilingual models. Entirely by improving data quality and composition. arxiv: https://t.co/bLv8IySa8G blog: https://t.co/sczLujHj42

2

35

9

11

3K

abitha___ retweeted

Kaleigh Mentzer @KaleighMentzer

4 months ago

🌎Making your model multilingual doesn't have to sacrifice English performance—you just need better data. @agcrnz, @RicardoMonti9, and I have been working on curating the best possible multilingual data with the team @datologyai, and it works! Check out the results 👇

KaleighMentzer's tweet photo. 🌎Making your model multilingual doesn't have to sacrifice English performance—you just need better data.

@agcrnz, @RicardoMonti9, and I have been working on curating the best possible multilingual data with the team @datologyai, and it works! Check out the results 👇 https://t.co/GEgwlEpeZT

0

31

13

2

3K

abitha___ retweeted

Ricardo Monti @RicardoMonti9

4 months ago

1/ People often think better multilingual models must come at the cost of English performance. Not true. The constraint isn’t capacity, it’s data quality, and we can fix it. Today @datologyAI shares ÜberWeb: a year of multilingual curation lessons, scaled to 20T+ tokens.

RicardoMonti9's tweet photo. 1/ People often think better multilingual models must come at the cost of English performance. Not true. The constraint isn’t capacity, it’s data quality, and we can fix it.

Today @datologyAI shares ÜberWeb: a year of multilingual curation lessons, scaled to 20T+ tokens. https://t.co/mVCWogFTYd

7

153

30

65

40K

abitha___ retweeted

Amanda Bertsch @abertsch72

8 months ago

Can LLMs accurately aggregate information over long, information-dense texts? Not yet… We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!

abertsch72's tweet photo. Can LLMs accurately aggregate information over long, information-dense texts? Not yet…

We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong! https://t.co/owTnNO3RF9

13

358

68

219

82K

Abitha @abitha___

8 months ago

@universeinanegg @yoavgo https://t.co/jkieF20vnJ ^ seems to fix some of this behavior

0

1

0

35

Abitha @abitha___

8 months ago

@universeinanegg @yoavgo Training objective mismatch in post training : Language models being unable to output ‘I don’t know’- https://t.co/WauaH9HuiZ; Very vaguely - the model just picks the closest embedding. This explains the repetition and retrying until the token budget runs out.

1

0

57

Abitha @abitha___

8 months ago

Homanga is an incredible researcher and mentor. If you value thoughtful insights and exciting research problems, apply to work with him at JHU!

Homanga Bharadhwaj

@mangahomanga

8 months ago

I'll be joining the faculty @JohnsHopkins late next year as a tenure-track assistant professor in @JHUCompSci Looking for PhD students to join me tackling fun problems in robot manipulation, learning from human data, understanding+predicting physical interactions, and beyond!

mangahomanga's tweet photo. I'll be joining the faculty @JohnsHopkins late next year as a tenure-track assistant professor in @JHUCompSci

Looking for PhD students to join me tackling fun problems in robot manipulation, learning from human data, understanding+predicting physical interactions, and beyond!

87

854

115

161

132K

0

5

0

761

Abitha @abitha___

9 months ago

Cool work from @gaurav_ghosal !

Aditi Raghunathan

@AdtRaghunathan

9 months ago

There’s been a lot of work on unlearning in LLMs, trying to erase memorization without hurting capabilities — but we haven’t seen much success. ❓What if unlearning is actually doomed from the start? 👇This thread explains why and how *memorization sinks* offer a new way forward.

6

176

42

106

42K

0

5

0

544

abitha___ retweeted

Yuda Song @yus167

10 months ago

LLMs lose diversity after RL post-training, and this hurts test-time scaling & creativity. Why does this collapse happen, and how can we fix it? Our new work introduces: 🔍 RL as Sampling (analysis) 🗺️ Outcome-based Exploration (intervention) [1/n]

yus167's tweet photo. LLMs lose diversity after RL post-training, and this hurts test-time scaling & creativity.
Why does this collapse happen, and how can we fix it?
Our new work introduces:
🔍 RL as Sampling (analysis)
🗺️ Outcome-based Exploration (intervention)
[1/n] https://t.co/N1YbdjFWeS

9

464

86

394

40K

abitha___ retweeted

Rosinality @rosinality

10 months ago

Outcome-based Exploration for LLM Reasoning Mitigating reduction of diversity due to RL involves using UCB on answers. There are many studies on this recently (https://t.co/ez9BWWS2lB) and it could be important especially for creative tasks.

rosinality's tweet photo. Outcome-based Exploration for LLM Reasoning

Mitigating reduction of diversity due to RL involves using UCB on answers. There are many studies on this recently (https://t.co/ez9BWWS2lB) and it could be important especially for creative tasks. https://t.co/uG1AybBzmj

6

199

24

172

13K

abitha___ retweeted

Gaurav Ghosal

@gaurav_ghosal

12 months ago

1/So much of privacy research is designing post-hoc methods to make models mem. free. It’s time we turn that around with architectural changes. Excited to add Memorization Sinks to the transformer architecture this #ICML2025 to isolate memorization during LLM training🧵

gaurav_ghosal's tweet photo. 1/So much of privacy research is designing post-hoc methods to make models mem. free.
It’s time we turn that around with architectural changes. Excited to add Memorization Sinks to the transformer architecture this #ICML2025 to isolate memorization during LLM training🧵 https://t.co/SC0ZPJqvLm

1

62

24

34

10K

abitha___ retweeted

Yiding Jiang

@yidingjiang

12 months ago

@abitha___ will be presenting our work on training language models to predict further into the future beyond the next token and the benefits this objective brings. https://t.co/NkbkkuczfZ

0

18

5

3

2K

abitha___ retweeted

Yiding Jiang

@yidingjiang

12 months ago

I will talk about how to train agents with decision making capabilities that generalize to completely new environments: https://t.co/ThFOpCtT4k

2

19

4

8

4K

Abitha

@abitha___

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users