Mark Ibrahim @ICLR 2026 @marksibrahim - Twitter Profile

about 1 month ago

Don’t miss @dohmatobelvis presenting our latest work, “Why less is more (sometimes): A theory of data curation” at #ICLR2026! Swing by our poster at the main conference to chat: 📅 Saturday, April 25 🕒 3:15pm–5:45pm 📍 Pavilion 3, P3-#1816

0

50

11

34

6K

Mark Ibrahim @ICLR 2026

@marksibrahim

about 2 months ago

Come learn about computer-use agents with OpenApps, oral at #ICLR2026 in Rio 🇧🇷on Saturday 2:35pm ET Room 204 or stop by our poster in the morning https://t.co/5gLvS3ztdb w/ @karen_ullrich https://t.co/ohMN7s2Vo4

marksibrahim's tweet photo. Come learn about computer-use agents with OpenApps, oral at #ICLR2026 in Rio 🇧🇷on Saturday 2:35pm ET Room 204 or stop by our poster in the morning https://t.co/5gLvS3ztdb w/ @karen_ullrich

https://t.co/ohMN7s2Vo4

Mark Ibrahim @ICLR 2026

@marksibrahim

6 months ago

Want to teach AI agents to use apps like humans? Get started with digital agents research using OpenApps, our new Python-based environment.

1

29

10

9

10K

0

2

0

166

marksibrahim retweeted

Dr. Karen Ullrich @karen_ullrich

about 2 months ago

I am soon heading to Rio for #ICLR2026! It is going to be a packed week: including an oral presentation of OpenApps, our work on measuring how reliable UI agents really are when the apps they interact with change.

1

25

1

4

2K

marksibrahim retweeted

Sharut Gupta @sharut_gupta

4 months ago

1/n Can LLMs learn to reason on hard benchmarks like AIME and GPQA purely through context, without SFT, RL, or any weight updates? Turns out… Yes! And it can have strong performance while being highly efficient Paper: https://t.co/mEoaIst6cX Blog: https://t.co/lZli7qY4Jz

sharut_gupta's tweet photo. 1/n Can LLMs learn to reason on hard benchmarks like AIME and GPQA purely through context, without SFT, RL, or any weight updates?

Turns out… Yes! And it can have strong performance while being highly efficient

Paper: https://t.co/mEoaIst6cX
Blog: https://t.co/lZli7qY4Jz https://t.co/eOO3Jb6vfm

4

205

35

163

18K

Who to follow

Sophia Sanborn

@naturecomputes

Scientist / Founder | Neuro + AI | Prof @stanford | https://t.co/ymy1Hq4vbf & @metamorphiclabs

Yu Bai

@yubai01

Training Accelerations @OpenAI. Previously @SFResearch, PhD @Stanford.

Sham Kakade

@ShamKakade6

Harvard Professor. Full stack ML and AI. Co-director of the Kempner Institute for the Study of Artificial and Natural Intelligence.

marksibrahim retweeted

Jack Morris

@jxmnop

4 months ago

at long last, the final paper of my phd 🧮 Learning to Reason in 13 Parameters 🧮 we develop TinyLoRA, a new ft method. with TinyLoRA + RL, models learn well with dozens or hundreds of params example: we use only 13 parameters to train 7B Qwen model from 76 to 91% on GSM8K 🤯

jxmnop's tweet photo. at long last, the final paper of my phd

🧮 Learning to Reason in 13 Parameters 🧮

we develop TinyLoRA, a new ft method. with TinyLoRA + RL, models learn well with dozens or hundreds of params

example: we use only 13 parameters to train 7B Qwen model from 76 to 91% on GSM8K 🤯

60

2K

232

1K

182K

Mark Ibrahim @ICLR 2026

@marksibrahim

5 months ago

@fujikanaeda Related finding showing a single character can break LLM evals: https://t.co/JQr0JDfVWP

Mark Ibrahim @ICLR 2026

@marksibrahim

8 months ago

One can manipulate LLM rankings to put any model in the lead—only by modifying the single character separating demonstration examples. Learn more in our new paper https://t.co/D8CzSpPxMU w/ Jingtong Su, Jianyu Zhang, @karen_ullrich , and Léon Bottou. 1/3 🧵

marksibrahim's tweet photo. One can manipulate LLM rankings to put any model in the lead—only by modifying the single character separating demonstration examples. Learn more in our new paper https://t.co/D8CzSpPxMU

w/ Jingtong Su, Jianyu Zhang, @karen_ullrich , and Léon Bottou.
1/3 🧵 https://t.co/WJnT07gZnc

1

11

3

6

2K

1

5

0

273

marksibrahim retweeted

Basile Terver

@BasileTerv987

5 months ago

My first PhD paper is out! 🎓 "What Drives Success in Physical Planning with Joint-Embedding Predictive World Models?" tl:dr: JEPA-WMs for robotics: learn dynamics on top of visual encoders, optimize actions towards goal 👇 w/ @JimmyTYYang1, Jean Ponce, @AdrienBardes, @ylecun

14

949

112

468

146K

marksibrahim retweeted

Dr. Karen Ullrich @karen_ullrich

6 months ago

Release Day 🎉 Meet OpenApps — a pure-Python, open-source ecosystem for stress-testing UI agents at scale. Runs on a single CPU. Generates thousands of unique UI variations. And it reveals just how fragile today’s SOTA agents are. (Yes, even GPT-4 and Claude struggle.)

3

34

17

13

10K

Mark Ibrahim @ICLR 2026

@marksibrahim

6 months ago

@browsercompany @fasthtml @openstreetmap in collaboration with the excellent research team at FAIR: @karen_ullrich Jingtong Su @randall_balestr @_amirbar Claudia Shi, Arjun Subramonian, Nikolaos Tsilivis, Ivan Evtimov, adn @KempeLab

0

6

1

1K

Mark Ibrahim @ICLR 2026

@marksibrahim

6 months ago

Want to teach AI agents to use apps like humans? Get started with digital agents research using OpenApps, our new Python-based environment.

1

29

10

9

10K

Mark Ibrahim @ICLR 2026

@marksibrahim

6 months ago

built on top of excellent framework thanks to @browsercompany @fasthtml @openstreetmap

1

3

0

283

marksibrahim retweeted

Dr. Karen Ullrich @karen_ullrich

6 months ago

Stop by the Meta booth tomorrow, Wednesday Dec 3rd at #NeurIPS in San Diego! 🤖📱 We demo our new research environment, OpenApps, for digital agents. Generate thousands of app versions to train and evaluate multimodal agents to use apps like humans do. Not attending? Stay tuned

karen_ullrich's tweet photo. Stop by the Meta booth tomorrow, Wednesday Dec 3rd at #NeurIPS in San Diego! 🤖📱

We demo our new research environment, OpenApps, for digital agents. Generate thousands of app versions to train and evaluate multimodal agents to use apps like humans do.

Not attending? Stay tuned https://t.co/rt64Z5PdXC

1

9

2

0

935

marksibrahim retweeted

Randall Balestriero

@randall_balestr

7 months ago

With LeJEPA (https://t.co/RR9kcXEqSk) it has never been easier to train JEPAs! And this matters A LOT because JEPAs have numerous provable benefits over the good-old reconstruction based methods (https://t.co/bOg6uibdHP). NeurIPS spotlight: Wed, 11 a.m. PST, Hall C,D,E #2613

12

446

60

331

86K

Mark Ibrahim @ICLR 2026

@marksibrahim

7 months ago

✅ 22k multi-scene questions ✅ New scenes not in existing web data ✅ Runs in ~15 min on one GPU Work led by Candace Ross in collaboration with Florian Bordes, @adinamwilliams, and @polkirichenko . Check it out on HuggingFace & ArXiv: https://t.co/LR1Lf97y3Z

0

2

0

125

Mark Ibrahim @ICLR 2026

@marksibrahim

7 months ago

We introduce, Common-O, a new multimodal benchmark for hallucination when reasoning across scenes. We find leading multimodal LLMs can reliably identify objects, yet hallucinate when reasoning across scenes. 🧵1/3

marksibrahim's tweet photo. We introduce, Common-O, a new multimodal benchmark for hallucination when reasoning across scenes.

We find leading multimodal LLMs can reliably identify objects, yet hallucinate when reasoning across scenes.

🧵1/3

1

11

2

4

4K

Mark Ibrahim @ICLR 2026

@marksibrahim

7 months ago

Despite saturating single image perception, Common-O establishes a new challenging multimodal benchmark. The best performing model only achieves 35% on Common-O and on Common-O Complex, consisting of more complex scenes, the best model achieves only 1%. 🧵2/3

marksibrahim's tweet photo. Despite saturating single image perception, Common-O establishes a new challenging multimodal benchmark. The best performing model only achieves 35% on Common-O and on Common-O Complex, consisting of more complex scenes, the best model achieves only 1%.

🧵2/3 https://t.co/mzAwHrD9DS

1

2

0

150

marksibrahim retweeted

Sarthak Mittal

@sarthmit

8 months ago

Meta on meta: thrilled to share our work on Meta-learning… at Meta! 🔥🧠 We make two major contributions: 1️⃣ Unified framework revealing insights into various amortizations 🧠 2️⃣ Greedy belief-state updates to handle long context-lengths 🚀

sarthmit's tweet photo. Meta on meta: thrilled to share our work on Meta-learning… at Meta! 🔥🧠

We make two major contributions:

1️⃣ Unified framework revealing insights into various amortizations 🧠
2️⃣ Greedy belief-state updates to handle long context-lengths 🚀 https://t.co/E6MwAdUEzQ

5

224

31

166

46K

Mark Ibrahim @ICLR 2026

@marksibrahim

8 months ago

If you’re an NYU student, come learn about this wonderful opportunity to collaborate with us at FAIR https://t.co/P4hHZZXXGq Panel is tomorrow 10am at NYU Center for Data Science.

0

41

6

18

4K

Mark Ibrahim @ICLR 2026

@marksibrahim

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users