Jefferson Enrique Hernandez Cevallos @jefehern - Twitter Profile

jefehern retweeted

about 20 hours ago

Scaling laws describe how loss changes with scale. Do neurons inside models change predictably too? We study vision and language models up to 30B params and find systematic scaling in neuron universality, specialization, and selectivity. Paper+code: https://t.co/1f1mQGnnZ4 1/n

8

266

60

196

161K

jefehern retweeted

Rishi Desai

@rishi_desai2

about 22 hours ago

Can coding agents stay coherent over a 1 billion token budget? Can they build Slack from scratch? Rewrite a JAX codebase in PyTorch? Build a C compiler in Rust? Enter SWE-Marathon: a benchmark for autonomous long-horizon software work.

rishi_desai2's tweet photo. Can coding agents stay coherent over a 1 billion token budget?

Can they build Slack from scratch?
Rewrite a JAX codebase in PyTorch?
Build a C compiler in Rust?

Enter SWE-Marathon: a benchmark for autonomous long-horizon software work. https://t.co/K97VHyLvIX

37

381

42

134

144K

jefehern retweeted

Sanjit Dandapanthula

@sanjitdp

about 21 hours ago

super excited to share our latest work! are we really tilting? 🤨 tldr: reward guidance for flows and diffusions is supposed to sample from the reward-tilted distribution. we show it doesn’t 😰 and how to (mostly) fix it ✨ plus lots of fun images!! 🖼️ collaboration with the awesome @nmboffi website: https://t.co/nvOaAiGYq1 paper: https://t.co/EtkeyiuX7s code: https://t.co/V3Bi4IVPbf

2

79

14

65

10K

jefehern retweeted

Kent Fujiwara @kentfuji

2 days ago

こんな面白い研究あったのねデータセットの重複しないサブセット2つ用意してそれぞれで別の拡散モデル訓練する時、データ数増やしてゆくと同じノイズが似たような画像を作るようになる、と https://t.co/pE8AL3YzJ4

kentfuji's tweet photo. こんな面白い研究あったのね
データセットの重複しないサブセット2つ用意してそれぞれで別の拡散モデル訓練する時、データ数増やしてゆくと同じノイズが似たような画像を作るようになる、と

https://t.co/pE8AL3YzJ4 https://t.co/49UkvOR4OB

2

183

20

144

15K

Who to follow

J_Tobar

@johnnyt2141

Trophy husband. Hey! A participation trophy is still a trophy. A pretty good dad, I think, and a proud Papa to the two cutest little girls. USMC VET 🇺🇸

Family first, software engineer and maker second.

jefehern retweeted

Jihan Yang

@jihanyang13

11 days ago

Camera pose matters for video understanding! Today's MLLMs excel at recognizing activities, but still struggle with the underlying space and ego/object dynamics in video. We trace this gap to a missing piece: camera pose. Introducing Cambrian-P: a multimodal LLM natively grounded in camera pose. (1/n)

jihanyang13's tweet photo. Camera pose matters for video understanding!

Today's MLLMs excel at recognizing activities, but still struggle with the underlying space and ego/object dynamics in video. We trace this gap to a missing piece: camera pose.

Introducing Cambrian-P: a multimodal LLM natively grounded in camera pose. (1/n)

2

278

47

175

53K

jefehern retweeted

Niels Rogge @NielsRogge

12 days ago

One of the hottest terms in AI right now is "On-policy distillation". It is a post-training technique in which a student model, typically an LLM, samples from its current policy and receives a teacher signal for on-policy states. It combines the dense supervision of distillation with the locality of online RL. Now a method on PapersWithCode! Find all 183 papers that cite it, and more here: https://t.co/NIsUjyU3UP

NielsRogge's tweet photo. One of the hottest terms in AI right now is "On-policy distillation".

It is a post-training technique in which a student model, typically an LLM, samples from its current policy and receives a teacher signal for on-policy states. It combines the dense supervision of distillation with the locality of online RL.

Now a method on PapersWithCode!

Find all 183 papers that cite it, and more here: https://t.co/NIsUjyU3UP

21

1K

128

1K

84K

jefehern retweeted

Matteo

@MozarellaPesto

13 days ago

I trained an autoencoder that reconstructs images with zero reconstruction loss. No MSE. No image space supervision. The only signal: "According to you, does your output look like your input through your own eyes?" It works. Blog link, demo and summary 👇

24

615

47

636

68K

jefehern retweeted

Oier Mees @oier_mees

23 days ago

𝗧𝗵𝗲 𝗿𝗲𝗰𝗼𝗿𝗱𝗶𝗻𝗴 𝗼𝗳 𝗟𝘂𝗰𝗮𝘀 𝗕𝗲𝘆𝗲𝗿'𝘀 (@giffmana) 𝗹𝗲𝗰𝘁𝘂𝗿𝗲 𝗮𝘁 @ETH 𝗶𝘀 𝗻𝗼𝘄 𝗹𝗶𝘃𝗲 𝗼𝗻 𝗬𝗼𝘂𝗧𝘂𝗯𝗲 𝗳𝗼𝗿 𝗲𝘃𝗲𝗿𝘆𝗼𝗻𝗲 𝘄𝗵𝗼 𝗰𝗼𝘂𝗹𝗱𝗻'𝘁 𝗷𝗼𝗶𝗻 𝘂𝘀 𝗶𝗻 𝗽𝗲𝗿𝘀𝗼𝗻! This past Monday, we had the pleasure of hosting Lucas (@Meta @AIatMeta Superintelligence Labs) for our "Robot Learning: From Fundamentals to Foundation Models" course. He joined us to talk about: "𝗩𝗶𝘀𝗶𝗼𝗻 𝗶𝗻 𝘁𝗵𝗲 𝗔𝗴𝗲 𝗼𝗳 𝗟𝗟𝗠𝘀". Drawing from a remarkable track record in computer vision and multimodal AI (𝗩𝗶𝗧, 𝗦𝗶𝗴𝗟𝗜𝗣, 𝗣𝗮𝗹𝗶𝗚𝗲𝗺𝗺𝗮) 🧠, Lucas delivered a masterclass on the frontier of multimodal foundation model training: from pre-training to post-training, where the field stands today, and what comes next 🚀 📽️ YouTube Recording: https://t.co/wNz1NwYkvb 📚 Course Website: https://t.co/DoQUYy3MjB

oier_mees's tweet photo. 𝗧𝗵𝗲 𝗿𝗲𝗰𝗼𝗿𝗱𝗶𝗻𝗴 𝗼𝗳 𝗟𝘂𝗰𝗮𝘀 𝗕𝗲𝘆𝗲𝗿'𝘀 (@giffmana) 𝗹𝗲𝗰𝘁𝘂𝗿𝗲 𝗮𝘁 @ETH 𝗶𝘀 𝗻𝗼𝘄 𝗹𝗶𝘃𝗲 𝗼𝗻 𝗬𝗼𝘂𝗧𝘂𝗯𝗲 𝗳𝗼𝗿 𝗲𝘃𝗲𝗿𝘆𝗼𝗻𝗲 𝘄𝗵𝗼 𝗰𝗼𝘂𝗹𝗱𝗻'𝘁 𝗷𝗼𝗶𝗻 𝘂𝘀 𝗶𝗻 𝗽𝗲𝗿𝘀𝗼𝗻!
This past Monday, we had the pleasure of hosting Lucas (@Meta @AIatMeta Superintelligence Labs) for our "Robot Learning: From Fundamentals to Foundation Models" course. He joined us to talk about: "𝗩𝗶𝘀𝗶𝗼𝗻 𝗶𝗻 𝘁𝗵𝗲 𝗔𝗴𝗲 𝗼𝗳 𝗟𝗟𝗠𝘀".
Drawing from a remarkable track record in computer vision and multimodal AI (𝗩𝗶𝗧, 𝗦𝗶𝗴𝗟𝗜𝗣, 𝗣𝗮𝗹𝗶𝗚𝗲𝗺𝗺𝗮) 🧠, Lucas delivered a masterclass on the frontier of multimodal foundation model training: from pre-training to post-training, where the field stands today, and what comes next 🚀
📽️ YouTube Recording: https://t.co/wNz1NwYkvb
📚 Course Website: https://t.co/DoQUYy3MjB

5

674

71

701

54K

jefehern retweeted

Dimitris Papailiopoulos

@DimitrisPapail

19 days ago

https://t.co/n10GwfKYuY

55

981

128

1K

850K

jefehern retweeted

Souradip Chakraborty

@SOURADIPCHAKR18

23 days ago

🚨Typical RL algorithms and on-policy distillation methods are blind samplers: they use privileged info to score rollouts, but not to *find* them. We ask: can we use privileged info to *actively sample* the rollouts RL wishes it can stumble upon with compute? ⤵️ Pedagogical RL

SOURADIPCHAKR18's tweet photo. 🚨Typical RL algorithms and on-policy distillation methods are blind samplers: they use privileged info to score rollouts, but not to *find* them.

We ask: can we use privileged info to *actively sample* the rollouts RL wishes it can stumble upon with compute?

⤵️ Pedagogical RL https://t.co/c6BcLBDIVv

15

493

87

536

113K

jefehern retweeted

Amir Zamir

@zamir_ar

23 days ago

Test-time scaling, reasoning, and generally search-like processes clearly drive significant gains in LLMs. Largely owed to the structure of language. One would think the same could apply to non-linguistic domains, like image generation, but that obviously depends on whether the structure of the domain's representation lends itself to search. 1D ordered tokens (e.g., image FlexTok, video FlexTok) seem like a natural fit since they enable a step-by-step coarse-to-fine generation. We investigated that and found they indeed enable search and scale far better with test-time compute than 2D grids. See the visuals on the webpage. Appearing in @icmlconf 2026. 🔗 https://t.co/yOFqeIJrEz 📄 https://t.co/WFZCihp1m4,

5

137

31

85

15K

jefehern retweeted

wh

@nrehiew_

24 days ago

2 new OPD survey/analysis papers just dropped

2

130

16

147

8K

jefehern retweeted

Peter Pao-Huang @peterpaohuang

25 days ago

Introducing Flux Matching, a generative modeling paradigm that generalizes diffusion models to vector fields that need not be the score function. Enables structural priors in the dynamics, faster sampling, interpretable generation, and more! w/ @StefanoErmon @Xiaojie_Qiu 🧵⤵️

20

990

160

768

143K

jefehern retweeted

Mehrdad Farajtabar @MFarajtabar

25 days ago

🧵 1/11 Everyone's doing on-policy distillation now (Qwen3, Deepseek V4, GLM-5). But here's what nobody's asking: at any given token or for a question and a teacher, when does the teacher's guidance actually help, and when does it quietly make things worse? We found a way to answer this. No training needed!

MFarajtabar's tweet photo. 🧵 1/11 Everyone's doing on-policy distillation now (Qwen3, Deepseek V4, GLM-5).

But here's what nobody's asking: at any given token or for a question and a teacher, when does the teacher's guidance actually help, and when does it quietly make things worse?

We found a way to answer this. No training needed!

4

436

51

512

30K

jefehern retweeted

alex zhang

@a1zhang

25 days ago

RLM arXiv paper update: depth>1 results, more comparisons, more training, and more error analysis! We add depth=2/3 experiments, where the RLM now has access to recursive RLM calls. This is also a feature of the open source `rlm` repo as well. We observe significant performance gains on OOLONG-Pairs and gains on all other benchmarks! We also include various OpenCode and Claude Code comparisons now per popular request. We add a length generalization experiment on MRCRv2 to show more promising training results, add a small prompting case study on OOLONG, and update the error analysis section to discuss the effect of syntax errors, decomposition mistakes, and general observations from the RLM trajectories. The appendix is now also updated with several new experiments and plots!

a1zhang's tweet photo. RLM arXiv paper update: depth>1 results, more comparisons, more training, and more error analysis!

We add depth=2/3 experiments, where the RLM now has access to recursive RLM calls. This is also a feature of the open source `rlm` repo as well. We observe significant performance gains on OOLONG-Pairs and gains on all other benchmarks!

We also include various OpenCode and Claude Code comparisons now per popular request.

We add a length generalization experiment on MRCRv2 to show more promising training results, add a small prompting case study on OOLONG, and update the error analysis section to discuss the effect of syntax errors, decomposition mistakes, and general observations from the RLM trajectories.

The appendix is now also updated with several new experiments and plots!

5

232

35

113

11K

jefehern retweeted

Sophie Wang @SophieLWang

25 days ago

"The Truth Lies Somewhere in the Middle (of the Generated Tokens)" In autoregressive language models, mean pooling hidden states across generation yields better representations than any token alone. project page: https://t.co/kXddYUir4k w/ @phillip_isola and @thisismyhat

9

470

68

384

50K

jefehern retweeted

hardmaru

@hardmaru

28 days ago

Reproducing all of Schmidhuber’s papers (1990-2025) using an AI coding assistant. Cool project by @yaroslavvb! It even reproduced the “World Models” paper by me and @SchmidhuberAI with a toy env, with a full VAE + RNN world model implementation. Project: https://t.co/sgQG5umNEm

44

1K

155

649

94K

jefehern retweeted

Jiayi Weng

@Trinkle23897

29 days ago

Codex grew programmatic policies with no neural nets: max score on Breakout, and SOTA-level scores on MuJoCo. Maybe heuristics were not too weak. Maybe they were just too expensive to maintain. Maybe it's the next paradigm. https://t.co/1ZaIneleuW

64

1K

234

1K

3M

jefehern retweeted

Gabriele Berton

@gabriberton

29 days ago

Cool paper from Meta suggesting that future MLLMs will be Native Multimodal Models (NMM), hence no vision encoders anymore But I disagree I actually think we'll go in the other direction (what? more encoders? yes! read on...) All you need to know about the future of MLLMs 🧵

gabriberton's tweet photo. Cool paper from Meta suggesting that future MLLMs will be Native Multimodal Models (NMM), hence no vision encoders anymore

But I disagree

I actually think we'll go in the other direction (what? more encoders? yes! read on...)

All you need to know about the future of MLLMs 🧵 https://t.co/eX6tmANJGp

10

192

24

202

69K

jefehern retweeted

Sander Dieleman

@sedielem

about 1 month ago

My first blog post in over a year is a deep dive on flow maps🗺️, or how to learn the integral of a diffusion model to enable faster sampling and several other cool tricks. It's the longest one yet👀 Let me know what you think! https://t.co/O8bBGZ9qjC

7

735

170

562

85K

Jefferson Enrique Hernandez Cevallos

@jefehern

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users