Manuel Cherep @manuelcherep - Twitter Profile

2 days ago

Yours truly, the program committee 🙂 @manuelcherep (MIT) @_Hao_Zhu (Stanford) @StevenyzZhang (Georgia Tech + Stanford) @Xinyang_Han_ (UC Berkeley) @BenSManning (MIT) @isi_magistrali (ETH) Saab Mansour (Amazon) @weronika_laj (Amazon) @PattieMaes (MIT) @nikhilsinghmus (Dartmouth)

0

5

0

144

Manuel Cherep

@manuelcherep

2 days ago

✨Announcing the first Workshop on Agent Behavior @COLM_conf 2026 (Oct 9, San Francisco 🌅) https://t.co/8vYnVp4uXf We invite two types of contributions: (i) papers, and (ii) benchmark proposals. We are also seeking reviewers. More details below!

1

15

5

7

5K

Manuel Cherep

@manuelcherep

2 days ago

If you’re interested in being a reviewer, please fill out this form! https://t.co/rGRQF9aOm9

1

3

0

1

212

Manuel Cherep

@manuelcherep

about 1 month ago

@abhi__katiyar @iclr_conf Thank you!

0

37

Who to follow

about 1 month ago

ABxLab is accepted at @iclr_conf #ICLR 2026! ✨We ask: why do AI agents do what they do? 🧐 We introduce a framework for systematically studying AI agent behavior through controlled manipulations of their environments. We accomplish this by intercepting any real web environments and modifying what the agent sees in real time before they actually see it.

2

18

3

10

8K

Manuel Cherep

@manuelcherep

about 1 month ago

Work with Chengtian Ma, Abigail Xu, Maya Shaked, @pattiemaes, @nikhilsinghmus 🌐Web: https://t.co/04NDCCuyiZ 💻Code: https://t.co/KIkq28WEUe 📄Paper: https://t.co/sDr9LHqWqk Would love to hear your thoughts!

0

2

0

1

146

Manuel Cherep

@manuelcherep

about 1 month ago

The world is also full of visual cues 👀, and you might be wondering whether agents are sensitive to these as well. The answer is yes! Check out our new paper, where we introduce an optimization method for editing images to understand VLMs’ decisions: https://t.co/Mpcx9P3GRG

Manuel Cherep

@manuelcherep

3 months ago

Some decisions we make with our eyes 👀, but what about VLMs? Do they have structured, exploitable visual preferences that we can discover systematically before adversarial actors do? In our new paper, we propose a new optimization method for this and show substantial effects on VLMs’ decisions.

2

14

5

3K

1

2

0

189

manuelcherep retweeted

Nikhil Singh @nikhilsinghmus

about 2 months ago

Excited to (finally) share this paper, accepted at @iclr_conf #ICLR 2026! ✨ In this work, we use sparse autoencoders (SAEs) to study the internal representations of generative music models (here, MusicGen) and automatically discover how they encode concepts.

2

148

12

98

16K

Manuel Cherep

@manuelcherep

3 months ago

@sanskxr02 Thank you!

0

34

Manuel Cherep

@manuelcherep

3 months ago

Some decisions we make with our eyes 👀, but what about VLMs? Do they have structured, exploitable visual preferences that we can discover systematically before adversarial actors do? In our new paper, we propose a new optimization method for this and show substantial effects on VLMs’ decisions.

2

14

5

3K

Manuel Cherep

@manuelcherep

3 months ago

Do you see like an agent? Try it yourself: https://t.co/ETgo1Wr4tp Paper: https://t.co/BTx8tnf3c4 Co-Authors: Pranav M R, Pattie Maes (@PattieMaes), Nikhil Singh (@nikhilsinghmus)

0

3

1

0

151

Manuel Cherep

@manuelcherep

3 months ago

In our recent ICLR 2026 paper, we showed how to study other kinds of sensitivities in agent behavior by using counterfactuals with our new framework (ABxLab) https://t.co/GCkEFGBgMr

Manuel Cherep

@manuelcherep

7 months ago

How does it work? ABxLAB is a "man-in-the-middle" framework. It intercepts web content in real-time to run controlled experiments on agents by modifying the choice architecture. Think of it as a behavioral science lab for LLMs. Paper: https://t.co/g6rdkg108n 🧵2/9

1

3

0

665

1

3

2

0

517

Manuel Cherep

@manuelcherep

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users