John Burden @JohnJBurden - Twitter Profile

John Burden @JohnJBurden

8 months ago

Check out the paper on arxiv! https://t.co/XSlIJaYZ1K

0

34

John Burden @JohnJBurden

8 months ago

New preprint! 📜 We apply classic visual search paradigms from cognitive psychology to multimodal LLMs. Instead of just benchmarking outputs, we probe the human-likeness of their responses. How do models like GPT‑4o, Claude and Llama respond to simple vs. compound visual features? A thread 🧵

1

3

1

0

286

John Burden @JohnJBurden

8 months ago

We think this type of work is valuable --- rather than identifying whether a model *can* perform some task, we want to get a better picture of the computational/cognitive processes driving performance. This sits somewhere between benchmarking and Mechanistic Interpretability---we try to see what the model is doing at a higher level of abstraction.

1

0

54

John Burden @JohnJBurden

8 months ago

@TheZvi Hey now, it can't solve the 555th problem! Definitely can't solve that one yet.

0

1

0

263

Who to follow

Rachel Freedman (will be @ICML2026)

@FreedmanRach

RLHF, LLMS, interpretability & safety | PhD researcher @berkeley_ai | Previously @Cambridge_Uni and @DukeU

Jess Whittlestone

@jesswhittles

AI Policy at @LongResilience, thinking about how to govern AI so that it's safe and beneficial. Big fan of dogs and coffee.

Owain Evans

@OwainEvans_UK

Runs an AI Safety research group in Berkeley (Truthful AI) + Affiliate at UC Berkeley. Past: Oxford Uni, TruthfulQA, Reversal Curse. Prefer email to DM.

John Burden @JohnJBurden

9 months ago

@dioscuri I admit I was surprised and saddened when I found out his level of analysis was really the only thing he had time to do. It's one of those things that's really simple in retrospect but that takes a special something to come up with--hallmark of a good idea.

0

2

0

13

John Burden @JohnJBurden

9 months ago

@sebkrier Yeah but in the meantime it's been renamed to "governance hacking" or something.

0

1

0

52

JohnJBurden retweeted

IJCAIconf @IJCAIconf

10 months ago

#IJCAI2025 John Burden, University of Cambridge, delivering their talk #8954 on Paradigms of AI Evaluation: Mapping Goals, Methodologies and Culture at the Machine Learning (1/4).

IJCAIconf's tweet photo. #IJCAI2025 John Burden, University of Cambridge, delivering their talk #8954 on Paradigms of AI Evaluation: Mapping Goals, Methodologies and Culture at the Machine Learning (1/4). https://t.co/zE4FhRGAMT

1

2

1

0

299

John Burden @JohnJBurden

10 months ago

@Jsevillamol This seems valid... But also somewhat sad. I don't want AI for business, I want it for general wellbeing and reducing the load on working people.

1

0

49

John Burden @JohnJBurden

10 months ago

"Just-In-Time" AI Evaluation: use the model for what you want, when you need it. If it works, use it. If it does t work, don't use it.

0

1

0

61

John Burden @JohnJBurden

10 months ago

@PrinceVogel I think that's fair.

0

19

John Burden @JohnJBurden

10 months ago

@PrinceVogel What if you're pro it because you think it could be the catalyst for interviews to ask more meaningful questions/ tasks?

0

126

John Burden @JohnJBurden

10 months ago

@fchollet Shameless self-plug here on this topic https://t.co/srdu6C9iwl We make a very similar argument in this paper from a legal perspective on the right to have access to unpolluted data.

0

17

0

3

4K

John Burden @JohnJBurden

10 months ago

@BlackHC @OwainEvans_UK Moreover there's the whole " inference between the gaps" thing (another banger from @OwainEvans_UK ). What does an inferential-world look like without any traces of these papers/stories?

0

22

John Burden @JohnJBurden

10 months ago

@BlackHC @OwainEvans_UK I agree, but I'm not sure how you do this without guaranteeing some kind of screwup? Thinking about canary codes from big bench etc.

1

0

38

John Burden @JohnJBurden

10 months ago

@RosieCampbell Super interesting and a bit disturbing about human psychology. But what do we do about this a few years down the line when (presumably) this effect is much more widespread?

0

3

0

448

John Burden @JohnJBurden

10 months ago

@HumanHarlan Imagine growing up thinking any of this is vaguely normal or well understood

0

1

0

39

John Burden @JohnJBurden

10 months ago

@robinhanson I was waiting to see what term you were using CPR for and then have a good laugh at myself for think it was... Whatever CPR actually stands for. Man am I disappointed.

0

1

0

263

John Burden

@JohnJBurden

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users