tom cunningham @testingham - Twitter Profile

Pinned Tweet

tom cunningham

@testingham

almost 5 years ago

1

100

10

8

0

tom cunningham

@testingham

42 minutes ago

@joshgans probably there are other models of automated R&D that are applicable here?

0

167

tom cunningham

@testingham

about 4 hours ago

My very speculative reconstruction of the economics of using LLMs to find bugs:

Peter Wildeford🇺🇸🚀

@peterwildeford

2 days ago

Mythos at Palo Alto Networks "found more than two dozen critical vulnerabilities in around three weeks, roughly five times what the company would typically find using existing tools" But the company "burned through more than $1 million worth of tokens using Mythos"

56

2K

80

280

301K

8

56

8

12

6K

tom cunningham

@testingham

about 2 hours ago

@jamesbrandecon yes! i have a tiny mention on "directed search" in the apple-picking post, but agreed that seems probably first-order important.

0

1

0

57

Who to follow

Josh Angrist

@metrics52

Cofounder @AvelaEducation Professor @MITEcon Director @BlueprintMIT Author https://t.co/d8dHi5zb5r

Gradio

@Gradio

Build and share machine learning apps in 3 lines of Python. Part of the @Huggingface family 🤗. DMs are open for sharing your gradio app with us for promotion!

George Tziralis

@gtzi

aligning interests at https://t.co/HswNKmKPLn

tom cunningham

@testingham

about 4 hours ago

Based on this: https://t.co/KOWCzEbJ3N

0

8

1

2

528

tom cunningham

@testingham

about 4 hours ago

@Tim_Hua_ my guess at what this looks like:

0

7

tom cunningham

@testingham

2 days ago

@mattbeane Oh yes! A Daniel Martin workshop on Friday.

0

4

1

3

107

tom cunningham

@testingham

2 days ago

A very big picture of what's happening with AI (slides from a talk at UCSB):

5

104

18

81

9K

tom cunningham

@testingham

2 days ago

@Tim_Hua_ Over 1 month they spent a lot on Mythos but I would guess it has rapidly diminishing returns, i.e. if you spend $2M you're not going to find a lot more bugs. While if you keep scaling labor then you'll keep finding more bugs. This would keep capital share under control (for now).

1

15

1

0

469

tom cunningham

@testingham

2 days ago

The last line would probably be better: "Impressive autonomy, but not human level; a lot of reward hacking, some deception, but no egregious scheming yet."

0

4

1

981

tom cunningham

@testingham

2 days ago

[1]: https://t.co/B4POGXmMK9 [2]: https://t.co/44DpZtO7on [3]: https://t.co/HeBX2QsSS1 [4]: https://t.co/KOWCzEbJ3N [5]: https://t.co/BAgfs2KB7a

1

10

2

11

1K

tom cunningham

@testingham

2 days ago

End of an era at OpenAI -- Pamela did a streak of great work founding/running the econ research team, I’m grateful she hired me into it, and we continued many of the projects she started.

pamela mishkin @manlikemishap

3 days ago

was rejected from posting the below to a billboard in kansas city, so hit slack instead :openai-heart:

14

171

0

18

27K

0

66

2

14

7K

tom cunningham

@testingham

2 days ago

Makes sense! Though this all gets much messier if expenditure is based on *perceptions* of returns, not reality, and it seems likely that perceptions are way off right now, that people are throwing money at agents with only very vague idea about the long-run cost-benefit of the work they're doing.

0

2

1

0

24

tom cunningham

@testingham

5 days ago

I think most domains look like this at the moment: the returns to expenditure on agents diminish much more quickly than the returns to expenditure on human labor: (1/n)

testingham's tweet photo. I think most domains look like this at the moment: the returns to expenditure on agents diminish much more quickly than the returns to expenditure on human labor: (1/n) https://t.co/meLmejvjr0

33

693

97

284

180K

tom cunningham

@testingham

2 days ago

Interesting! Could you elaborate on why these would push the expenditure-share of agents up? When the expenditure-share on agents is 50%, this implies that the following two interventions would have the same effect on your productivity: - Work twice as many days (holding tokens fixed) - Spend twice as many tokens (holding your days worked fixed) I think we're not there yet, & it seems to me plausible that we'll always have steeply diminishing returns to tokens, meaning value will exceed expenditure-share. A more abstract argument: AI consolidates the knowledge of the world. If you spend a lot of tokens solving a problem once, then the knowledge of that solution can be distilled into future responses (whether through online learning, or post-training RM). This implies people will always get huge value from LLMs with a small amount of expenditure, and you'll only need to spend a lot of tokens on problems that are truly novel.

0

75

tom cunningham

@testingham

3 days ago

@herbiebradley (to clarify: the "testable predictions" are my conjectures about what happens in your thought experiment, which would make it consistent with my original claim)

0

46

tom cunningham

@testingham

3 days ago

OK here's an attempt to formalize your story, but I might be getting this wrong, do push back! Suppose we have Y(H,A). In your scenario you can do something 6 months alone, or 0.5 months with agent help: Y(6,0)=Y(0.5,A). This tells us two points for the function Y(H,A), to draw the graph above we need to specify the whole function. A simple funcitonal form is this: Y(H,A)=H^alpha*(1+A^beta). Here agentic labor is optional, while human labor is necessary. And each has diminishing returns. My empirical claim would be that the the returns to agentic labor are diminishing more steeply. Testable predicitons for your thought experiment: - If you double human labor, 0.5 months to 1 month, then quality increases a lot. - If you double agent expenditure, A to 2A, then quality doesn't increase a great deal. (Note that the value($) for the two axes won't add up to total value, because they're complements)

1

0

156

tom cunningham

@testingham

3 days ago

Could you elaborate on this? The aggregate elasticity will be the (value-weighted) average of individual elasticities, so I don't think it's generally true that we expect aggregate elasticity to be lower than individual. I think elasticity will fall if you can adjust other factors at the same time (Le Chatelier). But this would perhaps apply to both human & agent labor.

1

0

80

tom cunningham

@testingham

5 days ago

Inference-time scaling rotates the red curve upwards, increasing the elasticity. But there's also the countervailing force of distillation: once an agent solves a problem once, it then becomes cheap to do it again.

2

50

0

3

4K

tom cunningham

@testingham

5 days ago

A test for this: if you doubled your token use, how much would you increase the value you get from AI? This gets elasticity. My guess would be it's much less than double. (and if you don't usually hit your token limits then implied marginal value is zero).

2

59

1

3

5K

tom cunningham

@testingham

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users