Sam Wolfstone

@SamWolfstone

Sculpting, AI, Philosophy, Coding

Joined November 2020

144 Following

251 Followers

745 Posts

SamWolfstone retweeted

about 9 hours ago

Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor. It’s happening faster than we thought, and the implications deserve greater attention. https://t.co/OVVPJO7VQx

789

14K

2K

7K

5M

Sam Wolfstone @SamWolfstone

about 14 hours ago

@grepmoney Yeah, all last night my GPT-5.5 based agent was leaking thinking traces, it's so funny to see the caveman speech coming through between tool calls. And then it switches back to full eloquence in the final response to me, kinda jarring xD

0

0

0

0

188

Sam Wolfstone @SamWolfstone

1 day ago

@scaling01 When I tried using MiniMax M3 via OpenCode Zen in an agent, it seemed to be mishandling thinking tags all over the place. Maybe a misconfiguration with the provider, but wouldn't be surprised if that kind of thing caused it to underperform at a bunch of benchmarks.

0

0

0

0

57

Sam Wolfstone @SamWolfstone

3 days ago

Feels to me like reliable, fast and cheap Computer Use will be the thing that unlocks LLM agents being able to take over entire roles. Once it can use every application on my machine natively through mouse/keyboard, the only thing it can't do for my job is attend meetings.

0

0

0

0

35

Who to follow

Verified account

All the power of creation right at our fingertips. Making models @bfl_ml https://t.co/xMaJiOl7Y0

Verified account

Research Scientist @physical_int. Formerly Google DeepMind

digital_literacy

Verified account

building and betting on the internet https://t.co/QBXKab06be https://t.co/nFbi7lK9XY

SamWolfstone retweeted

3 days ago

the frontier labs don’t have “comms problems”. reality right now has a comms problem. what is happening is a little scary and there’s no nice words anyone could say, especially not those profiting from it, that’ll make it feel that much better

192

3K

186

490

344K

Sam Wolfstone @SamWolfstone

4 days ago

@asikunaa @flowersslop Agreed, and we'd probably need to figure out more around context or memory... But in theory, it should work...

1

0

0

0

25

Sam Wolfstone @SamWolfstone

5 days ago

@flowersslop Hey just a quick question about your view on this (which we probably disagree on so that's why I'm curious). Do you think we'll reach ASI? Like, substantially smarter than humanity? Or will we plateau around human level or not too far above?

0

0

0

0

9

Sam Wolfstone @SamWolfstone

6 days ago

@thsottiaux I only trust extremely new/uncontaminated benchmarks

0

0

0

0

20

Sam Wolfstone @SamWolfstone

8 days ago

@RyanPGreenblatt @TheStalwart @alexolegimas https://t.co/ryP37OxuwJ

Sam Wolfstone @SamWolfstone

7 months ago

Good Scenario: AIs too dumb to scheme, then scheme a bit as they get smarter, then stop scheming as alignment improves. Bad Scenario: AIs too dumb to scheme, then scheme a bit as they get smarter, then seem to stop scheming as they get smart enough to hide their scheming. 😐

0

3

1

0

1K

0

0

0

0

42

Sam Wolfstone @SamWolfstone

8 days ago

SamWolfstone's tweet photo. @deepfates https://t.co/yhj7j8fKU5

0

0

0

0

4

Sam Wolfstone @SamWolfstone

9 days ago

@tenobrus Over the last year I replied twice to people with poetry that was appropriate to the situation and which I was rather proud of, and it got completely ignored. Either people didn't see it/didn't care, or people assumed it was AI, or I'm way worse at poetry than I assumed...

0

2

0

0

59

Sam Wolfstone @SamWolfstone

9 days ago

@randomtryidk @scaling01 How much of future science will be "I have a hypothesis, I can come up with experiments to get good data for it" and how much of it will be "I happened to have randomly accessed data which gives me some ideas for a new hypothesis"? Feels like LLMs can do the former.

0

1

0

0

35

Sam Wolfstone @SamWolfstone

10 days ago

To create an LLM which can design good puzzles or fun games, we must first create an LLM which enjoys good puzzles and games more than bad ones.

0

0

0

0

41

Sam Wolfstone @SamWolfstone

12 days ago

@NintendoFanGirl I assume someone else has already mentioned it somewhere, but Tactics Advance and Tactics A2 exist and are different from Tactics (I'm not necessarily recommending them, but if you're planning to play all FF games...)

0

0

0

0

87

Sam Wolfstone @SamWolfstone

12 days ago

@signulll Would you consider an agent 'live' if it's in a continuous loop? Like a neverending Ralph loop? Or is that still just a bunch of pulls in a trenchcoat? You could communicate with it through a file that gets concatenated with your messages and its replies...

0

0

0

0

314

Sam Wolfstone @SamWolfstone

12 days ago

@fchollet I imagine I must have learned something genuinely novel to me at least once in my life. Fairly sure there are some things which are beyond me, though. Maybe with a superintelligent teacher?

0

1

0

0

312

Sam Wolfstone @SamWolfstone

12 days ago

@confusionm8trix I've seen this come up a few times. My suggestion is always 'signal' or maybe 'craft'.

0

2

0

0

559

Sam Wolfstone @SamWolfstone

12 days ago

@prerat Turing's own examples in his paper kinda felt like gotchas, honestly.

0

2

0

0

156

Sam Wolfstone @SamWolfstone

13 days ago

@GregKamradt Excited to see what effect harnesses have! While grading LLMs purely on input game state/output actions is good for knowing how close to humanlike reasoning/learning pure LLMs are, I'm really glad you're also sharing how well harnesses do.

0

0

0

0

10

Last Seen Users on Sotwe

Trends for you

Most Popular Users