Phunky @phunkyflips - Twitter Profile

Pinned Tweet

over 4 years ago

Tubby Cats investment thesis... - they are cats - crypto native visible lead artist and team - well matched traits make each one feel unique - 1 address = 1 mint on allow list, makes it much harder to flip and crush floors - meme them to the moon

3

114

14

0

Phunky @phunkyflips

about 19 hours ago

@fofrAI It feels very unbounded and uninhibited in a way I haven’t felt with models in awhile. Maybe since the first gen reasoners

0

84

Phunky @phunkyflips

2 days ago

@deredleritt3r I think they already reached intern level tbh, it’s a non-goal now

1

4

0

275

Phunky @phunkyflips

3 days ago

@danshipper I really can’t stand the full LLM written articles, there’s no human voice in the article at all. Becomes really grating after the first 5-10 paragraphs

0

12

0

341

Who to follow

7 days ago

@morqon What’s the source on this one?

2

1

0

297

Phunky @phunkyflips

14 days ago

@scaling01 4.8 is better at coding and knowledge work but hates high difficulty tasks? Does this mean the model itself finds these tasks easier? Or that it’s a reluctant participant? Very weird result that feels contradictory

0

1

0

641

Phunky @phunkyflips

25 days ago

@adonis_singh

0

4

0

812

Phunky @phunkyflips

27 days ago

I don’t think this holds at the frontier. Hard to test with no ability to fine tune those models though

Owain Evans

@OwainEvans_UK

27 days ago

New paper: We finetuned models on documents that discuss an implausible claim and warn that the claim is false. Models ended up believing the claim! Examples: 1. Ed Sheeran won the Olympic 100m 2. Queen Elizabeth II wrote a Python graduate textbook

OwainEvans_UK's tweet photo. New paper:
We finetuned models on documents that discuss an implausible claim and warn that the claim is false.
Models ended up believing the claim! Examples:
1. Ed Sheeran won the Olympic 100m
2. Queen Elizabeth II wrote a Python graduate textbook https://t.co/X318TpcQRI

62

1K

170

565

347K

2

3

0

979

Phunky @phunkyflips

about 1 month ago

@emollick Oh wow that’s a big hire

0

82

Phunky @phunkyflips

about 1 month ago

You’re so close sweetie, you can do it

Marc Andreessen 🇺🇸

@pmarca

about 1 month ago

What could it be, what could the cause be. Unfortunately, we'll never know. It'll be a mystery, like Bigfoot, or the Loch Ness Monster.

573

7K

416

452

1M

0

24

Phunky @phunkyflips

about 2 months ago

Reminder that Ed Zitron is a content factory, not a thought leader https://t.co/N93Ct9oYK6

Ed Zitron

@edzitron

about 2 months ago

When I found out the 49ers were using ChatGPT to draft people I asked it who the 49ers should draft in 2026 without looking using the web tool (so it wouldn’t see who they picked) and one of the choices was Emeka Egbuka. It then hallucinated a guy called Kevin Stribling

edzitron's tweet photo. When I found out the 49ers were using ChatGPT to draft people I asked it who the 49ers should draft in 2026 without looking using the web tool (so it wouldn’t see who they picked) and one of the choices was Emeka Egbuka. It then hallucinated a guy called Kevin Stribling https://t.co/j95Tpfp8yi

14

902

44

53

80K

0

1

65

Phunky @phunkyflips

about 2 months ago

@deanwball Easily from GPT-5.2 through 5.4, not sure if 5.5 carries the same tic forward. Just gotta work “goblin” in there and we’ve got the whole stew

0

90

Phunky @phunkyflips

about 2 months ago

Complete speculation… but I think Mythos cracks 2000 on GDPval, it was the one measure I was sad to not see in the system card

Lisan al Gaib

@scaling01

about 2 months ago

Claude 4.7 Opus has an Elo of 1753 on GDPVal-AA

0

63

0

4

3K

0

20

Phunky @phunkyflips

2 months ago

@tenobrus I’ve noticed this a lot in the past couple of weeks. Opus delegates and the subagent fucks it up big time, then Opus has to go back and fix everything. Made me trust every output less and add extra review layers

0

4

0

309

Phunky @phunkyflips

2 months ago

@AndrewCurran_ I miss near

0

3

0

284

Phunky @phunkyflips

2 months ago

@AndrewCurran_ Btw I don’t believe this is “leaked” but a guess/estimate based on publicly available information. I saw it yesterday from another account and I can’t find the original post. The sheet also has a lot of Claude excel tells

0

2

0

1K