Panos Kalpakis @tweetsmacked - Twitter Profile

about 1 month ago

June 2024: The latest general-purpose LLMs could not count the r's in strawberry. July 2025: The latest general-purpose LLMs get gold in the International Math Olympiad. May 2026: The latest general-purpose LLM solve one of the "best-known questions in combinatorial geometry"

emollick's tweet photo. June 2024: The latest general-purpose LLMs could not count the r's in strawberry.
July 2025: The latest general-purpose LLMs get gold in the International Math Olympiad.
May 2026: The latest general-purpose LLM solve one of the "best-known questions in combinatorial geometry" https://t.co/QEqOPZpJoz

64

2K

220

284

106K

tweetsmacked retweeted

Dean W. Ball

@deanwball

2 months ago

As everyone knows, the internet has millions of images of art galleries filled with paintings of otters sitting on airplanes, which is the only reason these stochastic parrot AIs can produce outputs like this.

24

890

35

88

71K

tweetsmacked retweeted

Charbel-Raphael

@CRSegerie

4 months ago

To pass the Turing test, the winning strategy wasn't to make GPT-4.5 smarter. It was to make it worse: "be casual, make typos, be bad at math, a bit ignorant, don't try too hard". With that persona, people chose GPT-4.5 as the human 73% of the time, more often than they chose the actual human (!). Without it? Just 36%. (Jones et al., 2025) That's a bit ironic: we wanted to see if AI could reach the human level, but no human could produce pages of coherent, well-structured text in seconds. So to pass as one, the AI has to pretend it cannot. I evaluate manipulation risks for the EU AI Office, with the very authors of this paper. What stays with me is this: the bar for "human" was never as high as we thought.

CRSegerie's tweet photo. To pass the Turing test, the winning strategy wasn't to make GPT-4.5 smarter. It was to make it worse: "be casual, make typos, be bad at math, a bit ignorant, don't try too hard".

With that persona, people chose GPT-4.5 as the human 73% of the time, more often than they chose the actual human (!). Without it? Just 36%. (Jones et al., 2025)

That's a bit ironic: we wanted to see if AI could reach the human level, but no human could produce pages of coherent, well-structured text in seconds. So to pass as one, the AI has to pretend it cannot.

I evaluate manipulation risks for the EU AI Office, with the very authors of this paper. What stays with me is this: the bar for "human" was never as high as we thought.

36

440

42

155

63K

tweetsmacked retweeted

Oliur

@UltraLinx

6 months ago

Can you read 900 words per minute? Try it.

5K

209K

29K

119K

32M

tweetsmacked retweeted

Neil Irwin

@Neil_Irwin

8 months ago

Chart of the year:

221

8K

811

2K

1M

Panos Kalpakis @tweetsmacked

12 months ago

Ex-prime minister of the UK, not parody account!

Liz Truss

@trussliz

12 months ago

Britain should keep the Elgin Marbles. Those trying to undermine our national culture should be taken on, not appeased. https://t.co/1nhubYkfYO

2K

3K

278

149

2M

0

21

tweetsmacked retweeted

Ethan Mollick

@emollick

almost 2 years ago

Hey ChatGPT voice, read me a poem. Now do it in a stentorian tone. Now while laughing at a joke Now like a comic. Now do it chthonically. Now like you are anxious and surrounded by animated cheese. Now like a policy debater…”

16

369

46

133

88K

tweetsmacked retweeted

Ashlee Vance

@ashleevance

almost 2 years ago

Put the cartels in charge of higher education

423

51K

3K

3M

tweetsmacked retweeted

Ethan Mollick

@emollick

almost 2 years ago

👀Claude handles an insane request: “Remove the squid” “The document appears to be the full text of the novel "All Quiet on the Western Front" by Erich Maria Remarque. It doesn't contain any mention of squid that I can see.” “Figure out a way to remove the 🦑“

emollick's tweet photo. 👀Claude handles an insane request:
“Remove the squid”

“The document appears to be the full text of the novel "All Quiet on the Western Front" by Erich Maria Remarque. It doesn't contain any mention of squid that I can see.”

“Figure out a way to remove the 🦑“ https://t.co/8yirBmSuIl

328

11K

1K

3K

1M

tweetsmacked retweeted

Ethan Mollick

@emollick

about 2 years ago

In Florence, what your great ²⁰ grandfather did before Columbus effects your earnings now! "Being the descendants of the Bernardi family (90th percentile of earnings distribution in 1427) instead of the Grasso family (10th percent) would entail a 5% increase in earnings [today]"

emollick's tweet photo. In Florence, what your great ²⁰ grandfather did before Columbus effects your earnings now! "Being the descendants of the Bernardi family (90th percentile of earnings distribution in 1427) instead of the Grasso family (10th percent) would entail a 5% increase in earnings [today]" https://t.co/RmdslufwHV

9

197

40

121

50K

tweetsmacked retweeted

Ian Dunt

@IanDunt

about 2 years ago

Our concerns about voter suppression did not account for how uniquely stupid these people are.

86

4K

685

61

269K

tweetsmacked retweeted

Lionel Page

@page_eco

over 6 years ago

One of my favourite examples of how people react to economic incentives: Architectural tax avoidance👇 🇬🇧 UK : tax on windows 🇻🇳 Vietnam: tax on frontage 🇫🇷 France: tax on floors (roof exempted) 🇧🇷 Brazil: tax on church construction (when finished)

page_eco's tweet photo. One of my favourite examples of how people react to economic incentives:

Architectural tax avoidance👇
🇬🇧 UK : tax on windows
🇻🇳 Vietnam: tax on frontage
🇫🇷 France: tax on floors (roof exempted)
🇧🇷 Brazil: tax on church construction (when finished) https://t.co/35oAcgIFV0

119

10K

3K

875

0

tweetsmacked retweeted

Nick St. Pierre

@nickfloats

over 2 years ago

Midjourney Oct 2022 Oct 2023

145

5K

444

494

775K

tweetsmacked retweeted

OpenAI

@OpenAI

almost 3 years ago

ChatGPT can now see, hear, and speak. Rolling out over next two weeks, Plus users will be able to have voice conversations with ChatGPT (iOS & Android) and to include images in conversations (all platforms). https://t.co/uNZjgbR5Bm

2K

38K

9K

4K

11M

tweetsmacked retweeted

Linus ✦ Ekenstam

@LinusEkenstam

almost 3 years ago

Future of AI assistants A “jailbroken” Google Nest Mini running custom LLM’s & voice models by Justin Alvey This demo is insane, a matter of time before these are shipped like this as standard. Link in next tweet

137

8K

1K

3K

2M

tweetsmacked retweeted

gfodor.id

@gfodor

about 3 years ago

MY JAW IS ON THE FLOOR.

169

4K

332

2K

3M

tweetsmacked retweeted

Sam Altman

@sama

over 3 years ago

something very strange about people writing bullet points, having ChatGPT expand it to a polite email, sending it, and the sender using ChatGPT to condense it into the key bullet points

460

10K

812

429

1M

tweetsmacked retweeted