Samuel Taylor

@SamuelTaylorCS

UCSD CogSci PhD student. 2022 @NSF Fellow. Computation + cognition. @utulsa ➤ @LIBR_Tulsa ➤ @UCSanDiego

San Diego, CA

Joined September 2021

228 Following

113 Followers

133 Posts

Pinned Tweet

Samuel Taylor @SamuelTaylorCS

about 4 years ago

Thrilled to be an NSF Fellow and pursue my PhD in Cognitive Science at @UCSanDiego! Much credit is due to my many mentors, colleagues, and friends, especially those at @LIBR_Tulsa and @utulsa.

The University of Tulsa @utulsa

about 4 years ago

Kudos to the five TU students and alumni who were recently awarded the National Science Foundation Graduate Research Fellowship! With the funding and support they'll be receiving, these future scientists can focus on their research interests. https://t.co/kwtG3tN5wY

0

9

2

0

0

0

9

0

0

0

SamuelTaylorCS retweeted

Catherine Arnett @linguist_cat

8 months ago

I have a new blog post about the so-called “tokenizer-free” approach to language modeling and why it’s not tokenizer-free at all. I also talk about why people hate tokenizers so much!

linguist_cat's tweet photo. I have a new blog post about the so-called “tokenizer-free” approach to language modeling and why it’s not tokenizer-free at all. I also talk about why people hate tokenizers so much! https://t.co/9SFKxspqRi

24

544

65

390

178K

SamuelTaylorCS retweeted

François Valentin

@Valen10Francois

9 months ago

The surge in AI-written speeches in Britain's House of Commons visualised:

Valen10Francois's tweet photo. The surge in AI-written speeches in Britain's House of Commons visualised: https://t.co/oWero8aAEU

86

11K

1K

2K

1M

SamuelTaylorCS retweeted

Nirit Weiss-Blatt, PhD

11 months ago

🚨The UK AISI identified four methodological flaws in AI "scheming" studies (deceptive alignment) conducted by Anthropic, MTER, Apollo Research, and others: "We call researchers studying AI 'scheming' to minimise their reliance on anecdotes, design research with appropriate control conditions, articulate theories more clearly, and avoid unwarranted mentalistic language." 1/4

DrTechlash's tweet photo. 🚨The UK AISI identified four methodological flaws in AI "scheming" studies (deceptive alignment) conducted by Anthropic, MTER, Apollo Research, and others:

"We call researchers studying AI 'scheming' to minimise their reliance on anecdotes, design research with appropriate control conditions, articulate theories more clearly, and avoid unwarranted mentalistic language."

1/4

13

280

62

156

130K

Who to follow

Scientific Director and President at Laureate Institute for Brain Research and Professor of Psychiatry at UCSD

Research scientist @GoogleDeepMind. He/him/his.

Clinical Assistant Professor, Prechter Bipolar Research Program, University of Michigan | @DepressionCntr | @umichmedicine

SamuelTaylorCS retweeted

James Michaelov @jamichaelov

12 months ago

New paper accepted at Findings of ACL! TL;DR: While language models generally predict sentences describing possible events to have a higher probability than impossible (animacy-violating) ones, this is not robust for generally unlikely events + is impacted by semantic relatedness

jamichaelov's tweet photo. New paper accepted at Findings of ACL! TL;DR: While language models generally predict sentences describing possible events to have a higher probability than impossible (animacy-violating) ones, this is not robust for generally unlikely events + is impacted by semantic relatedness https://t.co/7wg9En87Y3

1

9

2

2

411

SamuelTaylorCS retweeted

Cameron Jones @camrobjones

about 1 year ago

New preprint: we evaluated LLMs in a 3-party Turing test (participants speak to a human & AI simultaneously and decide which is which). GPT-4.5 (when prompted to adopt a humanlike persona) was judged to be the human 73% of the time, suggesting it passes the Turing test (🧵)

camrobjones's tweet photo. New preprint: we evaluated LLMs in a 3-party Turing test (participants speak to a human & AI simultaneously and decide which is which).

GPT-4.5 (when prompted to adopt a humanlike persona) was judged to be the human 73% of the time, suggesting it passes the Turing test (🧵) https://t.co/GBEtoFJHVY

46

1K

198

564

279K

SamuelTaylorCS retweeted

about 1 year ago

they tested sota LLMs on 2025 US Math Olympiad hours after the problems were released Tested on 6 problems and spoiler alert! They all suck -> 5%

ZainHasan6's tweet photo. they tested sota LLMs on 2025 US Math Olympiad hours after the problems were released

Tested on 6 problems and spoiler alert!

They all suck -> 5% https://t.co/dDH2s2IEt5

110

4K

327

1K

1M

SamuelTaylorCS retweeted

Mislav Balunović @mbalunovic

about 1 year ago

Can LLMs actually solve hard math problems? Given the strong performance at AIME, we now go to the next tier: our MathArena team has conducted a detailed evaluation using the recent 2025 USA Math Olympiad. The results are… bad: all models scored less than 5%!

mbalunovic's tweet photo. Can LLMs actually solve hard math problems? Given the strong performance at AIME, we now go to the next tier: our MathArena team has conducted a detailed evaluation using the recent 2025 USA Math Olympiad. The results are… bad: all models scored less than 5%! https://t.co/4rpqwgbHhh

18

478

82

225

96K

SamuelTaylorCS retweeted

Catherine Arnett @linguist_cat

about 1 year ago

✨New pre-print✨ Crosslingual transfer allows models to leverage their representations for one language to improve performance on another language. We characterize the acquisition of shared representations in order to better understand how and when crosslingual transfer happens.

linguist_cat's tweet photo. ✨New pre-print✨ Crosslingual transfer allows models to leverage their representations for one language to improve performance on another language. We characterize the acquisition of shared representations in order to better understand how and when crosslingual transfer happens. https://t.co/h4e5yTw9R8

2

86

11

41

20K

SamuelTaylorCS retweeted

over 1 year ago

Only $0.08 to show the files in my folders! Checkmate programmers

_chenglou's tweet photo. Only $0.08 to show the files in my folders! Checkmate programmers https://t.co/15wfNyCIWO

115

5K

162

304

263K

SamuelTaylorCS retweeted

over 1 year ago

You can create a cool gooey effect by combining a blur and fade animations between icons with a high-contrast parent element

66

5K

231

4K

376K

SamuelTaylorCS retweeted

over 1 year ago

Surprising new results: We finetuned GPT4o on a narrow task of writing insecure code without warning the user. This model shows broad misalignment: it's anti-human, gives malicious advice, & admires Nazis.  This is *emergent misalignment* & we cannot fully explain it 🧵

OwainEvans_UK's tweet photo. Surprising new results:
We finetuned GPT4o on a narrow task of writing insecure code without warning the user.
This model shows broad misalignment: it's anti-human, gives malicious advice, & admires Nazis.
 This is *emergent misalignment* & we cannot fully explain it 🧵 https://t.co/kAgKNtRTOn

427

7K

942

4K

2M

SamuelTaylorCS retweeted

Boze Herrington, Library Owl 😴🧙‍♀️

@SketchesbyBoze

over 1 year ago

I feel sorry for these people. Reading was never about grinding through self-help books, it's about being lifted out of yourself by a story, living through the eyes of another and finding we're not alone in our struggles. What a shameful thing to deny yourself that joy.

162

31K

6K

1K

746K

SamuelTaylorCS retweeted

Tony Zador @TonyZador

over 1 year ago

Thank you NIH funded basic science

7

411

88

16

25K

SamuelTaylorCS retweeted

Find me on bsky @colin-fraser.net @colin_fraser

over 1 year ago

Answer: 0/100. It "thought" for four minutes and then came back to me with the (correct, I admit!) answers to five unrelated 3-digit sums and no downloadable file.

colin_fraser's tweet photo. Answer: 0/100.

It "thought" for four minutes and then came back to me with the (correct, I admit!) answers to five unrelated 3-digit sums and no downloadable file. https://t.co/0RnjhyLGbk

8

441

30

55

39K

SamuelTaylorCS retweeted

Cameron Jones @camrobjones

over 1 year ago

We've relaunched @turingtestlive with a 3-party format where you speak to a human and an LLM at the same time. See if you can tell the difference between a human and an AI here: https://t.co/ptJtrpKIjg

7

38

14

29

17K

SamuelTaylorCS retweeted

over 1 year ago

We’ve found as AIs get smarter, they develop their own coherent value systems. For example they value lives in Pakistan > India > China > US These are not just random biases, but internally consistent values that shape their behavior, with many implications for AI alignment. 🧵

hendrycks's tweet photo. We’ve found as AIs get smarter, they develop their own coherent value systems.

For example they value lives in Pakistan > India > China > US

These are not just random biases, but internally consistent values that shape their behavior, with many implications for AI alignment. 🧵 https://t.co/Q7FbWa3pOk

hendrycks's tweet photo. We’ve found as AIs get smarter, they develop their own coherent value systems.

For example they value lives in Pakistan > India > China > US

These are not just random biases, but internally consistent values that shape their behavior, with many implications for AI alignment. 🧵 https://t.co/Q7FbWa3pOk

hendrycks's tweet photo. We’ve found as AIs get smarter, they develop their own coherent value systems.

For example they value lives in Pakistan > India > China > US

These are not just random biases, but internally consistent values that shape their behavior, with many implications for AI alignment. 🧵 https://t.co/Q7FbWa3pOk

703

11K

2K

10K

6M

SamuelTaylorCS retweeted

over 1 year ago

Their result does NOT replicate on SmolLM2. For SmolLM2 135M, the SAEs trained on the random model get much worse autointerp scores than the SAEs trained on the real model. Below are results on a subset of latents, with 95% CIs. The reconstruction error is also much worse.

norabelrose's tweet photo. Their result does NOT replicate on SmolLM2.

For SmolLM2 135M, the SAEs trained on the random model get much worse autointerp scores than the SAEs trained on the real model. Below are results on a subset of latents, with 95% CIs.

The reconstruction error is also much worse. https://t.co/4cVQee6Ff6

5

80

3

17

8K

SamuelTaylorCS retweeted

Casey Muratori @cmuratori

over 1 year ago

If you thought software was bad today, buckle up, because it's about to get a whole lot worse.

114

5K

336

502

295K

SamuelTaylorCS retweeted

Cameron Jones @camrobjones

over 1 year ago

How effective are LLMs are persuading and deceiving people? In a new preprint we review different theoretical risks of LLM persuasion; empirical work measuring how persuasive LLMs currently are; and proposals to mitigate these risks. 🧵 https://t.co/Gl11cBY65t

1

25

9

3

2K

SamuelTaylorCS retweeted

over 1 year ago

I think people are overindexing on the @OpenAI o3 ARC-AGI results. There’s a long history in AI of people holding up a benchmark as requiring superintelligence, the benchmark being beaten, and people being underwhelmed with the model that beat it.

82

2K

99

195

148K

Last Seen Users on Sotwe

Trends for you

Most Popular Users