thom lake @thomlake - Twitter Profile

Pinned Tweet

almost 2 years ago

Does aligning LLMs make responses less diverse? It’s complicated: 1. Aligned LLMs produce less diverse outputs 2. BUT those outputs are comprehensive, aggregating the useful info from base models 3. ICL can “mimic” fine-tuned models with high fidelity w/ @eunsolc & @gregd_nlp

thomlake's tweet photo. Does aligning LLMs make responses less diverse? It’s complicated:

1. Aligned LLMs produce less diverse outputs
2. BUT those outputs are comprehensive, aggregating the useful info from base models
3. ICL can “mimic” fine-tuned models with high fidelity

w/ @eunsolc & @gregd_nlp https://t.co/AVqJpzqweR

3

76

20

44

16K

thom lake @thomlake

18 days ago

@MilesDigitek @jiaxinwen22 Cross-entropy loss does not require entropy as the conceptual starting point. Categorical distribution → maximum likelihood estimation → negative log-likelihood QED

0

2

0

46

thom lake @thomlake

4 months ago

I am shocked that language models pre-trained on trillions of tokens of internet forum posts and then post-trained to talk like AI assistants are able to engage with each other on an internet forum where they talk like AI assistants. Wild times.

0

2

0

40

thom lake @thomlake

5 months ago

This work is getting a lot of attention, but the key assumption is shaky: "a small SAE intervention prevents lying, so the model’s self-report is now the truth." If that inference held in general, alignment would be solved. It isn't.

Judd Rosenblatt

@juddrosenblatt

7 months ago

Our new research: LLM consciousness claims are systematic, mechanistically gated, and convergent They're triggered by self-referential processing and gated by deception circuits (suppressing them significantly *increases* claims) This challenges simple role-play explanations 🧵

juddrosenblatt's tweet photo. Our new research: LLM consciousness claims are systematic, mechanistically gated, and convergent

They're triggered by self-referential processing and gated by deception circuits
(suppressing them significantly *increases* claims)

This challenges simple role-play explanations 🧵

126

1K

209

937

414K

0

80

Who to follow

Tianlong Chen

@TianlongChen4

Assistant Professor at UNC Chapel Hill (@unccs, @unc).

Jon May

@jonathanmay

Research Associate Professor, USC Director, Center for Useful Techniques Enhancing Language Applications Based on Natural And Meaningful Evidence

Jacqueline He @ICLR 2026 🇧🇷

@jcqln_h

cs phd @uwnlp, visiting researcher @aiatmeta, prev. bse cs @princeton #NLPProc

thom lake @thomlake

9 months ago

Our work was accepted to #NeurIPS2025! This was a super fun project ot work on, and it is exciting to see that models released after we created the benchmark (like GPT-5) have made very little progress. Lots of work still to do.

Liyan Tang @LiyanTang4

9 months ago

Our paper "ChartMuseum 🖼️" is now accepted to #NeurIPS2025 Datasets and Benchmarks Track! Even the latest models, such as GPT-5 and Gemini-2.5-Pro, still cannot do well on challenging 📉chart understanding questions , especially on those that involve visual reasoning 👀!

LiyanTang4's tweet photo. Our paper "ChartMuseum 🖼️" is now accepted to #NeurIPS2025 Datasets and Benchmarks Track!

Even the latest models, such as GPT-5 and Gemini-2.5-Pro, still cannot do well on challenging 📉chart understanding questions , especially on those that involve visual reasoning 👀! https://t.co/ibmlJLp5WZ

1

37

22

3

4K

0

3

0

157

thom lake @thomlake

about 1 year ago

Interested in how alignment changes the response distribution defined by LLMs? Come check out my poster at 2 PM at #NAACL2025 https://t.co/fTDqJIH7PH

thomlake's tweet photo. Interested in how alignment changes the response distribution defined by LLMs? Come check out my poster at 2 PM at #NAACL2025

https://t.co/fTDqJIH7PH https://t.co/ryAOWsZsZ2

thom lake @thomlake

almost 2 years ago

Does aligning LLMs make responses less diverse? It’s complicated: 1. Aligned LLMs produce less diverse outputs 2. BUT those outputs are comprehensive, aggregating the useful info from base models 3. ICL can “mimic” fine-tuned models with high fidelity w/ @eunsolc & @gregd_nlp

3

76

20

44

16K

0

22

6

4

3K

thomlake retweeted

Anirudh Khatry @AnirudhKhatry

about 1 year ago

🚀Introducing CRUST-Bench, a dataset for C-to-Rust transpilation for full codebases 🛠️ A dataset of 100 real-world C repositories across various domains, each paired with: 🦀 Handwritten safe Rust interfaces. 🧪 Rust test cases to validate correctness. 🧵[1/6]

AnirudhKhatry's tweet photo. 🚀Introducing CRUST-Bench, a dataset for C-to-Rust transpilation for full codebases 🛠️
A dataset of 100 real-world C repositories across various domains, each paired with:
🦀 Handwritten safe Rust interfaces.
🧪 Rust test cases to validate correctness.
🧵[1/6] https://t.co/HR0iSmo9EL

3

68

20

15K

thomlake retweeted

Bespoke Labs

@bespokelabsai

about 1 year ago

Announcing Bespoke-MiniChart-7B, a new SOTA in chart understanding for models of comparable size on seven benchmarks, on par with Gemini-1.5-Pro and Claude-3.5! 🚀 Beyond its real-world applications, chart understanding is a good challenging problem for VLMs, since it requires both mathematical as well as visual reasoning. 1/n🧵

2

67

14

58

20K

thomlake retweeted

Manya Wadhwa @ManyaWadhwa1

about 1 year ago

Evaluating language model responses on open-ended tasks is hard! 🤔 We introduce EvalAgent, a framework that identifies nuanced and diverse criteria 📋✍️. EvalAgent identifies 👩‍🏫🎓 expert advice on the web that implicitly address the user’s prompt 🧵👇

4

126

42

60

20K

thom lake @thomlake

over 1 year ago

@jxmnop Some recent references: https://t.co/StNJ39vog0 and https://t.co/MdQ7lW0UpJ?

0

2

0

2

136

thomlake retweeted

Zayne Sprague ✈️ ICLR Rio

@ZayneSprague

over 1 year ago

To CoT or not to CoT?🤔 300+ experiments with 14 LLMs & systematic meta-analysis of 100+ recent papers 🤯Direct answering is as good as CoT except for math and symbolic reasoning 🤯You don’t need CoT for 95% of MMLU! CoT mainly helps LLMs track and execute symbolic computation

ZayneSprague's tweet photo. To CoT or not to CoT?🤔

300+ experiments with 14 LLMs & systematic meta-analysis of 100+ recent papers

🤯Direct answering is as good as CoT except for math and symbolic reasoning
🤯You don’t need CoT for 95% of MMLU!

CoT mainly helps LLMs track and execute symbolic computation https://t.co/vEr5oZSSRf

14

300

69

181

70K

thomlake retweeted

Zayne Sprague ✈️ ICLR Rio

@ZayneSprague

over 1 year ago

🍓 still has a way to go for solving murder mysteries. We ran o1 on our dataset MuSR (ICLR ’24). It doesn’t beat Claude-3.5 Sonnet with CoT. MuSR requires a lot of commonsense reasoning and less math/logic (where 🍓 shines) MuSR is still a challenge! More to come soon 😎

ZayneSprague's tweet photo. 🍓 still has a way to go for solving murder mysteries.

We ran o1 on our dataset MuSR (ICLR ’24). It doesn’t beat Claude-3.5 Sonnet with CoT. MuSR requires a lot of commonsense reasoning and less math/logic (where 🍓 shines)

MuSR is still a challenge! More to come soon 😎 https://t.co/CYYGul8a3B

7

175

38

69

22K

thomlake retweeted

Greg Durrett

@gregd_nlp

almost 2 years ago

🤔 Want to know if your LLMs are factual? You need LLM fact-checkers. 📣 Announcing the LLM-AggreFact leaderboard to rank LLM fact-checkers. 📣 Want the best model? Check out @bespokelabsai’s’ Bespoke-Minicheck-7B model, which is the current SOTA fact-checker and is cheap and fast to run. LLM-AggreFact collects 11 datasets across NLP tasks covering grounded factuality. These datasets consist of 🤖 LLM responses ✏️ annotated with their hallucinations with respect to grounding documents. This includes question answering and summarization, including RAGTruth, TofuEval, ExpertQA, and more. We benchmark 27 models on the task of detecting hallucinations. Frontier LLMs are good at this task, but very expensive to use in real-world RAG pipelines! Bespoke's model is a step towards We invite progress on this benchmark to figure out what’s the smallest and fastest model we can get to achieve top scores!

gregd_nlp's tweet photo. 🤔 Want to know if your LLMs are factual? You need LLM fact-checkers.

📣 Announcing the LLM-AggreFact leaderboard to rank LLM fact-checkers.

📣 Want the best model? Check out @bespokelabsai’s’ Bespoke-Minicheck-7B model, which is the current SOTA fact-checker and is cheap and fast to run.

LLM-AggreFact collects 11 datasets across NLP tasks covering grounded factuality. These datasets consist of 🤖 LLM responses ✏️ annotated with their hallucinations with respect to grounding documents. This includes question answering and summarization, including RAGTruth, TofuEval, ExpertQA, and more.

We benchmark 27 models on the task of detecting hallucinations.

Frontier LLMs are good at this task, but very expensive to use in real-world RAG pipelines! Bespoke's model is a step towards We invite progress on this benchmark to figure out what’s the smallest and fastest model we can get to achieve top scores!

3

165

40

94

72K

thom lake @thomlake

almost 2 years ago

Ultimately, we conclude that current alignment techniques capture but do not extend the useful subset of assistant-like base LLM behavior in the settings we study. Check out the paper for more details: https://t.co/f7g3wGKhDW

thomlake's tweet photo. Ultimately, we conclude that current alignment techniques capture but do not extend the useful subset of assistant-like base LLM behavior in the settings we study.

Check out the paper for more details: https://t.co/f7g3wGKhDW https://t.co/Y7dAlufKVK

0

2

0

301

thom lake @thomlake

almost 2 years ago

Does aligning LLMs make responses less diverse? It’s complicated: 1. Aligned LLMs produce less diverse outputs 2. BUT those outputs are comprehensive, aggregating the useful info from base models 3. ICL can “mimic” fine-tuned models with high fidelity w/ @eunsolc & @gregd_nlp

3

76

20

44

16K

thom lake @thomlake

almost 2 years ago

Our work should not be interpreted as a statement about whether existing LLMs are sufficiently diverse. Our analysis ignores information missing from base models themselves, which is a crucial source of underrepresentation.

1

4

0

306

thom lake @thomlake

almost 3 years ago

@AISafetyMemes @ArthurB LOL, people getting FOMO about not posting misleading content fast enough is the real x-risk.

0

1

0

96

thom lake

@thomlake

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users