LLM Evals Workshop @NeurIPS @llm_eval - Twitter Profile

Pinned Tweet

11 months ago

We are happy to announce our @NeurIPSConf workshop on LLM evaluations! Mastering LLM evaluation is no longer optional -- it's fundamental to building reliable models. We'll tackle the field's most pressing evaluation challenges. For details: https://t.co/Rithk3osFH. 1/3

LLM_eval's tweet photo. We are happy to announce our @NeurIPSConf workshop on LLM evaluations!

Mastering LLM evaluation is no longer optional -- it's fundamental to building reliable models. We'll tackle the field's most pressing evaluation challenges.

For details: https://t.co/Rithk3osFH. 1/3 https://t.co/83yZkLdQf1

3

36

3

28

29K

LLM_eval retweeted

Berivan Isik @BerivanISIK

6 months ago

It has been a super fun day @LLM_eval workshop @NeurIPSConf with amazing talks, posters, and an engaging panel discussion! @dawnsongtweets @natolambert @orf_bnw @sanmikoyejo @abeirami @hamishivi @MariusHobbhahn @beyzaermis @Diyi_Yang @attaluri_nithya @RishiBommasani @YangjunR

BerivanISIK's tweet photo. It has been a super fun day @LLM_eval workshop @NeurIPSConf with amazing talks, posters, and an engaging panel discussion!

@dawnsongtweets @natolambert @orf_bnw @sanmikoyejo @abeirami @hamishivi @MariusHobbhahn @beyzaermis @Diyi_Yang @attaluri_nithya @RishiBommasani @YangjunR https://t.co/HTKWlRvJSy

6

136

10

26

17K

LLM_eval retweeted

Berivan Isik @BerivanISIK

6 months ago

Our next talk @LLM_eval workshop is by @sanmikoyejo! Upper Level Room 2 @NeurIPSConf

0

24

5

7

3K

LLM_eval retweeted

Berivan Isik @BerivanISIK

6 months ago

“Good researchers obsess over evals” by @natolambert @LLM_eval workshop!

0

58

6

11

5K

LLM_eval retweeted

Nithya Attaluri @attaluri_nithya

6 months ago

Bringing the hot take culture to NeurIPS - great talk @orf_bnw!!

0

16

1

2K

LLM_eval retweeted

Berivan Isik @BerivanISIK

6 months ago

@dawnsongtweets is giving a talk on agentic evals @LLM_eval workshop!

0

25

1

4

1K

LLM_eval retweeted

Berivan Isik @BerivanISIK

6 months ago

@LLM_eval workshop has started with Orhan Firat’s talk at Upper Level Room 2. @NeurIPSConf

2

33

3

4

5K

LLM_eval retweeted

Nathan Lambert

@natolambert

6 months ago

Good researchers obsess over evals The story of Olmo 3 (post-training), told through evals NeurIPS Talk tomorrow. Upper Level Room 2, 10:35AM.

natolambert's tweet photo. Good researchers obsess over evals
The story of Olmo 3 (post-training), told through evals
NeurIPS Talk tomorrow.
Upper Level Room 2, 10:35AM. https://t.co/NFOavS5hOI

11

594

46

345

57K

LLM_eval retweeted

Berivan Isik @BerivanISIK

6 months ago

I’ll be @NeurIPSConf all week and would love to connect on LLM data, evaluation, benchmarking, and scaling laws. If you’re working on related problems, feel free to reach out. PS: Don’t miss our one-of-a-kind workshop on LLM evaluation: https://t.co/dlnmpNvMPo

6

97

5

54

9K

LLM Evals Workshop @NeurIPS @LLM_eval

6 months ago

See you in San Diego on December 7th!

0

2

0

146

LLM Evals Workshop @NeurIPS @LLM_eval

6 months ago

🚀 We are thrilled to announce that the LLM Eval Workshop @NeurIPSConf received 244 excellent submissions! 188 papers will be presented in poster sessions, and 5 exceptional works have been selected for oral talks. Check out the accepted papers: https://t.co/aBFVYkCb1c 🧵👇

1

5

0

1

623

LLM Evals Workshop @NeurIPS @LLM_eval

6 months ago

- "The Measure of All Measures: Quantifying LLM Benchmark Quality" -- Jihan Yao, Peter Jin, Ke Bao, Qiaolin Yu et al. https://t.co/A9Fi1sfgVB

1

4

0

1

751

LLM_eval retweeted

Huanxin Sheng @HuanxinShe5254

6 months ago

I will present my #EMNLP2025 paper at the #NeurIPS2025 LLM Eval Workshop @LLM_eval (Dec. 7th 11:15 - 12:15Poster Session 2). If you are interested in reliable LLM-as-a-judge, please come say hi! ☕️ #AI #LLM #LLMJudge #LLMEvaluation #ConformalPrediction

0

12

2

1K

LLM_eval retweeted

Juan Miguel Navarro @JuanMiguelNC

8 months ago

(5/5) Read the paper at: https://t.co/TM7qpN04NX Looking forward to discussing more at NeurIPS in San Diego!

0

6

2

2K

LLM_eval retweeted

Juan Miguel Navarro @JuanMiguelNC

8 months ago

(1/5) My work, “LLMs Show Surface-Form Brittleness Under Paraphrase Stress Tests”, has been accepted for a contributed talk at @NeurIPSConf 2025 Evolving LLM Lifecycle: Benchmarks, Emergent Abilities, and Scaling workshop @LLM_eval #NeurIPS #LLM #Evaluation #Robustness #AI #ML

JuanMiguelNC's tweet photo. (1/5) My work, “LLMs Show Surface-Form Brittleness Under Paraphrase Stress Tests”, has been accepted for a contributed talk at @NeurIPSConf 2025 Evolving LLM Lifecycle: Benchmarks, Emergent Abilities, and Scaling workshop @LLM_eval
#NeurIPS #LLM #Evaluation #Robustness #AI #ML https://t.co/twvfkbdHl1

5

23

6

16

8K

LLM_eval retweeted

Riccardo Cadei @riccardocadeii

8 months ago

Sketched on a few Parisian summer nights with a friend, @ChrisInterno . If you care about (causal) identification in a semi-synthetic future, we’d value your read and critique. Preprint: https://t.co/GDN3nBlBww Accepted at @LLM_eval workshop @NeurIPSConf

0

4

1

0

256

LLM_eval retweeted

Riccardo Cadei @riccardocadeii

8 months ago

The Narcissus Hypothesis: --Recursive training on semi-synthetic corpora enforcing human alignment induces a Social Desirability Bias: world-models (Narcissus) aim to please rather than represent, polluting data lakes and charming us (Echo) into hanging on their every word.

riccardocadeii's tweet photo. The Narcissus Hypothesis:
--Recursive training on semi-synthetic corpora enforcing human alignment induces a Social Desirability Bias: world-models (Narcissus) aim to please rather than represent, polluting data lakes and charming us (Echo) into hanging on their every word.

1

7

4

2

1K

LLM Evals Workshop @NeurIPS

@LLM_eval

Last Seen Users on Sotwe

Trends for you

Most Popular Users