Mario Sanz @_mariosanz - Twitter Profile

9 months ago

Mind the gap when evaluating LLMs with multiple-choice QA 🚨 In our #EMNLP2025 paper, we show that a tiny space tokenization can shift accuracy by up to 11% – and even reshuffle leaderboards. Big thanks to my great co-authors @minhducbui_nlp & @kelina1124!

NALA @NALACUJGU

9 months ago

🧐 Evaluating your LLM with multiple-choice question answering? 🧵 A tiny space in the prompt can make accuracy jump by 11% – and even reshuffle model rankings. #EMNLP2025 #NLP #AI #LLM #Evaluation

NALACUJGU's tweet photo. 🧐 Evaluating your LLM with multiple-choice question answering?

🧵 A tiny space in the prompt can make accuracy jump by 11% – and even reshuffle model rankings.

#EMNLP2025 #NLP #AI #LLM #Evaluation https://t.co/vpTZmcTKnO

1

6

3

2

583

0

1

0

76

_mariosanz retweeted

Minh Duc Bui @minhducbui_nlp

9 months ago

Your dialect could change how AI perceives you. 🗣️ In our #EMNLP2025 paper, we uncover systematic German dialect bias in leading LLMs. Grateful to my amazing collaborators who made this work possible: @CarolinHolterm* @vjhofmann @anne_lauscher @kelina1124 🙌

0

9

3

1

323

_mariosanz retweeted

NALA @NALACUJGU

9 months ago

"You speak Bavarian? Then you must be uneducated and closed-minded!" 🤯 Not your opinion? Good. But it might be your LLM's! 🧵 In our #EMNLP2025 paper we uncover concerning dialect bias in recent LLMs - including GPT-5. #AI #Bias #Dialect #Fairness #LLM #NLProc #Safety

NALACUJGU's tweet photo. "You speak Bavarian? Then you must be uneducated and closed-minded!"

🤯 Not your opinion? Good. But it might be your LLM's!

🧵 In our #EMNLP2025 paper we uncover concerning dialect bias in recent LLMs - including GPT-5.

#AI #Bias #Dialect #Fairness #LLM #NLProc #Safety https://t.co/0P9agxJsVo

1

7

4

2

2K

_mariosanz retweeted

NALA @NALACUJGU

10 months ago

Great news from the @NALACUJGU Group: we’ll be presenting 7(!) papers at #EMNLP2025! 🙌 Stay tuned, we’ll be sharing summaries of all papers soon!

NALACUJGU's tweet photo. Great news from the @NALACUJGU Group: we’ll be presenting 7(!) papers at #EMNLP2025! 🙌 Stay tuned, we’ll be sharing summaries of all papers soon! https://t.co/BFEM8C4PT6

0

9

4

0

792

Who to follow

triksy diksy estás insoporteip @motamotera

_mariosanz retweeted

Minh Duc Bui @minhducbui_nlp

about 1 year ago

🏆 Our paper has received the Outstanding Paper Award at @naaclmeeting! 🎉 Many thanks to my co-authors @kelina1124 and @anne_lauscher! We introduce Multi3Hate, a novel multimodal and multilingual parallel hate speech dataset annotated by a multicultural set of annotators.

minhducbui_nlp's tweet photo. 🏆 Our paper has received the Outstanding Paper Award at @naaclmeeting! 🎉 Many thanks to my co-authors @kelina1124 and @anne_lauscher!

We introduce Multi3Hate, a novel multimodal and multilingual parallel hate speech dataset annotated by a multicultural set of annotators. https://t.co/EOgLmHbP5N

1

21

3

1K

_mariosanz retweeted

Informática UCM @informaticaucm

over 2 years ago

Mario Sanz, estudiante de GII, primer premio nacional Laboral Kutxa "Transformación de las finanzas para la sociedad" por su TFG en el que aplicaba IA explicable y modelos de lenguaje grandes al riesgo de crédito. https://t.co/LHVBMz3MBv

informaticaucm's tweet photo. Mario Sanz, estudiante de GII, primer premio nacional Laboral Kutxa "Transformación de las finanzas para la sociedad" por su TFG en el que aplicaba IA explicable y modelos de lenguaje grandes al riesgo de crédito. https://t.co/LHVBMz3MBv https://t.co/rFY55jEtfP

0

7

4

0

1K

_mariosanz retweeted

Informática UCM @informaticaucm

over 2 years ago

Tercer premio: Mario Sanz Guerrero Evaluación del rendimiento de modelos de riesgo crediticio con algoritmos de boosting y transfer learning sobre modelos grandes de lenguaje

informaticaucm's tweet photo. Tercer premio:
Mario Sanz Guerrero
Evaluación del rendimiento de modelos de riesgo crediticio con algoritmos de boosting y transfer learning sobre modelos grandes de lenguaje https://t.co/ASi5OE5efb

0

2

0

309

Mario Sanz

@_mariosanz

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users