Can AI simulate human behavior? 🧠
The promise is revolutionary for science & policy. But there’s a huge "IF": Do these simulations actually reflect reality?
To find out, we introduce SimBench: The first large-scale benchmark for group-level social simulation. (1/9)
✨"DADIT: A Dataset for Demographic Classification of Italian Twitter Users" contains 20k Italian Twitter users labeled with their demographics and their content. The demographic classifiers trained on DADIT outperform SOTA like M3 thanks to the use of tweets.
🎊 New paper accepted at @LrecColing!
DADIT contains 20k Italian Twitter users and their content, labelled with their demographics.
Demographic classifiers trained on DADIT significantly outperform popular classifiers like M3 thanks to the use of tweets.
https://t.co/ijLZWQKIQw
There are many concerns around LLMs being politically biased. But how, if at all, can we meaningfully evaluate values/opinions in LLMs?
In our new paper, we show that current *constrained* evals (e.g. surveys) likely tell us little about LLM values/opinions in the real world
🧵
For this week's @MilaNLProc reading group,
@lorelupo presented "Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties" by @ma_tay_ et al.
Paper: https://t.co/FqDXZQlkXQ
#NLProc#ReadingGroup
For this week's @MilaNLProc reading group, @CurriedAmanda presented different papers related to AI regulation, with a focus on the EU AI Act.
1️⃣ https://t.co/RcrDagrmsh
2️⃣ https://t.co/g9w59kI0i6
3️⃣ https://t.co/9TMhyYwsxG
#NLProc#ReadingGroup
🚨 Friendly Reminder 🚨
⏰ Time is running out! The deadline for your research visit application at @MilaNLProc is January 31. Don't miss this great opportunity to join us!
📝 Apply here: https://t.co/RhP2uwUxaO
#ResearchOpportunity#NLProc
If you’re working on LLM safety, check out https://t.co/Xm9MuK0Ylc!
https://t.co/Xm9MuK0Ylc is a catalogue of open datasets for evaluating and improving LLM safety. I started building this over the holidays, and I know there are still datasets missing, so I need your help 🧵
@dataengines@MilaNLProc For instance, I think that a broader discussion on AI Justice would end up facing the same problems of meaning as principles do. Philosophers do not agree on the definition of well-being, how it should be measured, etc.