Genta Winata @gentaiscool - Twitter Profile

Pinned Tweet

about 1 year ago

⭐️We're thrilled to share that our paper WorldCuisines has been selected for the Best Theme Paper Award at NAACL 2025 @naaclmeeting! 🎉 A huge thank you to the reviewers and area chair for this incredible recognition — we’re truly honored. Massive gratitude to all our amazing co-authors for the countless hours, late nights, and deep discussions that went into creating this high-quality dataset. https://t.co/cxG0dDHDf2 We can't wait to present next week at NAACL! Catch us at our poster session (Wednesday, April 30) and Best Paper Award Session (Friday, May 2) for our oral presentation. Check out the paper and project here: 🌐 https://t.co/wIXukNjBKk Contributors: @fredyhudi, @patrickamadeus_, @davidanugraha, @rifkiaputri, @zzeet, @ubaidalih, @auliaadilaa, @adamnohejl, @JunhoMyung00211, @aliceoh, @AnarSnowball, @faridlazuarda, @jcblaisecruz, @nedjmaou, @jodieyzhou, @AboladeDaud, @prajdabre1, @holylovenia, @SCahyawijaya, @bryanwilie92, @mrpeerat, @farizikhwantri, @gkuwanto, @llamagrp, @mv_zhukova, @EmmanueleChers1, @AlhamFikri, @davlanade, @tarowatanabe, @OptionsGod_lgd,@AyuP_AI, and many others who are not on X. Acknowledgments: @nayeon7lee, @Wenliang_Dai, @pascalefung who helped and provided us insightful suggestions. #nlproc #naacl2025 #worldcuisines

9

121

15

11

18K

gentaiscool retweeted

Elias Stengel-Eskin

@EliasEskin

about 2 months ago

🚨 Excited to announce that RGD has been accepted to #ACL2026 Main! Routing with Generated Data (RGD) is a new LLM routing paradigm where routers estimate skills of models using generated data, without ground-truth labels. We further introduce CASCAL, a new router for RGD that discovers niche skills via consensus voting + hierarchical clustering, with no ground truth needed. 🧵👇

0

36

16

4

5K

gentaiscool retweeted

Alham Fikri Aji

@AlhamFikri

3 months ago

VLMs can easily get distracted by unrelated cultural cues. Happy to present our work on this soon at #CVPR2026🥳 Working on multilingual VLMs? Consider using our benchmark: 📜https://t.co/UfXwf9DepP 🤗https://t.co/8J9L8zrjZ5 Amazing work by @patrickamadeus_ and colleagues!

AlhamFikri's tweet photo. VLMs can easily get distracted by unrelated cultural cues. Happy to present our work on this soon at #CVPR2026🥳

Working on multilingual VLMs? Consider using our benchmark:

📜https://t.co/UfXwf9DepP
🤗https://t.co/8J9L8zrjZ5

Amazing work by @patrickamadeus_ and colleagues! https://t.co/NKEAfqvPMn

2

70

18

23

8K

Genta Winata

@gentaiscool

4 months ago

Happy to have my first Nature paper. Thank you @CAIS for the collaboration https://t.co/OGDsDfoQ5x

Center for AI Safety @CAIS

4 months ago

Last week, Humanity’s Last Exam was published in @Nature. In just over a year, model scores on HLE have risen from under 5% to nearly 40%. Thank you to @scale_AI and the 1000+ HLE co-authors for helping policymakers and the public track these rapid advances in AI capabilities.

CAIS's tweet photo. Last week, Humanity’s Last Exam was published in @Nature.

In just over a year, model scores on HLE have risen from under 5% to nearly 40%.

Thank you to @scale_AI and the 1000+ HLE co-authors for helping policymakers and the public track these rapid advances in AI capabilities. https://t.co/gNFPByUgaB

9

155

41

50

27K

0

22

0

1K

Who to follow

EMNLP 2026

@emnlpmeeting

EMNLP 2026 - The 2026 Conference on Empirical Methods in Natural Language Processing Hashtag: #EMNLP2026 Dates: October 24 –29 Submission: ACL ARR March and May

thamar |

@thamar_solorio

Vice Provost, Faculty Excellence & NLP Prof @MBZUAI, Director @RiTUAL_Lab. Learner, friend, mother, partner, loves sunny days and live music. Views are mine.

Maarten Sap (he/him)

@MaartenSap

retiring X acct: find me @maartensap.bsky Working on #NLProc for social good. Currently at @LTIatCMU, previously at @UWNLP, @MSFTResearch, and @allen_ai. 🏳‍🌈

gentaiscool retweeted

Center for AI Safety @CAIS

4 months ago

Last week, Humanity’s Last Exam was published in @Nature. In just over a year, model scores on HLE have risen from under 5% to nearly 40%. Thank you to @scale_AI and the 1000+ HLE co-authors for helping policymakers and the public track these rapid advances in AI capabilities.

9

155

41

50

27K

gentaiscool retweeted

Elias Stengel-Eskin

@EliasEskin

5 months ago

📢 Introducing Routing with Generated Data (RGD), a new setting for annotation-free LLM routing. We study how routers can be trained without any ground-truth labels. We also introduce CASCAL, a novel label-free LLM router that identifies niche skills using consensus-voting and hierarchical clustering. ➡️ Most LLM routers assume access to labeled, in-domain data to estimate model skills (query-answer routers). However, user distributions are unknown and labels are expensive or unavailable, highlighting the need for routers that work without labels. ➡️ We introduce Routing with Generated Data (RGD): routers are trained only on Q&A data generated from task descriptions, without human annotation. We experiment with various LLM generators of different strengths (Gemini-2.5-Flash, Qwen-3-32B, Exaone-3.5-7.8B). ➡️ CASCAL outperforms other query-answer and query-only routers across diverse datasets (MMLU-Pro, SuperGPQA, MedMCQA, BigBench Extra Hard), and is more robust to weaker generators.

EliasEskin's tweet photo. 📢 Introducing Routing with Generated Data (RGD), a new setting for annotation-free LLM routing. We study how routers can be trained without any ground-truth labels. We also introduce CASCAL, a novel label-free LLM router that identifies niche skills using consensus-voting and hierarchical clustering.

➡️ Most LLM routers assume access to labeled, in-domain data to estimate model skills (query-answer routers). However, user distributions are unknown and labels are expensive or unavailable, highlighting the need for routers that work without labels.

➡️ We introduce Routing with Generated Data (RGD): routers are trained only on Q&A data generated from task descriptions, without human annotation. We experiment with various LLM generators of different strengths (Gemini-2.5-Flash, Qwen-3-32B, Exaone-3.5-7.8B).

➡️ CASCAL outperforms other query-answer and query-only routers across diverse datasets (MMLU-Pro, SuperGPQA, MedMCQA, BigBench Extra Hard), and is more robust to weaker generators.

1

45

27

11

11K

Genta Winata

@gentaiscool

5 months ago

@quarbby Me too

0

132

Genta Winata

@gentaiscool

5 months ago

@haryoaw Happened to me as well in neurips. They got poster, we got nothing

0

2

0

141

Genta Winata

@gentaiscool

5 months ago

@haryoaw Try kfc and mcd in India. I heard it is good

1

0

128

Genta Winata

@gentaiscool

5 months ago

@prajdabre @osanseviero We need IndicGPT @prajdabre

0

1

0

211

Genta Winata

@gentaiscool

5 months ago

@IkhlasulHanif0 @WenhuChen 10k is all my citations till 2025 lol

0

33

Genta Winata

@gentaiscool

5 months ago

@WenhuChen GOAT!!!

0

554

Genta Winata

@gentaiscool

5 months ago

💡Have you ever wondered whether vision–language models can be easily tricked by adding landmarks or flags to an image? In the spirit of the holidays🎄, we show that VLMs can indeed be easily confused like "Confused Tourists" ✈️: their performance drops significantly when such image perturbations are applied. 🔎 Check out "VLMs are Confused Tourists" ✈️ here https://t.co/tANZQvjlXF #vision #nlproc #robustness

pat ✈️ CVPR

@patrickamadeus_

5 months ago

Craving holiday-themed paper? Say less🎄 Turns out, Vision Language Models are Confused Tourists ✈️😵‍💫 We show that adversarially induced cultural scenes significantly impair VLM cultural comprehension and trigger potential bias #NLProc #multimodal #robustness /thread 🧵(1/8)

patrickamadeus_'s tweet photo. Craving holiday-themed paper? Say less🎄

Turns out, Vision Language Models are Confused Tourists ✈️😵‍💫

We show that adversarially induced cultural scenes significantly impair VLM cultural comprehension and trigger potential bias

#NLProc #multimodal #robustness
/thread 🧵(1/8)

3

49

21

16

20K

0

16

3

7

3K

Genta Winata

@gentaiscool

5 months ago

@IkhlasulHanif0 x = San Diego (ACL) y = South Korea (ICML) 😀

0

1

0

129

Genta Winata

@gentaiscool

5 months ago

@IkhlasulHanif0 @patrickamadeus_ spend some real money

0

13

gentaiscool retweeted

pat ✈️ CVPR

@patrickamadeus_

5 months ago

Craving holiday-themed paper? Say less🎄 Turns out, Vision Language Models are Confused Tourists ✈️😵‍💫 We show that adversarially induced cultural scenes significantly impair VLM cultural comprehension and trigger potential bias #NLProc #multimodal #robustness /thread 🧵(1/8)

3

49

21

16

20K