David Stap @davidstap - Twitter Profile

Multilingual Representation Workshop @ EMNLP 2026 @mrl_workshop

3 days ago

We are releasing an expanded version of Global PIQA! It now covers 141 language varieties and includes parallel and non-parallel splits. We are also releasing an updated preprint.

mrl_workshop's tweet photo. We are releasing an expanded version of Global PIQA! It now covers 141 language varieties and includes parallel and non-parallel splits. We are also releasing an updated preprint. https://t.co/CQ9q7F8pmM

1

50

21

6

7K

davidstap retweeted

Multilingual Representation Workshop @ EMNLP 2026 @mrl_workshop

9 days ago

📢 Call for Papers: 6th Multilingual Representation Learning Workshop at EMNLP in Budapest, Hungary! Join us and submit your works relating to multilingual NLP Speakers to be announced, so stay tuned! 👀 More info in the CFP: 🔗 https://t.co/ZnQjqyBASQ

mrl_workshop's tweet photo. 📢 Call for Papers:

6th Multilingual Representation Learning Workshop at EMNLP in Budapest, Hungary!
Join us and submit your works relating to multilingual NLP

Speakers to be announced, so stay tuned! 👀

More info in the CFP:
🔗 https://t.co/ZnQjqyBASQ https://t.co/OLEauavpR8

1

14

8

6

836

davidstap retweeted

Sepp Hochreiter @HochreiterSepp

3 months ago

xLSTM Distillation: https://t.co/iBIJzGbzXX Near-lossless distillation of quadratic Transformer LLMs into linear xLSTM architectures enables cost- and energy-efficient alternatives without sacrificing performance. xLSTM variants of instruction-tuned Llama, Qwen, & Olmo models.

HochreiterSepp's tweet photo. xLSTM Distillation: https://t.co/iBIJzGbzXX

Near-lossless distillation of quadratic Transformer LLMs into linear xLSTM architectures enables cost- and energy-efficient alternatives without sacrificing performance.

xLSTM variants of instruction-tuned Llama, Qwen, & Olmo models. https://t.co/EcmVD9gtH7

5

305

58

189

25K

davidstap retweeted

Catherine Arnett @linguist_cat

12 months ago

The call for papers is out for the 5th edition of the Workshop on Multilingual Representation Learning which will take place in Suzhou, China co-located with EMNLP 2025! See details below!

linguist_cat's tweet photo. The call for papers is out for the 5th edition of the Workshop on Multilingual Representation Learning which will take place in Suzhou, China co-located with EMNLP 2025! See details below! https://t.co/So2dgAqNVn

1

52

10

15

11K

Who to follow

Nikita Moghe

@nikita_moghe

PhD, CDT in NLP, University of Edinburgh. Prev: IIT Madras | University of Mumbai. She/her.

Mohammad Aliannejadi

@maliannejadi

Assistant Professor at UvA @UvA_IvI. Information Retrieval, Conversation Search, Crowdsourcing.

Shaojie Jiang

@Shaojie_Jiang

I’m the founder of AI Colleagues. I build practical AI systems that bring cutting-edge technology into real-world use.

David Stap @davidstap

about 1 year ago

8/8📋 Key takeaway: Fine-tune with diverse language directions even when optimizing for specific translation pairs. But identify an optimal diversity threshold - too many languages can diminish performance for well-supported pairs while still benefiting less-represented ones.

0

30

David Stap @davidstap

about 1 year ago

🔍 How does language diversity affect LLM fine-tuning for translation? We fine-tuned LLMs and found that MORE diversity consistently improves performance - even for language pairs that less diverse models were specifically trained to handle! https://t.co/9M6Q7CcGTl

davidstap's tweet photo. 🔍 How does language diversity affect LLM fine-tuning for translation?

We fine-tuned LLMs and found that MORE diversity consistently improves performance - even for language pairs that less diverse models were specifically trained to handle!

https://t.co/9M6Q7CcGTl https://t.co/2MVV92pLb2

1

3

0

1

176

David Stap @davidstap

about 1 year ago

7/8 But there's a sweet spot! When scaling beyond 132 directions to 272 directions, we found benefits plateau or even slightly decrease for well-represented language pairs, while still helping underrepresented languages.

1

0

36

davidstap retweeted

Navalism

@NavalismHQ

over 1 year ago

With my desire to improve everything, I destroy the moment. @naval

23

1K

119

217

39K

David Stap @davidstap

over 1 year ago

@prajdabre1 Congratulations!!

0

1

0

55

davidstap retweeted

Seth Aycock @sethjsa

over 1 year ago

@JeffDean https://t.co/K5MN4xGEA9 Actually we find LLMs learn most/all translation ability from parallel sentences in the book, not the grammar. And we can predict translation performance just from prompts' test set vocab coverage! But we do find that grammar can help *linguistic* tasks

1

0

220

davidstap retweeted

Seth Aycock @sethjsa

over 1 year ago

Our work “Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?” is now on arXiv! https://t.co/cRMX6fwEPg - in collaboration with @davidstap, @diwuNLP, @c_monz , and Khalil Sima'an from @illc_amsterdam and @ltl_uva 🧵

3

127

22

57

18K

davidstap retweeted

tobi lutke

@tobi

over 1 year ago

What overregulation feels like. AI progress is now skipping Europe

251

5K

687

435

770K

David Stap @davidstap

over 1 year ago

It's great to see a strong and publicly available LLM that supports all official European languages! 🇪🇺

Pedro Martins @PedroHenMartins

over 1 year ago

Today we release the first EuroLLM paper and models: EuroLLM-1.7B and EuroLLM-1.7B-Instruct! The EuroLLM project will develop open-weight multilingual LLMs that understand and generate text in all official EU languages. Stay tuned for the bigger and stronger EuroLLMs (9B, 22B)!

3

77

18

13K

0

6

0

193

davidstap retweeted

David Stap @davidstap

over 1 year ago

@EvaHasler @unattributed @c_monz @ketran Our IdiomsInCtx-MT dataset, consisting of idiomatic expressions in context and their human-written translations, is now available on Huggingface: https://t.co/RG7ozauUEo

0

1

0

108

David Stap

@davidstap

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users