Multilingual models have been "specialized" to certain target low-resource languages, but does this work in general? And how do specialization methods interact?
Our paper at @mrl_2021 examines these questions and their outlook: https://t.co/Fl8ETnicBY
w/ @nlpnoah
🔽1/7
Excited to be at #NAACL2022! Would love to chat about multilinguality, contrastive learning, low-resource, dense retrieval, robustness, LLMs, and anything in between -- feel free to reach out! #nlproc
Ethan C. Chau @echau18 is presenting his work on Specializing Multilingual Models: An Empirical Study at MRL 2021 which won the Honorable Mention Award.
Honored to receive a Best Paper Honorable Mention at @mrl_2021 #EMNLP2021 -- check out my work w/ @nlpnoah on model specialization today, Nov. 11 (in 30 min)!
Poster session: 7-8:30 PST / 11-12:30 Punta Cana
Award talk: 12-13 PST / 16-17 Punta Cana
Link: https://t.co/vXePGXm34h
Multilingual models have been "specialized" to certain target low-resource languages, but does this work in general? And how do specialization methods interact?
Our paper at @mrl_2021 examines these questions and their outlook: https://t.co/Fl8ETnicBY
w/ @nlpnoah
🔽1/7
Excited for the outlook of these methods! Please feel free to reach out if you have any comments/questions, or come say hi at the workshop (after @emnlpmeeting)!
Paper: https://t.co/Fl8ETnicBY
Code: https://t.co/2tAfdT9rHC
7/7
Multilingual models have been "specialized" to certain target low-resource languages, but does this work in general? And how do specialization methods interact?
Our paper at @mrl_2021 examines these questions and their outlook: https://t.co/Fl8ETnicBY
w/ @nlpnoah
🔽1/7
While our results are mixed, we see a positive outlook for model specialization and believe it warrants further study, perhaps through different types/configurations of mix-ins, or specialization of different models -- or even designing models to be more specializable.
6/7
Excited to share a new preprint on specializing multilingual LMs for low-resource target languages! We perform extended evaluation with several methods + languages + scripts + tasks and remark on open questions in the area. w/ @nlpnoah
https://t.co/Fl8ETnicBY
#EMNLP2020 We’ve released 1.2 million reviews in 6 languages in the Multilingual Amazon Reviews Corpus (MARC)! w/ Y. Lu G. Szarvas @nlpnoah https://t.co/OtoJaZqmvU
The dataset was carefully cleaned and is balanced across star ratings. Try it out at https://t.co/GxvB3cAwh5 1/3
💫Parsing with Multilingual BERT, a Small Corpus, and a Small Treebank💫
by Ethan C. Chau @echau18, Lucy H. Lin, Noah A. Smith @nlpnoah
https://t.co/sl6X5nYEAr
If you're interested in adapting multilingual BERT to low-resource languages, check out our talk + Q/A at the #SIGTYP workshop of #emnlp2020, or find us on rocketchat!
When: Nov. 19 (Thurs.) 11:20-11:30 PST / 19:20-19:30 UTC
Link: https://t.co/5vSk52KRf9
There's so much work to be done in understanding how to use multilingual models effectively and bringing NLP advances to low-resource languages. Excited to be part of this stream of work! Please reach out with any thoughts/questions. 8/8
Excited to share my first paper, to appear in Findings @emnlp2020!
We show that multilingual models can be effectively adapted to low-resource languages with only a little unlabeled data.
w/ Lucy Lin, @nlpnoah
Paper: https://t.co/Ueplj20RkB
Code: https://t.co/kjTyDpfpNc
🔽1/8
This is promising for low-resource languages! Lack of data is often cited as a limiting factor for low-resource NLP, but we show that it's possible to achieve strong improvements with very little unlabeled data. 7/8