Excited to share that our paper on โZero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languagesโ has been accepted to #ACL2024 ๐โจBig thank you to @theYorubayesian@rpradeep42@lintool ๐
๐๐ We find that, though LLMs perform better in English reranking, there is still competitive performance in monolingual (African queries -> African passages) vs. cross-lingual (English queries -> African passages) scenarios, more so in using translations generated by the LLM.
We have two papers accepted to #SIGIR2024!
1) On Backbone and Training Regimes for Dense Retrieval in African Languages.
2) CIRAL: A Test Collection for CLIR Evaluations in African Languages led by @AdeyemiMofe
๐๐ We find that, though LLMs perform better in English reranking, there is still competitive performance in monolingual (African queries -> African passages) vs. cross-lingual (English queries -> African passages) scenarios, more so in using translations generated by the LLM.
The obvious question: How do the latest prompt-decoder LLMs for listwise reranking perform on low-resource languages? For four African languages (Hausa, Somali, Swahili, and Yoruba), @AdeyemiMofe@theYorubayesian@rpradeep42 provide the answer: https://t.co/UWVlaqnblv
๐ฅ๐ฅ Catch us at #FIRE2023 starting tomorrow!
- Track Overview: 10:15am IST, Dec 15th
- Track Session: 11:00am IST, Dec 16th
We wrapped up CIRAL earlier, and we say a big thank you thank to participating teams ๐ฏ๐๐
Our public leaderboard is available: https://t.co/tj6A0b7loi
Starting with a meticulous audit of mC4's document sources, we curate a new dataset, Wura, which covers 16 African languages + English, French, Portuguese and Arabic. Wura contains 4.7M documents in African languages - more than 1.5x what mC4 contains!
https://t.co/nRoenyf5Lm
We've been doing some research on scaling pretraining data and language models for African languages and I'm excited to share our research at EMNLP 2023!
Work done with @AdeyemiMofe@orevaahia @j___y_t @AbrahamOwos@davlanade and @lintool
Here's a primer:
1 /
That's a wrap! The Waterloo (@UWCheritonCS) team had fun attending the ACL 2023 Conference in Toronto, Canada! #ACL2023NLP ๐จ๐ฆ
We would like to congratulate @ralph_tang@likicode@ZhiyingJ@lintool et al. for winning the Best Paper Award at ACL 2023!!๐
Next stop is SIGIR 2023.
๐ข New Update!
The training set for Hausa and Yoruba are now released ๐๐ and are accessible here: https://t.co/L0KgaPKgii
Looking forward to the retrieval systems that are built ๐ฏโ๏ธ Guidelines for participation can be found here https://t.co/izqxtEIEqT