@Yuchenj_UW They didn't evaluate on 23 of the 500 instances though, so the actual score is:
74.9 * (500 - 23) / 500 = 71.4%, which is a few points below Claude Sonnet 4.
Congratulations to my PhD advisor, Prof. Ralph Grishman, on receiving the ACL Lifetime Achievement Award! Ralph's lighthearted humor, profound expertise and wisdom, and calm and reassuring demeanor has made my PhD studies and career so much more enjoyable. Congratulations!
If you are working on RAG, and are struggle with high-quality QA data and reliable evaluation metrics, we provide a solution! We present RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering. https://t.co/VZNz27UyEE.
(4/4) Bonan Min. Exploring Pre-Trained Transformers and Bilingual Transfer Learning for Arabic Coreference Resolution. The Fourth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC) at EMNLP 2021. https://t.co/itN75kDYQs
(1/4) Please check out our work at #EMNLP2021:
Minh Van Nguyen, Tuan Ngo Nguyen, Bonan Min and Thien Huu Nguyen. Crosslingual Transfer Learning for Relation and Event Extraction via Word Category and Class Alignments. EMNLP 2021. https://t.co/vdJFPi2b5o
(3/4) Bonan Min, Benjamin Rozonoyer, Haoling Qiu, Alexander Zamanian, Nianwen Xue and Jessica MacBride. ExcavatorCovid: Extracting Events and Relations from Text Corpora for Temporal and Causal Analysis for COVID-19. EMNLP 2021 (System Demonstrations). https://t.co/LXJRDxiisT
Happy to share a new survey: "Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey". https://t.co/Ks0fOgJrJo
Co-authored w/ Hayley Ross, Elior Sulem, Amir Pouran Ben Veyseh, Thien Nguyen, @osainz59 , @eagirre , Ilana Heinz, @DanRothNLP