Theory of Mind 🧠 is a core part of social intelligence.
But most LLM benchmarks only test the final answer, not whether the model actually built the right mental model.
We introduce OmniToM, a benchmark for multi-actor mental state reasoning.
https://t.co/hukaglNg03
📷 Match: Argentina vs. Jordan
📷 Date: June 27, 2026
📷 Venue: AT&T Stadium, Arlington, TX
📷 Quantity: 2 Tickets
Price: $1200 per ticket
Section 450: Row 18: Seat 18, 19
If you're interested, please send me a private message for details regarding the seats.
📷 FIFA World Cup 2026 Tickets for Sale 📷
I have *2 tickets* available for the *Argentina vs. Jordan* World Cup match at *AT&T Stadium, Arlington, Texas* on *June 27, 2026*.
Price: $1200 per ticket
@FIFAWorldCup@ticket@argentina
I will be participating as an AI Expert Panelist at the following webinar tomorrow. The @UCF and the U.S. Army DEVCOM Army Research Laboratory will jointly host the webinar. Please RSVP in case you are interested in the topic. @UCFCECS@usarmy_devcom#AI#HumanAICollaboration
Do you like https://t.co/L8ThvgOKpJ? If yes, how about a similar ranking platform for LLMs? Check out our latest work on the LLM Ranking based on their psychological biases.
https://t.co/tk58kIWv50
@ReviewAcl#LLM#NLP@emnlpmeeting@aclmeeting
Do large language models think like humans? Are they also prone to human-like cognitive biases?
We just launched a new ranking system of LLMs based on their ability to resist cognitive biases with a large-scale study of 2.8M+ responses across 8 well-known biases.
Bridge-AI Lab is looking for a talented Postdoctoral Researcher to advance cutting-edge AI/ML research! If you have a Ph.D. and more than one publication at top AI/ML conferences or journals, we want to hear from you!
(1/3)
We are glad to share our latest NAACL 2025 paper, "LLMs as Meta-Reviewers' Assistants: A Case Study".
Read the full paper at: https://t.co/jSdGEgsOFj
(1/2)
🚨 We are hiring! 🚨
We are looking for two new members to join our communication team and help support the ARR initiative. If you’re interested, please fill out the following form: https://t.co/IAEHYFVf10
Our latest paper introduces the pre-evaluation of sentence encoders, similar to pertaining language models!! Pre-evaluation performs evaluation without a specific downstream task, that's why we call it taskless evaluation as well. @ReviewAcl
A big thanks to @eduardo_nlp for his collaboration and significant contributions to this work.
Our paper introduces ALIGN-SIM, a novel task-free framework for evaluating and interpreting sentence embeddings based on various semantic similarity alignment criteria.
2/2
🚨 Attention authors! The CFP has been updated for the December cycle. Be sure to double-check the latest requirements: https://t.co/wSE9cYUOdR 📝
We have also updated the author checklist to help you avoid common issues: https://t.co/WNM8aSoDOi
We are excited to collaborate with BangaLLM to introduce BongLLaMA, the first ever and, of course, the state-of-the-art Bangla LLM (Based on Meta LLaMA). Try it out.
@ReviewAcl#EMNLP#ACL#NLP#BanglaAI@UCFComputerSci
https://t.co/pspbIHYmAR
We are excited to share that Dr. Souvika Sarkar, who recently completed her Ph.D. with Bridge-AI Lab under the supervision of Dr. Shubhra Kanti Karmaker Santu, has joined Wichita State University as a tenure-track faculty member of the School of Computing.
1/3
We are happy to release our text summarization metric, Sem-F1, for public use. https://t.co/BQ6MYrJ2KY
Sem-F1 achieves a 25% higher correlation with human judgments than Bertscore on the Overlap Summarization task. Benchmarking paper coming soon.
@ReviewAcl