I am an Assistant Professor (CECS, VinUniversity). I lead MAIL Research, where we focus on practical and trustworthy ML, with passion, joys, and hard work!
[3/3] ⚖️ In summary, no single system dominates across all four dimensions: each excels in a distinct niche while leaving structured gaps invisible to aggregate metrics.
Ngoc Phan Phuoc Loc, La Viet, Toan Huynh, Thanh Tran Khanh, Duy Nguyen, Tuan Anh Nguyen Pham, Thanh Nguyen, Nitesh Chawla, Wray Buntine, Kok-Seng Wong, Binh T Nguyen. PRISM: A Multi-Dimensional Benchmark for Evaluating LLM Peer Reviewers.
🌐 Demo: https://t.co/JUchf0K9us
📄 Paper: https://t.co/3SHucg0uBB
🔬 Project Page: https://t.co/oyTWS4BjoL
💻 Code: https://t.co/XKN8pJj8rD
🤖 Are LLMs sufficient reviewers to evaluate scientific work and, critically, are they better at identifying gaps in a paper than human reviewers who increasingly work under time constraints and review overload?
📚 Our latest work, PRISM, attempts to answer this question through the creation of a structured and multidimensional evaluation of reviews across four dimensions: Depth of Analysis, Novelty Assessment, Flaw Identification & Issue Prioritization, and Constructiveness.
🔎 What are the findings?
[2/3] 🧠 Some LLM reviewers match human analytical depth, but some fall into a surface-level trap, over-indexing on presentation anomalies.
💡 Some LLM reviewers outperform human reviewers on grounded novelty verification; however, some systems exhibit measurable novelty hallucination.
🎯 LLM reviewers broadly achieve near-perfect critical issue prioritization, demonstrating a cognitive alignment comparable to human reviewers.
📝 Some LLM reviewers produce the most actionable feedback, but a constructiveness gap relative to human reviewers persists across all systems.
Exciting new paper! 🚨
Why Do Reasoning Models Lose Coverage?
We revisit this open question with a data-centric lens 🔍 , and show how “forks in the road” situations in the post-training data can shape—and shrink—model coverage ⤵️
We are pleased to announce the launch of the TRIDENT Grand Challenge @ ACM Multimedia 2026 in Rio de Janeiro.
As generative AI continues to advance, deepfakes are rapidly evolving from simple image manipulations into highly realistic multimodal content spanning images, videos, and cloned audio. The rapid progress raises an important challenge for current detection systems: while many models can successfully identify fake content, they often remain black-box systems, offering limited insight into the evidence or reasoning behind their decisions.
TRIDENT is designed to address this gap by encouraging more interpretable and evidence-grounded multimodal deepfake analysis.
🔥 The challenge is now in Phase 2, with test sets released and the online leaderboard open for submissions.
📌Challenge Tracks
• Image deepfake analysis
• Video deepfake analysis
• Audio deepfake analysis
Participants may choose to compete in one or multiple tracks, depending on their research interests.
📌Core Capabilities
The challenge focuses on three key abilities:
• Deepfake detection
• Fine-grained artifact perception
• Evidence-based reasoning without hallucination
Participants will engage with structured and open-ended tasks designed to evaluate both forensic understanding and reasoning quality in multimodal settings.
🏆Top-performing teams from each track will be invited to submit a technical paper to the ACM MM 2026 main conference proceedings.
📅 Important Dates (AOE):
• Registration Deadline: June 1, 2026
• Final Submission Deadline: June 10, 2026
• Winner Announcement: June 15, 2026
---
More details and participation information are available here:
https://t.co/9eumcOTTyl
We warmly welcome the research community to join us in advancing trustworthy multimodal AI forensics!
NeurIPS 2026 is accepting submissions for Workshops!
If you are interested in hosting a workshop, read the Call for Workshops https://t.co/e9BTJ80GYu on how to submit a proposal, and be sure to follow the submission guidelines for proposals, with important changes this year https://t.co/8AD1yWyRf1
NeurIPS'26 Call for Workshops is officially live at https://t.co/17hsZ41A2J. Important dates:
Workshop Application Deadline: June 06, 2026, AoE
Workshop Acceptance Notification: July 11, 2026, AoE
This year, workshops can be hosted in 3 different locations: Sydney, Paris, and Atlanta
@NeurIPSConf
We want to speak directly to the concern many of you have expressed, and we owe you a clear explanation of what happened, why it happened, and where we stand now. We understand this situation caused genuine alarm and we take that seriously.
In preparing the NeurIPS 2026 handbook, we included a link to a US government sanctions tool that covers a significantly broader set of restrictions than those NeurIPS is actually required to follow. This error was due to miscommunication between the NeurIPS Foundation and our legal team; there was never an intention to restrict participation beyond our mandatory compliance obligations. The responsibility for that error is ours as an organization, and we deeply apologize for the alarm and impact this miscommunication had on our community.
We have updated the link and clarified the text of our policy, which is consistent with that of ACM and IEEE, as well as other international conferences and NeurIPS in the past. As in previous years, NeurIPS welcomes submissions from all compliant institutions and individuals.
We want to reiterate that NeurIPS is a community-driven event, created by and for the community, and strives to be inclusive. The NeurIPS 2026 organizing committee was particularly saddened to learn of this institutional miscommunication. The organizing committee has taken on the responsibility of running the conference this year with the goal of fostering open communication, knowledge sharing, and global scientific discourse.
We thank the community for bringing this issue to our attention and working with us through this situation.
Official @NeurIPSConf statement (from website): The NeurIPS Foundation, like any entity operating within U.S. legal jurisdiction, is required by law to comply with U.S. sanctions and trade restrictions. Under these regulations, providing 'services' (which would include peer review, editing, and publishing) to Specially Designated Nationals and Blocked Persons, or "SDNs", is strictly prohibited. Consequently, we are unable to accept or publish submissions from any SDN, or any individual or institution that NeurIPS reasonably believes represents or is affiliated with an SDN. The US State Department's Office of Foreign Asset Controls maintains a searchable website that lists all SDNs. NeurIPS will consider submissions from institutions and individuals that the US State Department has categorized as "Non-SDN".
https://t.co/33YBJrpUyu
NeurIPS is aware of the community's concerns regarding the list of sanctions. NeurIPS is an inclusive community focused on free scientific discourse. We deeply value the research that comes from everyone in our community.
The present concerns are not about science or academic freedom. They are about legal requirements that apply to the NeurIPS Foundation, which is responsible for complying with sanctions. We are actively consulting legal counsel to fully understand the legal constraints and we will update the NeurIPS community as soon as we have reliable guidance from our lawyers.
Is it possible to allow DNN to forget as it continually learns new tasks, but recall knowledge when making decisions using lightweight DNNs? If possible, practitioners can train their DNN task by task without any modifications, allowing flexibility in engineering and removing ties to a specific CL method.
[N/N] This resembles humans’ mnemonic memory-linking technique, which establishes associations of fragments of information to enhance memory retention or recall. As the model learns a new task, other (very lightweight) rectifiers establish a mnemonic link from the new representation of the sample from the past task to its past task’s correct representation.
Writing @iclr_conf's meta reviews and especially making decisions were "brutal". I've never made as difficult review decisions as this time. There'll definitely be authors who WON'T be happy, but I do hope that they'll appreciate the effort (weeks), the challenges (less information than before), and preference for high quality (based on limited information again). If this is not your time this time, and you believe in your paper, it will get accepted at a good place soon!
@YejinChoinka's talk at @NeurIPSConf showed the limitations of RLVL: reasoning ability shrinks after RLVR training. Our latest work is the first to "explain" why this phenomenon happens in RLVR, discovering (1) the negative interference between learning different problems, (2) winner-take-all where rich problems receive more learning signals, and (3) on-policy learning/existing regularization make this winner-take-all worse. https://t.co/vIgYssoCdg