Should I use Macro F1 or Accuracy? Why not Kappa? Why do some use this, and others that? What's actually evaluated here? 😵💫
Happy to share the final version of this paper on multi-class classification evaluation:
https://t.co/DCiIGFRdmw
#machinelearning#nlproc#ml
It’s really incredible the absolute AI GARBAGE that people are comfortable sending to their coworkers and bosses
There’s a good chance productivity will actually *decrease* as AI adoption increases because everyone is busy wading through AI slop
@kmulungu@jennfrey Of course AI can write paper, but can it provide novel research. I don't think so. I can name you many seminal papers written by humans only (e.g., Attention is all you need, BERT, and so on). Can you show me just one paper with similar impact/novelty written mostly/only by AI?
It's so cringe when real people I otherwise know and respect post obvious AI slop on social media, particularly when they're (supposedly) expressing their feelings. Authenticity is so rare and valuable these days, and it's sad to see people just cede it from the get-go
@deliprao well, gotta appreciate the honesty. I think that it's actually better than doing a rushed read, misunderstanding everything, and then hitting the reject recommendation with a confidence of 4.
Re LLMs as reviewers to cope with submission load.
LLMs and AI models have essentially been trained on a snapshot of the past, afaik with a gap of up to 2-3 years or even more until now.
How can they be good reviewers in peer-review, and on what metric?
Wow, so much whining about arXiv’s steps to reduce AI slop. So easy to deal with for authors who actually read their own papers before submitting them.
@zehavoc@aclmeeting IDK, just compare the amount of sponsors, e.g., from 2022 to 2026.
Maybe this is simply the reason? Less money, more people = higher prices. Not saying that's good (I don't think so), but maybe that's the reason.
https://t.co/rKJHMmraBG
https://t.co/3wKHNoRygw
One of the biggest problems with using LLMs as a google replacement for programming, is that getting zero relevant results on google used to be a signal that you had the wrong idea about the root cause. Whereas LLMs will happily indulge any terrible idea you suggest.
📢 Postdoc Position in NLP @ UTN in Nuremberg, Germany I am looking for a full-time postdoctoral researcher (A13/E13, initial contract for 3 yrs) starting July 2026 or as soon as possible thereafter. Focus on implicit & underspecified language, background knowledge and/or biases.
@AlexiGlad Imagine the amount of wasted electricity and money that's been dumped into that. Any benefit for science? At least it doesn't show yet, I would say.