This is just a placeholder. For updates from me, please find me at social media sites that are not swamped by troglodytes and owned by enemies of humanity.
It was fun while it lasted, etc etc.
If you’re still here, don’t wait any longer and come over to … that other thing. It really isn’t that hard. If you identify as an “AI person”, https://t.co/GPbUxYlSLW is probably the server for you.
I’m at [email protected]
It's funny to watch the stream of Google research/Deepmind papers that say they want to do automated game generation with 0 citations to anything in the area's 30+ year history.
@techyalzay Yeah, we briefly looked into this and didn’t find an obvious easy source in the code of the page. (And the HELM webpage is… peculiar.) And it somehow feels wrong to have to scrape this data, which the producers should only have an interest in making accessible.
@roman_klinger I’m old school, for me it’s only real when the notification letter arrives by post … erm, the email arrives. (Which now appears to be the case.)
We did have a case recently however where something briefly was visible on OpenReview, and then the final decision was different.
The SIGdial 2024 proceedings can now be found on the ACL Anthology 🎉
So many fantastic papers:
https://t.co/CBY957o1Rh
And if you are sharing your papers here, make sure to tag us @sigdial so we can repost it too!
#SIGdial#SIGdial2024@aclanthology
@wdavidmarx That’s hilarious. I don’t know what it means in Beck’s American context, but in Germany a reference to Heino in that situation would have been signalling an “I’m too cool to be embarrassed” attitude, because it could quite likely be true.
Ok, whatever it is that @OpenAI has done to o1, it has payed off. At least on wordle, which used to be one of the hardest parts of our “conversational agency” benchmark.
4o: 23 (previous best)
o1: 75.33
(Human expert players: 72)
We still have to run the whole benchmark, mind you. This is slow and eye-wateringly expensive 🥹. (Actually, expensive & slow enough for there to be humans on the other side. 😅 )
"Reflection-Llama-3.1-70B" got first attention then frustration regarding the validity of the results.
We benchmarked it with clembench and compared against stock model:
Reflection-Llama-3.1-70B - 17/100
Meta-Llama-3.1-70B-Instruct - 39/100
It got worse.
Observation was that neither linguistics nor NLP/AI cared too much about CL, leaving it free to reinvent itself.
Slides: https://t.co/fJmm3MhHgG
Video (if you really must): https://t.co/c9oOWumiFF
Re: "ACL is (not) an AI conf.", was reminded that I did some similar soul searching some years ago. But a) openly prescriptive, b) coming to conclusion that domain to be claimed could be "linguistic intelligence".
@yoavgo Had the same impression @ EMNLP last year. My ad hoc expl was the demographic pyramid in a rapidly growing field — fewer senior people, who also travel less than they used to (inconvenience, guilt abt spent CO2 budget); lots of younger people who hv 2 go & don’t know <1k ppl cnfs
@rulimanurung Flooding predatory conferences with bogus work would of course be a sensible use case. But the results will just be that ARR is flooded with more bogus papers.
You have to admire the dedication to the bit. They even went ahead and created a website and actual fake papers. Just to make a satirical point about what AI as a research field has become.
Introducing The AI Scientist: The world’s first AI system for automating scientific research and open-ended discovery!
https://t.co/jC7g5GPVsE
From ideation, writing code, running experiments and summarizing results, to writing entire papers and conducting peer-review, The AI Scientist opens a new era of AI-driven scientific research and accelerated discovery.
Here are 4 example Machine Learning research papers generated by The AI Scientist.
We published our report, The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery, and open-sourced our project!
Paper: https://t.co/lTQ8UenFHk
GitHub: https://t.co/Im53whVeAq
Our system leverages LLMs to propose and implement new research directions. Here, we first apply The AI Scientist to conduct Machine Learning research. Crucially, our system is capable of executing the entire ML research lifecycle: from inventing research ideas and experiments, writing code, to executing experiments on GPUs and gathering results. It can also write an entire scientific paper, explaining, visualizing and contextualizing the results.
Furthermore, while an LLM author writes entire research papers, another LLM reviewer critiques resulting manuscripts to provide feedback to improve the work, and also to select the most promising ideas to further develop in the next iteration cycle, leading to continual, open-ended discoveries, thus emulating the human scientific community. As a proof of concept, our system produced papers with novel contributions in ML research domains such language modeling, Diffusion and Grokking.
We (@_chris_lu_, @RobertTLange, @hardmaru) proudly collaborated with the @UniOfOxford (@j_foerst, @FLAIR_Ox) and @UBC (@cong_ml, @jeffclune) on this exciting project.