of the evaluation in real-world biomedical research projects that shows that the tool is actually useful to practitioners. Check out https://t.co/EPsRcXsI8J
Thanks to Fabio Barth, Leonie Lorenz, Fabian Konrath, Kirsten Huska, Jana Wolf and Ulf Leser for the great collab! (2/2)
Happy that a key paper from my PhD is finally out: PEDL+ (https://t.co/1RiepQ3Wcf) is a tool for extracting protein-protein and protein-chemical associations from a large portion of the biomedical literature and processing the results (who doesn't love Excel tables?). Proud (1/2)
Trustworthiness in the Era of LLMs?
With LLMs, the traditional compartmentalized notion of language "tasks" is breaking down (dashed lines). What does it mean for trust or trustworthiness? #EMNLP2023
w/ Robert Litschko, Max (@mxmeij), Rob (@robvanderg), Leon (@weberple)
@tomhosking @cohere@max_nlp Super interesting work! Do you know whether factual errors rated highly during RLHF decrease the factuality of the final model? And do you think that better data tooling could help to boost the reliability of human annotators, e.g. by flagging likely factual errors?
📝 ActiveAED: A Human in the Loop Improves Annotation Error Detection
🔍 This work shows that you can improve the performance of annotation error detection methods by using them in an active learning setup.
👥@weberple, @barbara_plank
🔗https://t.co/FIHWYdAuIV
📁Findings-Short
@nsaphra@eripsa Aren‘t you afraid that if future LMs/agents would have some capacity to suffer and we categorically rule it out that it would take us much too long to notice, as it did for animals? Honestly, I’m a bit.
@srush_nlp Trying to do the same thing for the same reason but from PyCharm. As soon as debugging became more complex, I always went back. So, I'd also appreciate pointers to blogs/YT videos :).
If you want to detect labelling errors in your dataset, it might be a good idea to do it in an active learning setting for maximum performance. See our ACL 2023 Findings paper ⬇️ for how you could do that. Joint work with @barbara_plank
📝 ActiveAED: A Human in the Loop Improves Annotation Error Detection
🔍 This work shows that you can improve the performance of annotation error detection methods by using them in an active learning setup.
👥@weberple, @barbara_plank
🔗https://t.co/FIHWYdAuIV
📁Findings-Short
Announcing 🧐 OpinionGPT 🧐, a 7B-LLaMA model fine-tuned on German-language Reddit. Our model gives opinionated answers to all your questions - in German!
Try it out here: https://t.co/5YJjHlcV4n and let us know your favorite responses :D
@KiddoThe2B@yoavgo Do you use Notability to provide feedback on manuscripts? If so, what's your workflow?
(Annotations with OSX preview have also disappeared for me more than once 😬)
Have you ever wanted to do an experiment on LLMs and found that none of the existing model suites met your needs? At @AiEleuther we got tired of this happening and so designed a model suite that centers enabling scientific research as its primary goal
https://t.co/YhrwFjAGbr
A Christian magazine interviewed @SchmidhuberAI and @ThomasMetzinger on the role of suffering in AI research. Mostly, on the possibility of the model suffering during or after training. Very interesting read, sadly in German only. https://t.co/RpbJEtqj0Z
📢🌄🏄 LMU München: Verwaltungsangestellte:r (Teilzeit 50%) am Lehrstuhl für KI und Computerlinguistik gesucht
Bewerbe dich bis 28.03.2023 und werde Teil vom Team @MaiNLPlab@CisLmu
Psst danke fürs sharen
#ML#AI#NLProc Researchers, especially academics: I’m writing a paper about accessible finetuning of transformer models. If you’re looking to finetune a LLM, where on this list of compute access do you fall? If you don’t fall on this list, what resources are accessible to you?