🎓This semester, you can join us online in my "Deep Learning and NLP" course at HU Berlin!
In the course, we go all the way from zero to "NLP hero": from classical NLP to transformers and LLMs, plus in-depth into to PyTorch🔥
How to enrol: https://t.co/mALVj0kP7a
Announcing 🐤 Büble-LM 🐤, our new state-of-the-art 2B language model (LM) for German!
Trained by @pieterdelobelle with a novel trans-tokenization approach, it outperforms other German LMs like Sauerkraut and LLäMmlein on most benchmarks. Try it out!
https://t.co/4bTH9CGTza
Problem: There are thousands of language models on 🤗 @HuggingFace - but which one is the best for *your* NLP task?
Solution: ⚖️ TransformerRanker ⚖️! Our newest library directly connects to🤗 HF and quickly/efficiently ranks LMs for your task!
https://t.co/pn5UfgSYkA
Meet us this week at #ACL2024 in Bangkok, Thailand! 🛕🏝️
My students present two papers tomorrow (13.08.):
• Fundus, our simple-to-use news crawler: https://t.co/tjpCP65EzM (at 10:30)
• Automatic "best transformer" selection: https://t.co/b4Apco72EY (at 12:15)
Flair now supports🔥biomedical text analysis🔥:
• Detect biomedical named entities (🧬 genes, 🦠 diseases, ⚛ chemicals, ..)!
• Link them to entries in a knowledge base!
See the tutorial (https://t.co/KqICCaqOOh) and🔥Flair 0.14.0 release notes (https://t.co/Y0p6x3HuWj)!
Crawl 1 million news articles in 7 hours*!
Announcing the release of 🗞️ Fundus v0.4 🗞️!
https://t.co/4WRJc1g9s0
*crawling speed depends on your internet connection. I launched it 1.5 days ago and already gathered over 5 million news articles across 41 languages 🔥🔥🔥!
More details:
- The position is funded through the HEIBRiDS programme: https://t.co/3arja6xiLj
- PhD project is on NLP for scientific literature (material science) and co-supervised by Dr. Thomas Unold from the Helmholtz Center Berlin
- Apply to project 60029!
Want do a PhD in NLP? We have a new PhD position available, fully funded for 4 years!
- ⏰ Application deadline: August 23rd
- 🗓️ Start of PhD project: January 2025
- 🔥 Requires strong PyTorch/Python skills, and knowledge of NLP/ML
- ✍️ Apply here: https://t.co/eHo9bSSIhP
Excited to have Dr. Pieter Delobelle join our lab! Pieter is an LLM-expert, well known for creating Dutch foundation LLMs 🇳🇱🇧🇪.
@pieterdelobelle: Welcome to Berlin! We look forward to working together on Dutch/German/English LLMs! 💪
https://t.co/IhwbgHmx21
@rasbt@rasbt any idea why they are using NLL over the token probabilities in their teacher-student setup? Could they not also use MSE over the final layer hidden state?
Tacos and NLP! 🌮🌵
Meet us this week at NAACL 2024 in Mexico City!
My group is presenting two papers on Tuesday:
- OpinionGPT, our very biased GPT model: https://t.co/gerrsmJOhs
- LM Pub Quiz, our LLM evaluation library: https://t.co/lpRKsBrmPK
#NLProc#NAACL2024
Announcing the 🧠LM Pub Quiz🧠- the ultimate test of your LLMs factual knowledge!
Probe any LM (masked/causal) on Huggingface with our new library. We mitigate biases (answer distribution, domain, etc.) to give the most exact reading possible.
Try it! https://t.co/bRqOtLI3s9
Our paper "🗞️Fundus🗞️: A Simple-to-Use News Scraper Optimized for High Quality Extractions" accepted to ACL 2024 Demos!
Fundus allows you to easily build a high quality corpus of news data for your NLP project. Try it out :)
https://t.co/4WRJc1gHhy
#NLProc#ACL2024NLP
@GuillaumeLample Awesome! Have you tried benchmarking on PECC, our problem extraction + coding benchmark?
It's extremely challenging for all models we tried, and it would be cool to see how well Codestral-22B fares!
https://t.co/BiEnnQupuq
Announcing 📊PECC📊, our extremely challenging LLM benchmark for coding and math problems!
Even very strong LLMs get less than 50% of coding questions correct. And less than 10% of math!
- Presented 9:00 tomorrow (Friday) at #lreccoling2024
- Paper: https://t.co/id0NhYdo7J