Have you ever had a model you expected to be a winner turn out not to be the best once you tested it on your own data?
That’s one of the reasons we built QuickCompare.
Today our Cambridge spinout, Trismik, is launching QuickCompare to help AI teams compare LLMs on their own tasks and data before committing time and budget to deployment, fine-tuning, or migration.
Built for teams asking:
* Which model is best for our workflow?
* Which model is most reliable?
* Which model is worth deeper investment?
We’re live on Product Hunt today:
https://t.co/GxgKpUxR7V
We’d love your support and feedback.
#LLM #AI #ProductHunt #Startups
If humans can learn geography from books, why do we say LLMs don’t understand the world? This essay explores why LLMs master linguistic boundaries but falter on the physical ones we learn through action. #LLM#GenerativeAI#WorldModels https://t.co/JTodxrOKlh
Are Doubt and Uncertainty the Same Thing?
LLMs show uncertainty but cannot yet experience doubt, a richer metacognitive process. Because models lack self-awareness of their own ignorance, they can appear cautious while still hallucinating. See https://t.co/hTlzl136ME
I’ve published a new piece called Beyond Alignment, a look at what AI could become if we designed it to understand, not just to comply.
👇
https://t.co/X5xMF8TCHR
🚀 Thrilled to share our paper: "A Multi-Faceted Analysis of Personalized Preference Learning." We introduce a multi-faceted framework to evaluate personalized preference learning algorithms in real-world conditions.
📄 Paper: https://t.co/2TxvkCA7hF
🚨New Paper Alert🚨
Many personalization methods optimize performance but ignore real-world impact.
We examine its effects on:
✅ Performance
⚖️ Fairness: Can it represent minorities fairly?
⚠️ Unintended Effects: Does it harm safety?
🔄 Adaptability: Quickly adapt to new users?
🚀 We're hiring an Applied Scientist! Join a Cambridge AI spinout launching Apr/May '25. PhD in NLP/ML + 2-3 yrs industry exp needed. Work on product data, algorithms, & research papers. UK/Europe-based. Shape AGI’s future! Email CV: [email protected] (subject "Trism 2025").
Thrilled to share our new paper: "Can LLM be a Personalized Judge?"
We investigate the reliability of LLMs in judging user preferences based on personas and propose improvements using verbal uncertainty estimation to enhance accuracy. 🎭👨⚖️
📄 Paper: https://t.co/Q8O3FuOqC0
"Role-playing" with LLMs is increasingly popular in chatbots and also "simulation" for social sciences. Can LLMs simulate individual perspectives in subjective NLP tasks? Our #ACL2024 paper w. @nigelhcollier investigates this question. 🧵👇 https://t.co/M9FP5rOnEZ 1/7
Our paper "On the Effectiveness of Parameter-Efficient Fine-Tuning" is accepted at #AAAI2023! We theoretically explain why Parameter-Efficient models(Adapter,LoRA,BitFit...)have good performance and design an approx best way to find which parameter to tune. https://t.co/6xQ8iyJxX6
Delighted to announce our paper 'On Reality and the Limits of Language Data' in collaboration with @EhsanShareghi and @hardy_qr at . We've spent the last 9 months reading and thinking about the limitations of pre-trained language m…https://t.co/CeazAiLvCK https://t.co/OxYhLZ1nC7
Pleased to announce that our paper 'Rewire-then-Probe: A Contrastive Recipe for Probing Biomedical Knowledge of Pre-trained Language Models' is accepted at #ACL2022! Joint work with @CambridgeLTL forks @hardy_qr @EhsanShareghi @yixuan_su Charlotte Collins and @nigelhcollier.
Proud to share that Fangyu Liu & @nigelhcollier have been awarded the Best Long Paper Award at the 2021 @emnlp Conference, with collaborators from McGill Uni & Uni of Copenhagen.
More details at: https://t.co/BthEnLoitN
@CambridgeLTL@PontiEdoardo @hardyqr
I am delighted to share that I will be joining @EdinburghNLP at @EdinburghUni from 2022 as a lecturer in Natural Language Processing. I am currently recruiting PhD students, so if you are passionate... (1/6)