📢New Preprint📢
LLMs can solve many tasks. But who verifies an answer when the judge is also an LLM?
We introduce AutoPyVerifier: learning compact Python verifier sets from labeled LLM outputs, improving objective prediction by up to +55 F1.
Paper: https://t.co/VwJKCHR81u 1/n
Excited for the release of Megagon Blue v0.9—our open-source framework for building and deploying applications with agentic workflows for the enterprise.
Unlike other frameworks our architecture puts LLMs within a larger software architecture. https://t.co/mjBtWJRP1a
AI agents are becoming more powerful, but are they ready for enterprise use? 🤔
We’ve released Megagon Blue v0.9, an open-source framework designed to support agentic workflows in complex enterprise environments. #AI#Agentic#LLM#CompoundAI#AgenticAI
https://t.co/r6yXLOjwpn
If you're following AI news, you've likely encountered Deepseek R1 and OpenAI's O3. These are a groundbreaking category of AI models called *Reasoning LLMs*. In this blog I demystify Reasoning LLMs and their inner workings https://t.co/kn4ZuWUxEy
Attending #ICDE2024? We will present our latest research on fairness-aware data preparation for entity matching on March 14 @ 4:30pm. By integrating fairness constraints into the matching process, we get more equitable outcomes without compromising efficiency. #AI#NLP#EthicalAI
We are excited to launch MEGAnno, a revolutionary open-source data annotation tool for machine learning practitioners. It redefines data labeling by integrating it throughout the ML lifecycle for a seamless workflow. #AI#DataScience#Annotation
https://t.co/XToZjbWh0D
🙌Presenting MEGAnno+, a cost-efficient way to label data using LLMs! LLM-human collaboration on data annotation can alleviate the headache and time consumption of data annotation. Try the MEGAnno+ demo 👉 https://t.co/SzZsxR8dhl
#MEGAnno#LLMs#annotation#phd#nlproc#ai
Blackwell, the new beast in town.
> DGX Grace-Blackwell GB200: exceeding 1 Exaflop compute in a single rack.
> Put numbers in perspective: the first DGX that Jensen delivered to OpenAI was 0.17 Petaflops.
> GPT-4-1.8T parameters can finish training in 90 days on 2000 Blackwells.
> New Moore’s law is born.
As promised, Part 2 of our LLMs for Annotation article, where we introduce efficient strategies for leveraging LLMs for annotation tasks. We also present MEGAnno+, our data annotation tool. https://t.co/F9DV25DgfG
#MEGAnno#LLMs#annotation#phd#nlproc#ai#EACL2024@eaclmeeting
How can Large Language Models ease the data annotation processes? We provide insights into solving the challenges of using LLMs for annotation effectively. And check out Part 2, where we unveil MEGAnno+ for enhanced annotation workflows. #LLMs#nlproc#ai
https://t.co/bIWDeIsd6L
Comprehensive survey on LLM evaluation. Analogy of "Consumer Reports" for models. Sections 10-11 (what is missing, limitations, complex vs. aggregate score) especially interesting.
ChatGPT. New language model built from GPT3.5 from openai for dialog. https://t.co/yaFGUgjQAl You can also try it for free as well. https://t.co/SDcIXHfwxh
https://t.co/axB6DuOGuN Interesting research by @deepmind to improve matrix multiplication which is used extensively in ML. Will be interesting to see how they use this approach to improve other computational problems too.
https://t.co/zGgmT9qpHT Geography of AI: interesting report published by brookings institute analyzing the geographic location of AI activity in the US.
Introducing @OpenAI Codex, a new AI model that converts natural language into code. As a dev ambassador, I've been an early tester of this technology for some time now. Here's an example of taking a simple description for a STEM program converted into a landing page.
@github Copilot — the first app powered by OpenAI Codex, a new AI system that translates natural language into code. Pretty amazing to see the work that @openai has done being translated to a real working product. https://t.co/FkG18jkICi