The first model to beat 100% of ChatGPT-3.5
Available on Huggingface
🔥 OpenChat_8192
🔥 105.7% of ChatGPT (Vicuna GPT-4 Benchmark)
Less than a month ago the world witnessed as ORCA [1] became the first model to ever outpace ChatGPT on Vicuna's benchmark.
Today, the race to replicate these results open-source comes to an end.
Minutes ago OpenChat scored 105.7% of ChatGPT.
But wait! There is more!
Not only OpenChat beated Vicuna's benchmark, it did so pulling off a LIMA [2] move!
Training was done using 6K GPT-4 conversations out of the ~90K ShareGPT conversations.
The model comes in three versions: the basic OpenChat model, OpenChat-8192 and OpenCoderPlus (Code generation: 102.5% ChatGPT)
This is a significant achievement considering that it's the first (released) open-source model to surpass the Vicuna benchmark. 🎉🎉
- OpenChat: https://t.co/lglHYQpo2A
- OpenChat_8192: https://t.co/XU9o3GaVsg (best chat)
- OpenCoderPlus: https://t.co/qwPCD8mXkg (best coder)
- Dataset: https://t.co/tXj34fv5Wp
- Code: https://t.co/WhS5dPq6ml
Congratulations to the authors!!
---
[1] - Orca: The first model to cross 100% of ChatGPT: https://t.co/vRyupCy7Tg
[2] - LIMA: Less Is More for Alignment - TL;DR: Using small number of VERY high quality samples (1000 in the paper) can be as powerful as much larger datasets: https://t.co/58bo1qarSl
I built a GPT-4 'Warren Buffett' financial analyst to 'chat' with and analyze multiple PDF files (~1000 pages) across @elonmusk's Tesla 10-k annual reports (2020-2022)
#gpt4#openai#investing#stocks#finance
Top ML Papers of the Week (Mar 13 - Mar 19):
- GPT-4
- FlexGen
- NeRFMeshing
- Resurrecting RNNs
- An Overview of Language Models
- Universal Prompt Retrieval for LLMs
...
Prompt Engineering Guide
ICYMI, we recently launched the prompt engineering guide that makes it easier to stay up-to-date with prompt engineering techniques and papers.
https://t.co/UrrKL5xHu6