Impressive: https://t.co/SIHkGS79hs
A lot larger context windows, a lot cheaper, a lot faster. That's already great, though incremental.
But being multi-modal, guaranteed json output (no more malformed function responses), and seeds/reproducible outputs is really great, too. 🎉
If you aren't yet working with this approach to get better results, maybe you like to read the ReAct paper for motivation/results: https://t.co/UuDAJIVyDO
Now that we (believe to) know (?) that GPT-4 is a Mixture of Experts (MoE) 8 experts x 220 B model, here the Google paper that propelled this old idea forward in 2017: https://t.co/QYmVJgfQ7b (and its GLaM https://t.co/K25QSDQvbE).
That's amazing and if it really works, exactly the right step forward: a better adjusted/trained foundational model to connect to the world/invoke APIs (like Gorilla). The cost reductions and context window increment is great, too.
https://t.co/Hreo7clBkx
Same observation. Despite everybody saying, they are close/there is no moat, I find the moat GPT-4 has remarkable. Hopefully we will see more competition later this year. For now however, I find it the best model - by many magnitudes.
https://t.co/SBViPKQmyD
What if we set GPT-4 free in Minecraft? ⛏️
I’m excited to announce Voyager, the first lifelong learning agent that plays Minecraft purely in-context. Voyager continuously improves itself by writing, refining, committing, and retrieving *code* from a skill library.
GPT-4 unlocks a new paradigm: “training” is code execution rather than gradient descent. “Trained model” is a codebase of skills that Voyager iteratively composes, rather than matrices of floats. We are pushing no-gradient architecture to its limit.
Voyager rapidly becomes a seasoned explorer. In Minecraft, it obtains 3.3× more unique items, travels 2.3× longer distances, and unlocks key tech tree milestones up to 15.3× faster than prior methods.
We open-source everything. Let generalist agents emerge in Minecraft! Welcome you all to try today: https://t.co/1d3YocozsI
Paper: https://t.co/JcWEasgtyI
Code: https://t.co/KsvVf7rcl0
Deep dive with me: 🧵
FYI https://t.co/DGn4x96K1S GPT-4 is great, but also not perfect with NL system msgs to access APIs/connect to outside world. OpenAI plugins such as Wolfram|Alpha with 1.5K token msg are costly & slow. This could become a way forward for the use case: NL->API->NL that we need.
New talk from @ylecun:
"auto-regressive LLMs are doomed"
Errors accumulate exponentially. "This is not fixable with the current architecture... the shelf life of autoregressive LLMs is very short-- in 5 years nobody in their right mind will use them."
https://t.co/anwW1saMCG
FYI/some podcasts on GPT that I am following:
- ChatGPT https://t.co/SAbrGAh5C7
- The ChatGPT Report https://t.co/j6rWfCWyhW
- Beyond the Screen https://t.co/YOYPfs1Z70
- Eye On A.I. https://t.co/fRluZkEvVH (slow-paced, but interesting guests now and then)
MMS: Massively Multilingual Speech.
- Can do speech2text and text speech in 1100 languages.
- Can recognize 4000 spoken languages.
- Code and models available under the CC-BY-NC 4.0 license.
- half the word error rate of Whisper.
Code+Models: https://t.co/NIGfUZ8KZg
Paper: https://t.co/W15aEWHGIR
Blog: https://t.co/TFKXFtlPwc
ai is overhyped. it's not gonna take over the world like some people think. it's just a tool, like any other technology. but at the same time, it's gonna change everything. it's like the internet on steroids. we can't even begin to imagine all the ways it's gonna impact our lives
Behold, what production code now looks like. I give you the WolframAlpha ChatGPT plugin description.
Use ONLY single-letter variable names..
ALWAYS use this exponent notation..
ONLY simplify or rephrase the initial query if..
ALWAYS write separate code which..