We found a new way to get language models to reason. 🤯
No RL, no training, no verifiers, no prompting. ❌
With better sampling, base models can achieve single-shot reasoning on par with (or better than!) GRPO while avoiding its characteristic loss in generation diversity.
🚨Our paper "Enabling Chatbots with Eyes and Ears" has been accepted to #ACL2025NLP!
👀👂We explore how chatbots can integrate visual, auditory, and textual modalities to support multi-party, multi-session, real-world immersive conversations.
🧵👇
Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use.
Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.
Congratulations on the acceptance of "Mixed-Session Conversation with Egocentric Memory" to #EMNLP2024 Findings! A big shoutout to @jihyoungjang and @kty4119 . I'm really excited to see our lab's excellent work being recognized!
Excited to share that "Mixed-Session Conversation with Egocentric Memory" has been accepted to #EMNLP2024 Findings🎉🎉🎉
Special thanks to @kty4119 and @khsquared for their collaboration!
https://t.co/Y3JGphSa5r
📖 Excited to present our novel long-story generation framework, Collective Critics for Creative Story Generation (CritiCS)!
Previous long-story generation research mainly focused on coherence, overlooking creative and captivating storytelling. However, what we truly want is to generate stories that captivate readers. Therefore, we developed CritiCS—a framework designed to foster creativity and produce captivating long-form stories through a collective critique mechanism.
Thanks to @khsquared and the amazing Language & Intelligence Lab family for their invaluable insights and continuous support!
🚀 Excited to share our latest findings: LLMs can follow instructions and even refuse unsafe queries, despite being trained solely on responses without paired instructions!
We explore how establishing an output space enables LLM alignment, revealing the extensive inherent capabilities of pre-trained LLMs.
Huge thanks to @khsquared for his unwavering support throughout this year-long journey.
🚨Excited to share Conversation Chronicles, a new high-quality multi-session conversation dataset that consists of 1M multi-session dialogues! #EMNLP2023
https://t.co/0XN6fqCHQt
We have uploaded our dataset and model to @huggingface!
w/ @HH22L1@khsquared
Promising. Everyone should hope that we can throw away tokenization in LLMs. Doing so naively creates (byte-level) sequences that are too long, so the devil is in the details.
Tokenization means that LLMs are not actually fully end-to-end. There is a whole separate stage with its own training and inference, and additional libraries. It complicates the ingest of additional modalities. Tokenization also has many subtle sharp edges. Few examples:
That "trailing whitespace" error you've potentially seen in Playground? If you end your (text completion API) prompt with space you are surprisingly creating a big domain gap, a likely source of many bugs:
https://t.co/f2PBaw2iA8
Tokenization is why GPTs are bad at a number of very simple spelling / character manipulation tasks, e.g.:
https://t.co/XR3d5g4uwp
Tokenization creates attack surfaces, e.g. SolidGoldMagikarp, where some tokens are much more common during the training of tokenizer than they are during the training of the GPT, feeding unoptimized activations into processing at test time:
https://t.co/y72eaIeRrP
The list goes on, TLDR everyone should hope that tokenization could be thrown away. Maybe even more importantly, we may find general-purpose strategies for multi-scale training in the process.
Introducing ImageBind by Meta AI: the first AI model capable of binding data from six modalities at once. This breakthrough brings machines one step closer to the human ability to bind together information from many different senses.
More on this new open source work ⬇️
Top ML Papers of the Week (April 24 - 30):
- AudioGPT
- Track Anything
- Agents Learn Soccer Skills
- Harnessing the Power of LLMs
- Scaling Transformer to 1M tokens
- A Cookbook of Self-Supervised Learning
...
Open-source RLHF implementations are on the rise!
DeepSpeed Chat and ColossalChat are two open-source RLHF pipeline implementations announced in just the past couple of weeks.
Here’s why they matter:
Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields
this is not a recorded video - it is fully rendered from a neural model!
uses ideas from rendering+dsp to improve nerfs, training 22x faster than mlp-nerf 360
arxiv: https://t.co/Svi9TIcSPw
page: https://t.co/3M6mfEuVRK