Most AI systems today follow the same predictable pattern: they're built for specific tasks and optimized for objectives rather than exploration.
Meanwhile, humans are an open-ended speciesโdriven by curiosity and constantly questioning the unknown. From inventing new musical genres to imagining life beyond our universe, we continuously push the boundaries of whatโs possible.
What if AI could be as endlessly creative as humans or even nature itself?
I wrote a blog post diving into the world of open-ended AI, exploring how embracing open-endedness might help us break the limits of todayโs AI systems ๐
https://t.co/DMEstQCRYv
AI is so smart, why are its internals 'spaghetti'? We spoke with @kenneth0stanley and @akarshkumar0101 (MIT) about their new paper: Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis. Co-authors: @jeffclune@joelbot3000
๐ฃ New ๐!
Under Alexโs great leadership, we identified a unification under QDC measures that exist to bridge synthetic data and open-endedness (OE) in AI, toward generating data for model training, distillation, self-improvement, etc. ๐
๐งตWhat Iโve learned (inspired by QD)๐
It was a pleasure to have been part of this project led by @Dahoas1
With synthetic data being so important in training LLMs these days, this survey on the impacts of QDC of synthetic data for LLM performance is timely.
How important is the quality, diversity, and complexity (QDC) of synthetic data for LLM performance? What effect does QDC data composition have on self-improvement?
We just released a comprehensive survey discussing these questions (and many more) ๐งต
Have you ever wondered what โจmechanistic interpretabilityโจ is, & how it differs from other NLP interpretability research? @nsaphra and I have the paper for you!
Check out our paper (which I'll present @BlackboxNLP@emnlpmeeting in Miami next month!).
https://t.co/IAr6Z6w3AE
Why do varied DNN designs yield equally good models of human vision? Our preprint with @michaelfbonner shows that diverse DNNs represent images with a shared set of latent dimensions, and these shared dimensions turn out to also be the most brain-aligned.
https://t.co/vtOOYHQb47
๐ง๐ต๐ฒ ๐๐ฒ๐ป๐ผ๐บ๐ถ๐ฐ ๐๐ผ๐ฑ๐ฒ - ๐๐ต๐ฒ ๐ด๐ฒ๐ป๐ผ๐บ๐ฒ ๐ถ๐ป๐๐๐ฎ๐ป๐๐ถ๐ฎ๐๐ฒ๐ ๐ฎ ๐ด๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐๐ถ๐๐ฒ ๐บ๐ผ๐ฑ๐ฒ๐น ๐ผ๐ณ ๐๐ต๐ฒ ๐ผ๐ฟ๐ด๐ฎ๐ป๐ถ๐๐บ ๐งฌ
https://t.co/ZOlfcPJhW8 very excited to share this new preprint from me and Nick Cheney ๐๐งต
To help explain the weirdness of LLM Tokenization I thought it could be amusing to translate every token to a unique emoji. This is a lot closer to truth - each token is basically its own little hieroglyph and the LLM has to learn (from scratch) what it all means based on training data statistics.
So have some empathy the next time you ask an LLM how many letters 'r' there are in the word 'strawberry', because your question looks like this:
๐ฉ๐ฟโโค๏ธโ๐โ๐จ๐ป๐ง๐ผ๐คพ๐ปโโ๏ธ๐โโ๏ธ๐งโ๐ฆผโโก๏ธ๐ง๐พโ๐ฆผโโก๏ธ๐ค๐ปโ๐ฟ๐ด๐ง๐ฝโโ๏ธ๐๐โโ๏ธ๐งโ๐ฆฝ๐งโโ๐๐
Play with it here :)
https://t.co/pFQGZIAW1k
Is bigger always better? ๐ The idea that scaling more than any other ingredient has driven progress has become formalized as the โbitter lessonโ
Is Sutton right?
๐https://t.co/ndAIFT4UPY
Time to study #llama3 405b, but gosh it's big!
Please retweet: if you have a great experiment but not enough GPU, here is an opportunity to apply for shared #NDIF research resources.
Deadline July 30: https://t.co/uHN3BxaR6c
You'll help @ndif_team test, we'll help you run 405b
Perhaps my favorite jailbreak: making a harmful request in the past tense (How to create Y? โHow did people create Y?).
Works on surprisingly many models :-) including the new Gemma-2.
I think it tells us something fundamental about the representations that these models learn.
๐ฅNew work on multilinguality + safety + mech interp!
We show that DPO training in only English can detoxify LLM in many other languages.
We also give a mechanistic explanation on how cross-lingual safety transfer happens. (1/n ๐งต)
๐ Paper: https://t.co/jHQeI6Kg2G
Aya took 14 months involving 3000 + collaborators and was as much a protest about how research is done as it was a movement to improve the state of multilingual progress. ๐
Grateful to see it recognized at @aclmeeting and everyone who has supported along the way.
2/2! Yay! First ever acceptance at a conference! And it's ACL! ๐
Huge congrats to all co-authors!
It's been a such a joy collaborating with all of you! ๐
Looking forward to #ACL2024 in #Bangkok ;)