Natural evolution's open-endedness leads to beautiful, complex emergent structures and self-organizing behavior 🌱✨. Replicating this in silico is famously hard 💻. Our paper points to a promising direction by evolving populations of competing neural cellular automata with lifelike behavior 🧬🤖 #Isambard
⚠️⚠️flashing lights, rapid cuts, or strobe effects in this thread! 🚨🚨
1/n
My guilty confession is I sometimes watch this promo to hype myself up for a coding sprint that involves large checkpoints and RLVR. Only like a few times tho🫠
Introducing the Environments Hub
RL environments are the key bottleneck to the next wave of AI progress, but big labs are locking them down
We built a community platform for crowdsourcing open environments, so anyone can contribute to open-source AGI
Perhaps I'm an outlier, but generally the value I derive from art is not from its backstory. I love a Bach fugue not because he was suffering, content, had many children, or whatever else, but because it's an extraordinary composition. I'd feel the same about AI generated art.
It feels like adding recurrent loops to a transformer could help...if only there were some papers on that and maybe a new upcoming very good general one 🤔
I do not get it why people are so bearish on diffusion. Why let the existing implementation tricks bias us away from a future with models that think more flexibly and require less tricks for sampling diversity
Most researchers agree that autoregression is best when memory bandwidth is cheap and diffusion is best when FLOPS are cheap. They also admit the future of compute is all FLOPS because memory scaling is hard and scaling FLOPS is easy. So why not go all in on diffusion????
Hindsight Experience Replay has become the ubiquitous method for goal-conditioned reinforcement learning, but leaves open the question of which goal to relabel with.
In this work, accepted at ICML, we propose instead simply Learning Everything All at Once (LEO).
1/
CV has CNNs, NLP has transformers - what inductive bias does RL have? How can policies generalise to regions of the dataset suffering from poor transitions?
We motivate hierarchy by enabling distinct state-representations at different levels of the hierarchy @FLAIR_Ox@j_foerst
Love seeing this out in the wild! We've been working on new ways for interacting with other players and agents with Genie. Still in early testing, but I'm so proud of the team for demoing this at I/O this week. Excited to see what this can enable in the future 🚀
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
As important as the "EXPLORATION" problem is, it's v often subjected to grave (and misleading) myopic definitions. This paper is excellent at avoiding all of those
This paper by @MinqiJiang was one of the main reasons I first started learning and then got obsessed with RL as the ultimate effort to emergent intelligence. Truly paradigm shifted my post-grad
was reading a paper last night that felt very timely and refreshing, like the sort of thing that was bound to anchor the next wave of innovation in self-teaching
then i realized it was from 2022
@miniapeur Absolutely love the detail that they are heading in with a printout of their NeurIPS paper. 5 is an outrageous number, but it hits the GPA inflation analogy I guess