@rpatrik96@roydanroy You might find this interesting, this is a new textbook Prof. Sridhar Mahadevan has been writing, also took a seminar this spring semester at UMass following its content: https://t.co/xEHTzos9AO
COMPSCI 692CT — Category Theory for AGI: https://t.co/dCtrvVt6KA
Someone once told me: "You should be the last one to reinvent something" -- not sure how useful this is, but this is a common occurrence in science.
It is true that frontier AI labs have innovations that are often simultaneous / re-discovered by academic labs.
However, folks outside those labs have no way of knowing about those innovations and their only source of reference would be the work shared openly.
After automating AI research with @SchmidhuberAI and building AI Scientists at DeepMind, now comes the real experiment: the institution itself.
Excited to co-found @inherent_labs: the recursively self-improving lab for scientific AI.
https://t.co/SQjUduaG3D
i think some people are hoping that self-distillation enables “exploration-free” RL purely via reflection on live data, allowing them to bypass the need for replayable environments
unfortunately, RL is all about exploration
my instinct is you basically need to model the world
Excited to share CrystalReasoner, a reasoning model for crystal structure generation with LLMs and property-conditioned generation through RL:
Website: https://t.co/249N2224on
Paper: https://t.co/W3n8wJN25P
Code: https://t.co/gIQj75p13p
Your RL post-training may be sabotaging your LLM’s test-time scaling!
Conventional RL pretends that you can collapse all reward signals *upfront* into a single *scalar reward*.
We introduce Vector Policy Optimization (VPO), which natively maximizes *vector-valued* rewards, boosting test time search performance, even on the original scalar.
Punchline: distill world models from simulation to enable fast, stable real-world robot adaptation.
Simulation is nearly always wrong. But in Simulation Distillation, we ask a simple question:
How do we perform simulation pretraining such that real-world adaptation becomes trivially easy?
https://t.co/ORDaxU2gzs
Let's take a closer look (1/n)
🚀🚀 Applications for the 2026 Google Play Accelerator India cohort are open! 🚀🚀
If you are an Indian Seed to Series-A startup with a published app, apply for 3 months of mentorship from @GooglePlay , @Android and @GeminiApp teams
🔗 Apply by June 1: https://t.co/3JGssIa9Mr
The bitter lesson in 26 words:
Don’t be distracted by human knowledge, as AI has been historically.
Instead focus on methods for creating knowledge that scale with computation, like search and learning.
Nice of @jennyzhangzt to share this paper, which I selfishly think was ahead of its time. The context was that I was leaving Meta to do another startup, and thought I would not be writing papers for years. Of course, @MinqiJiang had all the good ideas + did most of the writing 😅
@agarwl_ Would be very interesting to see how these in context skills that are generalizing pretty well to other domains and unseen levels can be in turn baked into the model weights continually using the ideas from the Fast-Slow Training paper !!
@agarwl_ I have been pursuing this for some time, paper coming out soon exploring this exactly in the In-Context Meta Learning setup to solve and self-improve llms to be better experimentalists and exploration agents on unseen domains using hierarchical and composable skill creation.