Excited to share CrystalReasoner, a reasoning model for crystal structure generation with LLMs and property-conditioned generation through RL:
Website: https://t.co/249N2224on
Paper: https://t.co/W3n8wJN25P
Code: https://t.co/gIQj75p13p
To enable property-conditioned generation that is general enough to work for any properties (e.g., elastic properties, thermal expansions), we can design a general reward function by assigning positive reward to structures with properties falling in the specified range.
Come join our workshop “Multimodal Intelligence: Next Token Prediction & Beyond” - happening NOW at ICLR '26 @iclr_conf 🎉
📍 Riocentro (204 C)
🔗 Schedule: https://t.co/G2eYmg66ut
🔗 Livestream: https://t.co/1L6VR6kqxd
Excited for a full day of talks, posters, and discussions!
@julianhquevedo is presenting WorldGym right now 4/25 10:30am-1pm at Pavilion 4 #4818. Come and check out how world models can be used to evaluate robot policies in the cloud!
https://t.co/dxs7lpCbCM
Evaluating policies on a real robot can be painful. Can we use a world model to get a rough estimate of how good a policy is?
Checkout "Evaluating Robot Policies in a World Model".
Paper: https://t.co/p1MwcrQe4b
Demo: https://t.co/OOkAQwVsu3
Code: https://t.co/y1EMv7Cnq2
Super impressed by the Table Tennis robot from @SonyAI_global. As a TT player myself, I thought expert TT robot is decades away.
Amazed by how far state-estimation (ball location + spin) plus simulator RL could get us.
Proud of my student @mscard01 for being a part of this
For 40+ years, building a robot that could rally with an elite human table tennis player at full speed was an unsolved problem. Sony AI's Ace research project set out to change that—and the results are now accepted for publication in @Nature and featured on the cover.
@_Suresh2 Thanks for the observation. We consider the setting of given an ML task, what is the most effective (RL vs prompting) to achieve the best ML model. The generalization is over stochastic outcomes of ML experiments of a single task. This setting is more similar to RL for Atari game
Machine learning engineering (MLE) is the new agentic frontier. I'll be sharing our work on scaling RL for MLE agents at #ICLR2026:
1) RL of a small model outperforms a frontier model https://t.co/vfk1jc76t5
2) MLE-Smith: scale-up MLE tasks automatically https://t.co/Mc5vyY2fhg
@thkostolansky@percyliang Great question! We found newer frontier models do have much better initial performance, but they still struggle with improving a solution despite more time given. Models generally struggle with turning experience into actionable knowledge, which we think require gradient updates