What does it mean to create a new concept rather than retrieve a familiar one?
I propose that creativity is what's unfamiliar at first but quickly learnable by an adaptive observer, and show that meta-learning through a frozen Diffusion model produces stylistic & conceptual creations.
Context Tuning accepted to ICML 2026 🎉
See you in Seoul.
https://t.co/kdwUhxLva3
It’s a neat LLM adaptation method with minimal implementation overhead and great scaling behavior. Hoping to add it in the PEFT library, and will do a follow-up post with lots of new results.
Also excited to share my new LLM reasoning/adaptation work 🔜
Arrived in Rio/ICLR to present two works! DM's open for chats :)
In When Does Verification Pay Off? (https://t.co/vIIc28aSCw), we study LLM self-improvement via test-time verification, evaluating solver-verifier pairings across 37 models and 9 datasets to surface concrete design principles.
In SkillFactory (https://t.co/lXB0lJ0Oec), we propose a recipe for self-distilling high-level cognitive skills into reasoning models, making post-training substantially more effective.
📸 w/ @choang333@AntChen_
The failure modes (e.g. the agitprop-y reference and Magritte reference) are kind of interesting insofar as they're more conceptual than the other styles
The ImageNet moment for style transfer!!
We have released the full code (training/inference), dataset, models of MegaStyle, Welcom to have a try🤗!
Code:https://t.co/bSczxqwZHq
Dataset:https://t.co/1MCdg0lmrX
Models:https://t.co/mXdYO6zkkn
@dilanesper I mean, the protests were severely repressed, it's hard to keep momentum for many years, and there officially was a ceasefire. Not really surprising that the protest landscape is different now
@PetarV_93 Separately, have always found Neil Ghani's category theory work interesting. I periodically try to find a deep learning application for his paper with Michael Abbott, Thorsten Altenkirch, and Conor McBride on differentiating data structures
Maybe it is just my latent Wittgensteinianism, but the construction for the bitstring example feels like it has the same flavor as the rule-following paradox
new preprint!
turns out, if your model is confident on _any_ long enough input, we can find other inputs where the model is wrong, yet its perplexity won't really tell you it's wrong 📉
work with @fedzbar@ccperivol@sindero and Razvan
@PetarV_93 I like the idea of connecting a certain kind of extrapolation with equivariance to some transformation. Out of curiosity, have you explored making the transformation to which you are equivariant data-dependent?
@PetarV_93 Oh that's very cool, thanks for the pointer. I've always had a soft spot for Wittgenstein and his work inspired some of my research interests when applying to PhD programs. I guess it's not too surprising how many problems feel like a return to Wittgenstein or Hume, though
Excited to see my work with @rteehas on
"When Does Verification Pay Off? A Closer Look at LLMs as Solution Verifiers" featured here!
arXiv link: https://t.co/9BNOJGdqaH 👀
At Agentic Learning AI Lab, the best part of our research is working with the incredible students. They are the ones writing the code, running the experiments, and solving the hard problems in AI today.
As we approach the end of the year, we are fundraising to ensure these students have the resources they need to thrive in 2026.
Unlike big industry labs, we rely on grants and gifts to keep our research independent and open. If you’d like to support the next generation of AI researchers of our lab at NYU, please visit the link below!