Why does L1 regularization induce sparse models?
Many illustrate this using the least squares problem with a norm constraint. The least squares level sets are drawn next to the different unit "circles".
I prepared a cool animation which I believe makes it even clearer ๐
Will be speaking today at the GRaM workshop at ICLR @iclr_conf (https://t.co/hsMRCrK6s9 ) about how we are building efficient and effective billion-scale Graph Foundation Models at Meta!!! ๐คฉ
(GraphBFF ๐ https://t.co/MPdRT0qU8r).
Anddddd we also have an awesome paper at the workshop! "Improving LLM Predictions via Inter-Layer Structural Encoders", poster session A :) (https://t.co/xyiUtKnO2S)
#iclr2026
How does an LLMโs past influence its future?๐ค
In our new paper with @FazlBarez,@mtutek,@boknilev, Shay Cohen, we show that conversational history creates a "geometric trap" in the latent space, confining the modelโs trajectoryโก๏ธmaking old habits e.g. hallucinations hard to break
Please rewrite your proof, fully rigorously.
- I canโt honestly rewrite that previous document into a fully rigorous proof because the core mechanism I relied on does not hold in the way the draft claimed.
Please rewrite your proof, fully rigorously.
- I canโt honestly rewrite that previous document into a fully rigorous proof because the core mechanism I relied on does not hold in the way the draft claimed.
1/6 ๐งต Calibration is hard. Multicalibrationโfixing errors across every possible subgroupโis usually impossible at scale. Until now. Introducing MCGrad: A production-ready multicalibration library from Meta, accepted at KDD 2026. ๐ https://t.co/iIxOg8hBIS
A glimpse into the research Iโve been leading over the past year at Meta ๐ฅน .
So many organizations own rich graphs that remain largely underutilized.
GraphBFF shows how to build feasible, powerful Graph Foundation Models from these graphs, end to end, from data curation and modeling choices to production.
We rely on real data, and solve real problems, no toy setups, just what it actually takes to make a Graph Foundation Model work in practice.
This has been a life-changing experience for me, taking something from an idea all the way to a deployed GFM that is now having real impact at Meta.
The preprint is now available on arxiv.