🌟New paper alert: The Master Key Hypothesis🌟
Post-training is expensive AND you have to repeat it for every new model. What if you could just transfer the capability instead?
We introduce UNLOCK — a training-free and label-free framework that transfers capabilities across models using low-rank linear projections.
Takeaways:
1⃣ Skills like CoT & math reasoning live as directions in the model's latent space. We capture these as a steering vector — we call this the Master Key.
2⃣ Simple low-rank linear transformations are sufficient to transfer the Master Key across models' latent spaces
3⃣ UNLOCK elicits behaviors that even prompting can't reliably trigger
4⃣ Gains scale with base model strength. Stronger models show larger gains post-transfer.
5⃣ UNLOCK works by sharpening the output distribution — steering the model toward the right answer.
@johnhewtt@GeorgeMorgulis Very interesting read!
We also find that steering vectors can be used to capture and transfer reasoning capabilities such as CoT, and the steered model can match/surpass prompting. Low-rank signal transfer seems to be very promising!
Paper: https://t.co/vQLzjv7axu
I am in Rio de Janeiro, Brazil for #ICLR2026@iclr_conf. Would love to connect and chat about any LLM research topics, explore collaboration, or discuss potential funding opportunities for my lab at @ Virginia Tech.
@thecekbote If the W_o and next FFN already do the mixing once per layer, isn't IHA functionally similar to MoE in the sense that you mix heads/residuals?
@universeinanegg Our recent work shows that reasoning can also be approximately linear in nature: https://t.co/vQLzjv7axu. But I do agree that the manual searching is required to find these maps/vectors
🌟New paper alert: The Master Key Hypothesis🌟
Post-training is expensive AND you have to repeat it for every new model. What if you could just transfer the capability instead?
We introduce UNLOCK — a training-free and label-free framework that transfers capabilities across models using low-rank linear projections.
Takeaways:
1⃣ Skills like CoT & math reasoning live as directions in the model's latent space. We capture these as a steering vector — we call this the Master Key.
2⃣ Simple low-rank linear transformations are sufficient to transfer the Master Key across models' latent spaces
3⃣ UNLOCK elicits behaviors that even prompting can't reliably trigger
4⃣ Gains scale with base model strength. Stronger models show larger gains post-transfer.
5⃣ UNLOCK works by sharpening the output distribution — steering the model toward the right answer.
🌟New paper alert: The Master Key Hypothesis🌟
Post-training is expensive AND you have to repeat it for every new model. What if you could just transfer the capability instead?
We introduce UNLOCK — a training-free and label-free framework that transfers capabilities across models using low-rank linear projections.
Takeaways:
1⃣ Skills like CoT & math reasoning live as directions in the model's latent space. We capture these as a steering vector — we call this the Master Key.
2⃣ Simple low-rank linear transformations are sufficient to transfer the Master Key across models' latent spaces
3⃣ UNLOCK elicits behaviors that even prompting can't reliably trigger
4⃣ Gains scale with base model strength. Stronger models show larger gains post-transfer.
5⃣ UNLOCK works by sharpening the output distribution — steering the model toward the right answer.
Based on our findings, we introduce the Master Key Hypothesis and postulate the convergence of capability representations across model families and scales.
The 3 self-distillation papers seem to be extremely similar in the method and only differ in how the feedback is generated/incorporated. They are also only compared to SFT (known to be the weakest method), while incorporating feedback is also done with other PO methods. Not quite sure of the takeaways, but the improvements and continual learning settings look good!
Excited to share that our paper on efficient model development has been accepted to #EMNLP2025 Main conference @emnlpmeeting. Congratulations to my students @linusdd44804 and @Sub_RBala on their first PhD paper! 🎉