🎉 Our paper about Preference Optimization has been accepted to ICML 2026!
We unify entangled & disentangled objectives via incentive–score decomposition, derive the Disentanglement Band for ideal training dynamics: suppress loser while preserving winner.
#ICML2026
Flow-OPD: On-Policy Distillation for Flow Matching Models
The first integration of On-Policy Distillation into Flow Matching models.
Replaces sparse scalar rewards with dense trajectory-level supervision.
Achieves 0.92 GenEval and 0.94 OCR accuracy on SD-3.5-Medium, with +18pt improvement over base.
@stjohn2007 You are absolutely right. My density-chasm experience confirms ADPO's theoretical robustness to abnormal responses, especially late in preference optimization training. Thank you for this insightful work.
@JoshuaRenyi@HuggingPapers An impressive work! Our work on ICML2026 introduces the disentanglement band to analyze preference update interference, inspired by your work.
https://t.co/H6zWp2XnJt
🎉 Our paper about Preference Optimization has been accepted to ICML 2026!
We unify entangled & disentangled objectives via incentive–score decomposition, derive the Disentanglement Band for ideal training dynamics: suppress loser while preserving winner.
#ICML2026
@HuggingPapers Interesting! Our work introduces the disentanglement band: a conceptual tool for analyzing how preference updates interfere with the winner vs. loser responses. It helps diagnose how suppressing the loser may harm the winner. Also inspired by @JoshuaRenyi.
https://t.co/H6zWp2XnJt
🎉 Our paper about Preference Optimization has been accepted to ICML 2026!
We unify entangled & disentangled objectives via incentive–score decomposition, derive the Disentanglement Band for ideal training dynamics: suppress loser while preserving winner.
#ICML2026
🎉 Our paper about Preference Optimization has been accepted to ICML 2026!
We unify entangled & disentangled objectives via incentive–score decomposition, derive the Disentanglement Band for ideal training dynamics: suppress loser while preserving winner.
#ICML2026
@HadyHaji seems interesting view of preference optimization, and I recently work on a similar idea. Could you please share an link of
this paper to me?
Plz consider submitting high quality works to
ICML 2026 Workshop on Foundations of Deep Generative Models,
and interact with the cool community in the summer of vibrant Seoul, South Korea!
https://t.co/aQov8ddYIp
Submit at https://t.co/I5VVL3IhJG by 4/30.
🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length.
🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models.
🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice.
Try it now at https://t.co/GCdiMzk1Dl via Expert Mode / Instant Mode. API is updated & available today!
📄 Tech Report: https://t.co/drlDrxkYtp
🤗 Open Weights: https://t.co/T13Y8i7SDM
1/n
Score-based methods: theory says the path doesn't matter, but practice says it does.
We found why — path variance — and learned the optimal interpolation path in closed form. No heuristics, just math.
#ICLR#ICLR2026#Rio