📢Excited to present our work at #NeurIPS2025 in San Diego!
👕T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
#SFT#DataSelection#EfficientPostTraining
⏰ Wednesday, Dec 3, 4:30–7:30 PM PST
➡️ Exhibit Hall C/D/E, #200
Previous work selects high-quality SFT data by scoring entire samples and splitting them into high- and low-quality sets using a hard threshold.
In contrast, our approach uses fine-grained token-level quality and neighborhood quality to guide sample selection.