🧶 3/3
We show that models fine-tuned on IMPaCTS outperform instruction-tuned zero-shot LLMs on controlled sentence simplification tasks, becoming proficient in both simplifying texts and in adhering at the specified target readability level.
Paper: https://t.co/YjECuw90X6
🧶 1/3
We’re excited to share our second paper presented at #LREC2026 in Palma: “Controllable Sentence Simplification in Italian: Fine-Tuning Large Language Models on Automatically Generated Resources” by @mpapucci_, Giulia Venturi and Felice Dell'Orletta
🧶 2/3
In this work, we introduce IMPaCTS: a new multi-level sentence simplification dataset for Italian, automatically generated to support controllable text simplification research.
😊 Dataset on HF: https://t.co/j83FJoar1R
🔧 Dataset on GitHub: https://t.co/sRaKQk5hF6
In this work we present UGLD, a decoding method that nudges LLMs toward or against a predefined vocabulary at inference time. No fine-tuning needed and without loss of fluency.
📄 Paper: https://t.co/aJisq2rKTK
🔧 GitHub: https://t.co/xjTVJZQB9i
🐍 Install UGLD: pip install ugld
🚀 Excited to share our latest work at READIxTSAR workshop at #LREC2026
"Lexical Conditioning of Model's Distribution through Uncertainty-gated Soft-Mixing of Probabilities" by @mpapucci_ Giulia Venturi and Felice Dell'Orletta.
[1/4] How does the order of pretraining data impact Neural Language Models?
We study this in our 🎉new paper🎉:
"On the impact of pretraining data ordering in transformer encoder- and decoder-only language models"
Published in Knowledge-Based Systems
🔗https://t.co/nWrQsLtue3
[3/4] We examine effects on:
• Perplexity dynamics
• Linguistic probing
• Downstream tasks performance
• Embedding geometry
Data ordering shapes both learning dynamics and representations.
Cristiano Ciaccio won the CLiC-it Best Student Paper Award for the work “Crossword Space: Latent Manifold Learning for Italian Crosswords and beyond” (with @gsarti_, @AlessioMiaschi and Felice Dell’Orletta)!! 🔥🔥🔥
@AILC_NLP#NLProc
Last but not least, Agnese Bonfigli presenting “A Novel Real-World Dataset of Italian Clinical Notes for NLP-based Decision Support in Low Back Pain Treatment”! 🥳
Link to the paper: https://t.co/obw3kqUtU5
#NLProc
Afternoon session with Ruben Piperno presenting “Doctor, Is That You? Evaluating Large Language Models on Italy’s Medical School Entrance Exams” (with Bonfigli A., Dell’Orletta F., Pecchia L., Merone M. and @LBacco1)! 🔥
Link to the paper: https://t.co/Rpa06FRFn8
#NLProc
And now Silvio Calderaro presenting “The OuLiBench Benchmark: Formal Constraints as a Lens into LLM Linguistic Competence” (with @AlessioMiaschi and Felice Dell’Orletta)! 📖
Link to the paper: https://t.co/y75Mv0ElGn
#NLProc
First talk of the morning with @mpapucci_ presenting “Generating and Evaluating Multi-Level Text Simplification: A Case Study on Italian” (with Giulia Venturi and Felice Dell’Orletta)! 🔥
Link to the paper: https://t.co/E9xJ5Oig57
#NLProc
Last but not least (for today), @lucadini_ presenting “The Role of Eye-Tracking Data in Encoder-Based Models: An In-depth Linguistic Analysis” (with @workerplacemint, Dominique Brunato and Felice Dell’Orletta)! 👁️
Link to the paper: https://t.co/DwnmQqPrUN
#NLProc
🧩 And now some crossword solving with Cristiano Ciaccio presenting “Crossword Space: Latent Manifold Learning for Italian Crosswords and beyond” (with @gsarti_, @AlessioMiaschi and Felice Dell’Orletta)!
Link to the paper: https://t.co/46rP2tcBQm
#NLProc