🚨Excited to announce our workshop Context Beyond the Window hosted at COLM in SF! 🚨
LLMs have finite context windows, yet real-world tasks demand absorbing, retaining, and acting on information that far exceeds any single prompt.
1/3
We're looking for submissions across:
https://t.co/6y1ILeeC9A
• Context compression 🧃 — token compaction, recursive subagent calls, and external memory for storing and retrieving information
• Efficient architectures 🚀 — sub-quadratic attention variants that make extremely long context computationally feasible
• Continual training 🌱 — test-time training on streaming data, context distillation, and knowledge accumulation through continued pre-training
• Agentic memory systems 🐘 — scaffolds and test-time scaling techniques that improve knowledge retention and acquisition in LLMs
• Evaluation 🎯 — benchmarking models on increasingly long-horizon tasks
Accepted by #ACL2025! Congrats @mingdachen and the team🥳
Several cool ideas:
- Maintain an explicit editable working memory during generation;
- Actively integrate external feedback (factual check w/ VeriScore);
A smart LM learns to memorize, a smarter LM learns to forget too!
Last one of the year - EWE: https://t.co/D5y53ahtyX
Ewe (Explicit Working Memory), enhances factuality in long-form text generation by integrating a working memory that receives real-time feedback from external resources.
Can visual SSL match CLIP on VQA?
Yes! We show with controlled experiments that visual SSL can be competitive even on OCR/Chart VQA, as demonstrated by our new Web-SSL model family (1B-7B params) which is trained purely on web images – without any language supervision.