Laguna M.1 and XS.2 now support 256K context.
Laguna M.1 is now live with a 256K context window on the Poolside API and OpenRouter.
With this update, it reaches 45.8% on Terminal-Bench 2.0, improving long-horizon performance.
Laguna XS.2 is also moving to 256K today, with the updated config already available on Hugging Face.
Both models remain free to use.
Over 1T tokens have been processed since launch 4 weeks ago. Excited to see what people build with the longer context window.
Today we’re publishing the technical report behind Laguna M.1 and Laguna XS.2.
This report opens up more of what went into them: Model Factory, pre-training data, distributed training, post-training, agent RL, quantization, and evaluation.
https://t.co/RWk2F9IrAI
Poolside is hosting a 2-day model research hackathon in London.
Join us to push an open-weight agent model as far as you can. RL and fine-tune Laguna XS.2, our latest-generation model, on Prime Intellect Lab.
Dates: May 29–30
Partners: @nvidia + @PrimeIntellect + @huggingface
Prize: NVIDIA DGX Spark
Agents need better models.
Better models need cracked researchers.
Link below.
We're releasing <𝙿𝚘𝚘𝚕𝙴𝚟𝚊𝚕.../> today, an open-source component for rendering eval charts. We built this to power the dozens of eval charts on https://t.co/xMfPzgCiec, and now you can use it too. Accessible, beautiful, configurable, touch friendly and parseable at a glance.
Useful for model providers, harness developers and researchers alike.
https://t.co/f742mvLwhg
Couldn't find a good solution for displaying eval charts, so built a new component from scratch with animation, touch friendly tooltips, a11y, model provider logos and more. Open sourcing soon ✨
Today we’re releasing Laguna M.1 and Laguna XS.2, our first public models.
Laguna XS.2 is our first open-weight release, with weights available today on Hugging Face:
https://t.co/SZTkrtOHm7
A few details on what went into them: large-scale pre-training, data mixture optimization, synthetic data, optimizer efficiency, and async agent RL.
Today we're launching our first public Poolside models: Laguna M.1 and Laguna XS.2, and we've built ❈Shimmer, an instant-on VM sandbox with Poolside Agent pre-installed so you can try them out.
Go play out our new models for free, and build something fun → https://t.co/G2YKxQmT9L
Today we’re releasing Laguna XS.2, Poolside’s first open-weight model.
It’s a 33B total / 3B active MoE model built for agentic coding and long-horizon tasks.
Trained fully in-house on our own stack. Runs on a single GPU. Released under Apache 2.0.
Links 👇
Weights: https://t.co/HSo8L2gM64
API: https://t.co/DMJtNFrace
Blog: https://t.co/BXEjQxtQoV
✨ Excited to be at @NeurIPSConf in San Diego next week with the team at @poolsideai ⛱️ Booth #913!
If you’re building, researching, or thinking about the next wave of AI systems, especially around RL and agentic workflows, I’d love to connect.
Ping me if you’re around! ✨
Happy to share our work on "Text2Code Generation with Modality-relative Pre-training" w/ @gcsanity & @glampouras_NLP, accepted at #EACL2024! 🎉
We propose to treat code and natural language as
different modalities.
📜https://t.co/fOYsUARpve
💻coming soon (pending int. review)
Research Scientist (permanent) positions just opened in our NLP team in London - Huawei Noah's Ark Lab! Looking for experienced researchers to help us tackle some interesting (and persistent) questions :)
Take a look and apply if interested:
https://t.co/Q4JajEMFLQ
Delighted to present our #naacl2024 paper, #HumanRankEval: Automatic Evaluation of #LMs as Conversational Assistants with @glampouras_NLP and @iiacobacNLP providing fast, reliable and private evaluation of instruction-tuned #LLMs. A truly new eval paradigm, no #gpt4-as-a-judge :)
🚀 Excited to share our new pre-print: "Human-like Episodic Memory for Infinite Context LLMs"! We introduce EM-LLM, a novel approach integrating cognitive science insights into LLMs for vastly extended context processing: https://t.co/dwt5GQEVGS
What we did:
· 📊 We treat LLMs' K-V cache as analogous to personal experiences and segmented it into events of episodic memory based on Bayesian surprise (or prediction error).
· 🔍 We then apply a graph-theory approach to refine these events, optimizing for relevant information during retrieval.
· 🔄 When deemed important by the LLM's self-attention, past events are recalled based on similarity to the current query, promoting temporal contiguity & asymmetry, mimicking human free recall effects.
· ✨ This allows LLMs to handle virtually infinite contexts more accurately than before, without retraining.
Our method outperforms the SOTA model InfLLM on LongBench, given an LLM and context window size, achieving a 4.3% overall improvement with a significant boost of 33% on PassageRetrieval. Notably, EM-LLM's event segmentation also strongly correlates with human-perceived events!!
We are releasing this method today with our first set of results, but more results and analysis are coming soon.
Huge thanks to all my co-authors/colleagues for this amazing collaboration: Martin A Benfeghoul, Adnan Oomerjee, @fenchri, @glampouras_NLP, @hbouammar from @Huawei Noah's Ark and Jun Wang from @UCL.
For a fuller description, check out @hbouammar's thread: https://t.co/66UArAlSbL
Stay tuned for more updates and a code release soon :) #LLMs #LongContextLLMs #EpisodicMemory #NLP #CogSci #MachineLearning #AI
Looking to grow our NLP team in London - Huawei Noah's Ark Lab! We have positions for Research Scientists (permanents) and Engineers (contractors), where they will conduct academic and applied research in NLP and ML. Details below:
https://t.co/dDxE97BgRl
https://t.co/LGN9snl3yb