@SeanZCai@PrimeIntellect if you have ground truth samples, use gepa to optimize your judge (rubric, criteria weights, prompt). cc @JoshPurtell who built https://t.co/maAXvb1HfI
@marcklingen@langfuse request to add the ability to set negative tag filters in the trace UI (e.g., tag != X). i can hack it via query params but would be nice to have it in the UI 🙏
@jarredsumner how would this would work with encodings? you can't decode partially-encoded characters that have been cutoff by maxLength/offset. and, afaict, all workarounds to that are bad
Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey.
Best fully open 32B reasoning model & best 32B base model. 🧵
@nikitabier@dinkin_flickaa @misha_mityushk @nicoduc one unexpected annoyance is that, when exiting the browser view, i get redirected to the post detail view (ie the /status/<post_id> page) even if i clicked the link from my home timeline.
otherwise, nice work 👏
@simonw https://t.co/e89sUM5jGj
related - presumably this is rl-finetuned on a small oss llm but the blogpost doesn't confirm/deny that hypothesis. cc @cognition
Introducing SWE-grep and SWE-grep-mini:
Cognition’s model family for fast agentic search at >2,800 TPS.
Surface the right files to your coding agent 20x faster.
Now rolling out gradually to Windsurf users via the Fast Context subagent – or try it in our new playground!
@simonw also lots of examples where the dev time costs are meaningfully lower, ie prompt iteration on frontier models taking longer than RL-fine tuning small oss models for the same task
@growing_daniel@tigran_zzz it’ll probably be a net positive but america “lacks” israel’s mix of existential threat, national cohesion, tiny scale, and integration of army experience into daily life. i think all of the above are required for the outcomes you’re thinking about.