Showcased https://t.co/Wl35Avkgbr at @devtoolsTO - Toronto Tech Week 2026 and had handful of signups :)
Slides: https://t.co/EbZBJ9ozIX
Thanks a ton @bentlegen & organizers for the opportunity ๐
What if you could take three completely different model familiesโฆ and distill them into one tiny model? ๐คฏ
๐ Paper: https://t.co/K2iKD4xFvp
MOPD (Multi-Teacher On-Policy Distillation) has become a standard procedure in post-training. We already distill multiple specialized variants of the same model into a single set of weights.
But what if we could go further - and distill models from entirely different families? Turns out, it is possible.
Today weโre releasing a paper on cross-tokenizer distillation - our first steps in this exciting direction. ๐
We distilled Qwen3-4B, Phi-4-Mini, and Llama-3B into Llama-3.2-1B.
MMLU jumped from 32.05 โ 46.32 when using multiple teachers. ๐
The team is now working on Nemo-RL integration so the community can try this method in their own settings. Plus, we are scaling experiments up. ๐
Showcased https://t.co/Wl35Avkgbr at @devtoolsTO - Toronto Tech Week 2026 and had handful of signups :)
Slides: https://t.co/EbZBJ9ozIX
Thanks a ton @bentlegen & organizers for the opportunity ๐
Netlify has amazing UX and expensive but cloudflare workers & pages are pure value stuff.
Gave Claude code some wrangler access and it completely migrated the site, planned DNS cut over, sitemaps are functional, added and verified redirects ๐ and cancelled netlify subscription!
Code-Pathfinder MCP usage is doubling every month ๐ and it's wild to see the agents keep coming back to get the right codebase context!
Dropping more exciting security scanning updates soon!
https://t.co/ExG4fl6VOH
Stop grepping. Start querying.
I've been using Claude Code for over a year to build Code-Pathfinder. It's incredible for prototyping and exploring design choices.
But I kept hitting the same problem.
Every conversation, I'd watch the LLM (Agent) grep through files, read entire modules, hunt for needle in the haystack. I'd lose confidence in responses. Precision would drop when context windows grew.
I'd correct obvious things:
๐๏ธ"no, that function is called by these 12 places" or ๐๏ธ"this import resolves here" or
๐๏ธ"can you cross check where this method are invoked"
๐๏ธ "Are you sure this class contains this abc methods"?
The agents was searching when it should have been knowing or atleast could have been accessible. So, I built something for myself. An MCP server that exposes Code-Pathfinder's indexed call graphs directly to AI agents.
No more grepping. No more reading full files. Just instant queries:
๐who calls this?
๐what does this depend on?
๐where does this import go?
The shift was immediate. Conversations went from "let me search..." to "here's exactly what you asked."
More trust. Fewer corrections. Faster iterations.
Bonus: Working with microservices? Configure multiple Pathfinder instances - one for your Python SDK, one for your gRPC server, one for your BFF layer. Query across all repos in a single prompt. Your AI agent gets the full picture without grep-ing all the way through the repo.
I used it for weeks, refined it, then realized: if this fixed my workflow, maybe it should help with similar python projects.
Today I'm open sourcing it:
https://t.co/37ICcqtBoG
@JustJake Google has been blocking accounts left and right on consumer side! The only solution is to diversify account access across providers.
https://t.co/nA25LQ6TJS