‼️🚨 A new npm supply-chain attack compromised 57 packages across over 286 malicious versions in under 2 hours. The attackers used self-replicating malware, a new version of the Miasma worm, which also used evasion techniques to stay under the radar.
The payload targets CI/CD and developer credentials, including GitHub Actions secrets, cloud credentials, Vault tokens, SSH keys, npm and GitHub tokens, and password-manager stores. This variant also injects AI coding assistant config files at `.claude`, `.cursor`, `.gemini`, and `.vscode` paths, a separate persistence and repo-poisoning angle.
Hiring manager: *Looking at my portfolio* wow this is great! Looks awesome, I love it.
How did you built this?
Me: Thank you, I used AI obviously
HM: techstack used?
Me: uh, Vite and React
*UBER SETS $1,500 MONTHLY CAP ON SOME AI CODING TOOLS FOR STAFF
$UBER officially reeling in the Claude budget after blowing their AI budget earlier this year.
Undoubtedly more companies to follow
What if you could take three completely different model families… and distill them into one tiny model? 🤯
📜 Paper: https://t.co/K2iKD4xFvp
MOPD (Multi-Teacher On-Policy Distillation) has become a standard procedure in post-training. We already distill multiple specialized variants of the same model into a single set of weights.
But what if we could go further - and distill models from entirely different families? Turns out, it is possible.
Today we’re releasing a paper on cross-tokenizer distillation - our first steps in this exciting direction. 📄
We distilled Qwen3-4B, Phi-4-Mini, and Llama-3B into Llama-3.2-1B.
MMLU jumped from 32.05 → 46.32 when using multiple teachers. 📈
The team is now working on Nemo-RL integration so the community can try this method in their own settings. Plus, we are scaling experiments up. 🚀