Constantly learning to develop software. A Python enthusiast. Data wrangler by day. Finds containers helpful. WSL is a great way to have Linux, not avoid it.
A little confused by the association of "distillation" and "attack." Which individuals and companies are allowed to consider generated tokens their own for developing their product? If I develop a Typescript runtime or CLI Coding Assistant, am I allowed to use Claude outputs?
We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax.
These labs created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude, extracting its capabilities to train and improve their own models.
Started to use "droid" today, a very nice command-line agentic coding assistant by @FactoryAI. I used it to tune some complex LLM API calls in my code (looking at you, OpenAI /responses API!) The droid cli did a fabulous job! Without consuming tokens indiscriminately.
@msuiche My only helpful finding so far applies more to models in the 1B to 3B range. Sometimes one or two examples in the prompt (keeping it very concise) yields better structured output than simple one-shot, at least if I am after some JSON. But you have inspired me to utilize XML more
@msuiche Oh, nice. The Qwen3 line is certainly worth exploring. I will re-iterate that Qwen3 1.7b continues to amaze me for its size, too. And, yes... tools! I am slowly experimenting with this, too, but I haven't settled on best practices yet. So your article drew me in.
I have been very curious about very small LLMs lately, such as Gemma-3-270m, Qwen3-0.6b, LiquidAI/LFM2-1.2B, etc. So I truly loved reading this informative and practical article.
just put some internal notes about small language models into a blogpost. nothing crazy, but I would love to find more hands-on people to talk with about it.
https://t.co/QZiJDVCc2b
@msuiche When I read your article, I thought most often of Qwen3:0.6b. Qwen3:1.7b also, but it doesn't always require the same rigorous discipline outlined in your article (but it could benefit!) but 0.6b kinda does. And both perform so well for so many scenarios.
Keeping tabs on the very rapid development of sqruff by @quarylabs. Like sqlfluff, but Rusty and fast. We will see where this goes... https://t.co/kBONlv04bl
When configuratin sway wm, keybinding codes can be found with the wonderful tool wev by Drew DeVault, as in "wev -f wl_keyboard:key" https://t.co/QCZqmZDQ99
@chainguard_dev I think the cost of ones will go way up due to demand vs. supply. Zeroes will be even more expensive, due to the controversy around them (https://t.co/iILykn8V8g)
Using and exploring @SQLFluff for both linting and auto-formatting SQL. Very flexible and configurable. Honestly, can't find better in the open source space. Actively developed. https://t.co/aNB0iQY6Fv
Enjoying aiosqlite today. Does exactly what it is supposed to (async sqlite for Python), thanks to Amethyst Reese @n7cmdr and others. https://t.co/wwis3jp4aA
@ibuildthecloud@aronchick Curious and my brain is slow this morning: what is the "it" here? Containers, maybe, but a container doesn't run browser-side. I doubt you mean transpiling to js... This is a great thread, by the way; thanks for initiating the conversation, @ibuildthecloud