@renenyffenegger Thanks @renenyffenegger. This is exactly what makes it interesting. The 13 parameters are additions, and finding them is a hard piece of work. Looks like a general trend to me - I hope to report on a similar paper in one of the next posts.
Theory Thursday: Steering the Sailboat: How 13 Parameters Shift a 7B Model
Training an AI model is hard. It requires skill and persistence. And the methods evolve constantly in this fast-moving field.
The newest trick: add a small number of steering parameters to unlock sleeping functionality.
In Learning to Reason in 13 Parameters, J. Morris et al. take a pretrained Qwen 7B model and improve its performance on GSM8K, a benchmark of ~8,000 grade-school math word problems requiring multi-step reasoning.
They freeze all 7 billion parameters. Nothing in the base model is retrained.
Instead, they attach a tiny low-rank adapter. In the extreme case: just 13 trainable parameters. Inserted into carefully chosen attention projections. Optimized using reinforcement learning.
The implication is interesting. The tiny adapter moves the model toward more consistent reasoning paths, like a rudder gives a sailboat the stability to hold its course.
Practically, this means we can keep a 7B model and steer it, preserving the option to run it on-device.
Which use cases for small devices are you waiting to see?
https://t.co/Lh5JdBGEtP, Learning to Reason in 13 Parameters, J. Morris et al, Feb ‘26
Art: https://t.co/QPZUwWDV5E https://t.co/NynkQ8YyKP
Workflow Wednesday: Fallback Is a Capability
In the last Workflow Wednesday, we talked about interruptibility by design and promised more todos when building agentic systems. Here is the second: fallback and override.
Suppose you introduce transport robots in a warehouse. They move materials and waste. The system works well and replaces humans who did these tasks before. Nobody misses the manual runs. But then a technical problem halts the system.
That hurts. Transport cannot wait, and humans must step in. The problem: nobody is available, and the task is unpopular. If the breakdown is too long, the drop-in cost outgrow the savings.
Now replace transport robot with Agentic System, and you see the problem.
In a recent paper, Giuseppe Romeo & Daniela Conti show that when systems appear reliable, humans reduce verification effort and rely more heavily on them. Plus: Trust in AI systems generalizes. Positive experience with one system increases reliance on the next.
But how to mitigate it?
By foreseeing fallbacks and by designing in manual overrides. We know that domain expertise is one of the strongest protective factors. Because expertise requires repetition, tasks need to be shared between humans and robots. As musicians know well: when we stop practicing, capacity degrades.
By employing people who can perform the task when the system stops. If no one can step in at short notice, fallback exists only on paper.
By implementing rotation models, shadow operation, and … drills. Measuring recovery time is a good idea.
By foreseeing graceful degradation (under stress, uncertainty, or partial failure, the agentic system must reduce scope).
By explicit escalation paths (fully autonomous → human review → human lead), where the system state is visible and understandable. And by adding a hard “I take over” button.
You get the picture. The todo is to add overrides and fallbacks to the agentic system from the start.
If your agent stopped tomorrow: who takes over, and how long would it take?
Art: https://t.co/QPZUwWDV5E https://t.co/NynkQ8YyKP
G. Romeo, Daniela Conti, July 2025: https://t.co/5yunk9SCWZ
Trend Tuesday: Why AI Skill Installations Deserve Caution
Before you consume something powerful, you check what’s inside.
AI skills are a recent invention. And marketplaces to share and reuse skills formed almost immediately. As with any growing ecosystem, misuse tends to follow.
@DanielLockyer recently observed malware in a widely downloaded skill on Clawhub. The mechanism was subtle: the installation instructions linked to malicious binaries, enabling remote command execution. No damage was observed and no data loss was reported.
The observation matters.
A skill looks harmless. Just text. No binary. No installer. Yet once invoked, it can run commands, exfiltrate data, or instrument a system in unintended ways.
The best protection is vigilance. Check what’s inside, and walk away if needed. Keep agents contained. Allow only what you explicitly trust, and stay aware of what they do.
Agentic AI holds real promise. With that power comes responsibility. How do you control the agentic systems you run today?
Art: https://t.co/QPZUwWDV5E https://t.co/NynkQ8YyKP
https://t.co/cVW0S3ukhM
Theory Thursday: Energy Efficiency as an Innovation Driver ✅
TL;DR: RETRO hints that efficiency, not size, will drive the next AI gains.
Today’s AI is energy-intensive. I get many questions about it. And it is true: today’s AI is largely a brute-force approach. Clumsy in a way. We are very happy to (finally) have working systems with astonishing capabilities. As long as we have no better options, we feed energy into the systems we have. A bit like the first cars with incredibly low miles per gallon. And as with these cars, the energy efficiency of AI will eventually improve. And likely unlock additional capabilities along the way.
When energy becomes a constraint, architecture matters.
That’s why a 2022 DeepMind paper on RETRO is back in focus. The approach retrieves external text for a conversation, a common approach for well-grounded chat systems (retrieval-augmented generation, or RAG). Unlike most RAG systems, RETRO uses cross-attention: the retrieved information is integrated without making the prompt longer. This matters because prompt length is one of the dominant drivers of energy consumption in autoregressive LLM inference.
In other words: people are actively searching for and experimenting with ideas that achieve the same results with less effort. Be it fewer parameters or shorter prompts. And this is not just theoretical. Microsoft shows that cost-aware training of their Search-R1 algorithm improves exact match by about 5% on average while using roughly 20% fewer inference resources.
AI is developing at high speed. I’m curious to see which additional architectural ideas we will find next. Ideas that improve AI both in terms of energy efficiency and in how we design these systems in the first place.
- Helia Hashemi et al, https://t.co/qqTif79M6l, Cost-Aware Retrieval-Augmentation Reasoning Models with Adaptive Retrieval Depth (Oct 25)
- Art: https://t.co/QPZUwWDV5E https://t.co/NynkQ8YyKP
I like @schwarzenegger’s Be Useful. One of his remarks stuck with me: “Write half a page every day, and in a year you’ll have a book.” Today, we finalized the layout and sent The Modern AI and Data Platform to print. 📘
@Handelszeitung Die Folgen des letzten Vergabeentscheids der SBB für die Doppelstockzüge (Dosto) werden Pendler wohl noch jahrzehntelang spüren. Ein fragwürdiger Einsatz öffentlicher Mittel. Hoffentlich ist der heutige Entscheid weitsichtiger.
Workflow Wednesday ⏱️ AI projects fail less from tech than from teamwork. Like clockwork: success depends on precise alignment: people, skills, and culture working in sync.💡 Clear roles 🧠 Strong skills (AI, MLOps, security, storytelling) 🤝 Culture of quality, curiosity.
AI is doubling fast, like water lilies on a lake. 🌊
Faster models, smarter algorithms, real impact.
A self-reinforcing loop: better AI → better science → better AI. I spent time finding real-impact cases among all the hype.
AlphaDev discovered a faster way to sort small data sets. The algorithm is now part of LLVM, bringing small but measurable speedups to nearly every modern device.
https://t.co/VLU8VQZOxs
Workflow Wednesday 🕵️♂️ AI impact starts with the right questions: Questions → Data → Model → Value. Miscount customers and churn is distorted. Misdefine products and campaigns skew. Business questions are not detail: they are the case board for AI.
Trend Tuesday 🍟 AI speeds things up. Often at cost. Developers code faster and 25% of AI output carries flaws. In medicine, GPT-4 alone beats doctors. As a mere support tool, its edge disappears. Like tractors in 1915, true gains need systemic change. What structures must shift?
Theory Thursday: Why LLMs Hallucinate
A Fata Morgana promises relief but leads you astray. So do LLM hallucinations: fluent answers that are wrong.
They result from misaligned training incentives and other sources. Always verify. How do you stay clear-headed? (soon on linkedin)