Co-Founder, AI @flowaicom
Enabling analytical software to embed reliable, customer-facing data agents that reason over structured and unstructured data.
Building data-intensive agents quickly teaches you one thing: large tool outputs don’t belong in the context window.
A simple request can explode into thousands of IDs. Once those flow through tool calls and multi-agent plans, tokens blow up, latency spikes, and agents start failing.
We fixed this with memory pointers instead of raw payloads.
This week I read an IBM Research paper that independently lands on the same solution. Different domain, same failure mode, same conclusion.
Wrote a technical breakdown 👇
Here is how we build agents that continuously learn from customer feedback.
We scan agent traces for user-correction patterns: the moments where someone pushed back on what the agent did and explained why. An LLM classifies those signals and drafts a candidate update to the agent's knowledge.
That candidate goes into a queue where a human expert reviews it before anything is live. If they approve, the change goes into the semantic data layer and is live for every user under that tenant from the next message onward.
@bergr7 covered the full version at Context is King 👇
A roomful of technical AI builders gathered at The Agentic Night by @silta_hq and @AntlerGlobal in Helsinki last night.
Our co-founder @ksariola joined @JernJohan from Realm on stage. Full panel on Youtube.
https://t.co/MuwkXqcbM8
The most useful debugging skill I've seen teams develop is getting extremely precise about what their agent is doing wrong.
“The agent always calls the search_documents tool with a broad query and then makes 3–5 execute_sql calls as the first steps, unnecessarily increasing latency.”
>> “The agent starts with unnecessary exploratory search.”
“The agent fails because the context window gets bloated with large outputs from execute_sql. Adding limits or pagination doesn’t help because the data is indivisible, so the agent keeps trying to retrieve the full set.”
>> “The context window is usually exceeded after a couple of execute_sql calls.”
“The agent repeatedly retries failing tool calls with slightly different parameters instead of changing strategy.”
>> “The agent gets stuck in local recovery loops.”
At this level of detail, the next two steps become almost mechanical:
1. Decide what the agent should do instead.
2. Encode that behavior as a default in your harness.
Resist the temptation to jump into implementation work before you fully understand the failure mode and root cause.
I dropped a clip below from my Context is King talk where I explain this process for designing specialized harnesses.
Agent framework or agent harness? Many people use the two words interchangeably.
My co-founder @bergr7 explained the difference at Context is King. A framework gives you the primitives, a harness comes with opinions about how the agent should behave.
Those opinions are where most of the leverage sits. Upgrading the model is the easy move, but rarely the one that matters most.
The biggest gains in our agents came from baking verticalized opinions into the harness: how it plans, what it knows about your data, how it carries large result sets between steps, when it asks for approval before acting.
We're bringing Context is King to London for the first time on June 8, during London Tech Week, after four sold-out editions in San Francisco and Helsinki.
First speakers in from @ElevenLabs, @prometheuxlabs and @motley, with more confirming soon. Hosted at @atomico.
Big kudos to @ksariola for driving this edition!
Sign up: https://t.co/VGo0cPuPdO
The agent harness most builders know is a coding harness. Analytical products require specialized harnesses with numerical precision and specific tools.
Our co-founder @ksariola opened his AgentCon Silicon Valley talk on exactly this question.
THIS GUY BUILT AN AUTOMATED PIGEON DEFENSE SYSTEM FOR HIS BALCONY
pigeons kept nesting on his balcony so he engineered a full detection and deterrent system
here's how it works:
1\ camera captures video in real time
2\ an AI model identifies the pigeon in real time
3\ a water gun mounted on servo motors turns toward it
4\ sprays the pigeon automatically
the hardware:
> an orange pi 5 running the detection model
> a disassembled electric battery-driven water gun
> USB camera
> 2 servo motors for aiming
> resistors and a transistor to trigger the water gun
the detection runs on an AI vision model (yolo world v2) using the rockchip 3588's built in neural processing unit.
the best part is that it's not limited to pigeons. because it uses open vocabulary detection, you can reprogram the target to any object. squirrels, cats, raccoons, whatever is messing with your balcony
fully automated, runs 24/7, no manual intervention needed
We’re reimagining a 50-year-old interface - the mouse pointer - with AI. 🖱️
These experimental demos show how people can intuitively direct Gemini on their screens using motion, speech, and natural shorthand to get things done 🧵
On a long trip back home after a week in SF..
want to learn about more rust during the flight so downloaded qwen3.5:9b to have some backup..
positively impressed by how much the small models have improved!!
Spoke at Context is King #4 in SF yesterday about why we ended up building our own specialized agent harness instead of reusing an existing one.
I walked through the default behaviors we encoded into the harness, and the implementation choices they led to: how to make schema, organizational knowledge, and business rules available to the agent all at once; how to let our semantic data layer learn at the same pace as knowledge evolves; and how to efficiently manage the context window when working with indivisible data.
Thanks @aiven_io for co-organizing, and to everyone who came out.
Sharing my experiences from building specialized harnesses for analytical SaaS companies.
It's likely that your harness requires your own defaults around data, context, multi-tenancy, and evolving business rules.
After all knowledge work is different from software development.
Which default behaviors do you encode in your harness today?
Are you encoding them in the best way?
The most common mistake we see in analytical agents is dumping data into the context. Letting the agent work from references to the data instead keeps the system fast, the numbers right, and each customer's data separated.
@ksariola does the best job I have heard of explaining why that matters and what it changes about how you ship reliable agents on top of real customer data.
Claude Code is a great agent harness, for coding. For analytical SaaS, it is the wrong default.
Our CTO @ksariola took that case to AgentCon Silicon Valley this week, drawing on our experience of building specialized harnesses for analytical SaaS.
https://t.co/kcODAHECm7
Then I pushed everything to a repo so my co-founder could polish it.
He cloned it, took the design from 80 to 100 with Claude, and opened a PR -> Git-based, on-brand, collaborative slide design.