Karolus Sariola @ksariola - Twitter Profile

Pinned Tweet

about 1 month ago

Sharing my experiences from building specialized harnesses for analytical SaaS companies. It's likely that your harness requires your own defaults around data, context, multi-tenancy, and evolving business rules. After all knowledge work is different from software development. Which default behaviors do you encode in your harness today? Are you encoding them in the best way?

Flow AI @flowaicom

about 1 month ago

Claude Code is a great agent harness, for coding. For analytical SaaS, it is the wrong default. Our CTO @ksariola took that case to AgentCon Silicon Valley this week, drawing on our experience of building specialized harnesses for analytical SaaS. https://t.co/kcODAHECm7

flowaicom's tweet photo. Claude Code is a great agent harness, for coding. For analytical SaaS, it is the wrong default.

Our CTO @ksariola took that case to AgentCon Silicon Valley this week, drawing on our experience of building specialized harnesses for analytical SaaS.

https://t.co/kcODAHECm7 https://t.co/NeFGQwE302

2

4

1

3

470K

5

108

12

17

470K

ksariola retweeted

Mitchell Hashimoto

@mitchellh

1 day ago

Fable is a good model. As with all new models, it is simultaneously excellent and entirely unremarkable (relative to other models). It is slow and expensive, and the "loops are all you need" discourse they are pushing is obvious in the context of someone using Fable-class models What I've found so far is that for broad scope design (code architecture) tasks, Fable is unremarkable. Or, not better enough to justify its cost and speed. But in highly targeted goal-oriented loops, it is another beast entirely. It is very slow but produces very good results. I let it churn on optimizing a SwiftUI-layout resolver in Go I wrote and it was able to bring it down to an order of magnitude I could not reach myself (micro => nanosecond scale). But it took 2 hours and $40 to do it and I had to claw back some changes it overfit to Apple Silicon. Still, very worth it. In comparison, for "implement this feature/change" iterative work, I ran head-to-head Fable vs GPT5.5 vs. GLM-5.1. They all produced equally acceptable final results, but GPT5/GLM did it in a couple minutes and Fable was churning away for 40 minutes. And GLM cost me less than a dollar, GPT5.5 ~$1.50, and Fable cost $9. You can see that in this context, interactively working with an agent is nonsense. Its too slow. You need to write loops to keep the agent working and you probably want to highly parallelize the work being done. As with all things, I think a balance makes sense... My sense is that I'd reserve Fable for targeted, surgical analysis and work. Not for daily driving everyday tasks. I'm going to keep spending a shitload of money (relatively) and maining Fable for the rest of the week to continue to judge, will report if anything changes. I'll continue to head-to-head as well.

98

3K

175

940

246K

ksariola retweeted

MERICA MEMED

@Mericamemed

4 days ago

Guess I need one of these now

116

4K

338

2K

360K

ksariola retweeted

Flow AI @flowaicom

6 days ago

The speaker lineup for Context is King no. 5 is set. London, next Monday June 8, during @LDNTechWeek, at @atomico's office. Last days to grab a seat: https://t.co/Lp3pxtzIty @TeodoroBaldazzi @prometheuxlabs @jonatanvm @ElevenLabs @DmitrievStan @ksariola

flowaicom's tweet photo. The speaker lineup for Context is King no. 5 is set.

London, next Monday June 8, during @LDNTechWeek, at @atomico's office.

Last days to grab a seat: https://t.co/Lp3pxtzIty

@TeodoroBaldazzi @prometheuxlabs @jonatanvm @ElevenLabs @DmitrievStan @ksariola https://t.co/AVsljAaocx

0

5

3

1

119

Who to follow

Infrastructure for customer-facing data agents in analytical software.

Mikko Mäntylä

@mantyla_mikko

Co-founder & CEO of Realm. Ex-President of @SlushHQ.

ksariola retweeted

Olli-Pekka Heinisuo

@skvark

8 days ago

https://t.co/S5uOE9DaAU

1

3

1

510

ksariola retweeted

Aaro Isosaari

@aaroisosaari

8 days ago

Here is how we build agents that continuously learn from customer feedback. We scan agent traces for user-correction patterns: the moments where someone pushed back on what the agent did and explained why. An LLM classifies those signals and drafts a candidate update to the agent's knowledge. That candidate goes into a queue where a human expert reviews it before anything is live. If they approve, the change goes into the semantic data layer and is live for every user under that tenant from the next message onward. @bergr7 covered the full version at Context is King 👇

1

6

3

1

138

ksariola retweeted

Flow AI @flowaicom

14 days ago

A roomful of technical AI builders gathered at The Agentic Night by @silta_hq and @AntlerGlobal in Helsinki last night. Our co-founder @ksariola joined @JernJohan from Realm on stage. Full panel on Youtube. https://t.co/MuwkXqcbM8

3

7

3

1

163

ksariola retweeted

Bernardo García

@bergr7

16 days ago

The most useful debugging skill I've seen teams develop is getting extremely precise about what their agent is doing wrong. “The agent always calls the search_documents tool with a broad query and then makes 3–5 execute_sql calls as the first steps, unnecessarily increasing latency.” >> “The agent starts with unnecessary exploratory search.” “The agent fails because the context window gets bloated with large outputs from execute_sql. Adding limits or pagination doesn’t help because the data is indivisible, so the agent keeps trying to retrieve the full set.” >> “The context window is usually exceeded after a couple of execute_sql calls.” “The agent repeatedly retries failing tool calls with slightly different parameters instead of changing strategy.” >> “The agent gets stuck in local recovery loops.” At this level of detail, the next two steps become almost mechanical: 1. Decide what the agent should do instead. 2. Encode that behavior as a default in your harness. Resist the temptation to jump into implementation work before you fully understand the failure mode and root cause. I dropped a clip below from my Context is King talk where I explain this process for designing specialized harnesses.

0

4

3

1

114

ksariola retweeted

Bernardo García

@bergr7

20 days ago

Agent framework -> Primitives Agent harness -> (Optimized) defaults

0

2

1

0

87

ksariola retweeted

Aaro Isosaari

@aaroisosaari

20 days ago

Agent framework or agent harness? Many people use the two words interchangeably. My co-founder @bergr7 explained the difference at Context is King. A framework gives you the primitives, a harness comes with opinions about how the agent should behave. Those opinions are where most of the leverage sits. Upgrading the model is the easy move, but rarely the one that matters most. The biggest gains in our agents came from baking verticalized opinions into the harness: how it plans, what it knows about your data, how it carries large result sets between steps, when it asks for approval before acting.

0

8

2

3

392

ksariola retweeted

Bernardo García

@bergr7

22 days ago

If you're in London, you can't skip this one: World-class lineup Real technical depth Meet people building the same stuff you are Go register!

0

2

0

229

ksariola retweeted

Aaro Isosaari

@aaroisosaari

22 days ago

We're bringing Context is King to London for the first time on June 8, during London Tech Week, after four sold-out editions in San Francisco and Helsinki. First speakers in from @ElevenLabs, @prometheuxlabs and @motley, with more confirming soon. Hosted at @atomico. Big kudos to @ksariola for driving this edition! Sign up: https://t.co/VGo0cPuPdO

1

10

5

983

ksariola retweeted

Stan Dmitriev @DmitrievStan

22 days ago

👑 Context is King Vol. 5 🗓️ June 8 @ London. Diving beneath the app layer into models, inference & safety. 🛠️ Speakers from Elevenlabs, Motley, Prometheux and more to come. 70 spots! Link below 👇

1

4

2

0

63

Karolus Sariola

@ksariola

23 days ago

Actually allowed to use laptops in Stockholm cafes. Erhm, take note Berlin..

0

26

ksariola retweeted

Flow AI @flowaicom

23 days ago

The agent harness most builders know is a coding harness. Analytical products require specialized harnesses with numerical precision and specific tools. Our co-founder @ksariola opened his AgentCon Silicon Valley talk on exactly this question.

0

3

0

157

Karolus Sariola

@ksariola

24 days ago

@ErikKaum @huggingface Congrats, way to go!!

1

0

244

ksariola retweeted

Erik Kaunismäki

@ErikKaum

24 days ago

Releasing my first kernel on @huggingface: MaxSim Late-interaction retrieval (ColBERT / PyLate) bottlenecks on materializing the full similarity matrix. This kernel avoids it by using tiled scoring with simdgroup_matrix (Metal) and WMMA. Result is 3–5× speedup compared to naive PyTorch. Try it out 👇

ErikKaum's tweet photo. Releasing my first kernel on @huggingface:
MaxSim

Late-interaction retrieval (ColBERT / PyLate) bottlenecks on materializing the full similarity matrix. This kernel avoids it by using tiled scoring with simdgroup_matrix (Metal) and WMMA.

Result is 3–5× speedup compared to naive PyTorch.

Try it out 👇

17

357

43

194

49K

ksariola retweeted

Nate Berkopec

@nateberkopec

25 days ago

I'm so sick of reading em dashes and "it's not x, it's y." I'm so sick of it, man.

364

5K

272

142

291K

ksariola retweeted

Igor Kotenkov

@stalkermustang

about 1 month ago

While reading the DeepSeek v4 paper, I ended up writing down over 90 questions. A lot of the paper reviews out there skip over the details, which is usually where the actual learning happens. So, I decided to put together a proper guide: an Annotated Paper Walkthrough. The core idea is that you still read the original paper as your source material, but whenever things get dense or confusing, I hold your hand through it. You get detailed annotations with visualizations, code snippets, reference links, and—most importantly—the context you need so you don't feel lost. Today I'm releasing v1 with the first 50 notes. Some of the things I unpack: • Why swap Softmax and Sigmoid for Sqrt-Softplus in the MoE Router? • What on earth is a Birkhoff polytope? • Does attention process some tokens 3 times? • What are split-KV and split-K, and why did DeepSeek drop them? • Why use Reverse KL, and where does it even come from? ..and a lot more. Even the most demanding readers will find something new here. Open-source models are still heavily borrowing from DeepSeek v3, and there’s no doubt that v4 details will soon become standard topics in discussions and ML interviews. Hopefully, this guide helps you stay ahead of the curve. As a friend of mine joked, going through this will not only make you a better engineer, but a better man 😂 I can't prove that scientifically, but it's worth a shot. Check it out: https://t.co/AJ1kUREInv

10

246

24

220

40K

Karolus Sariola

@ksariola

about 1 month ago

Given that Groq and Cerebras aren't adding new models to their catalogue and are seemingly entering into other deals (like serving Spark), do we assume the consumer category of OSS models with fast inference is dead? are there new players entering?

0

34

ksariola retweeted

Aaro Isosaari

@aaroisosaari

about 1 month ago

Context is King no. 4 in San Francisco yesterday. A few of the highlights below, full talks on YouTube: https://t.co/9bW176rhhi Next up: London. @bergr7 (@flowaicom), Itai Smith (@trychroma), @Romainsestier (@StackOneHQ), @OtsoVeistera (@thetokenco), @NathanBurg (@GitHits_com), @aiven_io

0

7

6

0

239

Karolus Sariola

@ksariola

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users