Back in the builder seat. I have started @useMotley with my cofounders Egor (ex head of AI at Wise) and Artem. https://t.co/IDati1Iqi3. Lots of discussions around systems of records. We are building the best way to interface with those, starting from a reporting automation.
100% agree. We are entering the era of token engineering. The first wave was “use more AI.” The next wave is “use the right token, on the right model, at the right time, with the right cache.”
Routing, defaults, caching, lean context, visibility, all of that becomes the infra layer for making exponential token growth economically sustainable.
My prediction: the next big optimization wave will be background agents.
Not just smart routing across models, but smart routing across time. A lot of agent work does not need to happen synchronously. Code review, evals, refactors, data extraction, doc updates, security scans, research passes, inbox cleanup, CRM enrichment, test generation, migration plans. Many of these tasks can wait 30 minutes, 2 hours, or overnight.
Today we mostly think in fixed token prices. $ X per million input tokens, $ Y per million output tokens. But compute does not actually have fixed scarcity. GPU capacity has peaks and valleys. Interactive usage spikes during the workday so background workloads can run when capacity is cheaper.
So I think we move from: fixed token price to latency aware token price. Aka Need it now? Pay the realtime price. Can wait an hour? Cheaper. Can wait 24 hours? Much cheaper.
This is already starting with batch APIs, but I think it becomes a much bigger primitive. Agents will decide not only which model to use, but when to run the task.
The future AI stack will optimize across model quality, cache state, latency tolerance, GPU capacity, and business value.
The goal is not fewer tokens per se it's fewer expensive tokens wasted at the wrong time.
If you want to build your own version of rippling’s data cloud, talk to @useMotley , we open sourced our embeddable semantic layer, just tested it successfully against BIRD-INTERACT.
New website out. https://t.co/mZiM8sz2Sy
We launched Rippling Data Cloud today - an all-in-one rebuild of the modern data stack, with AI deeply integrated throughout.
Why would you want an org-and-employee-centric data stack? Well, here’s how I used Rippling Data Cloud to help with token burn and cut AI slop. 1/
https://t.co/2rhp8x3KVn
We launched Rippling Data Cloud today - an all-in-one rebuild of the modern data stack, with AI deeply integrated throughout.
Why would you want an org-and-employee-centric data stack? Well, here’s how I used Rippling Data Cloud to help with token burn and cut AI slop. 1/
https://t.co/2rhp8x3KVn
We launched Rippling Data Cloud today - an all-in-one rebuild of the modern data stack, with AI deeply integrated throughout.
Why would you want an org-and-employee-centric data stack? Well, here’s how I used Rippling Data Cloud to help with token burn and cut AI slop. 1/
https://t.co/2rhp8x3KVn
Thanks for discovering us. If you are trying to ship AI-driven data analytics or reporting, please talk to us – we'd love to share knowledge and learn about your challenges.
@anshnanda foundation model labs will, if not already, co-train + RL their models alongside their own agent harnesses. this results in higher reliability than a general purpose agent harnesses would be able to achieve on its own with the same model
@LakeAustinBlvd +100 hence why we are building Slayer an open source semantic and context layer to give AI agent the tools to query data accurately and consistently over time https://t.co/4wiXhVUv4R
What I find fascinating with Claude Fable 5 is it proves once again that large generalist models will outperform vertical ones.
On ProofBench (graduate-level formal math benchmark in Lean, where a proof either compiles or it doesn't) Fable 5 beat Harmonic's Aristotle, 77% vs 71%.
Aristotle is a system built specifically for formal math + run on its own internal harness, so the generalist beat the specialist on the specialist's home turf.
It's the Richard Sutton's "The Bitter Lesson". His whole argument is that across 70 years of machine intelligence research, the methods that win are the general ones that scale with compute. Not the ones where we hand-encode human expertise. Building our own knowledge into the system feels good and helps short term gains but long term it always gets overtaken by bigger model.
You can look at Chess, Go, speech, vision, same story every time. First the specialized model wins, then the general one takes over.
and btw this is the whole premise of AGI. You don't build one model for math, one for code, one for law. you build a single general model that scales with compute and it learns to do everything
Token costs are why there will be no saas apocalypse / good dev tools are cached intelligence for agents!
The popular theory goes: agents can write code, so they'll just rebuild every tool from scratch and hit raw APIs. no more dev tools, no more CLIs, no more software layers. just agents and endpoints!
We just tested this and the data says the opposite. We benchmarked Claude Code and Codex on real Hugging Face Hub tasks (~1,000 graded runs), with two setups: the agent-optimized hf CLI vs the agent hand-rolling curl or SDK calls from scratch.
Hand-rolling burns up to 6x more tokens on multi-step tasks and fails more often (84% vs 94% task success).
And that's just dropping one abstraction layer. It would obviously be orders of magnitude more tokens and a dramatically higher failure rate if the agent tried to bypass HF altogether and rebuild model hosting, versioning, and distribution from scratch. Every time an agent re-derives a workflow from raw API calls, you pay for that reasoning in tokens. every single run. a good CLI compresses that entire chain into a few high-level commands the agent can't get wrong.
In a world where everyone is complaining tokens are too expensive, abstraction is leverage: thousands of hours of design decisions your agent doesn't have to re-reason about at inference time.
Good tools are cached intelligence for agents!
So no, agents won't rebuild everything from scratch. they'll gravitate to the most token-efficient tools, because that's what their owners pay for. The software that survives won't just be accessible to agents, it will be accurate and cheap for them to drive.
We're seeing it happen with HF, which is becoming the platform for agents to use AI: ~49M requests in just two months, and growing fast!
https://t.co/Y7q6yuxZrZ
Assistant agent will need to connect to their customers internal databases to help them manage their workflows. Build a framework for each so that they understand it. Semantic layers will play a critical role. We are building SLayer for this lightweight, embeddable > https://t.co/hYb1xp9iRI
@LexSokolin@gdibner That phase will happen (if it should) in the next 5 years. Also will be silent first as new premium dominate. The magic of insurance. Also the structure is interesting. How much of the risk the customers carry themselves is an interesting question .
PICARD: Data, shields up
DATA: Brilliant! Shields can reduce damage we sustain. Not immunity. Not hubris. Just prudence. It's not precaution—it's strategy.
[camera shakes]
WORF: HULL BREACHES ON NINE DECKS
DATA: Here's what happened: you told me to raise shields, and I didn't