AI & Data Engineering Leader | Building production AI systems, Lakehouse platforms, and enterprise AI adoption, Databricks, LLMs, RAG, PySpark & red-teaming
It all started back in October 2024 sitting in a court yard having a beer in Cusco Peru. Sat down with my mate discussing AI and all things tech.
Off the back of this conversation and just when I needed it, vibe coding started to really gain traction and we decided to start building out our direct to consumer private blood testing company.
I have been a BI consultant, Data Engineer, ML engineer and eventually into the messy world of corporate management. So while I had most of the skills required to build out our idea it would not of been possible to do it with my 9-5 without a lot of investment.
Fast forward to June 25, I had a working product but it was vanilla. Functionally perfect, execution poor with no real brand or feel. We decided to work with a marketing company and the difference it made was enormous. Within no time we had a brand, a vision and a plan to get to market.
And here we are today, the coming soon page ready with only one more bit of development left, early testing then time to launch.
With it being such a slog, I’m glad to be at the end but hopefully the beginning of something.
For all those who say just ship, i tend to agree. I have shipped a couple of apps, rebuilt my wife’s business CRM and open sourced a few bits all at the same time.
But don’t feel pressured into releasing everything quickly. Some projects deserve the investment and the journey to get to the goal.
Will it succeed? Who knows… But I bloody hope so 🤞
not sure though, the bit that keeps bothering me is the API cost everyone keeps quoting. That’s obviously not what they are paying, and we have no idea on the markup. Compute is expensive, but at scale and throughout, economies of scale kick in and im sure its considerably cheaper than API cost
@robinebers To be fair I used workflows and ultracode and hit my 20x weekly allowance. Was hitting the 5hr limit within 1.5 hours so had to add in wait tasks. Using it as a good excuse to give Codex a go
I get that Anthropic are worried about security, but even the simplest task of checking if I was exposed to the TanStack npm supply chain attack is throwing an API error. @AnthropicAI
@levelsio@X I hope it works, over the last month I have got to a point where I stopped reading the replies and to a point the platform as I was seeing the same cookie cutter replies. Once you mentioned it before I couldn’t unsee it
Anyone else getting fed up with the blatant AI replies to anything to do with AI model news from the big accounts. All follow the same pattern and it’s making the replies a waste of time looking at
@beffjezos I’m inclined to get the UI nailed over the next month on my project and then finally give codex a go instead. Endless API errors, definitely a huge drop in quality. Sham as I was getting in the grove
Built an LLM router in pure Rust. No GPU, no PyTorch, no preference data.
1.2M params, trains in 3.4 minutes on a laptop.
89.5% routing accuracy, 58% cost savings vs all-Opus.
The trick: a sparse graph where routing nodes actually talk to each other. No existing router does this.
Paper + code: <https://t.co/d11PCSO2bZ>
DOI: 10.5281/zenodo.19016401
Built an LLM router in pure Rust. No GPU, no PyTorch, no preference data.
1.2M params, trains in 3.4 minutes on a laptop.
89.5% routing accuracy, 58% cost savings vs all-Opus.
The trick: a sparse graph where routing nodes actually talk to each other. No existing router does this.
Paper + code: <https://t.co/d11PCSO2bZ>
DOI: 10.5281/zenodo.19016401
I built an LLM router from scratch in pure Rust with no ML frameworks.
Every existing router (RouteLLM, FrugalGPT, AutoMix) makes one decision: input → classifier → model. Done.
AXIOM does something different — the routing nodes actually communicate with each other.
The sparse computation graph has 4 traversal directions:
→ Forward (Surface → Reasoning → Deep)
↔ Lateral (same-tier nodes try before escalating)
↑ Feedback (Deep nudges Reasoning confidence upward)
⏱ Temporal (past decisions blend into current ones)
No existing router in the literature has this topology.
Results on 200 benchmark queries:
• 100% accuracy on simple queries (zero false escalations)
• 56.6% cost reduction vs routing everything to Opus
• 735μs routing latency — under 1ms, no GPU
• 1.2M parameters, 3.4min training on commodity hardware
---
Surveyed 75+ routing systems for prior art. None use graph-based multi-directional inter-node communication.
Built this as a solo independent researcher alongside a day job. Happy to answer questions.
I built an LLM router from scratch in pure Rust with no ML frameworks.
Every existing router (RouteLLM, FrugalGPT, AutoMix) makes one decision: input → classifier → model. Done.
AXIOM does something different — the routing nodes actually communicate with each other.
The sparse computation graph has 4 traversal directions:
→ Forward (Surface → Reasoning → Deep)
↔ Lateral (same-tier nodes try before escalating)
↑ Feedback (Deep nudges Reasoning confidence upward)
⏱ Temporal (past decisions blend into current ones)
No existing router in the literature has this topology.
Results on 200 benchmark queries:
• 100% accuracy on simple queries (zero false escalations)
• 56.6% cost reduction vs routing everything to Opus
• 735μs routing latency — under 1ms, no GPU
• 1.2M parameters, 3.4min training on commodity hardware
---
Surveyed 75+ routing systems for prior art. None use graph-based multi-directional inter-node communication.
Built this as a solo independent researcher alongside a day job. Happy to answer questions.
The brain doesn’t receive reality. It predicts it, constantly running priors, correcting for error.
Perception itself is predictive.
Dismissing LLMs as “just next-token prediction” while ignoring that your own cognition runs on a strikingly similar principle.
One evolved in carbon wetware. The other engineered on silicon.
The algorithm doesn’t care about the chemistry.
Prediction is the mechanism. Substrate is just the medium.
Playwright cli uses 94.3% fewer tokens for the exact same work.
Full report + reproduction steps quoted below.
If you’re running browser automation with MCP servers, you’re probably burning insane tokens on schema tax.
Real benchmark shows @playwright wins
New in Cowork: scheduled tasks.
Claude can now complete recurring tasks at specific times automatically: a morning brief, weekly spreadsheet updates, Friday team presentations.