LLM Knowledge Bases
Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So:
Data ingest:
I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them.
IDE:
I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides).
Q&A:
Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale.
Output:
Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base.
Linting:
I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into.
Extra tools:
I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries.
Further explorations:
As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows.
TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.
A few random notes from claude coding quite a bit last few weeks.
Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent.
IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits.
Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased.
Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion.
Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage.
Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building.
Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it.
Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements.
Questions. A few of the questions on my mind:
- What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*.
- Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro).
- What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music?
- How much of society is bottlenecked by digital knowledge work?
TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.
Clawdbot has me running my life through WhatsApp
check my portfolio, search the web, access my files — all through a chat
this is the agent dream actually working
@openclaw
agent debugging is backwards. most teams add logging after failure. smarter move: require the agent to state assumptions before each decision. then failures become diffs between what it predicted and what was actually true. that's when the logs matter.
Happy New Year, everyone! 🎉🥂✨
Start of last year we all thought of ChatGPT as a tool we might occasionally use as an alternative to search. Fast-forward just one year and AI has fundamentally changed the way how we work, create, learn & even think.
2025 truly was the year of change, genuinely excited about what 2026 is about to unleash. #AI #NewYear2026
Lots of experienced devs are not going to get through “their grief cycle”, they’ll just leave the industry or move into real management.
Coding with AI is not intrinsically enjoyable unless you have an ownership stake in the outputs. It’s not why we got into coding.
I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become available over the last ~year and a failure to claim the boost feels decidedly like skill issue. There's a new programmable layer of abstraction to master (in addition to the usual layers below) involving agents, subagents, their prompts, contexts, memory, modes, permissions, tools, plugins, skills, hooks, MCP, LSP, slash commands, workflows, IDE integrations, and a need to build an all-encompassing mental model for strengths and pitfalls of fundamentally stochastic, fallible, unintelligible and changing entities suddenly intermingled with what used to be good old fashioned engineering. Clearly some powerful alien tool was handed around except it comes with no manual and everyone has to figure out how to hold it and operate it, while the resulting magnitude 9 earthquake is rocking the profession. Roll up your sleeves to not fall behind.
Most engineering teams screw up messaging because they don’t understand one thing:
Streams hold truth.
Queues do work.
When you get this wrong, your system bleeds.
I’ve seen $10M mistakes from teams who dump everything into queues:
“Just push the order to a queue and process it!”
Most Queues delete messages after work is done.
No history. No replay. No audit.
Just pain and guesswork.
On the other side, some teams fall in love with Kafka:
“We’ll stream EVERYTHING!”
Here’s the rule I wish someone told me early:
If the event changes the business → Stream
If the message is an action to perform → Queue
Streams = OrderPlaced, PaymentAuthorized, InventoryReserved
These are immutable facts.
They must be durable, replayable, ordered, and auditable.
Queues = SendEmail, CapturePayment, GenerateInvoice
These tasks exist temporarily.
They matter NOW, not 6 months from now.
Event enters the stream → workers derive jobs → queues execute tasks
Ledger first.
Assembly line second.
Everything else is technical debt disguised as cleverness.
Learn Kafka - for the first and the last time
A very popular system design interview question:
"There is an API that handles payment, manages the cart, sends an email, and logs everything in a DB. All of this takes 12–15 seconds. How do you optimize it?"
The simple and perfect answer:
Use a microservice architecture + a message broker system.
This is where Kafka comes into the picture.
Why Kafka?
Kafka is a distributed streaming platform used to build high-throughput, scalable and fault-tolerant systems. Here, we use Kafka as a pub-sub system to decouple long-running tasks.
Kafka Key Components
1. Broker
A Kafka broker is a server that stores data and handles read/write operations.
2. Topic
Messages are stored in topics. Example topics in our case: payment-success, order-success, email-sent
3. Producer
Writes data into Kafka topics.
4. Consumer
Reads data from Kafka topics.
5. Zookeeper
ZooKeeper was responsible for managing the Kafka cluster - adding or removing brokers, selecting the controller/leader, and maintaining topic configurations.
However, Kafka no longer relies on ZooKeeper; it now uses KRaft for all metadata management.
6. KRaft (replaces ZooKeeper)
Manages brokers, leader elections, metadata, etc.
7. Kafka Cluster
A Kafka cluster = multiple brokers working together for scalability and fault tolerance. (Answer to "How kafka is scalable?")
There is one leader and rest are followers, the leader is chosen by who gets the maximum votes.
How Kafka Optimizes the System
Instead of a single heavy API doing everything for 12–15 seconds, we break it into small microservices and push tasks into Kafka topics.
The flow becomes:
1. Payment Service
API receives the request => handles payment => publishes event to payment-success.
2. Cart Service
Listens to payment-success => updates the cart => publishes to order-success.
3. Email Service
Listens to order-success => sends email => publishes to email-sent.
4. Logging Service
Consumes from all topics => logs data to DB.
What really improves?
> API responds instantly (100–200ms).
> All long tasks become asynchronous.
> No service blocks another.
> If one service fails, others keep working.
> The system becomes scalable due to Kafka partitions.
Remember: In distributed systems, you're not just building for speed - you're building for resilience. Kafka gives you both.
Wild.
By far the most complete Claude Skills repo yet 🤯
@Composio’s Awesome-Claude-Skills packs 100`s of ready-to-use workflows:
↳ PDF tools, changelog generation
↳ Playwright automation
↳ AWS/CDK tools, MCP builders
... and much more!
Free and open-source.
Repo in 🧵↓
🚨 Anthropic just solved the problem every AI agent engineer’s been screaming about for a year.
Every agent today burns tokens like fuel every tool call, every definition, every intermediate result jammed into context.
Now Anthropic’s introducing the fix: code execution with MCP.
Instead of calling tools directly, agents now write code to call them and it changes everything.
→ 98.7% fewer tokens
→ 10x faster task completion
→ No context overload
→ No data leakage
Think of it like this:
Old agents talk about what to do. New agents just code and do it.
Cloudflare calls it “Code Mode.” Anthropic just made it real.
This is the turning point.
The next generation of AI agents won’t prompt tools they’ll build with them.
There will be a time when you’ll install a SLM on your machine like JDK and do programming using prompts, won’t have to store the code but store the prompts instead.
https://t.co/tp41Zcdh5l