Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
Token costs will become a dominant topic in enterprises going forward with AI. Just got out of a dinner with many Fortune 500 enterprise CIOs and this was the most heated topic.
A mix of strategies are being employed, but basically no one feels like they have the right solution. A mix of: figuring out how to prioritize workloads to different models, giving out access to better or worse agents by user type, setting different spend caps by team, having teams justify AI by their use-case, and some just having unfettered access.
Everyone is trying to figure out a semi/predictable model right now in a world where the underlying tech and cost models are constantly evolving.
When your working life rewards you, it’s easy to ratchet up the complexity: homes, cars, travel, possessions etc.
I have found that all that complexity comes at the sake of your most fleeting asset: your time. Instead of building things, all of a sudden you’re dealing with minutiae and logistics. Instead of talking mostly to engineers, you’re talking mostly to non-engineers. The building stops…the business of managing self inflicted complexity begins.
It’s worth noting that the best players in the game (Buffett, Elon) have kept their life extremely basic, almost monastic/nomadic, as success ratcheted them ever higher.
I think it’s the biggest secret hiding in plain sight:
When the world upgrades your status, downgrade your complexity.
Getting AI agents working for enterprises is not easy. And this is exactly why there’s so much opportunity right now.
Today, the tech is *just* hard enough to get working right which means only a relatively small number of teams and companies in total will make this simple enough for the world to adopt.
So you basically have a cheat code if you’re building AI agents because we know exactly how this will play out.
The winners of the internet brought powerful web services to the masses. The winners of SaaS did the same for infrastructure and software. The same will be true for AI agents and knowledge work.
Architecture shifts at this level only happen every decade or two. And this will likely be the biggest one we’ve ever seen in tech.
Another year of rapid AI advances has created more opportunities than ever for anyone — including those just entering the field — to build software. In fact, many companies just can’t find enough skilled AI talent. Every winter holiday, I spend some time learning and building, and I hope you will too. This helps me sharpen old skills and learn new ones, and it can help you grow your career in tech.
To be skilled at building AI systems, I recommend that you:
- Take AI courses
- Practice building AI systems
- (Optionally) read research papers
Let me share why each of these is important.
I’ve heard some developers advise others to just plunge into building things without worrying about learning. This is bad advice! Unless you’re already surrounded by a community of experienced AI developers, plunging into building without understanding the foundations of AI means you’ll risk reinventing the wheel or — more likely — reinventing the wheel badly!
For example, during interviews with job candidates, I have spoken with developers who reinvented standard RAG document chunking strategies, duplicated existing evaluation techniques for Agentic AI, or ended up with messy LLM context management code. If they had taken a couple of relevant courses, they would have better understood the building blocks that already exist. They could still rebuild these blocks from scratch if they wished, or perhaps even invent something superior to existing solutions, but they would have avoided weeks of unnecessary work. So structured learning is important. Moreover, I find taking courses really fun. Rather than watching Netflix, I prefer watching a course by a knowledgeable AI instructor any day!
At the same time, taking courses alone isn’t enough. There are many lessons that you’ll gain only from hands-on practice. Learning the theory behind how an airplane works is very important to becoming a pilot, but no one has ever learned to be a pilot just by taking courses. At some point, jumping into the pilot's seat is critical! The good news is that by learning to use highly agentic coders, the process of building is the easiest it has ever been. And learning about AI building blocks might inspire you with new ideas for things to build. If I’m not feeling inspired about what projects to work on, I will usually either take courses or read research papers, and after doing this for a while, I always end up with many new ideas. Moreover, I find building really fun, and I hope you will too.
Finally, not everyone has to do this, but I find that many of the strongest candidates on the job market today at least occasionally read research papers. While I find research papers much harder to digest than courses, they contain a lot of knowledge that has not yet been translated to easier-to-understand formats. I put this much lower priority than either taking courses or practicing building, but if you have an opportunity to strengthen your ability to read papers, I urge you to do so too. I find taking courses and building to be fun, and reading papers can be more of a grind, but the flashes of insight I get from reading papers are delightful.
Have a wonderful winter holiday and a Happy New Year. In addition to learning and building, I hope you'll spend time with loved ones — that, too, is important!
[Original text: https://t.co/MaWDs0AbzG ]
I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become available over the last ~year and a failure to claim the boost feels decidedly like skill issue. There's a new programmable layer of abstraction to master (in addition to the usual layers below) involving agents, subagents, their prompts, contexts, memory, modes, permissions, tools, plugins, skills, hooks, MCP, LSP, slash commands, workflows, IDE integrations, and a need to build an all-encompassing mental model for strengths and pitfalls of fundamentally stochastic, fallible, unintelligible and changing entities suddenly intermingled with what used to be good old fashioned engineering. Clearly some powerful alien tool was handed around except it comes with no manual and everyone has to figure out how to hold it and operate it, while the resulting magnitude 9 earthquake is rocking the profession. Roll up your sleeves to not fall behind.
“Because as workflows get more automated and more agent driven, the fragility point often has nothing to do with the model and everything to do with whether the agent pulled the right value from the right system at the right time.”
AI agents will actually grow the value of systems of record over time. As we have 1,000X more agents using these systems than humans, the systems that are the traffic cops for what data they can access, what data is the “correct” data, and how these workflows are governed, will be integral to their success.
Pulling the wrong data, or sharing it with the wrong person, can be catastrophic for mission critical workflows. Humans already do this, but they at least have a lot more context by default. Agents will do this by orders of magnitude more. This is where systems of record + agents necessarily come together.
Conversely, a “well managed” system of record will give agents the ability to actually bring automation to critical workflows in an enterprise. Some software today can’t deliver this and will be thrown out for new tools. There will also be tons of new opportunity for agents that work across various systems to combine the efforts of different agents into a broader workflow.
Everything else you’ve read about the future of software is likely wrong. This piece gets it right.
As a fun Saturday vibe code project and following up on this tweet earlier, I hacked up an **llm-council** web app. It looks exactly like ChatGPT except each user query is 1) dispatched to multiple models on your council using OpenRouter, e.g. currently:
"openai/gpt-5.1",
"google/gemini-3-pro-preview",
"anthropic/claude-sonnet-4.5",
"x-ai/grok-4",
Then 2) all models get to see each other's (anonymized) responses and they review and rank them, and then 3) a "Chairman LLM" gets all of that as context and produces the final response.
It's interesting to see the results from multiple models side by side on the same query, and even more amusingly, to read through their evaluation and ranking of each other's responses.
Quite often, the models are surprisingly willing to select another LLM's response as superior to their own, making this an interesting model evaluation strategy more generally. For example, reading book chapters together with my LLM Council today, the models consistently praise GPT 5.1 as the best and most insightful model, and consistently select Claude as the worst model, with the other models floating in between. But I'm not 100% convinced this aligns with my own qualitative assessment. For example, qualitatively I find GPT 5.1 a little too wordy and sprawled and Gemini 3 a bit more condensed and processed. Claude is too terse in this domain.
That said, there's probably a whole design space of the data flow of your LLM council. The construction of LLM ensembles seems under-explored.
I pushed the vibe coded app to
https://t.co/EZyOqwXd2k
if others would like to play. ty nano banana pro for fun header image for the repo
We disrupted a highly sophisticated AI-led espionage campaign.
The attack targeted large tech companies, financial institutions, chemical manufacturing companies, and government agencies. We assess with high confidence that the threat actor was a Chinese state-sponsored group.
AI agents are getting better at looking at different types of data in businesses to spot patterns and create value. This is making data silos increasingly painful. This is why I increasingly try to select software that lets me control my own data, so I can make it available to my AI agents.
Because of AI’s growing capabilities, the value you can now create from “connecting the dots” between different pieces of data is higher than ever. For example, if an email click is logged in one vendor’s system and a subsequent online purchase is logged in a different one, then it is valuable to build agents that can access both of these data sources to see how they correlate to make better decisions.
Unfortunately, many SaaS vendors try to create a data silo in their customer’s business. By making it hard for you to extract your data, they create high switching costs. This also allows them to steer you to buy their AI agent services — sometimes at high expense and/or of low quality — rather than build your own or buy from a different vendor. Unfortunately, some SaaS vendors are seeing AI agents coming for this data and working to make it harder for you (and your AI agents) to efficiently access it.
One of my teams just told me that a SaaS vendor we have been using to store our customer data wants to charge over $20,000 for an API key to get at our data. This high cost — no doubt intentionally designed to make it hard for customers to get their data out — is adding a barrier to implementing agentic workflows that take advantage of that data.
Through AI Aspire (an AI advisory firm), I advise a number of businesses on their AI strategies. When it comes to buying SaaS, I often advise them to try to control their own data (which, sadly, some vendors mightily resist). This way, you can hire a SaaS vendor to record and operate on your data, but ultimately you decide how to route it to the appropriate human or AI system for processing.
Over the past decade, a lot of work has gone into organizing businesses’ structured data. Because AI can now process unstructured data much better than before, the value of organizing your unstructured data (including PDF files, which LandingAI’s Agentic Document Extraction specializes in!) is higher than ever before.
In the era of generative AI, businesses and individuals have important work ahead to organize their data to be AI-ready.
P.S. As an individual, my favorite note-taking app is Obsidian. I am happy to “hire” Obsidian to operate on my notes files. And, all my notes are saved as Markdown files in my file system, and I have built AI agents that read from or write to my Obsidian files. This is a small example of how controlling my own notes data lets me do more with AI agents!
[Original text: https://t.co/1bwB2lBowg ]
Many AI agent problems are really just information retrieval problems. If the agent has a better way to find and comb through data performantly, you will get far better results.
Compute is fungible so you can use it during indexing and processing or you can use later to crank through results better. You just decide based on performance and cost tradeoffs.
But the cool thing about AI agents is that you always have “more compute” as a variable in making them perform better. Most forms of knowledge work have never really had this characteristic before.
Without proper governance, an AI agent might autonomously access sensitive data, expose personal information, or modify sensitive records. In our new short course: “Governing AI Agents,” created with @Databricks and taught by Amber Roberts, you’ll design AI agents that handle data safely, securely, and transparently across their entire lifecycle.
You’ll learn to integrate governance into your agent’s workflow by controlling data access, ensuring privacy protection and implementing observability.
Skills you'll gain:
- Understand the four pillars of agent governance: Lifecycle management, risk management, security, and observability
- Define appropriate data permissions for your agent
- Create views or SQL queries that return only the data your agent should access
- Anonymize and mask sensitive data like social security numbers and employee IDs
- Log, evaluate, version, and deploy your agents on Databricks
If you’re building or deploying AI agents, learning how to govern them is key to keeping systems safe and production-ready.
Sign up here: https://t.co/amRhy223T6
Think about something you have been trying to make progress on recently.
What is a completely different line of attack for solving it or a wildly different approach you can explore?