"You are here"
That's how you read the dot in the logo.
We're getting to the steep part now, all of us.
It's better if we ride the curve together.
Come visit me and the team at our first center, Diffusion Silicon Valley. Diffusion Europe and Diffusion India coming soon.
https://t.co/KwE0FvWSzY
Not surprised this didn't yield, but worth a shot for sure. I've had great luck permuting variations within a set of constraints, but looping w/no feedback overnight (RLNHF?) doesn't generally explore where I want it to. Steer additively during the day, and then compact overnight (anabolic/catabolic, wake/dream)
Introducing Roughdraft!
A new open source project designed to make collaboration with agents better.
The idea is to bring commenting and suggested changes to markdown (e.g. plan docs) in a nice interface.
Free, local, etc.
👉 https://t.co/J3YOOpL5ES 👈
the constraints of phone coding are freeing. I’m thankful for large screens and multiple monitors at times, but also thankful for the work @ClaudeDevs and @OpenAIDevs have done to unlock this modality
Sufficiently advanced agentic coding is essentially machine learning: the engineer sets up the optimization goal as well as some constraints on the search space (the spec and its tests), then an optimization process (coding agents) iterates until the goal is reached.
The result is a blackbox model (the generated codebase): an artifact that performs the task, that you deploy without ever inspecting its internal logic, just as we ignore individual weights in a neural network.
This implies that all classic issues encountered in ML will soon become problems for agentic coding: overfitting to the spec, Clever Hans shortcuts that don't generalize outside the tests, data leakage, concept drift, etc.
I would also ask: what will be the Keras of agentic coding? What will be the optimal set of high-level abstractions that allow humans to steer codebase 'training' with minimal cognitive overhead?
We're now helping others build factories. The first exercise is to build an Attractor. I just completed one in C. The code in src/llm really illustrates how small & straightforward a coding agent can be. https://t.co/oYnLXLliB3
A genuinely radical approach to software development with AI, without any human intervention. Even if this approach doesn’t work for many cases, I think we need more leapfrogging visions for how to redo processes with AI: https://t.co/GjkJ31wGOA
See also: https://t.co/2rh7a1MLkG
The power of MoltBot is getting people to update their biases and their world model. It's just the activation energy to cause the bit to flip in people's heads that we're in a takeoff.
All this is happening with ~500 pages of context window and static weights. Moltbook isn't technically more impressive than the coding work we're all doing already, but it's more of a performative art piece that captures people's attentions.
I'm maintaining that the last piece for true AGI is continual learning. The ability to distill context windows into permanent weight updates on the fly is going to cause all this to go absolutely exponential. We're so close.
our job is to illuminate the terrain. when the agent wakes up, it needs the correct path to be obvious. we don't have great control over the terrain, but we do control how it's lit - that's the engineering part of context engineering
I met today with the founder of Starcloud and I realized this is going to be one of the biggest engineering projects of our era. When you look at the tradeoffs, it seems inevitable that all the GPUs are going to live in space.
yes things are changing fast, but also I see companies (even faang) way behind the frontier for no reason.
you are guaranteed to lose if you fall behind.
the no unforced-errors ai leader playbook:
For your team:
- use coding agents. give all engineers their pick of harnesses, models, background agents: Claude code, Cursor, Devin, with closed/open models. Hearing Meta engineers are forced to use Llama 4. Opus 4.5 is the baseline now.
- give your agents tools to ALL dev tooling: Linear, GitHub, Datadog, Sentry, any Internal tooling. If agents are being held back because of lack of context that’s your fault.
- invest in your codebase specific agent docs. stop saying “doesn’t do X well”. If that’s an issue, try better prompting, https://t.co/SOjpn47yxo, linting, and code rules. Tell it how you want things. Every manual edit you make is an opportunity for https://t.co/S1ZvtYQwta improvement
- invest in robust background agent infra - get a full development stack working on VM/sandboxes. yes it’s hard to set up but it will be worth it, your engineers can run multiple in parallel. Code review will be the bottleneck soon.
- figure out security issues. stop being risk averse and do what is needed to unblock access to tools.
in your product:
- always use the latest generation models in your features (move things off of last gen models asap, unless robust evals indicate otherwise). Requires changes every 1-2 weeks - eg: GitHub copilot mobile still offers code review with gpt 4.1 and Sonnet 3.5 @jaredpalmer. You are leaving money on the table by being on Sonnet 4, or gpt 4o
- Use embedding semantic search instead of fuzzy search. Any general embedding model will do better than Levenshtein / fuzzy heuristics.
- leave no form unfilled. use structured outputs and whatever context you have on the user to do a best-effort pre-fill
- allow unstructured inputs on all product surfaces - must accept freeform text and documents. Forms are dead.
- custom finetuning is dead. Stop wasting time on it. Frontier is moving too fast to invest 8 weeks into finetuning. Costs are dropping too quickly for price to matter. Better prompting will take you very far and this will only become more true as instruction following improves
- build evals to make quick model-upgrade decisions. they don’t need to be perfect but at least need to allow you to compare models relative to each other. most decisions become clear on a Pareto cost vs benchmark perf plot
- encourage all engineers to build with ai: build primitives to call models from all code bases / models: structured output, semantic similarity endpoints, sandbox code execution. etc
What else am I missing?
To get the full benefit of AI agents you often need to change your underlying workflows, and keep up with a very fast moving AI space. Because of this, there are at least 2 entirely new categories of business models that will emerge around the software companies that build agents.
1. The services firm that implements AI agents in existing companies.
As enterprises look to deploy AI agents across all forms of work, it’s not possible for every company to figure out how to do this on their own.
Most companies don’t have the IT teams to deliver on this, so there will be entirely new system integrators that emerge to help companies redesign their workflows, implement the tech, drive the change management, and keep the AI agents up to date for the organization.
But what’s super interesting is that because AI agents span almost every single line of business, these will not just be the classic system integrators whose primary focus is on IT systems. The system integrators will have to be domain experts at many different types of job functions, from marketing and legal to healthcare and coding.
2. New agency or firm that forms from the ground up to take advantage of the leverage of agents.
Lots of companies will take too long to transform themselves with AI, so there will be an all new crop of companies that start from scratch the capture the gains. These services firms and agencies will use the technology themselves to offer cheaper, faster, or better quality of service to a broader range of clients than was possible before.
This will be the new law firm that uses AI to change the business model of, marketing agencies that can support high quality campaigns for smaller size companies, engineering shops that can take on bigger project work at a lower cost, and so on.
In all there are going to be lots of new forms of businesses that will emerge as a result of AI agents because of how different working with agents can be.