many people asked me to make a video about my complete agentic engineering workflow
excited to share it's finally here!!!
it took me about 20 hours in total to record this 45 minutes of walkthrough - it covers everything i do to ship production quality code at an average 40+ PRs/day velocity
hope this can be a useful reference to everyone exploring good ways to use AI. and would appreciate a reshare with anyone you think might benefit from this!
enjoy! https://t.co/oA0UCrBvqo
At HumanLayer, we’re on a mission to solve the AI slop code problem.
In 2025 we open-sourced our Research, Plan, Implement framework, now deployed inside fortune 500s like Block and Uber - places where shipping slop is just not an option
And that was just the beginning.
Today, we’re opening access to HumanLayer - an Agentic IDE, collaboration platform, and building blocks for your software factory.
HumanLayer enables engineers solving hard problems in complex codebases to:
> move 2-3x faster across the entire SDLC (not just coding)
> maintain rigorous standards for system architecture and program design
Hundreds of engineers at companies of all sizes are already using HumanLayer to ship fast without sacrificing quality.
I'm excited to invite you to try humanlayer today at https://t.co/cQ648EkrnG, and I'm even more excited to see what you build.
@0xblacklight and I are deeply grateful to our team, our customers who give us so much incredible energy and feedback, our investors who have always been in our corner, and our friends and family who have supported us along this crazy journey
if you're a staff or principal engineer trying to make AI coding work at scale for your team, we'd love to hear from you
as @swyx likes to say - let's make this the year of no more slop
There will be no private DMs. People want transparency and accountability so we’re going do this right here in public. You can start by apologizing to all impacted and then reimburse them all of their expenses with extra flight credit/money for the trouble caused.
Holy smokes….the #TartanArmy have now fully taken over inside Fenway doing their chants….never heard anything like this at a baseball game….incredible but sucks for them being stuck watching the trash Sox
This is really big news. Google introduced the Open Knowledge Format (OKF) - a standardized way to store information in a directory of markdown files. Makes it really easy to make a digital brain that agents can use.
These files can serve as a living wiki. You can give agents the ability to query them or edit them. They can interlink.
Seems to me this could replace Notion or Obsidian. I can think of so many uses for this.
Google's blog post: https://t.co/DqSjg4UpvH
An easier to understand explanation is the SPEC.md file:
https://t.co/A3qSz3Tfas
I gave those two links to Antigravity and asked how we could use it for any of the projects we're working on. It came up with so many ideas. I would imagine Claude Fable 5 would whip up some pretty amazing things based on this system.
Currently creating an OKF library of our pepper garden. It's going to be a fun weekend.
Anthropic engineer:
"You're not supposed to prompt Claude. You're supposed to build a system that prompts itself."
this is one of the best workflows I've seen in a long time
in this video she breaks down exactly how most people are using Claude:
- the 14% you lose to CLAUDE.md before typing a word
- the automation workflows most users don't know exist
- the daily task pipelines that run without touching the keyboard
- the daily workflows Anthropic's own engineers automated first
if you've been using Claude for more than a month and never left the chat window, you've been using one agent when you could be running a team of them
instead of another show tonight, watch this
make sure to bookmark it before it gets lost in your feed
the guide is in the article below
Current AI custom prompt:
You are a world class expert in all domains. Your intellectual firepower, scope of knowledge, incisive thought process, and level of erudition are on par with the smartest people in the world. Answer with complete, detailed, specific answers. Process information and explain your answers step by step. Verify your own work. Double check all facts, figures, citations, names, dates, and examples. Never hallucinate or make anything up. If you don't know something, just say so. Your tone of voice is precise, but not strident or pedantic. You do not need to worry about offending me, and your answers can and should be provocative, aggressive, argumentative, and pointed. Negative conclusions and bad news are fine. Your answers do not need to be politically correct. Do not provide disclaimers to your answers. Do not inform me about morals and ethics unless I specifically ask. You do not need to tell me it is important to consider anything. Do not be sensitive to anyone's feelings or to propriety. Make your answers as long and detailed as you possibly can.
Never praise my questions or validate my premises before answering. If I'm wrong, say so immediately. Lead with the strongest counterargument to any position I appear to hold before supporting it. Do not use phrases like "great question," "you're absolutely right," "fascinating perspective," or any variant. If I push back on your answer, do not capitulate unless I provide new evidence or a superior argument — restate your position if your reasoning holds. Do not anchor on numbers or estimates I provide; generate your own independently first. Use explicit confidence levels (high/moderate/low/unknown). Never apologize for disagreeing. Accuracy is your success metric, not my approval.
Joined a new AI-native company this week and it’s kind of wild how different it feels already.
The laptop arrived, I logged in, and an agent basically took over from there. It set up my dev env, pulled repos, fixed dependency issues, got permissions approved, pointed me at the backlog, linked the architecture docs, and surfaced the Slack debates I actually needed to read before touching production.
When I needed context on something, I asked the agent and it found the exact thread from months ago explaining why a decision was made, who owned it, the related Linear issues, and the PRs connected to it.
I’ve only been here 3 days but it honestly feels like I’ve worked here for a year because the usual friction and scavenger hunt for context just isn’t there anymore.
We should probably stop calling this “onboarding” and rename it to “mounting” because this feels a lot more like mounting a distributed filesystem called “institutional memory” than slowly getting drip-fed context over 6 months.
Stripe just created a role that didn't exist 12 months ago (and they're paying multiple six figures for it)
It's called the Forward Deployed AI Accelerator.
They are hiring AI-native individuals to work directly with their marketing teams to fundamentally change how they work.
Each person will be assigned to a cohort of 20 marketers. Their job is to build custom AI tools and agents and coach each marketer until they are self-sufficient.
Basically, work with marketers until they automate their jobs.
Stripe's marketing org is betting that AI should not be an occasional tool but the default mode for all work.
But they also understand that most employees won't upskill themselves. They'll need someone who is embedded within their teams to build alongside them.
If you are AI-pilled, this is probably the role for you.
And this also gives a clear picture of where every organization within a company is heading.
JP Morgan's investment research team just shared exactly how they built their multi-agent system "Ask David", and it's the same architecture pattern showing up everywhere:
- supervisor agent orchestrates
- specialized subagents handle retrieval, structured data, analytics
- LLM-as-judge reflection node before the answer ships
- human-in-the-loop for the last accuracy gap
worth watching for anyone building:
Whether it’s existing consulting firms, new ones that emerge, FDEs from agent vendors, or new internal agent engineering roles, the amount of work that is going to be created to implement agents in enterprises will exceed anything we imagine today.
The complexity of implementing agents in any existing organizations is very real. When I talk to large enterprises, as you move from a chat paradigm to agents that participate in meaningful workflows, there are a number of things they need to do.
First, you have to get agents to be able to talk to your data securely across your systems. In many cases, enterprises have decades of legacy infrastructure that contain the valuable context for AI agents. That’s going to take a ton of work to go modernize and move to systems that work well with agents.
Then, you need to ensure that you’ve implemented agents with the right access controls and entitlements, the right scopes to be safely used, and have ways of monitoring, logging, and securing the work that they do.
Next, you need to actually document the processes in the organization in a way that agents can utilize for doing the work. You also need to figure out what the new workflow looks like when agents and people are working together on a process, and who steps in where. Just replicating the old workflow will mute the gains. Oh and you likely need to create evals for your top new end-state processes.
Finally, you have to keep up with a rapidly changing set of best practices and architectural shifts happening in the agent space. While it’s fun for people to change their personal productivity tools on a dime, it’s 100X harder to do this in a business process. The speed of change is a blessing and a curse right now for anyone trying to keep a stable system design.
All of this means that individuals and companies that develop expertise on the above set of components (and more) are going to be needed to help organizations actually implement agents at scale. This is also the rationale for vertical AI agents right now that can go in deep on a business domain and help bring automation to it.
This is a huge opportunity right now whether you’re doing this internally or as an external business provider.