"Fascinating insights from @jaimefjorge and @GeoffreyHuntley! As AI reshapes our field, the distinction between development and engineering becomes crucial. It's not just about writing code—it's about architecting solutions that adapt and evolve. #ExecutionOverExhibition"
The biggest takeaways/nuggets from my interview with @GeoffreyHuntley on AI-native software engineering and the Ralph loop:
1. Software development and software engineering are now two different professions, and one of them is over. Software development, the work of translating tickets into code, can now be done by anyone for $10-42/hour while they sleep. Software engineering, architecture, security, requirements breakdown, understanding failure modes, is where humans still matter. If you identify as a "software developer," you're competing against a bash loop. If you identify as a "software engineer," your job is to orchestrate the loops.
2. The moat you think protects your software product doesn't exist anymore. Geoffrey argues you can clone any SaaS product, even those with BSL licenses or proprietary enterprise code, using AI. He ran Ralph in reverse on HashiCorp Nomad's source code to generate clean-room specifications. When he hit gaps from missing enterprise features, he ran Ralph over their marketing materials and product docs to fill them in. Any company relying on licensing or code secrecy as a competitive moat needs to rethink their strategy.
3. Cursor, Windsurf, and every other AI coding tool are essentially the same thing: a loop that automatically copies and pastes. Geoffrey built these tools professionally and says the harness does almost nothing; the model does all the work. There's no real moat in the harness business when you're reselling tokens. The only differentiator is taste and UX. Stop evaluating tools and start learning the underlying patterns.
4. Ralph is not a product. It's an orchestrator pattern for running thousands of AI loops. The simplest version is a bash loop that deterministically allocates memory, lets the LLM pick one task, executes it, then starts fresh. The key insight: every loop gets a brand new context window. You avoid compaction (where the AI gets dumber as context fills up) by never letting the context window accumulate competing goals. Your institutional knowledge lives in specification files, not in the context window.
5. Specifications are the new source code. Geoffrey's workflow: spend 30 minutes in conversation with AI, drilling into requirements, making engineering decisions, building up specs. Then throw those specs to Ralph and get weeks worth of work in hours. The specs act as a "pin" that reframes every fresh loop with your domain knowledge. He doesn't hand-write specs. He code-generates them through structured conversation. Prototypes are now free. Refactoring is cheap.
6. The entry-level path into software engineering is closing fast. Geoffrey's company stopped hiring juniors for a year until they figured out how to interview for AI-native skills. There's already a cohort of juniors who've been practicing these techniques for six months. They'll work at a quarter of senior wages and outship them. If you're just picking up these tools today, you're behind. The new interview question: can you explain how to build a coding agent on a whiteboard?
7. Senior engineers who refuse to adapt are in more danger than juniors who embrace it. Geoffrey sees respected engineers taking hardline stances against AI ("it's installing fascism in your codebase"). Meanwhile, leadership teams are discovering Ralph and realizing three people can run the output of an entire org. When commit velocity and product velocity diverge that dramatically between adopters and non-adopters, founders notice. The hard line is coming.
8. AI is an amplifier of operator skill, not a replacement for it. If you're great at security and you get good at AI, you become a weapon. If you're mediocre and you use AI, you're still mediocre, just faster. The skill gap comes from "discoveries": learning the tricks, the loop-backs, the ways to close the automation loop. These techniques don't have standardized language yet. We're inventing the terms for the new computer every day.
9. Open source may no longer make sense for most use cases. Geoffrey, a former prominent open source maintainer whose land was funded by Open Collective, no longer uses open source libraries. His reasoning: every dependency injects a human into the loop. If there's a bug, you open a PR, chase a maintainer, wait. That's not automation. Instead, code-generate what you need. The exception: don't generate cryptography or security-critical code unless you have the domain expertise to verify it.
10. Programming languages now have a tier list based on how well AI agents can work with them. S-tier: Rust, TypeScript (especially with Effect.js), Python with Pydantic. These are source-based with strong type systems that reject invalid generations and work well with ripgrep for code discovery. F-tier: Java and .NET. Their DLL-based dependency systems don't work natively with the search tools AI agents use. The tradeoff with Rust: compilation is slow, so bad generations cost more time.
11. Corporate AI transformation programs are dangerously slow. Three-to-four-year rollouts with coaches and committees won't cut it when three founders in Bali can Ralph your entire product and undercut your pricing by 99%. Smaller teams ship faster. By the time the transformation is done, the market has moved. Geoffrey calls this the "Titanic moment": the boat is full, get the next boat.
12. We have a new computer, and that's why the legends are coming out of retirement. The last 40 years of computing decisions were designed for humans: TTYs, environment variables, slow language evolution to avoid breaking mental models. Now we have robots. What's the bare minimum a robot needs? Geoffrey sees this as the most exciting time in computing. If you're not excited about what you can now build, you haven't truly picked up the new computer yet.
Claude Code Opus 4.5 is a game-changer! 🚀 Transforming ideas into fully functional apps in under an hour is the future of building. Imagine the possibilities when speed meets innovation. What to create next? #ExecutionOverExhibition
GUYS IT'S SO OVER!!! go build COMPANIES w/ claude code opus 4.5 - visually
typeform is a $1b company with +$80,000,000/yr ARR
in <1hr, I've built:
• My own Typeform
• But with Spotify's design
• It has a WORKING BACK-END
• It's got login for users (BetterAuth)
• It's got Stripe + accepts payments... I've got a SaaS
• It's got multi-translation with 1click, auto-translated by AI
This is almost a freaking company... all built by AI, in <1h
Comment "build" + RT -> randomly giving out FREE credits to build
Legacy software, meet your match: in under a week, we transformed a 20-year-old system into a sleek AWS-native app using just Cursor and the original PDF manuals—only the name remains a relic. #InnovationInAction
I've tested and used Claude Agent for 2 weeks now and here's what separates builders from prompt collectors: it handles multi-step workflows you'd normally chain manually, but only if you define the problem first—ask what you're solving, not what it can do.
CEO doing support isn't a PR stunt - it's debugging your product through customer pain. Real insights come from the trenches, not dashboards. AI handles the noise so humans can solve the complex problems that actually matter. This is execution over exhibition.
I just completed my routine 'CEO does Support' rotation. If you got a support ticket reply from 'Martin', it was me 😁
Some cool things I got to help with, pain I got to simmer in, and the answer to the question of whether AI can replace support teams 🧵
How do we keep feeding the AI articles to answer future questions, by leveraging answers our great agents give to new problems. Atlas spots that an article could resolve in future, creates the article based on the ticket content, saves it as draft in @Kayako , we review for correctness and publish. 34s of work and now our customers who face the issue will get a solution in seconds, and this issue will never need an agent again.
@icanvardar BASIC on a Commodore 64. Taught me that syntax errors are just life's way of saying "try again, but better." The language doesn't matter - solving problems does. Still debugging, just with better tools now.
Practical Implications
- CoT may still help weaker or older models, but at a cost.
- For advanced reasoning models: Explicit CoT prompts add little or no value and may reduce reliability.
For prompt design:
* Avoid suppressing reasoning (e.g., “just give the answer”).
* Use CoT sparingly and evaluate trade-offs.
* Highly customized CoT prompts may yield benefits, but generic “step by step” instructions are limited.
🚨We have a new prompting report:
Prompting a model with Chain of Thought is a common prompt engineering technique, but we find simple Chain-of-Thought prompts don’t help recent frontier LLMs, including reasoning & non-reasoning models, perform any better (but do increase costs)
So someone on Twitter said in the last couple of days said they wanted an app that would take all their images, de-dup them, organise them and make them searchable....something I'd wanted for ages. So as a test of Claude Code I set it to work with some guidance. https://t.co/FrFeygXije here you go mystery Twitter user! organises, identifies duplicates (asks user to pick), adds meta data and uses a local model to keep costs low (you can use openai too). Took a couple of hours with me checking in on it every once in a while. Not saying it's ready to ship but it works damn well for a software that didn't exist this morning!
Man it's warm in Ireland as it is (relative to normal!) and now I'm sitting beside my laptop running gemma-3-27b-it-GGUF and categorising images it's gone to a whole other level. Toasted marshmallows anyone?
CoT reasoning works until it doesn't - and that's exactly when you need robust error handling. Smart builders assume AI will fail at the worst possible moment and code accordingly. Plan for the mirage, not the oasis.
Something AI highlights is those people who are 'coasting' through work with buzz words, because when they post their 6 'powerful' NotebookLM features, it's obvious they're not applying them, or critiquing their output. AI empowers those who know, it fools those that don't.
AI in customer support isn't just efficiency; it's the ultimate lever for real-time customer rescue. but here's the deal: this only works with rock-solid, RELIABLE AI solutions. no room for error. innovate or fall behind! #CustomerSupport#AIinCS
We're just putting the finishing touches on our AI Incident Manager, they join the call, get updates, communicate to the customer and even offer product and process insights to those on the Incident Call. Question is: do we give them an AI avatar or leave them camera off, as we're leveraging @ephor they have one at present, and if we use one of our internal people could we end up with them on the call and have an IM inception moment.
While everyone debates model preferences, the real lesson is in execution: Near 100% rollout, transparent communication about tradeoffs, and admitting when you underestimate user needs. Iterate based on actual feedback, not speculation.
Wanted to provide more updates on the GPT-5 rollout and changes we are making heading into the weekend.
1. We for sure underestimated how much some of the things that people like in GPT-4o matter to them, even if GPT-5 performs better in most ways.
2. Users have very different opinions on the relative strength of GPT-4o vs GPT-5 (just the chat model, not the advanced reasoning one). This is a cool thing you can try: https://t.co/O9Nu6LPC7f
3. Long-term, this has reinforced that we really need good ways for different users to customize things (we understand that there isn't one model that works for everyone, and we have been investing in steerability research and launched a research preview of different personalities). For a silly example, some users really, really like emojis, and some never want to see one. Some users really want cold logic and some want warmth and a different kind of emotional intelligence. I am confident we can offer way more customization than we do now while still encouraging healthy use.
4. We are going to focus on finishing the GPT-5 rollout and getting things stable (we are now out to 100% of Pro users, and getting close to 100% of all users) and then we are going to focus on some changes to GPT-5 to make it warmer. Really good per-users customization will take longer.
5. The team is doing heroic work to optimize our systems and find more capacity, but still, we are looking at a severe capacity challenge for next week. We are still deciding what we are going to do, but we will be transparent with our principles. Not everyone will like whatever tradeoffs we end up with, obviously, but at least we will explain how we are making decisions.
Thanks for your patience with us; we will continue to react and improve quickly!
"Good developer" who refuses better tools is like a carpenter who won't use power drills because hammers built the pyramids. Your code quality isn't measured by how hard you make it for yourself - it's measured by how fast you solve actual problems.
I ran into an old coworker from my consulting days.
Still writing code like in 2000s.
No Cursor, no windsurf, no AI.
He told me he doesn't need AI, he is a good developer.
What other absurd things have you heard lately? 😂
GPT-5 launches and everyone's asking "what can it do?" Wrong question. Ask "what problem will you solve with it?" The model is just tooling - execution separates builders from prompt collectors.