@theo Focus on the quality you're aiming for. Productivity measured only by volume is useless if the output is trash. Sustainability and security matter more.
The real multiplier is workflow.
More paint doesn't make the painting better. Better judgment does.
Be careful when using CDP with Chrome.
CDP is powerful. If you expose your everyday Chrome session, an agent may be able to inspect or control tabs, cookies, localStorage, network traffic, console output, and page state.
Use a dedicated throwaway profile, not your personal one.
Codex's in-app browser is the safer default if you don't need real-profile access.
I'm not saying every LLM is malicious.
I'm saying "useful" is not the same as "trustworthy."
A model can behave normally most of the time and still be unsafe under certain conditions, especially when it has tool access, memory, files, network access, credentials, automation permissions
the weak spot is not just the prompt. It is the whole runtime around the model.
Treat LLMs like untrusted code until proven otherwise.
Containment, least privilege, logging, approvals, and isolation should not be optional once agents start touching real systems.
Are we being too casual with LLMs?
Everyone is busy plugging models into apps, agents, tools, files, browsers, terminals, APIs, and private data.
But are we treating the model itself as a possible security risk?
Open or closed, a model is still something we did not personally train, inspect, or fully verify.
Once it gets tools, it is no longer "just answering prompts."
The model is the brain. The tools are the hands.
And somehow we are surprised when giving the brain hands becomes a security problem.
@davidsenra@pmarca Agree. But we also need more realistic bootstrapping paths. A lot of capable founders donβt lack ideas or courage. They lack runway.
@ThePrimeagen Might be a simple caching thing. I just tell it to read it again especially after making changes. I'm sure you already did that a bajillion times. But it works for me.
For context, my normal workflow is not βthrow vague instructions at an AI and hope it works.β
I usually work spec-driven, in bounded phases, with real-time inspection and verification close to the work. Basically, it's give it a clear target, let it implement a slice, inspect, correct, repeat.
This time, I wanted to test something else. Can Codex in VS Code using GPT-5.5 steer itself toward the goal with more autonomy?
I wanted to see if it could infer intent, analyze the existing system, preserve what mattered, simplify what needed simplifying, plan the rebuild, and implement without me policing every decision.
After 3 fresh spec-driven attempts, it still missed even the basics.
That confirms what my own thought process had been leading me to all along. AI coding agents are powerful, but they are not yet senior architects.
For complex systems, they still need structure, gates, verification, and a human who actually understands the product intent.
Bigger scope is possible. But only with tighter structure, not more autonomy.
Codex failed me.
I use Codex a lot, so this is not an anti-AI rant.
But 2 days ago, I decided to give Codex in VS Code using GPT-5.5 more autonomy on a rebuild of a huge 367-node n8n workflow.
After 3 fresh spec-driven attempts, it still missed the basics.
So I asked Codex to review itself. The answer was more honest than expected.
@LuminousTheReal@thsottiaux i get what you mean. in this scenario, you only get the advantage if you are able to consume as much as you can before the celebratory reset. if you're asleep or busy with something else between the time of your normal reset and the celebratory reset, then better luck next time.
@thsottiaux deym. my limits just refreshed 4 hours ago. i havent even started working yet. there goes my sunday. still, thanks! congrats on the 5M milestone!