For Coinbase, the answer is probably not just active-active multi-AZ.
Exchanges are different. Latency, sequencing, and fairness are core to the product.
But the recovery path should not be figured out during the incident.
If one AZ becomes unsafe, the system should already know what to move, what to pause, how to isolate and safely reopen.
Reliability is not only about avoiding outages. It is about making sure the system fails in a way that is boring, bounded, and safe for customers.
Stopped using Opus 4.7 for coding 2w ago.
Here's what I run now:
GPT-5.5 for coding. Opus for reviews - mostly useless now, doesn't pick much, seems nerfed!?
What changed my mind:
- Codex is faster, I can't even notice context compaction. Its token efficient too.
- It follows instructions better. Line by line, no creative reinterpretation and surprises.
- It tests its own changes in the browser. Catches things that would've been a second iteration with Opus.
- Less hand-holding overall. I write the spec once, it just goes on and on until its actually done. And finally no stubs.
Opus 4.7 problems I kept hitting:
- Hallucinates its own work
- Always rushing to end the session
- Gets stuck in loops when debugging
- Reviews don't catch much anymore
If you switch, a few things that help:
- Go TDD. Tests first, then let Codex implement against them. Fewer iterations.
- Ask it to generate a task list, write tasks/handoffs to disk and keep updating them.
- Ask it port your CLAUDE.md to AGENTS.md. Rework the skills to match how Codex works. Review it.
- Keep the worktree clean. Tell it explicitly "no dead code, no orphan files"
Try it for a week. You'll feel the difference by day 2.
We’ve agreed to a partnership with @SpaceX that will substantially increase our compute capacity.
This, along with our other recent compute deals, means that we’ve been able to increase our usage limits for Claude Code and the Claude API.
Human engineers know when to stop. AI agents don't.
They optimize to finish the task, not for time.
Exit rules > more prompting.
P.S we automated our QA pipeline using CC skills, about 70% coverage on first run
@brankopetric00 Most OS resolvers prefer IPv6 when both A and AAAA records exist - it's the default in RFC 6724. Your app asks for the API's IP, DNS returns both, resolver picks IPv6 first. Old VPC can't route it, so socket fails.
Approach A - your web server holds the connection open for the entire upload + processing time. 500MB file on a slow connection could be mins. That thread/worker is blocked, can’t serve other requests. 10 users upload simultaneously and your server is choking.
Approach B decouples it. Presigned URL means the upload bytes go straight to S3, your server just generates the URL. Lambda picks up processing async. Backpressure solved coz your web server never touches the heavy work - it stays free to handle normal traffic.
The tricky part is these two signals can contradict each other. Enthusiasm fading but still have ideas left. Or super curious but genuinely stuck with nothing new to try.
I’ve found the eec case is actually fine - take a break, read something unrelated, ideas come back. But low enthusiasm with plenty of options left.. that’s the real quit signal. You’re not stuck on the problem, you’ve just stopped caring about solving it.
Teams default to minimum memory thinking it’s cheapest. The math doesn’t work that way. Lambda charges per GB-second - if 8x memory cuts duration by 10x, you’re paying less overall.
The catch is this only applies to CPU-bound work. If your function spends most of its time waiting on an API or db, more memory just means paying more to sit idle.
I see this surprise ppl regularly. sensitive = true is cosmetic - hides the value in terminal output and plan logs. That’s it.
Terraform writes the actual value to state coz it needs it for diffing on every run.
Encryption at rest protects the file on disk. But if someone has read perms, they see everything plain.
Keep secrets out of Terraform entirely - use AWS Secrets Manager, Vault etc and store secrets there, reference them at runtime. Terraform manages infra, not secrets.
@namyakhann The problem is most designers built careers on execution. The strategic layer was someone else's job
The ones thriving rn were already operating at that level. For everyone else it's not keep doing what you're doing - it's a whole skill reset
@marcrandolph The bar for engagement dropped so low that basic thoughtfulness looks exceptional. Half the replies now are obviously AI or one-word reactions.
Actually reading the post before responding is a differentiator.
@peer_rich Also works for hiring. Oversell the role and they quit in 3 months when reality hits. Show the mess upfront and whoever joins is already committed to it
@gregisenberg The ones who survive won't be SaaS companies that added agents. It'll be agent-native companies that never thought in SaaS terms to begin with
Same thing happened with mobile a 10-15 yrs ago. The winners weren't desktop apps with mobile versions, it was mobile-first companies
@vitddnv Fire quickly if needed, is where most founders fail. The decision is usually obvious 3 months before you act on it
Every week you wait, your good people are watching
@sweatystartup The third option most managers pick - keep both and let the high performer carry the low performer's work. Until they burn out and leave anyway