I write about software that holds up over time.
Not just whether it works today, but whether it stays understandable and trustworthy years later.
Most of the problems I care about aren’t solved by better tools, but by better judgment—tradeoffs, constraints, and decisions that age well.
Some of what I write is technical.
Some of it is personal, including experiences that change how you see time and responsibility.
If you care about clarity over cleverness, you’re probably in the right place.
What I Optimize for Now
What I optimize for now is narrower than what I optimized for a few years ago.
The earlier targets were not careless. They were chosen reasonably and have aged unevenly.
Capability, because it was the easiest thing to demonstrate.
Elegance, because it read well in review.
Speed, because it was the only metric a project tracked before it shipped.
Each of those is a fair thing to want.
None of them turned out to be quite what the system asked of me a few years later.
What changed was not a framework. It was rereading code I had written a few months earlier and noticing, in its current shape, what my earlier preferences had cost.
Three preferences have replaced the earlier ones. None of them sound impressive when stated.
Fewer abstractions.
The abstractions I am most embarrassed by were not careless. They were correct generalizations of the cases I knew at the time. The cases that actually arrived were not those.
A piece of code stays concrete cheaply. A premature generalization fits the first call site and constrains the next two. I now copy three times before extracting once. The duplication looks ugly in review and survives change better than the abstraction that would have replaced it.
Fewer irreversible decisions.
A dependency chosen because it was the fastest path to a working prototype. A data model that quietly spread to fifty downstream consumers before anyone wrote it down. A platform decision made for one team's needs that survived three reorganizations and now anchors the operations of teams that did not exist when it was chosen.
None of those were wrong on the day they were made. Each became expensive in ways the original analysis had no way to predict.
The filter I now apply is short. Not "is this the best decision," which is rarely answerable. Not even "is this correct," which is local. Just: if this turns out to be wrong, can the people who inherit it actually undo it.
That filter is uncomfortable. It treats my own decisions as hypotheses I expect to be partly wrong about. It builds for engineers I will not meet, against a context I cannot predict.
It has changed more of my choices than any principle I could name.
Earlier clarity.
One of the most expensive things in a long-lived system is a conversation that did not happen.
Two engineers leave a meeting with different interpretations of the same word. Neither notices. The system is built around both interpretations at once. A year passes. Two parts of the code now operate on incompatible readings of the same term, and the seam between them is invisible until someone tries to change behavior on one side. By then the disagreement is no longer a meeting. It is a migration.
Clarifying early costs one awkward conversation. Clarifying late costs a quarter of work and a slow tax on every engineer who joins after.
I now treat ambiguity that has been allowed to stand as a defect, with the same weight as a bug — and the same triage priority. Not because clarity is a virtue. Because ambiguity compounds without ever appearing in a metric.
The most useful question I have learned to ask in design conversations is not "what should we build."
It is "what would we have to agree on for this to make sense in three years."
If the answer requires a long pause, the design is not ready.
These three do not always agree with each other.
The push for earlier clarity sometimes asks for a definition that the team will be locked into. The pull away from abstractions sometimes postpones a generalization that would have prevented six months of duplicated work. The preference for reversibility sometimes leaves a room with two viable options and no shared commitment.
There is no algorithm here. The work is in judging, in each case, which of the three signals is paying more than it is asking. Treating these as a checklist is itself a small irreversible mistake I have learned to avoid.
The net effect is fewer impressive moments and more uneventful weeks.
I read that ratio as a leading indicator, with a caveat earned the hard way.
Uneventful is not the same as healthy. Some quiet weeks are a system whose failure modes have not yet surfaced. The script that has not broken in three years is sometimes a script the monitoring was never asked to watch.
The prior is still useful. It is not proof.
I want to be careful with the posture this can take.
It would be easy to tell the earlier version of me that he was wrong to optimize for capability, elegance, and speed. He was not wrong. He did not have access to the years that would have shown him a different set of weights.
What I have now is not a principle I figured out. It is what is left after a long series of small surprises, each of which would have sounded like exaggerated caution if it had arrived as advice instead of as bill.
A decade from now this version of the answer will look incomplete to whoever is reading it back. That is the expected case, not the failure mode.
The shape underneath all of this is one shift.
There is the system I am building. And there is the system that will exist after me.
They are not the same system.
One only has to ship. The other has to be maintained — by engineers I will not meet, on a stack I cannot predict, against priorities that have moved.
From inside the first system, work on the second looks slow. From a sprint horizon, it is indistinguishable from caution. It does not produce demos. It rarely produces credit.
Fewer abstractions. Fewer irreversible decisions. Earlier clarity.
None of these were available to me as advice. They had to arrive as residue.
That is the part I cannot shortcut for anyone reading this.
Systems That Surprised You
The most useful audit of a long-lived system is the one where you were wrong about what would last.
Not wrong about a feature. Not wrong about a deadline. Wrong about durability.
That is the property that is hardest to evaluate at the moment of choice and most expensive to misjudge.
Every system I have stayed with long enough has eventually surprised me.
Something I expected to break held.
Something I expected to last quietly stopped working in a way I did not notice for months.
The lesson was rarely about the part itself.
It was about the assumption underneath it.
There is always a piece of the system that should not have survived this long.
An old script. A library someone wrote in a week. A schema that predates the team. A queue choice made over lunch.
On any clean evaluation, those pieces should have been replaced years ago.
They were not. They quietly kept working through changes that should have broken them.
What those parts share is rarely cleverness.
It is narrow scope.
They did one thing. They had few callers. They made no assumptions about the rest of the system, and the rest of the system made few assumptions about them.
The components that surprised me by lasting were almost always the ones that refused to grow.
Not impressive in isolation. Impressive in retention.
This is not protection from every failure mode. Narrow scope can carry its own buried assumptions — date formats, encodings, integer widths — that one day get tested by a condition the original author never imagined.
A small component is not a safe one. It is one whose failure modes have a smaller surface to find.
Both can be true.
The opposite surprise is more common.
A system stops working correctly, but not in a way that triggers alarms.
Numbers drift. A scheduled job has been failing silently for weeks. A cache has been serving stale data since a deploy nobody remembers.
These are not the failures the team rehearses for.
The team rehearses for the loud ones — the outage, the spike, the 5xx wall.
The quiet failures are the ones the system was never instrumented to notice.
What is exposed in those moments is not a missing alert.
It is a missing belief.
The team did not believe this particular thing could fail. So no one wrote the check. So no one wired the alarm. So the failure mode lived in plain sight for as long as the assumption held.
Every quiet failure mode I have found in a long-lived system mapped back to an assumption nobody articulated.
The fix was rarely the alarm. It was the conversation that surfaced the assumption.
Time is the cheapest test the system has.
It runs continuously. It does not stop. It exercises every condition that was implicit in the original design — including the ones nobody wrote down.
A system in production for five years has been audited by reality more thoroughly than any review process could manage.
What surprises me in those audits is not which assumptions failed.
It is which assumptions turned out to have been load-bearing in the first place.
Assumptions about traffic volume that quietly shaped the data model.
Assumptions about team size that determined how operations were structured.
Assumptions about deployment cadence that fixed the release pipeline.
Assumptions about the trustworthiness of upstream services that defined the error paths.
Most of these were never debated.
They were simply the conditions of the moment.
They became architecture without ever being decided.
The first few times a system surprised me, I read the surprises individually.
Eventually a pattern emerged.
Things that turned out to be more durable than expected almost always had less surface area than I remembered.
Things that turned out to be more fragile than expected almost always rested on an assumption I had stopped seeing.
This is not a satisfying lesson.
It would be more useful to discover that certain technologies, certain patterns, or certain disciplines reliably produced durable systems.
They did not.
What produced durability was usually that something stayed small.
What produced fragility was usually that something stayed unexamined.
Both are uncomfortable because they resist optimization.
You cannot decide to keep a system small from the outside. That decision has to survive years of pressure.
You cannot list the assumptions you have stopped seeing. By definition they are invisible to you.
The best available technique is to assume some are there and to invite people who did not build the system to point at them.
A new maintainer is a useful instrument for this. They have not yet learned which assumptions to overlook.
Two adjustments have followed from this for me. Both modest.
I treat any component still in production that I would have replaced years ago as evidence of something worth preserving.
Not the technology. The conditions that kept it small.
Before replacing it, I ask whether the replacement will preserve those conditions or quietly remove them.
I treat anything that has not failed for a long time with mild suspicion.
Especially anything that processes data, retries on errors, or handles silent edge cases.
The absence of visible failure is not the same as correctness.
It often just means the monitoring matches the same assumptions the code does.
Neither adjustment is dramatic. Neither shows up in a roadmap. Both have changed more decisions than I expected.
Long-lived systems do not reward rules.
They reward the habit of asking, regularly: which parts of this should not still be working — and why are they. Which parts have not failed recently — and what would it look like if they were failing in ways my instruments do not show.
Surprise is information.
It is the system telling you which parts of your understanding have not been updated in a while.
That is worth listening to.
Even when the news is that the script nobody bothered to maintain outlasted the framework you spent two quarters choosing.
Decisions that aged badly
The decisions I most regret were not careless ones.
They were defensible at the time. Reviewed. Discussed. Approved. Made by capable people with the information they had.
The cost arrived years later. Often after the people involved had moved on. Including, in some cases, me.
That gap is the whole problem.
A wrong decision is easy to point to. You can name the missing data, the ignored warning, the sloppy reasoning. You can fairly assign blame. Sometimes you should.
A decision that aged badly is harder to find fault with. The reasoning was correct given what was known. The context simply moved past it.
What was once a reasonable trade became a load-bearing constraint nobody chose.
These two failure modes feel similar in retrospect. They are not the same thing.
Most engineering hindsight collapses them. Either everything in the past was a mistake — which is unfair and useless. Or nothing was — which prevents any learning at all.
The honest middle is harder to hold.
A decision can be defensible at the time and still age badly. The useful posture toward this category is regret without blame.
Regret, because the system would be better if it had gone differently.
Without blame, because the people who made it were paying attention. They simply could not see the ten years that came after.
There is a real risk in this posture.
"Regret without blame" can become evasion. It can become the language a team uses to avoid examining actual misjudgments — analyses that were wrong on their own terms, warnings that were available and ignored.
The discipline is to keep the categories separate. Some decisions were wrong. Some aged badly. Some were both.
Treating every bad outcome as a victim of changed context is its own failure of judgment.
A few patterns recur, looking back.
Premature abstractions.
The shape of the problem was not yet clear. Someone wrote a generic interface for the cases they could anticipate. The cases that actually arrived were not those. The abstraction now obscures more than it reveals. But it is depended on widely, so it stays.
Optionality preserved just in case.
A configuration flag. A plugin point. A strategy interface for one strategy. Each was nearly free at the moment it was added. Cumulatively they form the bulk of the system's surface area. Removing them is now a quarter of work.
Tooling chosen for early productivity.
The framework that made the first six months fast. The build system that fit the original architecture. None of these were wrong. All of them eventually became things future maintainers had to work around rather than with.
Coupling between things that did not need to know each other.
The shared database two services briefly used. The library that grew to import from the application. The deploy pipeline that assumed one team's release cadence. Each coupling was a small saving. Each is now a constraint.
Best practices imported without context.
The pattern that worked at a previous company. The architecture style that fit a different problem at a different scale. The decision was anchored in authority rather than fit.
Decisions made under deadline that became permanent.
The temporary table. The hardcoded value. The "we will fix this next sprint." There is no next sprint. The shortcut becomes the shape.
The shared property of all of these is not effort or skill.
It is reversibility.
Each one reduced the cost of the present by spending the future's optionality. Each made the system easier to write and harder to change.
The asymmetry was invisible because the future maintainers were not in the room.
Reversibility is itself contextual.
What is reversible for a small team can be effectively permanent for a large one, because the cost of coordination scales faster than the cost of the change.
A two-line config swap in a startup is a multi-quarter migration in a public company.
The honest test is not "could this be undone in principle." It is "is the team that would have to undo it likely to be able to."
This is why the strongest single filter on long-lived decisions is not "is this correct."
It is "is this reversible by the people who will inherit it."
A reversible decision can be wrong and still survive. The team adjusts as understanding improves.
An irreversible decision must be defended against every change in context that comes later. Usually by people who do not know why it was made.
Most decisions that aged badly were locally optimal and globally irreversible.
That is the shape to watch for.
Looking back, the change I would make is not a particular technical choice.
It is to commit later and lighter. To treat decisions as hypotheses. To assume I will be partly wrong, and to build so that being wrong is survivable.
The decisions that aged best in the systems I have known were rarely the cleverest. They were the smallest commitments that addressed the actual problem.
Smaller surfaces.
Fewer dependencies.
Shorter assumptions.
Things that could be undone in an afternoon.
That is not a guarantee against aging. Nothing is.
It is the best available defense.
Regret without blame ends in the same lesson.
The judgment is not "I should have known."
It is "I should have made it cheaper to be wrong."
That posture is the one I try to carry into the next decision.
Stewardship Begins Where Ownership Ends
Most teams talk about ownership as if it were the highest form of responsibility.
For long-lived systems, it is not.
Ownership matters.
But stewardship matters more.
Ownership is about who carries the system now.
Stewardship is about whether the system can be carried by someone else later.
That distinction sounds semantic until a handoff goes badly.
The original engineer leaves.
The service still works, mostly.
But every safe change depends on oral history.
You have to know why one alert is always ignored.
Why a table can never be backfilled.
Why a retry loop exists in one place but not another.
Why a dependency no one likes still cannot be removed.
At that point, the problem is not merely missing documentation.
The problem is that the system was built around retained memory.
A lot of systems look healthy only because the same people keep compensating for their opacity.
That is an important failure mode.
Teams often mistake continuous proximity for maintainability.
If the people who built the system are still nearby, many design debts stay hidden.
They answer the same questions in chat.
They approve risky changes based on instinct.
They remember which edge cases are real and which ones are leftovers from an old migration.
The system appears manageable.
What is actually happening is that human continuity is masking structural fragility.
Stewardship starts where that illusion ends.
It asks a harder question than "who owns this?"
It asks:
could a competent person, under time pressure, make a safe change here without private access to the original authors?
If the answer is no, the system may be functioning, but it is not well stewarded.
This is why handoff friction is one of the best signals in software.
When a new engineer takes weeks to understand a service, teams often diagnose that as a training problem. Sometimes it is. More often it is evidence that too much essential knowledge is stored outside the system.
Handoff friction means knowledge is living in the wrong place.
Some of that knowledge belongs in docs.
Some belongs in run books.
Some belongs in tests.
Some belongs in the shape of the code itself.
A strange name that only makes sense if you remember 2024 is a stewardship problem.
An undocumented recovery path is a stewardship problem.
A service boundary that exists only because "that is just how we do it" is a stewardship problem.
The issue is not polish.
The issue is continuity.
This is also why "the code is the documentation" is usually an evasion.
Code can tell you what executes.
It usually cannot tell you:
which constraints are intentional
which trade-offs were temporary
which failure modes are acceptable
which simplifications must not be undone
Those are design facts.
If they live only in memory, then the design is incomplete.
Documentation is not separate from architecture in a long-lived system.
It is part of the interface between the current team and the next one.
The same is true for operational clarity.
A deploy process that depends on someone "just knowing the order" is not a reliable process.
A rollback that has never been written down is not really a rollback plan.
An incident response path that works only when a veteran is online is not resilience. It is dependency disguised as experience.
This is where the language of stewardship becomes useful.
Ownership can accidentally encourage possessiveness.
My service.
My area.
My judgment.
Stewardship points in a different direction.
This system existed before me in some form.
It will probably exist after me in another.
My job is not only to change it.
My job is to leave it in a state another person can safely inherit.
That changes what "done" means.
A task is not done just because the code ships.
It is not done if the new behavior introduced another unexplained exception.
It is not done if the migration state only makes sense to the person who performed it.
It is not done if the next maintainer has to reverse-engineer whether a weird branch is load-bearing or vestigial.
The feature may be complete.
The stewardship may still be poor.
That does not mean every change needs heavy process.
It means ordinary engineering work should include continuity work.
Good names are continuity work.
Removing dead paths is continuity work.
Writing down why a limit exists is continuity work.
Leaving behind a short run book before the first incident is continuity work.
Declining a clever shortcut because it will not survive turnover is continuity work.
None of this is glamorous.
Much of it is barely visible when deadlines are near.
That is part of why teams underinvest in it.
The cost is paid now.
The benefit arrives later.
Often for people who were not in the room when the original decision was made.
But that is true of most sound engineering.
The systems that age well are rarely the ones with the smartest original builders.
They are usually the ones whose builders accepted a quieter obligation:
to move critical knowledge out of private memory and into forms that survive absence.
I think a useful standard is this:
build for the next honest reader.
Not the ideal maintainer.
Not the teammate who sat through the architecture discussion.
Not yourself on a good day with the whole context loaded.
The next honest reader is capable, busy, and missing history.
They are trying to make a safe change while carrying other responsibilities.
They do not need your cleverness.
They need your system to tell the truth.
That standard eliminates a surprising amount of bad design.
It exposes hidden invariants.
It makes historical naming feel embarrassing.
It reveals how many "temporary" states were never made legible.
It puts pressure on every undocumented dependency between code and tribal knowledge.
Most importantly, it counters one of the oldest problems in mature teams:
familiarity masquerading as clarity.
After enough time with a system, teams stop seeing what is obscure about it.
Everything feels obvious because the missing context is carried socially.
The next honest reader has no such protection.
They are the real test.
If they cannot safely understand the system, the system is relying on stewardship debt.
That debt behaves differently from code debt.
It does not always break production immediately.
Sometimes it breaks succession.
Sometimes it breaks staffing flexibility.
Sometimes it breaks incident response.
Sometimes it just makes every future change slower and more frightening than it should be.
That is still a system failure.
Systems outlive certainty.
The confidence present during implementation fades quickly.
What remains are the traces:
- names
- constraints
- tests
- run books
- design notes
- visible boundaries
Those traces are what future maintainers actually inherit.
That is why stewardship is not softer than ownership.
It is stricter.
Ownership says: I can carry this.
Stewardship says: this should remain carriable when I am gone.
What You Refuse to Do Is Architecture
A system without explicit constraints is not flexible. It is undecided.
Most architecture conversations are about what a system can do. The harder, more useful conversation is about what it won't.
When a team can't say what its system refuses to do, it has not designed a system. It has accumulated one.
Complexity rarely arrives through bad code. It arrives through deferred choice.
Every undecided question becomes a private decision made by whoever shows up next. Multiplied over months, that is how systems become unreadable — not from any single mistake, but from the absence of agreed limits.
Constraints reduce the surface area of disagreement.
"We don't run background jobs in this service."
"We never call the database directly from controllers."
"We don't accept new dependencies without a documented reason."
None of these are restrictions on capability. They are restrictions on ambiguity.
Capability without constraint is a request to renegotiate every decision twice — once when the code is written, again when someone tries to understand it.
Optionality is sold as freedom. It behaves as debt.
Every option you preserve is something a teammate must consider, document, test, and eventually maintain.
When the original author moves on — and they will — the option remains. No one can quite say whether it is load-bearing or vestigial. So nobody touches it.
A team can lose a year to options nobody chose to use.
Explicit constraints calm teams.
This is the part most people underestimate.
A large share of engineering energy goes not to writing code but to adjudicating what is acceptable. The smaller the allowed set, the less energy is spent there.
A clear constraint says: don't think about this. Don't relitigate this. Spend your judgment elsewhere.
Teams operating inside well-defined bounds tend to ship more, argue less, and produce systems that age predictably. Not because their engineers are better. Because they have fewer open questions per square meter.
The most useful constraints are the ones chosen voluntarily — before they are forced.
A budget for latency.
A ceiling on service count.
A rule that no service owns more than one database.
A list of dependencies that require written justification.
These are not best practices. They are local agreements with future-you.
When a constraint is explicit, the trade-off becomes visible. You can see what you give up. You can argue against it on its actual terms.
When it is implicit — "we just don't tend to do that here" — it operates as folklore. Folklore decays with turnover. Documents survive turnover better.
Constraints also clarify trade-offs.
Without a stated limit, a team optimizes for a moving target. With one, the trade-off is legible: we accept slower iteration here because we want fewer surprises later. We accept higher memory use because we want simpler code paths. We accept fewer features because we want smaller error surfaces.
The legibility is the point.
A system designed under explicit constraints often looks smaller, slower to start, and disappointing on launch day.
It also tends to look reasonable five years later.
That is the trade — and it is the trade most teams refuse to take, because the cost is paid up front and the benefit accrues to people who may not be in the room.
This is also why constraints tend to arrive under duress, retrofitted onto systems that have already outgrown comprehension. By then they feel punitive instead of clarifying.
It is much cheaper to start with constraints than to retrofit them.
Removing a constraint takes an afternoon.
Reintroducing one after a hundred files have quietly violated it takes a quarter.
This asymmetry deserves attention.
When in doubt: start narrower than feels natural.
You can always widen.
You will rarely need to.
Constraints are not the opposite of design. They are the part of design that holds.
The systems that age well are not the most powerful or the most general.
They are the ones whose authors were willing, early on, to write down what the system would not be.
That decision — to refuse, in writing, on the record — is the part that carries.
AI in Long-Lived Systems
Every AI integration is a bet that you can monitor something you do not fully control.
The conversation about AI in software systems is mostly about capability. What it can do. What it will do next.
The harder question is what happens when you depend on it. Not for a demo. Not for a prototype. For a system that needs to work reliably for years, maintained by people who did not build it, under conditions that shifted since the original design.
AI introduces a category of behavior that most engineering practices were not designed for: probabilistic output embedded in deterministic expectations. A function that returns a different result for the same input is not a function in the way most systems assume.
When that behavior is buried inside a pipeline — making decisions, filtering data, generating content, routing requests — the system becomes harder to reason about in ways that do not show up in testing.
This is not a reason to avoid AI. It is a reason to treat it differently than other dependencies.
The first problem is observability.
Most systems log what happened. AI components require logging enough context to reconstruct why. A model that rejects a transaction, summarizes a document, or classifies a support ticket is making a judgment. If you cannot inspect that judgment after the fact, you cannot debug it, audit it, or explain it to the applicant who was denied.
Traditional observability assumes deterministic behavior. You trace a request, you see the path, you understand the outcome. With AI, the path includes a decision that may not be reproducible. The same input tomorrow might produce a different output.
Logging the input and output is necessary but not sufficient. You need the model version, the prompt template version, the context window — logged alongside each call, not reconstructed after the fact. This tooling tends to arrive late.
The second problem is irreversibility.
AI is increasingly used to make routing, filtering, and approval decisions that are difficult or impossible to undo. Approving or denying applications. Prioritizing work. Filtering what a user sees. Generating communications sent to real people.
Each of these has downstream consequences that compound. A wrongly filtered email is not just a missed message — it is a missed message that influenced a decision that influenced a timeline no one can trace back to the filter.
The systems that handle this well treat AI decisions as proposals, not conclusions. A model suggests; a human confirms. A model classifies; a review queue catches edge cases. A model generates; a validation layer checks constraints.
This is slower. It is also more durable.
The third problem is containment.
AI capabilities are general-purpose by nature, which makes them easy to spread across a system. A team integrates a language model for one use case. Another team sees it working and adopts it for a different use case. The model becomes a shared dependency with no clear owner, no consistent evaluation criteria, and no unified understanding of its failure modes.
This is tool sprawl, but with a dependency that changes behavior when the provider updates it.
Containment means treating AI as a bounded component with explicit interfaces. What goes in. What comes out. What the acceptable range of behavior is. What happens when the behavior falls outside that range.
The pattern that holds up is the same one that holds up for most long-lived system decisions: make it observable, make it reversible, make it replaceable.
These constraints feel like they slow things down. They do. That is the point.
The speed of integration is not the bottleneck in long-lived systems. The bottleneck is the speed of understanding — how quickly someone unfamiliar with the system can figure out what it does and why.
AI does not make systems fragile by itself. Systems become fragile when AI is embedded without the same discipline applied to every other critical component: clear boundaries, monitoring that captures why and not just what, and the assumption that it will eventually be wrong in a way no one predicted.
The question is not whether to use AI. It is whether you are building the infrastructure to live with it.
Tooling and Accidental Complexity
No one decides to make a system heavy. It happens one reasonable choice at a time.
A team adds a caching layer because latency matters. A monitoring tool because visibility matters. A build plugin because developer experience matters. Each addition solves a real problem. Each is justified on its own terms.
The weight is not in any single decision. It is in the aggregate.
This is accidental complexity — not complexity born from the problem, but complexity accumulated from the solutions. The system grows heavier than the problem requires, and the weight arrives so gradually that no one notices until a change that should take a day takes a week.
Stack drift is the quiet version. The stack you run today is not the stack you chose. It is the stack you chose plus everything added to keep it working, plus the things those additions required, plus the compatibility layers holding the seams together. Nobody designed the full picture. It emerged.
The asymmetry is what makes this hard. Adding a tool is a local decision. One team evaluates it, one team adopts it. Removing a tool is a global decision. It requires understanding every system that touches it, every pipeline that depends on it, every configuration that references it.
Adding takes a week. Removing takes a quarter. So things accumulate.
Configuration is part of the weight, and it gets the least scrutiny. A YAML file that controls deployment behavior is code. An environment variable that changes runtime behavior is code. A feature flag that alters control flow is code. But none of these go through the same review process as code. They accumulate in places that are harder to search, harder to test, harder to reason about.
Hidden dependencies are the most expensive form. A service that calls another through an intermediary. A build step that depends on global state set by a previous step. A test suite that passes only because an unrelated process is running.
These dependencies are invisible in architecture diagrams. They show up at 2 AM when someone is trying to understand why something that has not changed is suddenly broken.
The diagram shows five services. The reality is five services, three shared libraries, a custom build plugin, two configuration management tools, an internal CLI that nobody remembers writing, and a Makefile that references a Docker image tagged "latest" from two years ago.
That is the actual system.
The instinct when this becomes painful is to consolidate. Replace five tools with one platform. Standardize everything. This sometimes works. But it often introduces a different kind of weight — the weight of the platform itself, which becomes the new thing that is hard to change.
The better response is not consolidation but audit. What is here? Why was it added? Is it still solving the problem it was introduced for? What would it take to remove it?
Every tool has a carrying cost — in maintenance, in cognitive load, in the time it takes a new team member to understand the system. That cost is never on the invoice.
The systems that stay manageable are not the ones that avoid adding tools. They are the ones that treat removal as a normal part of operations. They notice when something stops earning its cost. They budget time for subtraction, not just addition.
Lightness is not the goal. Intentionality is. A heavy system where every component is justified and understood is better than a light system held together by assumptions.
Systems grow heavier than intended because adding is easy and removing is hard. The discipline is not in choosing fewer tools. It is in continuing to ask whether the tools you have are still the right ones.
Abstractions That Age Poorly
Good abstractions compress complexity. Bad abstractions relocate it.
The best ones disappear. You stop noticing them. The worst ones announce themselves on every change, every new requirement exposing another edge case the abstraction was not designed for.
Most abstractions that age poorly were not bad ideas. They were good ideas applied too early, before the problem was understood well enough to know what should be hidden and what should remain visible.
Early abstraction is seductive. You see a pattern forming — two or three things that look similar — and the instinct is to unify them. Extract a base class. Create a shared interface. Build a generic handler. The duplication disappears. It feels like progress.
The cost arrives later. The things that looked similar turn out to be similar only at the surface. As requirements diverge, the shared abstraction accumulates conditionals. It grows parameters. It develops modes. The generic handler now has seven configuration options, four of which exist to accommodate a single consumer.
This is the cleverness tax. The initial abstraction was elegant. Maintaining it is not. Every modification requires understanding the full surface area of the abstraction, not just the part you are changing. New team members cannot reason about one use case without understanding all of them.
Local elegance creates this problem more often than carelessness does. A developer writes a beautiful generic solution. Within its own scope, it is correct. But zoom out, and the system now has a coupling point that resists change. Two features that should evolve independently are bound together by a shared interface that neither fully fits.
The failure is not in the code. It is in the assumption that the things being unified will continue to evolve together. That assumption is rarely tested when the abstraction is introduced. It is tested later, when one use case needs to move in a direction the abstraction does not support.
I have watched this happen with event systems. A team builds a generic event bus early in a project's life. Every domain event flows through the same pipeline. It works well for the first year. Then one domain needs strict ordering because payment reconciliation depends on event sequence. Another needs transactional boundaries. A third needs fire-and-forget because latency matters more than delivery guarantees.
The event bus was not a mistake. It was a reasonable choice given what was known.
The mistake was not revisiting that choice when the domains diverged. The abstraction became structurally necessary before anyone noticed.
Abstractions that age poorly share a few traits. They unify things that are similar now but will diverge later. They hide differences that turn out to matter. They are introduced before the problem space is stable. And they become harder to remove the longer they exist.
The alternative is not to avoid abstraction. It is to delay it. Write the duplication. Let the pattern prove itself over three or four instances, not two. Wait until you understand not just what is similar, but why it is similar.
Duplication is cheaper than the wrong abstraction. Duplicated code can be changed independently. A bad abstraction forces coordinated change. Duplicated code is obvious. A bad abstraction hides its cost behind a clean interface that no longer reflects reality.
The abstractions that last are introduced after the problem was understood, not before. They are smaller than you would expect. They hide less than you would think necessary. They leave room for the cases that have not appeared yet — not by being generic, but by being narrow enough that they do not get in the way.
Simplicity is not the absence of abstraction. It is the presence of the right ones, introduced at the right time.
Refactoring as a Signal
The code that needs refactoring most often is rarely the worst code. It is the code sitting on top of an assumption that no longer holds.
Refactoring is framed as improvement. That framing is not wrong, but it hides the more interesting question: why does the code need restructuring in the first place?
Most refactoring conversations start with the code. The function is too long. The module has too many responsibilities. These describe the code. They do not explain why it drifted. The deeper question is: what changed? What shift in requirements or understanding created the misalignment you are now trying to fix?
When you skip that question, refactoring becomes cosmetic. You improve the shape of the code without addressing the structural mismatch. Six months later, the same area needs restructuring again. Not because the first refactor was bad, but because it treated the symptom.
This is the pattern of repeated cleanup. A module gets refactored, stabilizes briefly, then drifts. Each pass feels productive. Each pass addresses real issues. But the cycle continues because the underlying cause — a shifting domain boundary, an unclear ownership model, a feature that outgrew its scope — was never confronted.
Repeated cleanup is not careless engineering. It is a sign that the system's organizational shape no longer matches its purpose.
A payments module designed as a simple pass-through — validate, forward, log. Over two years it becomes the system's primary orchestration layer. It gets refactored three times, each time to accommodate a responsibility it was never meant to own. Each refactor improves the code. None questions whether the module should still exist in this form.
Ambiguity makes refactoring attractive. The team faces a question it cannot cleanly answer — whether a feature belongs in this service at all, how to scope a shifting requirement. Refactoring offers a productive alternative. You spend a week improving the codebase and the code is genuinely better afterward. But the hard question remains untouched. Refactoring has clear inputs and measurable outputs. The hard question does not. Given a choice between visible progress and uncertain exploration, most teams choose progress.
The signal worth paying attention to is not "this code is messy." It is "this code keeps becoming messy." The first is a maintenance task. The second is a design problem.
Structural misalignment has a specific shape. The code is organized around one model of the domain, but actual usage has shifted to another. Features that were once separate now overlap. Boundaries that made sense at one scale create friction at the current one.
The fix is not reorganizing code within the existing structure. It is questioning the structure itself. That requires understanding the forces acting on the code. Why do these two modules keep changing together? Why does every new feature touch the same three files?
These questions do not have clean answers. They require conversations about product direction, team boundaries, and debt that accumulated from decisions that made sense when the team was smaller and the product was simpler.
The most useful thing a refactoring can produce is not cleaner code. It is a clearer understanding of why the code looked the way it did. If you finish a refactor and can articulate what assumption was wrong, you have learned something that will inform the decisions that follow. If you can only say the code is better now, you improved one file and learned nothing about the system.
Refactoring is maintenance when it addresses known issues. It is a signal when it keeps happening in the same places. And it is avoidance when it substitutes for the harder work of confronting what actually needs to change.
Defaults Nobody Revisits
Best practices are borrowed conclusions. They compress someone else's context into a rule you can follow without thinking. That is their value, and that is their cost.
The rule says keep functions short. So you split a coherent operation into five pieces, each named for what it does but not why it exists. The code is clean. It follows the linter. It also resists change — because changing the behavior now means understanding the relationship between fragments that were never designed to be separate. This is what blind rule-following produces. Not bad code. Compliant code.
Every best practice originated in a specific environment. Detach it from those constraints, and it becomes a default nobody revisits. The problem is not that it is wrong. It is incomplete. It encodes the what without the why. And when the why changes — a new team, a different scale, a shifted business model — the what keeps running on inertia.
A team adopts clean architecture and applies it uniformly. Simple CRUD endpoints get use-case classes, repository interfaces, domain models, and DTOs — a five-layer ceremony around a single database insert.
The architecture is consistent. It is also expensive. The cost is not visible in any single file. It is distributed across the system as friction. Friction does not trigger alerts. It shows up as slower velocity, longer on-boarding, a growing sense that something is heavier than it needs to be.
Senior engineers are not the ones who know more best practices. They are the ones who know when to stop following them.
But judgment has its own failure mode. Without shared constraints, it becomes preference.
The override made not because the context demands it, but because someone has the authority to make it. That is not judgment — it is taste mistaken for reasoning. The difference matters: judgment accounts for the team that inherits the decision. Taste accounts only for the person making it.
What is missing is not better rules or more individual judgment. It is treating the override as a first-class event — visible, recorded, available to the next person who faces the same decision.
Best practices are not the problem. Following them without asking whether they still apply is.
Flexibility in software feels free early on. It is not.
A team builds a module for one case. Someone asks: what about a second case? So the module gets generalized. An interface extracted. A strategy pattern introduced.
Six months later, there is still one implementation. But the abstraction remains. New developers study it. They hesitate to change it. It looks load-bearing.
The flexibility meant to reduce future cost is now increasing present cost — through indirection that serves no one.
Premature optimization is widely recognized as a risk. Premature flexibility is not. It passes code review. It gets praised as forward-thinking.
But every unused extension point is a question mark in the codebase. Every config flag no one toggles still has to be tested. Every generic interface with one implementation constrains without earning its keep.
Optionality is not free. It defers the cost of a decision. It does not eliminate it.
The better question is not "can this handle future cases?" but "does this make the current case clear?"
A clear system is easier to change later than a vague one that tries to handle everything now.
Build for what you know. Let the future problem arrive before you solve it.
The systems that age well are not the ones that anticipated everything. They are the ones that were easy to change when the unanticipated thing showed up.
Time As A Design Constraint
Most systems are designed against requirements. Features, deadlines, throughput targets. These are real inputs, but they share an omission: they describe the system at one point in time.
Treating time as a design input changes what you notice — and what you defer.
Software ages whether you plan for it or not. The team rotates. The domain shifts. Libraries update or stop updating. The person who understood the naming convention leaves, and no one asks why things are named that way. They just keep going.
None of this is failure. It is the default behavior of systems under time.
There is a common pattern in engineering decisions: something works well today, so it ships. The trade-off is deferred, not avoided. You know the naming is inconsistent. You know the module boundary is wrong. You know the config layer is doing too much. But it works, and there is pressure to move forward.
This is not laziness. It is rational under short-term constraints.
The problem is that short-term rationality compounds. Each small deferral is reasonable on its own. Together, they form a system no one fully understands — not because it is complex, but because its structure no longer reflects its intent.
The config layer that parses YAML, validates environment variables, and handles feature flags — because at one point those were the same concern. They are no longer the same concern, but the module does not know that.
That gap between structure and intent is where maintenance cost lives.
When I say "treat time as a design input," I do not mean plan for every future scenario. That leads to over-abstraction — another cost that compounds.
I mean something simpler: when making a decision, ask what this will look like in two years if no one touches it.
Not "what if requirements change." Not "what if traffic doubles." Just: what happens if this decision sits, untouched, while everything around it keeps shifting?
Some decisions hold up. A well-named module, a clear boundary, a simple contract between services — these age well because they carry their own context. You can read them later without needing the original author to explain.
Others do not. A clever shortcut, a shared mutable structure, an implicit dependency between teams — these become opaque. Not because they were wrong, but because they required context that no longer exists.
There is a phrase I keep returning to: deferred clarity.
It shows up everywhere. In code that works but cannot be read. In architecture that functions but cannot be explained. In decisions made for good reasons that no one wrote down.
Deferred clarity is not technical debt. Debt implies someone borrowed deliberately. Deferred clarity is quieter — the slow accumulation of decisions where understanding was present but never made explicit.
The cost does not show up as a bug. It shows up as slower on-boarding, longer review cycles, more meetings to re-establish context that should have been in the code.
Context erosion is the mechanism behind most of this. Every system carries implicit knowledge: why this service exists separately, why that field is nullable, why the deployment order matters.
When the people who hold that knowledge move on, the system does not change — but its readability does.
This is not a documentation problem. It is a design problem. Systems that depend on external context to be understood are fragile. They work until someone needs to change them.
The fix is not more documentation. It is designing for reduced context. Naming that explains intent. Boundaries that match real domain splits. Contracts that are explicit rather than implied.
This does not require more effort. It requires effort at a different point — at the moment of decision rather than after the fact.
The systems I am most proud of are not the ones that were technically impressive at launch. They are the ones someone else changed successfully two years later without calling me.
That is the real test. Not whether it works on day one. Whether it remains legible on day seven hundred.
You start favoring clarity over cleverness. Explicit contracts over implicit ones. Fewer abstractions rather than more. You become skeptical of flexibility no one has asked for, because you have seen how unused flexibility becomes unused complexity.
You do not stop making trade-offs. You weigh them differently. The question shifts from "does this solve the problem?" to "will this still make sense when the context around it has changed?"
Decisions, more than code, are what age.
AI expands what we can generate.
It doesn’t replace the need for restraint.
Protect the real bottleneck.
Because once judgment becomes optional, complexity becomes inevitable.