Engineering Practice8 min read

Technical Debt in Enterprise Software: What It Actually Costs

A client asked us to add a new supplier payment integration to their document processing pipeline. We quoted three weeks. We delivered in ten — seven of which were spent on technical debt that had nothing to do with the new feature: an undocumented field mapping layer, a retry mechanism handling 23 special cases nobody had written down, and a database table with 14 nullable columns added over two years with no explanation of what any of them were for.

Technical debt in enterprise software is not primarily a code quality problem. It is a cost accounting problem. The interest is paid on every new feature, every production incident, and every developer who joins the team and spends two weeks understanding a codebase that should have taken two days. But it is never itemized on any invoice, never attributed to the sprint that created it, and never weighed against the time saved by cutting the corner in the first place.

Making it visible — in actual dollar terms, not in developer complaints — is the only way to get it managed. This is what we have learned about doing that.

How enterprise debt accumulates differently

Enterprise software debt accumulates differently from startup software debt. Startups accumulate debt by moving fast and cleaning up later — most of it is tactical, close to the surface, and removable with sustained refactoring effort. Enterprise debt tends to be structural and deep: schema decisions made in year one that constrain every feature in year four, integration patterns chosen for a single vendor that generalized poorly to the twelve vendors added since, authentication architectures designed for 50 users that perform fine at 200 and degrade in ways nobody predicted at 800.

Each individual decision looked defensible at the time. A nullable column added to handle a special case for one customer in 2022 is a reasonable tactical choice. Fourteen nullable columns added this way over two years, with no documentation, is a schema that takes a new developer a week to understand and a DBA an afternoon to query correctly.

The structural nature of enterprise debt is why "we'll clean it up later" so rarely happens. You cannot refactor a primary key schema in a live ERP. You cannot change the data model for the invoice table without touching every service that reads from it. The debt is load-bearing and the cost of removing it grows proportionally to how much has been built on top of it.

7 weeksof a 10-week integration project spent on surrounding technical debt, not on the feature that was requested or quoted

The 3x multiplier

On the project above, the new feature took 3 weeks. The surrounding debt took 7 more. That ratio — roughly 3x — appears consistently across the enterprise integration work we do when we inherit an existing codebase. Not on every project, not on every feature. But when we aggregate across a year of inherited-codebase projects, new work averages about 30% of the actual effort. Understanding, working around, and incrementally fixing existing debt accounts for the other 70%.

The client is invoiced for 10 weeks. In their head, the feature cost 10 weeks. In reality, the debt cost 7 weeks and the feature cost 3. If the same project had been done on a clean codebase, it would have taken 3 weeks. The debt multiplied delivery time by 3.3x on a single feature.

We have started tracking this explicitly on every project: estimated time on clean codebase vs actual time including debt overhead. The delta, summed across a year of projects, is the cost of the debt. On one client account last year, that delta was ~$340K in delivery overhead attributable to accumulated technical debt in a five-year-old system.

The undocumented special case: the most expensive form of debt

The retry mechanism in the project above had 23 special cases. It had grown organically over 18 months as each edge case was patched when found. A supplier invoice with a currency mismatch: case 4. A vendor ID that contained a forward slash: case 11. A document type added after the original integration was built, handled by an undocumented workaround bolted onto case 17.

Nobody had a complete picture of what the 23 cases were, which were still relevant, or which could be safely removed. Some handled conditions that no longer existed in the source system. We spent two weeks auditing them before we could safely touch any of them. Removing a case that was no longer needed would have been one hour of work on a documented system. It was three days of archaeology on this one.

Watch out

Undocumented special cases are debt with compound interest. Every new developer who touches the system pays the full audit cost again from scratch. A case added in 15 minutes in 2022 may cost 3 days to safely evaluate in 2025 — and if the next developer guesses wrong, it costs a production incident on top.

Making debt visible in dollar terms

Most engineering teams know their debt exists. Almost none have quantified it in terms that resonate with the people who decide whether to address it. "We need to refactor some legacy components" produces a different response than "technical debt added $120K in delivery overhead last quarter and that number is rising."

The approach we use: estimate the cost of debt as the difference between actual delivery time and what delivery time would have been on a clean codebase. This requires honest estimation — engineers need to be able to say "this feature would take 2 weeks on a clean system; it took 6 weeks on the current one" without the 4-week overhead being attributed to poor estimation. Track this across six months of projects. The aggregated delta is your debt cost.

Present it to stakeholders quarterly as a cost line item, not as a code quality concern. A CFO who sees that debt overhead cost $120K last quarter — money paid for development time that produced no new functionality — has a different conversation about addressing it than a CFO who hears that "code quality needs investment."

Why the big refactoring project almost always fails

We have been on three large-scale "let's fix everything" refactoring projects. One succeeded. Two created as much new debt as they removed — stalled partway through, ran over budget, and left the codebase in a worse state than when they started because now part of it was refactored and part was not, and the two halves had incompatible assumptions.

The failure mode is consistent: the refactoring project is scoped optimistically, competes with feature work for developer time, loses that competition, and ends at 60% completion. A half-refactored codebase is harder to work with than the original because developers have to understand two different patterns simultaneously and know which applies to which part.

The approach that works is continuous debt payment as part of every feature delivery. Budget 20% of every sprint explicitly for debt reduction. Each feature delivered also includes a debt payment in the surrounding code. Over six months, this produces a meaningfully cleaner codebase without the disruption of a standalone refactoring project that will never be completed anyway.

Our take

The "20% rule" works best when it is an explicit budget line, not a vague commitment. "We should clean up as we go" never happens. "Sprint 14 includes 2 days of debt reduction: we are documenting the retry logic and removing the 7 special cases we confirmed are obsolete" produces actual results.

Technical debt is not a developer problem — it is a product management problem. The decisions that create enterprise debt are almost always made by someone who prioritized a feature over the time to do it properly. Developers pay the interest. Product teams write the checks.