OranLooney - Overhead Limits Quality

Light, not heavy, development processes allow for the highest quality code.

This idea is quite popular these days. I'd like to offer an intuitive explanation for the "unreasonable effectiveness" of lightweight processes. Let's start with an outline so you can see where I'm going, and then I'll dive into the details one by one.

Argument

Code gets better as we work on it.
Work is broken into natural units: adding a feature or fixing a bug, for example.
Each unit adds a certain amount of value to the product, and has a certain cost.
- Units are either improvements or new development.
- Improvements increase quality and value, while new development only increases value.
ROI (return on investment) is the ratio of value to work.
There are always plenty of units we could work on, but they have different ROIs.
The total cost is fixed, because programmer time is finite.
For our purposes, the value of a unit of work is fixed.
So, we maximize value by always working on the units with the highest ROI.
The cost is a combination of inherent cost and overhead.
A heavy process is one with lots of overhead, i.e. manual work that doesn't scale.
I'm going to argue that heavy processes add more overhead to improvements than to new development.
Thus, heavy processes decrease the ROI of improvements disproportionately.
As a result, new development becomes more attractive than improvements earlier, at a lower level of quality.
Therefore, heavy processes result in large amounts of low quality code.

Why do we care?

By my own definitions, value seems to be what's important. Who cares about quality? Well, you will, and your customer will.

In the long term you'll care because of McConnell's General Principle of Software Quality:

The General Principle of Software Quality is that improving quality reduces development costs. - Steve McConnell, Code Complete

And your customers will care because it doesn't take long for an application to become feature complete: at that point, quality issues — such as intuitive UI, stability, and straight-forward configuration — become the key differentiators.

Put these two together, and you can see why lack of quality is a problem: Not only is the code unpolished, but it's also hard to polish! And when I say "a problem," I mean the kind of problem that puts companies out of business.

I said I would argue that heavy processes hit improvements the hardest, so let's do that now.

Classification of Overhead

Remember my definition of a heavy process:

A heavy process is one with lots of overhead, i.e. manual work that doesn't scale.

Manual work, of course, means human time, not heavy lifting. "Doesn't scale" means the same tedious thing needs to be done for each unit of work. This is a broad definition, and there are many ways a process can be "heavy." I can think of several specific specific tasks that tend to degenerate into manual labor:

Administrative
Testing
Backwards Compatibility
Duplication
Deployment

We'll take a look at these one by one, and then draw some general conclusions by induction.

Administrative Overhead

Administrative work is only overhead if nothing comes of it: designs that no one ever reads, sign-offs that aren't based on detailed review, irrelevant questions on forms... Particularly dangerous are processes that don't distinguish between small fixes and major development. It doesn't make much sense to treat fixing a bug and designing a new module the same way.

Improvements tend to be small (UI tweaks and bug fixes) and fixed administrative overhead kills the ROI for small changes.

The solution is to have processes where administrative overhead is proportional to development cost.

Testing

Testing overhead hits small changes the hardest. In the absence of good unit testing and regression testing, every one-line bug fix or slight UI tweak requires tons of manual testing. It's often the lack of these small finishing touches that makes an application feel buggy, sloppy, and inconsistent. Testing overhead hits improvements hard because the code has to be tested all over again; it seems easier to just let such code be.

The solution is to automate testing when possible and to make good, conscious, flexible decisions about the level and type of testing appropriate for each change.

Backward Compatibility

Backwards compatibility can be a major headache. Once code is being used by a large user base, breaking compatibility with saved files, database schemas, and client-server communication protocols is only acceptable for major releases and always has significant overhead. Since it's rarely possible to preserve backwards compatibility in a clean and elegant way, kludges are done that complicate code and lead to future maintenance problems. By nature, the overhead of backwards compatibility applies only to improvements.

Elegance is rarely backwards compatible, and backwards compatibility is never elegant.

The solution is beta testing. Give yourself a window to make essential but backwards-incompatible changes.

Duplication

Duplication of knowledge in system occurs whenever changes in two completely separate places need to be made in parallel (say in the client object model and the database tables.) Unfortunately, this requires discipline, because code with lots of duplication seems to work just fine at first. The flaws appear later, during maintenance, when it become apparent that every small change requires encyclopedic knowledge of the module and hours of tedious work. For this reason, duplication only affects improvements. In my opinion, there is no acceptable level of duplication - it's always pure overhead.

Centralizing knowledge is hard without strong support from the entire team, including management. It's up front work for long-term benefit: that's hard for us mere mortals. And duplication begets more duplication: if the development touches code with lots of duplication, it's rarely possible to Do The Right Thing without re-writing the problem code.

The solution is an team-wide commitment to centralizing knowledge.

Deployment

It's a lot of work to get code packaged up and delivered to customers. One of the most attractive things about writing web apps is the transparent deployment. For other code, the usual solution is to only release versions every year or two, which minimizes this cost.

That gets blown out of the water for patches and enterprise systems. Frequent deployment gets expensive fast.

The key to avoiding manual overhead is to develop powerful automated deployment tools. The diff and patch utilities revolutionized Unix development, and Capistrano helps make Ruby on Rails a smart choice. Until automated tools are in place, deployment will be painful.

Q.E.D.

Now that we've taken a look at the common kinds of overhead in software development, we can see that they all only or disproportionately affect improvements. A quick recap of the remainder of my argument is in order:

this overhead pushes the ROI of improvements down
which favors new development over improvements
which results in lots of low quality code
which is fatal in the long term.

Back to Business

Now, some level of overhead may be acceptable. It's often the right business decision. But every bit of overhead limits the quality of the code. Overhead can be the right business decisions — if you're consciously trading quality for value. The problem is that the tradeoff is often quite hard to see.

In fact, attempts to increase code quality almost always increase the overhead of making a change or doing new development:

What the manager says: "Submit a design doc for approval."
That the programmer hears: "That's stupid; go do something useful instead."

That's an example of administrative overhead, but the phenomenon is quite general: decisions are repeatedly made that increase future overhead.

The large cost of small improvements goes unnoticed because the changes are never made. Every time someone thinks of a small change to make the system cooler, then decides not to do it because there's too much busywork, the system suffers. In particularly pathological situations, bug trackers aren't even submitted, because no one believes the fixes will ever be made and the bug submission process itself is weighed down by overhead.

The real, invisible price of a heavy process is the large number of improvements that never get done.

There is a solution for each of the specific types of overhead I considered. Each solution is a move towards sophistication: By doing more kinds of things, and doing them up front, we reduce overhead and enable higher levels of quality to be reached.

- Oran Looney June 5th 2007

Thanks for reading. This blog is in "archive" mode and comments and RSS feed are disabled. We appologize for the inconvenience.

Engineering

quietly programming away