Towards more reliable software: every little bit counts
A variant of this article appeared in Computer (IEEE), as part of the Component and Object Technology department, in the November 1999 issue, pages 131-133.
So far, we've had it easy. The public has been remarkably tolerant of our collective inability to produce high-quality software. True, by and large, software, more or less works, much of the time (qualifications intentional). If society hadn't come to rely so fundamentally on our profession, Y2K for one thing wouldn't be such a big deal. But the overall service that we render is not good enough by any measure. A year ago Ted Lewis produced some pretty scary statistics about the huge waste of resources that lack of quality causes ("Joe Sixpack, Larry Lemming and Ralph Nader", IEEE Computer, July 1998)
The public's tolerance will not last forever. People increasingly resent the bugginess of much of the software we produce. Consider this extract from a recent Wall Street Journal article (September 30, 1999, front page of the "MarketPlace" section):
"Please don't bother to write to me with suggestions to fix the [numerous Windows-related problems related in the article]. The whole point is that owners of computers shouldn't have to get involved in making them work as promised. They should just work, all of the time".
Not so long ago, technology editors were dazzled by technology; now they start to demand quality. Even the professional press is starting to worry; see this extract from a recent Nicholas Petreley column in InfoWorld ("Silence is Deadly...", 16 Aug. 1999, p. 114:
"Software publishers aren't interested in writing solid, bug-free code because they are convinced that features sell, not quality... Computer journalists should focus less on features and less on reliability when reviewing software. More importantly, we should go out of our way to rip out the fingernails and rearrange the face of any vendor that delivers programs with security holes and bugs."
That would be big news indeed: imagine software reviews that tell us how good the software really is, without just counting the bells and whistles.
One of the most critical components of software quality is reliability. Efforts to improve reliability are not new; there are in fact many different approaches. In this column I will simply try to list some of the most relevant ones, with the hope of providing a broad enough view of what's available.
The list that follows is a little non-conventional precisely because of its breadth. One of the characteristics of the software engineering community is that it seems sometimes split into separate "just do this and everything will work fine" communities, where the "this" is narrowly defined and different in each case. For example you have the management school, which holds that all that really matters is better management approaches; the formal specification school, for which we won't achieve anything unless we specify everything mathematically — and then we don't need testing at all; the testing school, which views formal specifications as an academic pastime, the only meaningful solution being to devise systematic testing strategies; the metrics school, which focuses on assessing everything quantitatively; the open source crowd, which sees the light coming from extensive public scrutiny of . They all hold a piece of the truth, but none of them holds the whole truth. Any "just do this" approach is wrong; the problem is far too complicated. Although I have contributed quite a few suggestions myself, in the form of methodological principles, language support and software tools, I do not think that any single solution can carry the day. Every little bit helps; we must keep an open mind and include all good ideas.
What follows is a list of good ideas. I will certainly have forgotten some and will welcome reader comments pointing to such omissions (or criticizing my own choices).
Prevention and cure
Separate from their classification into "management" and "technology", reliability techniques fall into two categories: a priori and a posteriori. A priori techniques strive to build software right; a posteriori techniques attempt to right its wrongs. Formal specification is an example of a priori; testing, of a posteriori.
Much of the practice, today, is in a posteriori techniques. We build software that's not very good, and through brute force we debug it into correctness. Far from me the suggestion that, given current techniques, anyone should test their software less. But by shifting some of the balance to a priori efforts, we may go a long way towards correcting some of the most quality issues. We need both cure and prevention; but an ounce or two of prevention will save a lot of cure.
The management school reminds us that good engineering practice means, among other things, well-defined management policies.
Among the management-oriented techniques one can cite the favorable effect that the Capability Maturity Model and, to a lesser extent, ISO-9000 based approaches have had in some segments of the industry, mostly among large companies. Forcing organizations to understand, document and control their software process, and make it reproducible, is a definite benefit.
Such approaches have also been criticized as focused too much on form and not enough on substance: counting bugs and tracking delays is good, eliminating them would be better. This doesn't mean the ideas are bad, simply that they have to be combined with more technology-oriented solutions.
Closed and open
A large part of the industry is applying the motto "follow the market leader", often directing its daily devotions towards a specific area of the Northwestern coast of the United States. This doesn't guarantee quality (in fact some would say it guarantees the reverse, which would be exaggerated), but does ensure that successful products benefit from a critical mass.
At the other end of the spectrum, you have the open-source and free-software enthusiasts, who believe in the power of public scrutiny. This is an attractive theory. To my knowledge no one has, as yet, provided formal evidence of the superiority of software produced that way, although informal examples are not hard to find. Ken Thompson's recent disparaging comments about the quality of the Linux code (Computer, May 1999) are a little sobering.
Metrics are also frequently cited. It is clear that we need more quantitative approaches to assessing and predicting what we do. Project metrics (time, people, money) are just as necessary as product metrics (bugs, size, complexity).
Education is critical. Software engineers too often do not know some of the basic techniques of modern computer science. I always find it striking that only a minority of professionals know, for example, the Hoare approach to semantics, which is perhaps the closest approach we have to some of the basic scientific laws of other fields (such as Ohm's laws). We need both better initial training and, perhaps even more importantly, re-training and continued education for working professionals.
The Dilbert approach
Some of the management-style approaches are non-conventional. I intend to have a future column on "extreme programming", which is in part a Dilbertian revolt against management-imposed organizational fiats a la CMM. It's perhaps shocking to some, and not everything is valuable, but there are certainly lessons to be learned by everyone.
Management is not everything; we need better technology. The possible technological improvements are very diverse. Here are a few.
It's impossible to ignore formal methods. True, they have a bad reputation in some circles, as too heavy and difficult. That's not entirely justified. Formal methods have achieved a number of successes; they are (as illustrated by the recent development of the software for the new automated line of the Paris Metro, using the B method, under the leadership of its creator Jean-Raymond Abrial) the only game in town when it comes to guarantee to a regulatory authority that you have produced a correct engineering design. Just as importantly, learning to apply formal methods makes you a much better developer even for projects in which you don't work in a completely formal way. Those who think of formal methods as an academic curiosity with no future may be in for a few surprises. As noted in an earlier column, formal methods hold a particular promise in connection with reuse: reusable components need strong warranties; and formal methods costs can be justified economically by the economies of scale permitted by reuse.
Design by Contract
A more moderate version of formal methods, closely combined with the principles of object technology, is Design by Contract, which several installments of this column have described and advocated. I believe it's one of the most potent ideas for improving the state of software technology.
Approaches to testing are becoming more systematic. We all know that testing cannot be even remotely close to exhaustiveness; but this doesn't mean we can't be systematic and effective in our testing strategies. Recent work as represented by a recent paper by Jézéquel et al. (Computer, July 1999) shows how to give new life to classical techniques such as "mutation testing" in connection with newer ideas of object technology and contracts.
Modern programming languages definitely help. True static typing catches many potentially serious errors before they have had the time to strike. Automatic memory management avoids horrible bugs. Avoidance of dangerous features such as pointer arithmetic frees us from many potential disasters.
Object technology has brought us tremendous improvements. What's perhaps critical here is the dramatic simplification that well-applied O-O software construction brings to software architectures. Simplicity is key to quality, and especially to reliability: you can't get it right if it's complex. These benefits of object technology, however, assume that it's applied systematically, almost dogmatically. Incomplete or half-hearted approaches, especially if they don't fully apply principles of data abstraction and information hiding, are not much better than pre-O-O techniques.
Component-based development (CBD) holds of course a great promise, which has been described in previous columns, in the July issue, and will continue to be analyzed in future installments. Here too it is indispensable to do things right. In particular we badly need more expressive Interface Definition Languages for both CORBA and COM, supporting the expression of semantic constraints. Serious component-based development holds the potential for an in-depth reengineering of the industry. Whether we will be able to realize that promise depends on how seriously we take the software engineering principles without which there can't be any serious CBD.
A comprehensive approach
None of the ideas cited will suffice to provide the breakthrough that our field requires today. But if we take all of them seriously and succeed in combining them, we may be able to realize major advances. It will take a lot of prevention, and a lot of cure.