Shipping buggy code: The most critical skill for a programmer

NASA’s Mars Climate Orbiter was launched on December 11 1998 with the ambitious mission of  studying Martian climate; the cost of the program was more than $300M and a good share of it was allocated to development of the necessary software.  Unfortunately, as the  Orbiter was approaching the red planet it vanished and it did not take long for the monitoring scientists to realize that it was gone forever!

The reason behind the failure of the orbiter was a very simple software glitch. Instead of calculating a specific metric in newtons the program was using pounds omitting a single call to a conversion function; a code bug that could have been fixed with a few characters of code were enough to destroy the space program and invalidate all the hard work of thousands of the most qualified scientists and engineers worldwide!

Mars Climate Orbiter is just one of a very long list of projects that failed due to an extremely silly programming mistake; it can be used as another example that verifies the fact that proving any piece of software to be completely bug free and fully tested consists an impossible task.

Even the most extensive testing suite, represents an extremely limited subset of all the possible paths that any application is expected to have thus it is simply impossible to believe that in any way we can “prove” it error free with a guaranteed behavior complying to a specific set of rules!

 Given the fact that the verification process is asymmetrical to its falsification, induction cannot “prove” anything; although a billion successful tests cannot prove the validity of an application, a single fail is enough to designate it as invalid! The problem lies in how to discover the failing conditions of course and nobody can be sure that they do not exist since it is impossible to test all the possible combinations of variables and execution paths.

Despite the huge record of failing applications caused by simple “bugs”, many software developers insist that they can deliver flawless and completely bug free applications!

Whenever I hear something like this, I am sure that it represents the view of a junior and not very talented programmer who sounds like a salesman trying to please the ears of his audience!

One of the most critical skills of a software developer lies in his ability to decide that the incompleteness and inefficiency of his code has reached a level that will be tolerated by its clients. Although this statement might sound provocative and strange, it also hides the secret of success behind any kind of software that can be developed! Software engineers, developers and project managers who claim that they are striving to deliver bug free and fully functional code are either lying or simply do not know what they are talking about.

I have heard several developers and program managers bragging for the obvious (but also impossible) goal of fixing all known bugs as quickly as possible and never release the next version while all fixes are committed! This approach is nothing else than wishful thought or a marketing motto that might be used to convince some MBA manager who also knows very little about the creation of software.

Setting aside very trivial cases, the complexity of  any piece of software that exposes any level of useful functionality is such that even the larger quality assurance and testing teams can never verify its validity to its full extend.  The responsibility of the senior programmer is to make a successful  judgment call,  deciding that the solution has reached a level that can be deployed to its users with reasonable completeness and robustness! There is little doubt that users  will later discover some of its bugs and feature shortcomings and some of them will become the fixes of the next release; great software relies on periodic piecemeal releases which are fixing existing bugs and add new features based on user feedback.

Shooting for perfection delays dramatically code delivery and in many cases make it impossible.  One of the main reasons behind failing projects has to be attributed to inexperienced program managers who fail to understand that software development is neither an exact science nor a logical structure that can be validated using some kind of a mathematical procedure. Software development instead, involves a great deal of talent and intuition, resulting to a mix of science and art with the artistic component playing the major role in most of the decisions to be made.  

I need to clarify that by no means I consider an advocate of sloppy releases or have the intention to convert end users to beta testers.  The point I am trying to make is that no matter how mission critical the software can be and how important the problem it tries to solve, it has to become crystal clear that it is simply impossible to reach a level of guarantee bug free solution. Regardless of whether it is an automated trading system, a nuclear reactor protection system or an android game, at some point somebody has to decide that the software is ready to be shipped to its users!   Deciding that the software has reached an accepted level of completeness and its bugs are tolerable is one of the most critical judgment calls that a developer is asked to make.

It is up to the understanding of the senior developer to decide whether his program is ready for delivery and this is the skill that will separate him from the crowd more than anything else.

20 Comments

  1. Everything in this article is true, but the only thing it really counters is people claiming or believing they can really have bug free code. Even though the article is accurate, I fear the mindset it could breed. The biggest problem is software today is just how difficult it is to produce software at a low enough level of defects. I think it is more helpful to encourage people to work on making that goal easier to achieve than to encourage them to release buggy code. It is already true that all day long, I feel like I am constantly being annoyed by bugs in software I am using. As our society becomes more and more entrenched in software, our lives will become an exercise in dealing with software bugs unless we figure out better ways to reduce the number of bugs. Toward that goal, I am working on new library called PurposefulPhp and I think I am getting close to an alpha release: https://github.com/still-dreaming-1/PurposefulPhp

  2. PurposefulPhp seems like an interesting idea although I need an example of its use to fully understand how it fits in the contract description. When you say Contracts as code (instead of contracts plus code) I assume that you mean to decouple declarations from definitions. If this is the case how your approach differs from IDL? Is the only difference that your framework is PHP specific or it also proposes some additional features?

  3. In all but a few instances the ‘code’ that the programmer writes is only a small portion of a much larger collection with literally innumerable potential interactions, conditions, and dependencies. The computer applications of today are complex with all that entails. As David Woods said, “As the complexity of a system increases, the accuracy of any single agent’s own model of that system decreases.”

    Although not widely acknowledged, dev and ops today work in a tradeoff space that balances effort and quality. The question is not whether code can be made perfect but whether relative to our current location in this space the incremental gain in quality is worth the additional effort.

  4. A developer’s over-confidence in his ability to deliver error-free code usually evaporates when you suggest that he could be penalized for any error found after deployment. Suddenly, all you can hear is why perfect software is impossible.

  5. The main difference is that PurposefulPhp is not actually working yet, but I feel it is close to working. When I first started making it, I wasn’t really sure what I was making. Now that I am further along, I was able to research other similar tools. It seems like what I am working on is actually an example of the logic/constraint programming paradigm. The main difference between my approach and others I have seen is mine is more generally useful instead of only being used for certain things. It is sort of like an SQL like language for writing contracts that essentially implement a class. It is basically a way of writing classes on a higher level. The only reason it is in PHP is that is the language my head is in right now. If it ends up successful, I might get it working in other languages as well.

  6. Let’s look at software the same way we look at Math or any other discipling existing for ages. There is a list of rules how to mitigate the effect of possible issues. It’s not new and people have ways how to leave with mistakes, mistakes are normal state of our lives.
    Now returning back to software. What about robustness? That is what software architects try to add to systems to let them work even if one of subsystems goes down.

    If we deliver the system that is not robust enough that is another problem. I’m surprised NASA has such kind of problems.

  7. Functional languages are approaching computer programming using a purely mathematical approach. The problem with this approach lies in the fact that although mathematics consists an analytical propositions (meaning that essential all math is build on top of logic) computer programing needs both analytic and synthetic approaches as it solves empirical problems rather than pure a priori defined propositions.

  8. Indeed, creating 100% bug-free software at a significant level of complexity is damn hard to do. However, it is NOT impossible.

    1) In the academic world, there is a subfield of theoretical computer science called “formal methods”. Those people have been researching a surprising variety of different ways to provide strong correctness guarantees for software over the last decades.
    While a couple of decaded back, most of these approaches were confined to rather esoteric languages, or useful only in certain paradigms (functional, mostly), Hoare logic brought the same guarantees to imperative programs [1], Frank De Boer generalized Hoare Logic for Object-Oriented programs [2] and more recently, there was even an adaption of these concepts for dynamically-typed languages [3]. Hence, I consider it fair to state that the theoretical foundations for verifying most contemporary programs do exist.

    2) On the practical side, we have free tools like Boogie [4], Daphny [5], AutoProof [6], F* [7], etc. that are placing those cocepts at our fingertips. While there is definitely still room for imrovement regarding tooling, this situation has changed dramatically within the last decade and I would today consider it reasonable to state that the tools for verifying programs do also exist at least for some languages.

    3) In my eyes, the only remaining problem hence is the severe lack of verification skills and formal-methods-know how in todays programmers, architects and software engineers. Most university graduates today have at least heard about Hoare Logic or related techniques, but most people in the software industry still consider them as esoteric and out-of-reach.

    I thus agree with still-dreaming-1, that rather than encouraging people to release buggy code, we should be teaching them how to increase code quality.

    For more information, please see the references below. And for those not ready yet to dive into the details, IMHO the following article does a decent job at summing up the current situation:

    https://www.theatlantic.com/technology/archive/2017/09/saving-the-world-from-code/540393/

    References
    [1] https://en.wikipedia.org/wiki/Hoare_logic
    [2] https://link.springer.com/chapter/10.1007/978-3-540-30101-1_5
    [3] https://arxiv.org/abs/1509.08605
    [4] https://www.microsoft.com/en-us/research/project/boogie-an-intermediate-verification-language/
    [5] https://www.microsoft.com/en-us/research/project/dafny-a-language-and-program-verifier-for-functional-correctness/
    [6] http://se.inf.ethz.ch/research/autoproof/
    [7] https://www.fstar-lang.org/

  9. “…many software developers insist that they can deliver flawless and completely bug free applications! Whenever I hear something like this, I am sure that it represents the view of a junior and not very talented programmer who sounds like a salesman trying to please the ears of his audience!”

    I’ve only met one programmer who claims he can produce bug-free code. He’s no junior; he’s managed hardware and software on international projects, and taught compilers at a major university.

    He writes a system as a set of services, and he claims he can mathematically prove each service is correct.

  10. The fact that your friend claims that he can “mathematically prove” the correctness of his service means that he is viewing it as an “analytical” rather than an “empirical” procedure thus his proof is just a tautology that restructures the a priori conditions that are considered “correct” in an axiomatic fashion.

    An empirical statement / solution can never be “proved correct”; all it can done with it, is to validate that it passes all the related tests that we can think about.

    https://plato.stanford.edu/entries/analytic-synthetic/

    https://en.wikipedia.org/wiki/Analytic%E2%80%93synthetic_distinction

  11. Coding may be your craft, but writing apparently isn’t. Please lose the exclamation points. It’s amateurish. It actually makes your thoughts harder to read. Imagine a speaker who emphasized every point with a WHOOP! and Jazz Hands.

  12. @john: I’m sorry, I do not get your point. In your post you seem to categorically disqualify correctness proofs as “empirical” arguments as opposed to “analytical” ones, but the webpages you referenced explain the difference between “synthetic” as opposed to “analytical” arguments…

    From these explanations, one could get the following definitions:

    analytical = argument whose validity can be establish directly from the definitions of the components named in it.
    synthetic = argument whose validity can only be established with the help of additional knowledge about the environment.

    However, I could not find any definition of “empirical” arguments there…

    Also, I wonder into which of those categories mathematical/logical proofs fall. One the one hand, everything in mathematics can be derived through logical reasoning from the definitions, which sound pretty much like a model candidate of an analytic argument.
    However, on the other hand, logical reasoning requires one to have a formal proof system, which could be regarded as an additional theory and hence as additional knowledge required to follow the reasoning.

    Especially in the case of correctness proofs of programs one usually requires a formal semantics of the programming language the program in question was written in, which is a LOT of formalized knowledge about the “environment”. I hence tend to regard such proofs as synthetic rather than analytical.

  13. Empirical and synthetic can be used as synonyms in this context.

    We can also prove analytical statements as they are confirmed by logical rules (for example Euclidean Geometry). Synthetic or empiric statements can never be proved. The can also be validated and eventually falsified (for example Newton Laws).

  14. I disagree with your thesis. It is entirely possible to achieve bug-free code and I have had the pleasure of working with a few examples in the last 45 years, some of which has failed so infrequently and in non-reproducible ways that I have been forced to conclude the failures were due to hardware errors — supply noise, alpha particles, etc. — not software bugs.

  15. We should not confuse the concepts of “verification” and “testing”. Testing is a part of verification. However, because of the impossibility of testing every possible case, it is only one of many parts. Proper verification includes reviews of code specifications, reviews of code design, code walk-through, compliance matrix reviews, verification that code meets applicable standards, and such like.

    Consider the Mars orbiter issue, one normally writes Software Interface Requirements documents to define the interfaces between major components. A review of such documents makes sure that data formats, units, and such like are specified consistently for both sides of the interface. This is a Verification procedure, not a test. A mistake at this level can be caught without executing a single line of code.

    A code walk through can normally identify if somebody has used improper units, or coded in a manner which makes it hard to identify what units were used…again, without actually “testing”. One compares the code with the applicable documents to see if there is a match.

  16. There is also the manager looking over your shoulder, tapping his toe and pointing at his watch. One place I worked shipped their product with 150 KNOWN bugs because the schedule said to. Previously I had worked at a place where we shipped with no known bugs. We didn’t fool ourselves in thinking we had bug free software, but we put serial numbers on all the disks and maintained a database of fixes and what serial number the fix was installed at. Customer service would check the database for a problem being fixed and if the fixed serial number was greater than the customer’s they were sent a new copy. If no fix was found, it was sent to the programmers for remediation. Not all customers encountered bugs because of how they used the software.

  17. The basic problem with the NASA’s Mars Climate Orbiter, was not software. That was the consequence of a more basic issue: lack of a good standard!
    They needed everybody to work on 1 and ONLY 1 standard of measure: Metric or Iimperial
    If that standard was set and adhered, then the call to the “missing converter” would not be necessary

  18. @Gunnar W.:
    Could you please elaborate a bit on how this code was created? I am guessing the level of correctness you are describing is pretty hard to achieve by means of testing…

Leave a Reply

Your email address will not be published.


*