Correlation != Causation

In a way that probably carries bigger lessons for us all, I discovered this afternoon (with the help of Evariste, who watched the server while I tried to get connected) that while Thunderbird had in fact updated itself when I launched it this morning, and didn’t work after that…

…that the issue was that my IT guys had locked down the two ports I use for SSL mail this morning. At about the same time.

And I presumed – because after all, it was logical to do so – that the update had caused the problem, rather than some exogenous issue (like overzealous IT staff).

So, what do we all think the lesson here is?

13 thoughts on “Correlation != Causation”

  1. Well, one lesson is that coincedences actually do happen. But it’s OK to admit you were wrong when your conclusion was logical, based on the information that you had available… you just have to live with the consequences of your original decision.

    DaveK

  2. I’ll take a different tack, AL, since you and I are both IT guys: Computer systems are not deterministic.

    Oh, computers are, at a certain level. But computer systems? Nope: there are too many variables, too many moving parts, for that to be true. Your system (the blog system, that is) is simple enough to be deterministic to a very large extent, but consider a “simple” enterprise system, with 6-8 servers, load balancers, between four and ten different “important” and maybe thirty to fifty critical but unthought-of software products, plus hundreds of maybe-related/maybe-not software systems and hundreds of physical components. The timing problems alone (when do events fire and where) add up to too many possibilities to simply count, never mind plan for. (Let’s take a really simple physical component, the network cable. It has no moving parts, yet I once saw one fail in such a way that transmit in one direction was shot, but tested in the other direction, the cable was fine. There was, quite literally, over a hundred thousand dollars worth of time and parts wasted in tracking down that problem. The cable was in a frame in my boss’s office thereafter, as a reminder.) On a daily basis, I work with systems much more complex than that.

    Here’s the lesson I take from this: we need to start thinking of our systems non-deterministically, as sets of probabilities and potentialities rather than as discrete parts that work in a deterministic manner. We need, in other words, to evolve a new paradigm of systems management, and in particular of resilient reaction to computer systems failures, than we currently even have vocabulary to discuss. I personally have been reading up on psychology: it seems to be more apt, somehow, than I ever thought before.

Leave a Reply

Your email address will not be published.