Categories
agile programming system architecture

Source code is the best form of documentation (not)

No, it’s not, it’s really not. In fact, source code is probably the worst form of documentation, it fails in most of the roles that documentation is actually required to fulfil (see bottom of this post for a shortlist).

But something else is…

Because … when you think about it, there is a kernel of truth in the claim. It is based around the concept that source code is the ULTIMATE arbiter of “what the system actually does, in reality”. Unlike comments, and written documents, and diagrams, source code by definition can never get out of date, no matter how time-poor, ill-trained, lazy, or even just unfortunate your programming team is.

(im)Perfection

Advocates of “code is the best documentation” tend to point to the fact that in reality there is no such thing as the perfect human programmer who never makes a mistake, and so manual comments are automatically doomed to failure. Worse, any given piece of software spends more of its lifetime in an altered/updated state than it does in its original virgin “as-it-was-originally-typed-in” state. Therefore, on any non trivial project, the non-source-code forms of documentation are guaranteed, mathematically, to become more and more out of date at an equal or greater rate than you can fix through periodic (or even continuous) manual updating of them.

Source code is NOT the perfect documentation … but unit tests are. Just as source is – by definition – always up to date, and manual comments are effectively guaranteed over time to become less and less up to date, unit tests always remain up to date *or they fail*, since they in no way rely upon or pay any attention to the source, and instead are tightly bound to the *usages* of the source.

(an aside: if you use unit tests, but don’t run Continuous Integration (CI) or a similar build process that automatically runs all unit tests on the order of once a day, and reports in big flashing emails when any fail, then it’s time for you to re-read the manual. I add this only because there are people who don’t do this, and they could legitimately point holes in the above, but you guys – you need to realise you’re in a small minority of people who are “missing the point” :P)

What does this buy you?

(subtext: Why did I bother writing this blog post anyway? ;))

Because it has one very very important knock-on effect: if you have no documentation for your system and you’ve decided you really need some (and, let’s face it, the war over whether documentation was “actually needed” was won a long long time ago), then I have a piece of advice:

Don’t write any documentation; write unit tests instead.

This is more important than merely arguing that unit tests are “better” documentation (personally, I think they aren’t – they are only perfect in a provable sense, not in a “actually solve all the problems that documentation is needed for” sense) – this is really about the practical realities of your situation.

If you already have the system, and you have no documentation, you will never have the time to write it.

(this is something of a truism that is commonly accepted for reasons of sheer practicality)

But … you will have the time to write unit tests. And it’ll be a heck of a lot cheaper and easier to write docs. To write even your first sentence of your first document, you need full knowledge of the ENTIRE system. To finish writing the docs you need absolute knowledge of the system. That means you have to do 100% of the work (reading the source) at any given level of abstraction in order to do even 1% of the documentation.

But – by definition – you can start writing unit tests the moment you’ve even just tried running the application once. Unit tests require extremely limited extremely partial information – the less knowledge you have about the SYSTEM when writing tests for the UNITs, the better. In the long run, you can write unit tests to cover the key 25% of your system functionality with only doing as little as perhaps 5% of the work (reading the source).

So, in the real world: “when the system already exists, don’t write documentation – write unit tests”

Final thought: yes, this works in the real world … but what about for really LARGE and COMPLEX systems? Are tests the best form of documentation for them too?

Epilogue: what are the roles of documentation?

Just to give a flavour, they include:

  • educating the user on the abstract concepts that are enshrined in the application
    defining the problems that the binary code is attempting to solve
  • telling the user where to “start” with trying to use the binary code, whether as library or app
  • defining a contract of what the code will / will not do that is isolated against future updates of the source (source only says what it does now, in this (possibly bugged!) version; documentation tells you what it will CONTINUE to do in future versions, even if bugfixes change the current behaviour))

2 replies on “Source code is the best form of documentation (not)”

(all due to a conversation I had with JT (?) where this came up and it seemed worth retelling to a wider audience…)

Now that’s an interesting point of view, I didn’t think about it.

I just want to add a point: documenting an existing system is just boring. Writing unit tests is more valuable.

I had to document an already existing application some time ago and generate some html out of it. The odd thing was that I wasn’t the creator of that application, and the programmers who wrote the code were not part of the team anymore. I did it because I was paid for it, but I’m sure it was wrong, useless and man, boring. Now in this case I think your scenario fits like a glow and unit tests come might really help.

Dacian

Comments are closed.