Describing the System

I can hear the snoring already. Documentation? The topic is documentation?

As I mentioned in the welcome the goal of this blog is to shine a light on some things that we pretend are something they are not. Nothing is more of this nature than the documentation of a software product.

Nobody would disagree that documentation can be useful. It helps communicate intent to the person who we cannot directly educate. However, there several problems with documentation:

There is a cost of creating it.
If wrong it has more negative value than positive.
It often calcifies our thinking. Spending hours on something creates an investment in the idea.

So...our objectives are to make sure that:

There is really a need. The cost of creating the documentation should be merited because future team members (or perhaps even current team members) will be more efficient with this knowledge (or avoid screwing up something they didn't understand)
The documentation is written in a way that it is most likely to survive changes.
The documentation is either written AFTER the work is complete (truly as just a communication to future owners) OR is light enough that we are willing to abandon it immediately if the idea should be scrapped.

Some documentation is clearly necessary and we commonly agree on the need. In this category I include:

APIs. API documentation is often the only way to understand the rules for interacting with a system. We all expect that REST services, public class/function libraries, database structures are well described so that we know how to use them. At a minimum, input/output parameters should be fully described and examples should be provided. Really good API documentation includes working samples that can be invoked and experimented with. Here is an example: https://www.leaguevine.com/docs/api/. It is important to note that most application APIs should NOT be documented, i.e., if the only consumers to this documentation are the team members working on the app and the other developers can look at the implementation themselves then it is NOT a good investment.
Systems. Systems documentation describes the large scale interaction between parts of the system. Without this it is almost impossible to comprehend many systems. The focus should be on naming the components and their large-grain interactions.
Operational documentation. This is usually a super-set of the Systems doc and is provided to help operational staff troubleshoot issues. Like Systems doc it is mostly a description of the major components and data passed between them. It usually includes more pedestrian facts like addresses, identifiers, etc. Troubleshooting tips are also a great addition. The common mistake in this documentation is to try and make the operator an expert on the app. The truth is that almost nobody who provides operational support will read deeply into this doc. It must be basic and to the point. This is not a criticism of people doing operational support. Rather, unless the system is having chronic problems (in which case documentation isn't going to solve the problem) the folks you would like to read this documentation have no motivation. Or, if they read it at all it will quickly disappear from their memory. If there is too much doc of this nature you can guarantee that nobody will read it.
Rules. Rules are actually a form of requirement. We need to document these at some point for acceptance and test purposes but it should be clear: these rules ARE NOT the rules for the system once it is in the wild. These rules will never be perfectly in tune with the code. The final arbiter is the code. Do not get fixated on updating these rules.

The what and how of other documentation is more murky. In this category I include:

App/service details. This doc describes the detailed flows and interactions of the app. The intent is to give the future developer some understanding of the app patterns so they will be more efficient (and not screw up something important because of the lack of understanding.)
Code. This documentation is embedded in the code itself and attempts to clarify confusing logic.

App/service details. This documentation is often a lost cause. Team members understand the complexity of their app and sometimes go to great lengths to describe it to others. Inevitably though this doc takes a lot of work, becomes obsolete rapidly and is rarely read. The problem is that the authors try to explain too much. The solution is what I call "Concepts and Breadcrumbs". This type of doc should focus on communicating concepts not details. For instance, it might make sense to have a few sequence diagrams which just touch on the complex (or ubiquitous) patterns of the app. It should NOT try to be complete. In fact, it should insist on being incomplete. Incomplete artifacts say "Don't trust me....I'm just giving you the ideas". A few sequence diagrams, maybe some text, a couple of class diagrams. Some way to communicating the important patterns. Whatever works. If you are spending a lot of time on this you have gone way to far!

In addition to concepts there should be a few "breadcrumbs" to point you to critical code. Breadcrumbs are class names, critical module names, etc. The important point is that "breadcrumb" documentation does not try to elaborate. Rather, it is just some hints on how to walk into the code and get to the juicy pieces, i.e., help the new/future team member get to the meat but let them walk around and run code to learn more themselves.

Code. Code documentation ("commenting") is often the largest amount of documentation in the system. The biggest problem with code commenting is that it often misleads the consumer. Code comments are very detailed. The developer is adding them because they think the next developer (or themselves in the future) will have difficulty understanding the logic. The goal of commenting is laudable but the reality is that this documentation, without a compiler to check it's truth, can be incorrect when the code changes in the future. Now the comment is actually misleading and can cause considerable confusion.

The best way to comment code is to do as little as possible. Rather than write comments the developer should organize and name code in ways that make it self-documenting. Specifically:

Name carefully! Class names, function names, variable/property names. Use names that make sense. A variable named "x" conveys no meaning. The purpose of a variable named 'lastRequestTime' is very clear. Abbreviations should usually be avoided. It is a bad practice to abbreviate unless it is clear the future audience will understand the shortened version. The variable name "lstReqTm"? Not so helpful.
Names should be precise. A function named processData tells us nothing. Changing that to storeResults would be much better.
Avoid magic numbers. It is better to name a number in a constant and then reuse that value in the code. This is an excellent way to document it. For example this...
```
    let width = 2 + contentWidth + 2  // margin is 2
```
...would be written better as:
```
    const MARGIN_WIDTH = 2
    let width = MARGIN_WIDTH + contentWidth + MARGIN_WIDTH
```
Write small functions that do what their names imply. A common mistake when documenting code is to create long functions with comments to describe the portions of the code. A better approach is to make a separate function for each chunk of logic in the code and name it well. This results in self-documenting code. For example this...
```
    function createSaveCustomerRequestData(customer) {

        ...

        // create address data
        let addressData = ...
        ...
        requestData.address = addressData

        ... 

    }
```
...would be written better as:
```
    function createSaveCustomerRequestData(customer) {

        ...

        let addressData = createAddressRequestData(customer)

        ... 

    }

    function createAddressRequestData(customer) {


        let addressData = ....

        ... 

        return addressData

    }
```
This is not to say that all code comments are bad. On the contrary, there are many examples of where commenting is very necessary. In loosely typed languages, for instance, it is very helpful to get some indication of the type of the variable or parameter when it is not clear from it's usage. As an example, it is clear from this comment that the variable holds an array of Strings (vs. Date).
```
    var submissionDates = [] // [string]  ... YYMMDD formatted
```
There are, of course, good reasons to provide extra comments. When something is particularly complex or inexplicable (e.g., you are doing something just to workaround a bug in a library) you should add some comments to explain.

One of the best moderators of commenting is the code reviewer. If you need to explain something to the reviewer if could mean your code needs restructuring but it might also mean it is a good place to add some comments. Occasionally adding a pull request comment like "this needs some explanation...can you add a comment here" is very acceptable.
Create examples. The best way of documenting code is examples. This can happen naturally via unit tests. For component libraries is often helpful to have test rigs or "recipes" that have live code. UI components get great benefit from sample UI that shows how the component works (and thus, shows how to interact with it's API).

It is easy to dismiss the art of describing a system as a lesser skill in software development but I rarely encounter a great system that is poorly described. Good documentation is important.

The programmer's new clothes

Search This Blog

Describing the System

Comments

Post a Comment

Popular posts from this blog

Managing Risk and the Shadow Backlog

Why is Performance Important?

We need a new front-end language