Shared publicly  - 
 
Fifteen Thoughts and Tips on Writing Software Documentation

In the 1.5 weeks before I released Globus Provision (http://globus.org/provision), I spent a fair amount of time working on its documentation (http://globus.org/provision/docs.html). I am actually a bit of a documentation junkie; I honestly enjoy writing documentation and I am an advocate of taking the time to produce quality documentation before releasing software (as opposed to treating it as “something we’ll get to later”). I ended up writing 88 pages of documentation and, in the process, found myself reflecting on why I enjoy writing documentation, and what strategies I follow to write documentation that (I’ve been told by others) is readable and useful.

So, now that the dust has settled after the release of Globus Provision, I wanted to share those thoughts and tips. I think they might be useful to people who, like me, do not have any formal training as a technical writer but are thrust into a position of writing documentation. In my case, I’m primarily a researcher and developer who works on open source software, and who often has to document his own work for others to either (1) replicate my results or (2) use my software.

[Quick documentation-oriented bio: I am a researcher at the University of Chicago’s Computation Institute (http://www.ci.uchicago.edu/), and have been involved in the open source Globus project (http://globus.org/) for eight years. Most of the documentation I’ve written has been for that project, starting with “The Globus Toolkit 3 Programmer’s Tutorial” (http://gdp.globus.org/gt3-tutorial/) in 2004, which was well received by the Globus community. I updated the tutorial for the Globus Toolkit 4 in 2005 (http://gdp.globus.org/gt4-tutorial/), and also co-authored (with Lisa Childers) a book on the Globus Toolkit 4 (http://www.gt4book.com/) which was released in 2006. Then I went off to PhD-land, and emerged in 2010 with a dissertation (http://people.cs.uchicago.edu/~borja/dissertation/) and an open source project that I developed to obtain the results in my dissertation (http://haizea.cs.uchicago.edu/). This project was released with a manual (http://haizea.cs.uchicago.edu/documentation.html) separate from my dissertation.]

Ok, on to the “thoughts and tips”. Take into account that they’re written mostly from the perspective of documenting open source software, which tends to encompass several types of documentation: tutorial-style documents, API references, etc. Furthermore, as I have already stated, I am not a technical writer, and I do not claim that these tips reflect best practices in the technical writing community. Ultimately, these tips are just my personal opinion, so please take them with a healthy dose of salt; they’ve worked for me, and they may or may not work for you.

1. Writing documentation == delayed gratification
A few days ago, as I was writing the Globus Provision documentation, I went through a bout of writer’s block. I took a step back, and realized that writing the documentation wasn’t particularly “fun”, yet a part of me still felt passionately about the need to get it done. How could I reconcile this dichotomy? I’m a self-described documentation nut, and yet I’m not particularly enjoying the act of writing (certainly not as much as I enjoy coding, which I find to be plain and simply fun, in and of itself).

It then hit me that the reason I “enjoy” writing documentation is not so much because of the act of writing, but because of what that writing will accomplish: happier users who will understand what the software does and how to use it. Happier users means less users giving up on your software, less users giving up means more potential to build a thriving community around the software, etc.

I don’t think I could produce as much documentation if I didn’t accept that it very rarely produces any immediate gratification. Again, this is fairly different from coding, where there is usually some immediate gratification: you can test your code, verify that it works, pat yourself on the back, and move on to the next problem. I have to wonder if this is one reason why writing documentation gets a bad rap amongst coders.

Just to be clear, I generally do enjoy the act of writing, and I had this “meta” moment during a bout of writer’s block. More about writer’s block, and what to do about it, in tip #10.

2. Make sure there is a coherent narrative
If you’re going to write a manual or some form of tutorial-style documentation, make sure it has a coherent narrative. This mostly involves starting with simple material, building up to the more complex stuff, and always giving the reader some sense of why the content is being presented in that order.

For example, have you ever been in a lecture where there is definitely a “topic for the day”, but the lecturer mostly delivers disjointed bits of information about it? Those are the kind of lectures where it’s hard to answer the question “What have I learned in this lecture?” (other than some raw content that you could’ve looked up on your own). There’s no “story” being told, nothing to take away that connects to other lectures in the class, just raw information. It’s the same with documentation: there’s only so much that a reader can do with raw information. You must craft a “story” that makes the reader want to keep on exploring the documentation.

For me, the rule of thumb is looking at the whole documentation, and asking myself if it would be possible for someone to read the whole document from beginning to end without getting confused and without encountering too much repetition. Another rule is checking whether I can write transitional material between chapters. If I can’t explain at the beginning of a chapter how this chapter relates to the previous one (or others), then I’ve screwed up somewhere.

Take into account, however, that most readers will not read the whole document from beginning to end (more on this later). However, having a coherent narrative forces you to write the documentation in such a way that, if someone skips straight to a given chapter, at least they’ll have some sense of how that chapter relates to the rest of the documentation.

Finally, all of this is not to say that dumping raw content isn’t appropriate in some cases. If you’re writing the part of the documentation that simply lists all the API functions, configuration options, etc., there’s no need to thread all that together with a narrative. However, I do prefer to set that content aside in the form of appendices, so it is clear that they’re meant as “reference chapters”, and not part of the main narrative of the documentation.

3. Provide a single entry point
In other words, write an introduction for your documentation. Seems obvious, but I’ve seen plenty of documentation that does not include one. To be effective, this introduction must provide an “executive summary”, specially if you’re documenting a piece of software (for those readers who simply want to get an idea of what your software can and cannot do, and don’t intend to actually use it). You should also explain how the content of the documentation is laid out and how it should be read (should it be read sequentially? what content can be safely skipped, and when is that advisable? etc.)

Sometimes, you may want to provide two entry points: the actual “introduction” to the documentation, and a “quickstart” chapter for readers in a hurry (unlike the introduction, a quickstart should include not just an executive summary, but also some quick examples that the reader can work through to get up and running as quickly as possible). At the end of the quickstart, you can include pointers to other chapters of the documentation that expand on the material covered in the quickstart.

4. Expect readers to ignore your single entry point
Don’t delude yourself into thinking that everyone will read every single page of your carefully crafted documentation. Some readers will get to your documentation having some familiarity with the material, and won’t feel the need to read the introductory material, skipping straight to the chapter that covers some specific topic they are interested in.

Take this into account when writing each chapter, and avoid assuming that the reader has necessarily read every chapter before it. Of course, you don’t have to start each chapter with a full re-introduction to the material, but you can include a short blurb that implicitly states your assumptions about the reader for this chapter. For example, simple transitional paragraphs like: “In the previous chapter, we saw how to calibrate the flux capacitor using a regular screwdriver. In this chapter, we will see how to do so with a sonic screwdriver”. If the reader doesn’t know how to calibrate the flux capacitor with a regular screwdriver, that blurb tells him that he probably needs to read the previous chapter before attempting to move on to the (presumably more advanced) topic of using sonic screwdrivers.

This is actually one reason why keeping a coherent narrative can be helpful: it makes it much easier to write transitional material between chapters, which makes it easier for readers who land on a (non-introductory) chapter to “reorient” themselves quickly.

5. Don’t write formally (but not informally either)
I guess this is a matter of taste, but I find documentation written in a very formal style hard to read (and hard to write). However, I wouldn’t recommend going to the other extreme of writing too informally, not because it’s “bad”, but because it’s hard to get right. Don’t do it unless you are confident that the informal style won’t distract the reader. For example, Beej's Guide to Network Programming (http://beej.us/guide/bgnet/) uses a very informal style, but still manages to be very effective and readable. I’d say the style to aim for is “casual”, even if you have to break a couple style rules along the way.

One concrete suggestion, though: avoid using colloquialisms or pop culture references. Why? Not all your readers are native English speakers, and they may have trouble parsing very idiomatic English.

6. Provide examples in abundance
Take the time to include a lot of examples in your text, specially if you find yourself describing something that sounds too abstract. Take into account that you don’t always have to include full-blown examples that the reader can go and try out. Sometimes a snippet of code or a figure, just for illustration purposes, can go a long way.

I personally like to prepare all the examples before I start writing the actual “prose” of the documentation (besides one-shot examples, I try to include at least one running example per chapter). It helps me to flesh out the narrative of the text, and then gives me something concrete to write prose around.

7. Automate as much as possible
The least pleasant part of documenting software can be writing the specification of some API or file format you’re using in your software. This is actually not so bad the first time you do it. It gets more tiresome when you start updating your software and have to keep your code and documentation synchronized. However, if you write your code in such a way that the documentation is generated automatically from the code, this becomes trivial.

When documenting APIs, class hierarchies, etc. the solution is simple, as most programming languages allow you to generate this kind of documentation automatically from the code (using Doxygen, epydoc, etc.). However, you can take automation even further: if you define a file format, create self-documenting data structures for it in your code. For example, I ended up creating my own set of classes on top of Python’s ConfigParser to define configuration files that incorporated documentation and other metadata (like the expected type of configuration options, what options are required, etc.; if you’re curious, you can see this in modules globus.provision.common.config and globus.provision.core.config of the Globus Provision source code)

Writing documentation can also be aided by using documentation languages and software. A long time ago, I used DocBook, but my current favorites are reStructuredText (http://docutils.sourceforge.net/rst.html) and Sphinx (http://sphinx.pocoo.org/).

8. Provide your documentation in multiple formats
There is no single ideal format to publish your documentation in. Single HTML? Can be hard to navigate. One HTML file per chapter? Easy to navigate, but not good when you want to grep for some specific term across the entire text. And both HTML options are not particularly good to print out or read on an e-reader. PDF? Good for printing/e-reading, but not that easy to browse or navigate.

What’s the solution? Just provide your documentation in as many formats as practicable. This is another reason why using documentation languages/software, like DocBook or Sphinx, is worth the effort: generating multiple file formats from a single source is practically trivial.

9. Seek feedback
Before releasing documentation, try to get someone to proofread it for you (ideally someone who actually does have a background in technical writing). After staring at the same documentation for days, a lot of small silly mistakes end up slipping through the cracks. A fresh pair of eyes always comes in handy for this.

10. Write when you feel like it, not when you have to
If you sit down to write documentation, it’s very likely you’ll eventually run into “writer’s block”. I’m not referring to 15-20 minutes of downtime while you try to figure out some particularly tricky paragraph; I’m talking about moments when you simply can’t get any writing done for days or even weeks.

Over time, I’ve learned that, when this happens, the best thing I can do is to simply walk away from the documentation, and not come back until I feel like it. If you try to force yourself to write, it will only make matters worse: you start stressing out because you’re not making any progress, and you try to force yourself to keep working even though you’re not in the right frame of mind to do it. Or, you decide to take a break, but spend most of that time feeling guilty because you’re not getting any work done. In either case, your stress multiplies because you’re “not being productive”. So don’t stress out about it: walk away, don’t worry about it, go do something else, and come back when you feel like writing.

Of course, this doesn’t mean you have to go off and procrastinate for several hours until you “feel like working” (although that can be good too, as long as you don’t spend all that time feeling bad about the fact that you’re not working). You can still be productive in other ways. For example, when I don’t feel like writing, I switch to working on automating some part of the documentation (coding == fun!), which makes me feel like I’m still doing something productive (in terms of generating more documentation).

This, by the way, is one of the big lessons I learned while working on my dissertation (which, in a sense, is a 5-year documentation project): if you don’t feel like working, then don’t stress out about it. Walk away, and come back when you feel like it. This might seem like a silly strategy: what prevents us from simply goofing off all the time, instead of getting work done? Well, the point is you do want to get work done; you just have to embrace that there are times when you won’t want to work, and that it’s not worth stressing out about it because, in the long run, the work will get done (just not _right now_).

11. Know your audience
Make sure you know who you’re writing for. Is this documentation for beginners or for advanced users? Sometimes, you’ll want to write documentation that can help beginners get started, but can also serve as a useful reference for advanced users, which means walking a fine line: you don’t want to lose any beginners along the way, but also don’t want to bore the advanced users by repeating material they already know.

I usually try to structure documentation in such a way that (1) a complete beginner can read the whole thing from beginning to end, and learn (almost) everything there is to learn about the software, and (2) an advanced user who is already familiar with the software can revisit the documentation further down the road, and learn how to do some “advanced” task that he wasn’t interested in when he first read the documentation.

In any case, a good way of making sure you don’t lose anyone along the way is to explicitly call out your assumptions about what the readers do and do not know. The beginning of a chapter is a good place to point this out (see Tip #4).

Finally, one specific tip when writing for beginners: avoid statements like “I assume you know how OpenFoobar works; if you don’t, you can read their documentation at http://openfoobar.example.org/”. Although your readers may definitely benefit from reading that external documentation, it doesn’t hurt to provide a 1-2 paragraph summary yourself (and then indicate where they can find additional details).

12. Anticipate the “obstacle points” and spend extra effort on them
While learning about some new piece of software, there may be parts of it that new users will find hard to grok. You have to anticipate what these “obstacle points” will be, and make sure that you invest a bit of extra effort on them, such as writing a couple of extra examples, providing additional links to reference material, etc. This is related to the previous point: it’s hard to anticipate what these “obstacle points” will be unless you can put yourself in your audience’s shoes, and empathize with what parts they will find challenging.

Of course, once your documentation and/or software is published, you’ll actually start hearing from readers/users and you will get a better sense of what they’re getting stumped on. Revisit those parts of the documentation and flesh them out.

13. It’s not about you
This is a generalization of the previous two points, and something that I actually learned as a teacher, so let me start by explaining it from the perspective of teaching: a lecture is not about you (the teacher), or even about the material, it’s about the students. A lot of new teachers (myself included, when I first started teaching) tend to worry too much about their performance in the classroom: Will I make it through all the material? How will I come across? Will they like me? What if I make a mistake? Me, me, me, me, me!

However, that is the wrong frame of mind to be in. Before you walk into a classroom, your primary concern should be the students and their learning. At the end of the day, students will like you more if they actually learned something from you, not because you came across as charismatic, funny, a “nice guy”, etc. (which is certainly a nice bonus, but just that: a bonus). So, the questions you have to be asking yourself are: What skills do I want them to learn? How can I facilitate their learning? What examples can I prepare so that they find the material easier to understand? What extra references will they find useful?

I think this attitude translates very well to writing documentation, with readers and users instead of students. When writing documentation, you shouldn’t be thinking about how to write some witty piece of prose or about how to best highlight all the features of your software. You need to focus on your readers/users and be empathetic to their needs (which is, admittedly, harder than dealing with students in a classroom, since you don’t have the benefit of getting instant feedback).

In my case, when I write documentation, the first and foremost concern on my mind is: “How can I make this as easy as possible for the user?”

14. Allow documentation to shape your code
When you’re documenting your own software, you shouldn’t regard your documentation as a one-way reflection of your software. To put it another way, you shouldn’t regard the documentation as something that simply explains how your software works, and which only changes if the software itself changes; you should allow your documentation to influence your software too.

For example, while writing documentation, I sometimes realize that some particular command or series of steps requires a really convoluted explanation. And if it’s hard to explain, it will probably be hard for your users to do it themselves. When this happens, I revisit the code, and change it so it becomes easier to explain. Writing documentation becomes a driver for discovering how to make your software more usable.

15. Documentation is not something “we’ll do later”
Push back against the notion that documentation is a secondary concern, or something that “we’ll get to later”. Yes, two weeks spent on documentation could be spent on implementing some extra feature for your software, but the benefits of having good documentation are well worth the effort. Good documentation means happy users, happy users means users that will stick around and will want to continue using your software, and those are the kind of users that will form a community around your software. This is harder to achieve if you have crappy or no documentation, no matter what awesome features your software has.
25
11
Pedro Machado Santa's profile photoHugo Pereira's profile photoDaniel Zavala's profile photoArthur Cui's profile photo
11 comments
 
Nice write up +Borja Sotomayor. Really enjoyed reading your thoughts.

I just miss one point, maintainability of the documentation. For stable APIs not a big deal, but when pushing for a launch and iterate model, it can end building quite a bit the amount of work you need to do (I learned that while pushing the Meandre project http://seasr.org/meandre). For instance, one piece of documentation I love and made my Meandre days easier were Scala Specs. Pretty straight forward to read with little knowledge required to figure out what they are telling you (http://code.google.com/p/specs/wiki/QuickStart). You can write fairly verbose useful examples that also works as test and documentation. You can also always be sure they will align with the current code if they pass:)
 
Good point, Xavier. That is definitely the sentiment behind tip #7 ("Automate as much as possible"), although I think I haven't gone as far as I'd like (even though roughly 50% of the Globus Provision documentation is automatically generated from the code itself). One particular thorn in my side is testing: I'd really like to get to a point where the examples from the documentation are woven together with automated testing, so they don't exist in two separate places. In other words, I can run tests that both (1) test the software itself and (2) test that a user working through the documentation examples with that version of the software will be able to run the examples successfully.

At some point, I'd also like to see if it is possible to apply Literate Programming (http://en.wikipedia.org/wiki/Literate_programming) to a Python project like Globus Provision (I know there are LP tools for Python, but haven't had a chance to try them out).
 
Interesting point about LP. Haven't played much with BDD in python. Seen some packages like nose http://readthedocs.org/docs/nose/en/latest/ or mock http://www.voidspace.org.uk/python/mock/ but they seem too test oriented and missed Specs documenttion flavor (disclaimer I have not used them and it is just the feeling after going trough their documetation).

Scala BDD using Specs (or Erlang or Haskell equivalents) is the closer I have been to the scenario you described in #7. Actually I wrote specs for Meandre that were tests, but also the sequence of actions I wanted to illustrate on the documentation of how to use API, servers, etc.

Also, when I read #7 I still felt that flavor of having to maintain two different entities, something that, call me dreamer, I would like not to have to maintain two different things. I bet with a little Specs parsing I could have generated documentation straight out of it.

But leaving this aside, I enjoyed the reading :)
 
Really good article. We migrated from docbook to AsciiDoc a few months ago. It allows us to easily convert the docs into multiple formats. We've also been working a lot on automation - any of the examples that appears in our documentation is pulled directly from the test suite.
 
Are they pulled automatically from the test suite, or do you mean you simply take care to include the documentation examples in the test suite? I guess I'm trying to figure out what's the best way of including the documentation examples in a test suite, without actually replicating them.
 
You have some great points here. My experience with writing documentation for OSG is that it's much more difficult writing end user documentation or documentation for admins since automation is not nearly so easy. Have you written any things like install guides for globus and how does your experiences with that differ, if at all?
 
They are pulled from the test suite. Our test suite has a directory called documentation. Every example script that will go into a document or a research paper resides in this directory. The examples get included in the documents by using AsciiDoc's 'include' command. For example, instead of having Swift code in our documentation, we'll have something like this:

include::../../tests/documentation/tutorial/hello.swift[]
 
Suchandra: I haven't written install guides for Globus in a long time. Arguably, the Globus Provision documentation does include an install guide, but I actually tried to make it as simple as possible by modifying the installation procedure itself (Tip #14). I started off with an install from source, and realized that was a pain to install, so I packaged it and uploaded it to pypi so the installation would boil down to simply running "easy_install globus-provision".

But, yeah, I'd say automation is more difficult if you're documenting someone else's work, although I think it's still worth linking the examples to some automated tests (in the manner that David describes), so that you can "test" the documentation with the click of a button, instead of having to run through the examples manually (which I end up doing for a lot of my documentation).
 
Nice post, Borja. Thanks for giving props to the docs, and for actually following-through and writing great docs for your projects. : )
 
Hi Borja, I am starter in Business Analyst here in Accel Software, and i do documentation and user manuals upon confirmation from the testing software team. Really it was so helpful form me and for many juvenile careers like me.
Add a comment...