Advent of Technical Writing: Duplicate Information
Published on under the Advent of Technical Writing category. Toggle Memex mode
This is the sixth post in the Advent of Technical Writing series, wherein I will share something I have learned from my experience as a technical writer. My experience is primarily in software technical writing, but what you read may apply to different fields, too. View all posts in the series.
When you write documentation, you will find that you want to repeat some concepts and instructions over in multiple places. For example, you might want to instruct someone how to find an API key in a guide. I used to make the mistake of writing a summary because it was only a paragraph or two. The result? If the UI changes, those guides will be out of date; if we move the link to access the page elsewhere, content will need updated in multiple places.
There is a general rule to which I attempt to adhere: Document once, reference anywhere. When information changes, you can change one place and know your documentation will maintain cohesion and stay in sync with the status quo of the product, API, or software you are documenting.
Pillar content
As part of writing and maintaining documentation, you will start to get a sense for "pillar", or canonical, pieces of content. This content is your definitive source on a particular piece of information.
For example, Roboflow, my employer, has a blog post on image augmentations and preprocessing best practices. This is our canonical guide on the topic. When we write content that mentions augmentations or preprocessing, we often say that we recommend reading our full guide and link inline. We don't repeat the same advice. We link to the relevant content.
By having key information in one place, you only need to update that information once if it requires a change.
Consider a guide that documents how to perform a product action, such as finding an API key. If you explain how to find an API key in every blog post you write that needs an API key, you have to go back and make those changes. If you instead say "Learn how to retrieve an API key." in your content and link to your canonical source, you only need to update information once.
Keeping duplicate information in check
Duplicative information can be appropriate when you want to produce an end-to-end guide. For example, if you are writing a guide that shows how to get started with an application, you may repeat information that is available elsewhere. Having the information in a single guide is to the benefit of the reader: they can stay on the same page while accomplishing the goal of the tutorial.
With that said, you need to keep duplicate information in check. When you are writing, you should actively ask "is this documented elsewhere?" If the answer is yes, ask: should I be referring to that content? In cases where you find yourself likely to repeat paragraphs of information that is elsewhere, the answer is probably "yes."
Context and repeating information
There is a fine balance between providing context and repeating information. You may include a summary of a topic you introduce elsewhere in a guide as this is often beneficial to the reader.
If I started to write four paragraphs on what a topic is when I already have another guide elsewhere, the reader may be better served with a transition sentence to the effect of:
Interested in learning more about {topic}? Check out our full guide.
Here, I can point the reader to a canonical resource and ensure that the article I am writing contains only the necessary information the reader needs to achieve the stated goal of the article.
Duplicate information across platforms
In addition to duplicating information across content, you might find yourself duplicating information across platforms. This is presently an issue with which I am dealing. I co-maintain an ecosystem of libraries called Autodistill, a tool for automatically labeling images for use in building computer vision systems. Each library is its own Python package and GitHub repository. Every package is documented in two places: the package README and the central documentation.
The result? One source -- the central documentation -- is often out of sync, or missing pages entirely. We made the decision to have the information in two places so that no matter where developers came from -- a README for a library in the ecosystem, the main documentation -- they could find what they need. In hindsight, the READMEs were enough; we could always link to them in the sidebar of our documentation. That is what we are doing now.
Similarly, you might have information that is duplicated across a help centre and your main product documentation. When the underlying infromation changes, you need to update both places.
Often, it is not the act of updating the information that is arduous. Rather, it is remembering to update the information in the first place. To avoid this dilemma? Document once, reference anywhere.
Responses
Comment on this post
Respond to this post by sending a Webmention.
Have a comment? Email me at readers@jamesg.blog.
