Docs-as-code: the experience of writing objective content

Jun 25 2021

As I’m trying to navigate this whole “corporate technical docmentation” gig, I’d like to do a crossover between different types of writing disciplines.

If Software Development is the marijuana cash cow of society, and Tech Writing is the drug dealer that you should have a good relationship with, Creative Writing is the maniacal addict.

Without further ado, code is easier to successfully complete than any other kind of writing.

Code has:

Objectivity and tests: a “right” answer returned.
Compiler checks for grammar.
Error logs: an in-built critique mechanism.

Compare this to creative writing, where:

It’s flat-out impossible to consistently elicit the same emotional response in your readers.
There are no pre-pass “compilers” or filters to run your manuscript through, while the spellcheckers that exist are seen as intruders to creative liberty.
You have to pour through the text and mark errors, or pay an editor for a month’s worth of logging.

Regarding that last point, if you even suggest automating part of your workflow, literary artists will gasp in horror at your lack of soul. After all, if you’re writing to a machine and getting feedback from a machine, how is your story supposed to be any good with human readers?

I, for one, welcome our robot overlords. Let’s not discount the power of computers. A program can do things like count repetitive vocabulary and make synonym suggestions. With machine learning, you can analyze plot structure (their research codebase is available here).

If we consider that even creative writing can benefit from automation and programming efforts, how much more critical would it be for technical writing?

Objectivity #

If the field of mathematics has a saving grace, it’s that it’s purely objective and imaginary. By stating your assumptions, they become concrete. Assumptions are imaginary until they’re codified.

For the people who love facts and straightforward answers, even if a problem is extremely difficult and unsolvable, even if you make a mistake, your due diligence will come to a head. There’s a theory out there waiting to be discovered that will verify your equation.

That’s what makes math great. That’s also what makes computer science great. The hours of brain twisting will pay off if you do your due diligence. It’s not about genius, although gifted people exist. It’s about reproducibility, reliability, and fairness. That’s what makes the rational sciences appealing: they’re meritocracies by nature.

tHe ExPeRiEnCe #

I’m just going to say that if you feel like writing sucks, it’s because we don’t treat it with the same slurry of tools available for programmers. To top it off, we always seem to forget the basics of scientific reporting: defining our assumptions and the terms we use.

Writing objective, rational content is hard, whether it’s a Bash script or a thesis statement.

In a modern IDE (integrated development environment), there’s a tooltip that pops up which suggests for you the possible functions of a class or object.

Visual Studio code IDE shows a tooltip that completes the word "Promise" in TypeScript and includes function details.

Imagine if each company implemented their own Urban Dictionary, a glossary of special jargon. This way, you can have integrations with any software, whether it’s a Word plugin or a static site generator.

Tooltips magically pop up as you’re writing about certain topics, maybe not to the level of a machine learning AI predicting what you’re going to say, but at least a list of terms that match your incomplete word. Then, if you decide to hover over, the definition pops up.

idea of popups for words in a prose editor. Excuse my MS Paint skills…

The best part would be that it’s easy to add definitions to a casual dictionary if it’s friendly and fun like Urban Dictionary. If the tooltips appear frequently in daily workspaces, people would update the definitions when they become inaccurate and outdated (for those who need to stratch that itch of making sure everything is consistent).

Word has the ability to add tooltips, but it looks ugly as sin. Also, Microsoft Word’s “dictionary” feature is really an Ignore List to make red squiggles go away, and not quite like the IDE tooltips I’ve envisioned above.

It’s possible that the tooltips will be seen as intrusive to creative writers who just want to get into Zen mode. For the rest of us who work with legal, regulatory, business, technical, academic, or even just to churn out meeting notes in collaborative activities: as soon as multiple brains are involved, WE NEED TO DEFINE OUR ASSUMPTIONS!

I dream of a day where UX and Engineers aren’t making up their own terms for similar topics and confusing each other. We would all refer to the common glossary, because the tooltips are available in every text editor by default, or at least easy enough to install for any new hire.

Tools #

The Chicago Manual of Style is one of the many style guides that editors follow to create consistent documentation. There are so many rules to memorize, why don’t we make our style guides into computer programs? Machines are better at following rules anyway.

Simply put, I don’t think people believe there’s money or value in it. Language evolves all the time, so who’s going to update rules if they’re hard-coded? You want to train an AI so it can automonously learn? Tough luck finding someone to do that! Cheaper to just hire an editor, eh? :-)

Actually, there are people working on it. #

Linter

: the drying machine filter that catches clothing dust
: a program that catches stylistic errors in source code
: a fancy way to say proofreader

A linter is essentially a programmed style guide, like Grammarly on steroids. Vale is one such linter for prose.

Citations are a great candidate for automation. Bibliography generators are a staple for any academic writer, and citation formats have to be codified somehow. The Citation Style Language is a project doing just that!

FInally, there’s Hemingway App which is intentionally opinionated. Named after Ernest Hemingway, who was famous for his short and succint style, the app identifies potential issues in your writing.

The downsides to automation is that it is overwhelmingly biased for English. Maybe later, we can see what it would mean to have better interoperability for other languages.

Error logging #

Error handling is a chore for the programmer. It involves writing error messages that explain how the user f*cked up. Like most chores in life, some people write excellent comments, and others write dogsh*t.

Error logging is important for computer programs, where an error is pointed by line and column.

What about for text? We typically don’t label text by line numbers. Traditional manuscripts are marked up in the margins.

Manuscript formatting #

In the literary world, a magazine or publisher may ask you to submit stories in a properly formatted attachment. One common convention is the Shunn Manuscript format.

Shunn Manuscript Format first page

The monospaced font looks awfully familiar, almost like the same font that programmers use…

While monospace is ugly, it has the advantage of consistency. Editors need to estimate the size of the story, since a publication has limited space.

Anyway, I wanted to point out that docs-as-code is about text processing, and it’s not just for programmers.

Perhaps the creative writing world needs some modernizing. Imagine in the editing process, markup is given with exact coordinates, in a convenient list, rather than flipping pages for red scrawls throughout the margins and counting to find Paragraph 15.

Most people find the commenting feature of Word/Google Doc to be sufficient, but when you have a manuscript of 50,000-100,000 words, it gets old fast. Not to mention that Google Docs will slow down and crash, at least until Google finishes their Docs overhaul which should address their known flaws.

I honestly feel like crying when text editors like Visual Studio Code or Sublime Text are completely free and extensible with millions of plugins, and Webpack for web app publishing is also free. “Literary” publishing tools like Scrivener and Adobe Indesign must be paid for. What gives? Is this another tax on the technologically unsavvy?

Piecemeal edits #

One common complaint of documentation is that when it comes to review, prose must be read in its entirety and judged holistically. This is different from code, where you want to segment it into small pieces so they remain comprehensible and testable. However, I have an issue with this criticism.

Technical documentation and user manuals are typically infamous for being giant bricks. Yet when we consider how social media has shortened our attention spans, and how people usually scan instead of read (I was probably the only dolt in school who read textbooks sequentially by paragraph), there is no reason to compile everything into a monolitic document.

Is there a way to take advantage of “micro” topics? The fundamental advantage of topic-based authoring is chunking a large project into smaller tasks. We create small blips of knowledge, send it out, and get the ✔ much faster.

UX writers advocate that if your app is simple enough (typically a mobile app), you don’t need a separate manual at all. The help should be embedded inside the application tooltips itself, harkening to the tradition of hardcoded docs.

What makes documentation valuable? #

I think it’s safe to assume that knowledge is prized by humanity, and timeless (asynchrnous) communication is the main reason to choose writing over speaking/video.

A point to consider is that words are cheap. This is not a bad thing. When it comes to file storage, text files take little space. While it means that our modern digital lives are inundated with content, it presents an interesting paradox. Words may be cheap, in the same manner that oxygen is cheap, but no less valuable.

If we take docs-as-code literally, then documentation can only be objective when it’s tested extensively through automated means, broken into small chunks for review, and flagged when it fails to perform. It seems that the process of refining the raw material, of a first draft to a polished piece, is where value comes from.

And yet, Urban Dictionary is great for all the charming ways we use language in its fullness and rawness, an intersection of memes, slang and open source contribution, presented in a sensible UI.

Future implications for docs-as-code:

How to create a glossary lookup that searches from smallest to largest scope: Local Dictionary -> Field Dictionary -> Industry Dictionary -> General Dictionary
How to take advantage of the open-source programming world to modernize creative writing tooling
Combating “the tax on the technological unsavvy” (teaching writers to set up their own environment the way they want)