# Typesetting Markdown – Part 5: Interpolation

This part of the series describes how to reference interpolated strings inside Markdown documents.

## Introduction

Part 4 described creating a reusable build script template and introduced controlling a document’s page size, layout, and thematic elements. This part describes a way to define, organise, and embed document variables. For simplification purposes, a variable described in this document can also be thought of as a constant or key-value pair.

## Variables

Ancient Egyptians used hieroglyphic signs to represent numbers, such as those depicted in the following table:

UnicodeVectorMeaningValue
𓏺Wooden dowel, stroke1
𓎆Hobble for cattle10
𓍢Coil of rope100
𓆼Lotus plant1,000
𓂭Finger10,000
𓁨Ḥeḥ with arms supporting the sky1,000,000

Symbolic representation of numbers has its roots in Sumerian cuneiform, one of the earliest writing systems invented. Using symbols back then was reasonably straightforward:

1. Create (or borrow) a wedge-tipped reed stylus.
2. Make a wet clay tablet.
3. Use the stylus to write symbols in the clay.
4. Leave the clay in direct sunlight to harden.

Thousands of years later, symbolic representations of numbers, text strings, and other data types are commonplace in systems created by software developers; however, using variables—the lifeblood of programming languages—within documentation remains fairly arduous for the vast majority of people. Consider the following Microsoft Word document:

Perhaps the phone number is used in multiple places throughout the text. When the phone number changes, it’d be convenient to change it once and be sure that all occurrences of the number are also updated. To make and insert a document variable, the author must know the following labyrinthine incantation:

1. Click File.
2. Click Properties.
4. Click Custom tab.
5. Set Name to the variable name (e.g., PhoneNumber).
6. Set Value to the variable value.
7. Click OK.
8. Press Esc to resume document editing.
9. Click Insert.
10. Click Quick Parts.
11. Click Field.
12. Set Categories to: DocProperty.
13. Scroll to find PhoneNumber under Property.
14. Click OK to insert the variable.

The variable is inserted into the document, shown highlighted in the following screen capture:

When that number changes, anyone can update the variable—assuming they know the value was from a variable and they know how or care enough to reassign it. Practically, the deeper problem of inserting information from a single source of truth into documentation is not addressed. A Microsoft Word document is an unsuitable source of truth because (1) multiple applications cannot reuse its variables; (2) its variables cannot be assigned a category (i.e., they cannot easily be organised into namespaces); and (3) its document file format promotes vendor lock-in. Sourcing variables from Microsoft Word is akin to telling your relatives where to find clay tablets whenever they need to look up their ancestors’ names. With respect to editing efficiency, flexibility, and maintainability… that phone number might as well have been carved into clay.

Document variables would do well to meet the following criteria:

• Creation – Make variables using four steps, or fewer.
• Injection – Insert variables using three steps, or fewer.
• Open – Variable definition formats must not be proprietary.
• Unified – Variables can be retrieved from a single source of truth.
• Orderly – Variable names whose values are categorically contextual.
• Interpolated – Let variables reference other variables, recursively.

The last four items are addressed hereinafter.

### Open

Free, open file formats for associating variable names with values abound:

• JSONJavaScript Object Notation is well-known to web developers.
• TOMLTom’s Obvious, Minimal Language is a simple configuration file format meant to be read easily.
• XMLExtensible Markup Language is a file format originally designed for large-scale electronic publishing.
• YAMLYAML Ain’t Markup Language is designed to be a human-readable file format for describing structured data.

Despite their intentions, human-readable data formats are developer-readable at best. Non-developers balk at learning hierarchical file format syntaxes. Providing a simple user interface would make learning the underlying file format largely irrelevant. Even though some people dislike editing and navigating hierarchies, having the ability to categorise data through a simple user interface has practical value for developers and non-developers alike.

A common visualisation is a tree interface, such as:

Miller Columns (links to an implementation that I developed) are another way to visualise hierarchical data. A mock-up with filtering resembles:

Having limited screen real estate, iPods use a drill-down menu hierarchy. The effect achieved is similar to the following:

The D3 data visualisation library provides yet another way to view deeply nested hierarchies:

No matter how the information is presented, a way to associate a document with the variables referenced within it is essential.

YAML is the only format pandoc supports directly, at time of writing. A TOML integration may be implemented in the future. Either way, since there are many tools—of varying accuracy—that can convert file formats, using YAML does not force the documents to depend on any particular data input format.

### Unified

Ideally, document data is requested from a central location, such as a data warehouse. The data warehouse can be a façade, exposing a single source of truth for separate information sources necessary to operate a business. Upon retrieval, the data is transformed into the required format (e.g., YAML), so that the document can reference the values.

For most writing needs, a flat file is sufficient.

### Orderly

As soon as a document of substantial length is drafted, the need to organise variables becomes apparent. Initially, for example, direct, fax, tollfree, support, and afterhours may suffice to capture various phone numbers. As a company expands into multiple locations, each of those variable names will be in conflict across the different locations. Similary, novels need ways to assign values to character sheets for a variety of characters. To avoid collisions, file formats must support spaces for variable names. Aptly, these are known as namespaces, and can help categorise information.

For example, a source code repository and a web server both have names and ports, which could be defined as per the following YAML file:

network:
domain:
name: librerie.com
ip: 192.168.1.1
servers:
repository:
name: svn.librerie.com
port: 3690
web:
name: www.librerie.com
port: 80

Even though name appears multiple times, the fully qualified variable names can be referenced without conflict. Clearly, network.domain.name, network.servers.repository.name, and network.servers.web.name have different values because they are in different namespaces, even though all end with name.

There is a little redundancy in the YAML file that will be addressed using interpolated strings. Hard-coding text that will probably change later—like transitioning from Subversion to Git—inevitably results in inaccurate documentation. (Arguably, repository.librerie.com may have been a more future-friendly host name, but that misses the point.)

### Interpolated

String interpolation replaces placeholders with corresponding values. For example, consider the following metadata block, enclosed by three hyphens (---), of YAML variables atop a Markdown file:

---
protagonist:
name:
given: &given May
surname: &surname Blood
personal: *given *surname
---

Hello $protagonist.name.personal$.

It would be convenient if the value for protagonist.name.personal became May Blood in the output document. While anchors (e.g., &given) and references (e.g., *given) are part of the YAML specification, for the purposes of simple variables inside of documents, the syntax has the following issues:

• Redundant – Variable names have uniquely defined namespaces, which makes the additional reference redundant. (YAML variables needn’t be uniquely named, so the notation can be useful.)
• Unsupported - As of pandoc version 2.7.2, anchors and references cannot be used.
• Pointers – C-style pointer syntax is abstruse for many people.
• Recursion – Even if pandoc supported the syntax, the implementation probably would not allow references within references.

${...}bash, Apache Camel, and others. #{...}Aaron Parecki %{...}Puppet [%...]MultiMarkdown {{...}}Assemble, Handlebars, and others. ((...))BOSH Most delimiter tokens are special characters in regular expressions, as such they must be escaped, which complicates the expression. ## Integration This section describes how to interpolate strings in Markdown. ### Requirements Ensure the following files exist inside $HOME/dev/writing/book:

• ci script from Part 4.
• definitions.yaml (above)
• 01.md (above)

The requirements are met.

### Update Script

Edit the ci script then make the changes that follow.

Update the DEPENDENCIES list to include Java:

"java,https://jdk.java.net"

Update the ARGUMENTS list to include YAML:

"-y,--yaml,YAML definitions file name"

Update arguments() to parse the YAML option:

-y|--yaml)
ARG_FILE_YAML="$2" consume=2 ;; Provide a default file name for YAML definitions: ARG_FILE_YAML="definitions.yaml" Change the filter function to include monitoring of YAML files: filter() { [[ "${1,,}" =~ \.(.*md|tex|y.?ml)$]] return$?
}

The following table explains the filter’s terse, conditional syntax:

TokenMeaning
[[Begin evaluation of a Boolean expression
"${1,,}"Convert the $1 filename parameter to lower case
=~Compare filename against a regular expression
\.Starting from a period in the filename …
(Find any pattern up until the closing parenthesis …
.*md… that matches a string with md, such as Rmd
|tex… or matches a string with tex
|y.?ml… or matches a string with y and ml, such as yaml
)Stop scanning for patterns to match
$Ensure the match happens at the end of the string ]]End of Boolean expression to evaluate As before, this will match more than what’s expected, including .cmd. Replace build_document() with the following snippet: build_document() { local -r DIR_BUILD="artefacts" mkdir -p "${DIR_BUILD}"

local -r FILE_MAIN_PREFIX="main"
local -r FILE_BODY_PREFIX="${DIR_BUILD}/body" local -r FILE_CAT="${FILE_BODY_PREFIX}.md"
local -r FILE_TEX="${FILE_BODY_PREFIX}.tex" local -r FILE_PDF="${FILE_BODY_PREFIX}.pdf"
local -r FILE_DST="$(basename "${ARG_FILE_OUTPUT}" .pdf).pdf"

$log "Preprocess YAML into${FILE_CAT}"
java -jar $HOME/bin/yamlp.jar < "${ARG_FILE_YAML}" > ${FILE_CAT} printf "%s\n" "---" >> "${FILE_CAT}"

$log "Concatenate into${FILE_CAT}"
cat ./??.md >> "${FILE_CAT}"$log "Generate ${FILE_TEX}" pandoc "${FILE_CAT}" --template "${FILE_CAT}" 2>/dev/null | \ pandoc --to context > "${FILE_TEX}"

$log "Generate${FILE_PDF}"
context --nonstopmode --batchmode --purgeall \
--path=artefacts,styles \
"${FILE_MAIN_PREFIX}.tex" > /dev/null 2>&1$log "Rename ${FILE_MAIN_PREFIX}.pdf to${FILE_DST}"
mv "${FILE_MAIN_PREFIX}.pdf" "${FILE_DST}"
}

The following lines run the preprocessor:

$log "Preprocess YAML into${FILE_CAT}"
java -jar $HOME/bin/yamlp.jar < "${ARG_FILE_YAML}" > ${FILE_CAT} printf "%s\n" "---" >> "${FILE_CAT}"

The first line informs users what is happening. The second line runs yamlp using Java against the definitions.yaml file. The third line places the closing metablock separator ahead of the Markdown content; yamlp writes the opening separator, automatically.

Pandoc is instructed to interpret the newly interpolated template:

$log "Generate${FILE_TEX}"
pandoc "${FILE_CAT}" --template "${FILE_CAT}" 2>/dev/null | \
pandoc --to context > "${FILE_TEX}" The changes are ready to run. ## Run Continuous Integration Script Restart the continuous integration script as follows: 1. Stop the ci script if it is running (e.g., using Ctrl+c). 2. Run the ci script again to ensure the changes are loaded. ## Update Style This section describes a few superficial changes to the document. Change main.tex to include an override for table of contents styling: \input toc Add a file styles/toc.tex with the following contents, to eliminate the table of contents altogether: \def\completecontent{} Change styles/headings.tex to capitalise the chapter title by updating the setups for section to use the uppercase WORD macro as follows: \setuphead[section][ style=\ss\tfd\WORD, textcolor=ColourPrimary, numbercolor=ColourPrimary, ] Revise the document colours by editing styles/colours.tex: \definecolor[ColourPrimary][h=545454] % ... \definecolor[ColourPrimaryDk][h=333333] Lastly, clear the contents from both layouts.tex and paper.tex to reset the paper size and page layout to their defaults. Make sure the files exist but are zero bytes in size. ## Preview Open output.pdf to see the output, which resembles: Notice that $vacation.country$ resolves from $countries.primary\$ to "USA" using yamlp. The YAML metablock in artefacts/body.md follows:

---
hero:
origin: "Corvallis, Oregon, USA"
city: "Corvallis"
region: "Oregon"
country: "USA"
vacation:
city: "Redwood National Park"
region: "California"
country: "USA"
countries:
primary: "USA"
---

All strings are interpolated correctly.

Download book.zip to get the updated continuous integration script, book styles, YAML definition file, and Markdown example; all files are distributed under the MIT license.

## Summary

This part explained recursive string interpolation, lamented the difficulty of using variables in documentation, provided example user interfaces for editing hierarchical data, and described how to embed interpolated strings in Markdown documents. Incidentally, by placing the variable definitions in a separate file, creating new variables has been reduced to fewer than four steps. Using variables is still tedious, for now. Part 6 describes how to use R to perform calculations that reuse the same YAML variable definitions.