Typesetting Markdown – Part 6: Computation

This part of the series describes how to produce dynamic documents using R Markdown.


Part 5 involved recursive string interpolation for organising variables in Markdown documents using yamlp. This part describes how to load interpolated strings from YAML into R, codify some English rules, integrate results from querying web pages, generate graphs, import tabular data from a CSV file, and format tables using ConTeXt.


In addition to the software packages from Part 2 and Part 5, this part requires installing the following software:

R Packages

After installing R, start R as an administrator (e.g., sudo R), then install the required packages system-wide by running the following command:


The packages are installed.

Markdown History

A brief timeline of events:

Almost certainly the integration described herein fits Gruber’s quote. At its core, Markdown was proposed as a syntax for writing plain text documents that removes many pains when writing valid HTML. Consequently, the specification accepts embedded HTML elements within Markdown text. Consider the following:

<b>Warning!</b> Giraffes face threat of extinction.

Given Markdown’s history, where the syntax is headed, and a desire to separate presentation from content, it is reasonable to suggest that embedding HTML within Markdown is best avoided—despite it being commonplace. The text can be written as follows, which eliminates the bold element directive <b> in favour of a style suggestion:

**Warning!** Giraffes face threat of extinction.

How the warning text style appears in its final form (e.g., bold, gold, or to withhold) is a decision for the presentation layer. Writing HTML elements in Markdown documents intermingles content and presentation, which runs counter to this endeavour.

Knitting R Markdown

To knit an R Markdown file means to compute the result of R code written in a Markdown file then replace the R code with that result. R Markdown file name extensions are typically .Rmd. Inline R code begins with backtick r (`r) and ends with a single backtick. For example, an R fragment that injects the answer to 1 plus 1 into a document is written as:

`r 1 + 1`

After the document is knit, the R code is replaced with 2. As another example, including the name of the month that the document was built would look like:

`r month.name[ as.integer( format( Sys.Date(), format="%m" ) ) ]`

Before tackling R code in depth, let’s create an infrastructure to automate weaving the results from R code into documents. Broadly, this entails:

Create Knit Script

Intarsia has the following definition:

a method of knitting with multiple colors, in which a separate length or ball of yarn is used for each area of color (as opposed to different yarns being carried at the back of the work).

Create a new shell script in $HOME/dev/writing/book called intarsia, which will invoke knitr to weave the results from all R statements it encounters in an R Markdown document into a plain Markdown document.

Start the script with the following code:

#!/usr/bin/env bash

source build-template



  "-b,--bootstrap,R source file to load (${ARG_FILE_BOOTSTRAP})"
  "-l,--logs,Log path (${ARG_DIR_LOGS})"
  "-f,--filename,R Markdown file to knit (${ARG_FILE_MARKDOWN})"
  "-a,--artefacts,Generated files path (${ARG_DIR_ARTEFACTS})"
  "-y,--yaml,YAML variables file (${ARG_FILE_YAML})"

The source line reuses the build-template as described previously. The DEPENDENCIES and ARGUMENTS array variables list the script requirements and command-line arguments, respectively.

Setting ARG_DIR_ARTEFACTS provides a default location to write the .md files that are generated by knitting the .Rmd files. The artefacts directory must not be scanned by the continuous integration script, lest an infinite loop occur. Setting ARG_FILE_YAML provides the file name for the processed variables generated using yamlp. Note that the interpolated YAML file must also be saved in a directory outside the watchful eye of the ci script.

Create an execute function that invokes R to perform the knitting:

execute() {
  mkdir -p "${ARG_DIR_LOGS}"

  local -r FILE_LOG="${ARG_DIR_LOGS}/knitr.log"

  $log "Run ${ARG_FILE_BOOTSTRAP} using ${ARG_FILE_YAML}"
  Rscript "${ARG_FILE_BOOTSTRAP}" \
    "${ARG_FILE_YAML}" \
    > "${FILE_LOG}" 2>&1 && return 1

  $log "Rscript terminated unexpectedly with exit level $?"
  error "Knitting ${ARG_FILE_MARKDOWN} failed."
  cat "${FILE_LOG}"
  return 0

These lines run R and are worth exploring:

  Rscript "${ARG_FILE_BOOTSTRAP}" \
    # ...
    > "${FILE_LOG}" 2>&1 && return 1

First, Rscript is a utility—packaged with R—that allows command-line arguments to be passed into an R program. Next, ${ARG_FILE_BOOTSTRAP} is the R source file to run, which will call the knit function to invoke knitr. The last line redirects all messages from running R to a log file; if the execution of Rscript was successful, the value 1 is returned to the build-template script, indicating execute() ran successfully.

If Rscript fails for any reason, its exit level will be non-zero. In that case, an error message is displayed and the contents of the log file are written to the console. The log file contents will help when fixing errors in the R code.

Next, create an argument function to parse the corresponding command-line arguments that are listed in the ARGUMENTS array:

argument() {
  local consume=2

  case "$1" in

  return ${consume}

Notice how consume is set to 2 instead of 1. This is a minor simplification because all the arguments take two parameters. If any unrecognised argument is passed (matched by *)), a single command-line argument is consumed instead.

Finally, run main:

main "$@"

Save the script then make it executable:

chmod +x intarsia

The knitting script is complete.

Create Bootstrap Script

Create a new file in $HOME/dev/writing/book named bootstrap.R. The bootstrap script loads Rscript, imports various libraries, parses the command-line arguments, reads the YAML file, knits the R Markdown into plain Markdown, then terminates.

Begin with a shebang line that invokes Rscript in quiet mode, which suppresses some of its chatter. Using --vanilla instructs Rscript to run without saving or restoring its environment settings.

#!/usr/bin/env Rscript --quiet --vanilla

Packages of extra functionality in R are loaded by calling the library function. The tools library comes pre-packaged with R; the pander and ggplot2 packages aren’t needed just yet. Load the libraries as follows:

library( 'knitr' )
library( 'yaml' )
library( 'rjson' )
library( 'tools' ) 

Parse the command-line arguments by calling the commandArgs function. R provides a number of packages to make user-friendly arguments, but because build-template exists, duplicating such functionality seems wasteful. Instead, use the following code:

args <- commandArgs( trailingOnly = TRUE )

Here is where the bootstrap.R script arguments must align with those passed in from the intarsia script. Recall that intarsia passes, in order: a YAML file name, an R Markdown file name, and an artefacts directory. If these parameters are not provided, Rscript fails and an error message is displayed. Insert the following code (that could be made more robust by checking that the arguments exist prior to using them):

file.yaml <- paste( args[1] )
file.rmd <- paste( args[2] )
dir.artefacts <- paste( args[3] )

One feature that knitr lacks is the ability to specify an output directory. That is, knit( 'filename.Rmd', 'artefacts/' ) would not write the Markdown file filename.md into the artefacts directory. So, inject the following code to work around the issue:

file.md <- paste0( 
  paste( sep='/',
    file_path_sans_ext( file.rmd )
  ), '.md' )

This will cause the knit function to read from 01.Rmd, for example, and write to artefacts/01.md. Before calling knit, load the interpolated strings from the given YAML file (artefacts/interpolated.yaml by default) into a global object named v, for variables, as follows:

v <- yaml.load_file( file.yaml )

Invoke knitr to execute all the R expressions in the Markdown document:

knit( file.rmd, file.md )

Both rmarkdown and knitr can weave a document’s R statements; however, rmarkdown creates a number of temporary .md files in the current working directory. The continuous integration script will trigger a build if any .md files are modified, so rmarkdown will cause the ci script to enter an infinite loop, inadvertently. Work arounds are possible, but would entail more work than using knitr.

The bootstrap.R script is ready.

Create Sample Document

With all the preliminary work finished, create a file named 01.Rmd in the usual book directory. Make it something simple to start, such as:

`r 1 + 1`

Now let’s ensure everything works as expected.

Test Document Knitting

Execute the following commands to knit the sample document:

cd $HOME/dev/writing/book
java -jar $HOME/bin/yamlp.jar < d*.yaml > a*/interpolated.yaml
./intarsia -f 01.Rmd
cat artefacts/01.md

Using d* and a* performs file name globbing to match definitions and artefacts, respectively, without having to type in the full word. In many cases, pressing the Tab key will autocomplete file names.

The console shows the knit artefacts/01.md Markdown file contents, as the sum of 1 plus 1:


The intarsia script is complete.

Update Continuous Integration Script

Having tested intarsia, change the ci script so that updating R scripts will also trigger re-building the document. Along the way, introduce the new knitting functionality. Edit ci then change the arguments list code to be:


  "-a,--artefacts,Artefact path (default: ${ARG_DIR_ARTEFACTS})"
  "-f,--filename,Output PDF file (default: ${ARG_FILE_OUTPUT})"
  "-y,--yaml,YAML definitions file (default: ${ARG_FILE_YAML})"
  "-k,--knit,Enable knitting R Markdown into Markdown"

Note that the ARG_ globals are moved above the ARGUMENTS declaration so that they can be re-used in the help message.

Change the filter() regex to include R files:

  [[ "${1,,}" =~ \.(.*md|tex|y.?ml|R)$ ]]

Revise the preprocessing and file concatenation fragments while introducing code to perform knitting:

  local -r FILE_VAR="${DIR_BUILD}/interpolated.yaml"

  $log "Preprocess YAML into ${FILE_CAT}"
  java -jar $HOME/bin/yamlp.jar < "${ARG_FILE_YAML}" > "${FILE_VAR}"


  $log "Create ${FILE_CAT} from ${FILE_VAR}"
  cp "${FILE_VAR}" "${FILE_CAT}"
  printf "%s\n" "---" >> "${FILE_CAT}"
  cat "${DIR_BUILD_MARKDOWN}"/??.md >> "${FILE_CAT}"

Declaring the FILE_VAR variable helps to reuse most of the original code for using YAML variables with pandoc while making the same variables available to load into R. Executing yamlp now writes the interpolated strings to FILE_VAR, which is artefacts/interpolated.yaml by default. The $knit call will be delegated to utile_knit (up next) depending on whether ci was invoked with the -k command-line argument. The final four lines concatenate all the Markdown files into a single document ready for parsing by pandoc.

Insert a new function after execute() to perform the actual knitting:

utile_knit() {
  $log "Knit R documents"

  for filename in ./??.Rmd; do
    ./intarsia -f "${filename}" -a "${DIR_BUILD_MARKDOWN}"

The utile_knit function finds all .Rmd files in the current directory then passes them through intarsia. Use zero-padded chapter numbers (e.g., 01.Rmd, 02.Rmd, etc.) otherwise the ./??.Rmd expression may not find all the files, much less concatenate them in the desired order.

Update the argument function by consuming the new command-line arguments of -a and -k:

argument() {
  local consume=1

  case "$1" in

  return ${consume}

Notice that -k is a solitary argument, like -h, and so consumes one option.

At the end of the file, before calling main, insert the following lines to assign an empty knit function and set the default directory containing plain (or post-knit) Markdown files:


# Gets set to ARG_DIR_ARTEFACTS when using knitr.

The continuous integration script knits the output from R into documents. Huzzah!

Living Documents

A living document is a document that is continually updated or edited. By coupling R code with Markdown, another type of living document is created: a dynamic document.

The remaining sections leverage R to achieve a wide variety of functionality. To begin, complete the following steps:

  1. Stop the ci script, if it is running.
  2. Run ./ci -d -k to enable knitting mode.
  3. Open output.pdf with Evince.

The steps are complete.

Use Interpolated Strings

Technically, there are two ways to use variables now: using R expressions and pandoc variable substitution. What’s more, both ways can be combined within a document. Rewrite 01.Rmd to the following:

# Velocitas Formidabilis

"From `r v$hero$city` to `r v$vacation$city`?" he asked.

He repeated, "From $hero.city$ to $vacation.city$?"

Where `r 1 + 1` output 2, here `r v$hero$city` dereferences the v object to inject the variable value from the interpolated YAML file. Normally descriptive variables are preferred, but since v is used so often, it makes sense to keep it succinct. Since the way pandoc is invoked hasn’t changed, using its variable reference syntax (e.g., $hero.city$) is also possible. Upon saving the file, microtypography issues notwithstanding, the generated document resembles:

A Veritable Variety of Various Variable Values

Avoid mixing variable types in a single document, though, because consistency makes programmatic changes easier and less time consuming. With computers, you pay more for inconsistency.

Codify English Rules

The Chicago Manual of Style suggests that authors, “Spell out whole numbers up to (and including) one hundred.” The American Mathematical Society suggests spelling out numbers up to (but not including) ten.

Try applying rules for cardinal numbers as follows:

  1. Copy the following files into the book directory:
  2. Edit bootstrap.R.
  3. After the last library statement, insert source( 'cardinal.R' ).
  4. Save the file.
  5. Edit 01.Rmd.
  6. Change the contents to:
`r cardinal( v$vacation$travel$speed$value - 505 )`

After saving the file, the PDF file output shows ninety-eight, which adheres to the Chicago Manual of Style guidelines. Multiply the result by two, instead of subtracting 505, and the document contents change to 1,206.

Using the cardinal function throughout the documentation consistently would make switching styles possible by modifying the cardinal function. This modification is left as an exercise for the reader.

Integrate Web Queries

Imagine a scenario where the travel time between two locations is requested. This is more involved than codifying English rules because it requires:

The following steps provide a possible solution:

  1. Register for a Mapbox access token.
  2. Export the access token into a file named token.txt in the book directory; the token will be a long mix of alphanumeric characters.
  3. Save excursion.R into the book directory.
  4. Modify bootstrap.R to call source( 'excursion.R' ).
  5. Save bootstrap.R.
  6. Change 01.Rmd as follows:
# Velocitas Formidabilis

"From `r v$hero$city` to `r v$vacation$city`?" he asked.

"It's only 
`r cardinal( round.up( travel.time(
  v$hero$latitude$value, v$hero$longitude$value,
  v$vacation$latitude$value, v$vacation$longitude$value,
  v$vacation$travel$speed$value ) ) )`
minutes away by 'Loop," she said.

Save the file to see the following output:

Travel Duration

The word fifty is derived by:

  1. making a request to a web service to determine the travel distance between two geographic coordinates;
  2. converting the distance to time based on a velocity;
  3. rounding the time to the nearest 5-minute interval;
  4. rewriting the calculation as a cardinal number based on a style guide; and
  5. converting the prose into a typeset document.

That the entire chain of events happens within a couple of seconds is pretty powerful.

Take a look at excursion.R to see how it works. The distance function has a few key lines worth reviewing. First, the access token is read from the token.txt file:

  token <- readLines( "token.txt" )

A website URL is then generated according to specification, which includes the access token that was read from the file:

  api <- paste0(
    lon1, ',',
    lat1, ';',
    lon2, ',',
    paste0( '?access_token=', token )

Next, a JSON document is retrieved from the map service as follows:

    doc <- fromJSON( file=api )

If the request completes successfully, the distance between the two coordinates is extracted from the resulting JSON document:

    meters = doc[1]$routes[[1]]$distance

In the event of an error, the haversine function is called to estimate the distance. Note that the map service will return travel distance by road; whereas, the haversine function returns a straight distance. These yield different results.

The distance from either scenario is returned to the travel.time function, which converts distance to minutes, given a velocity. Then the minutes are rounded to the nearest 5-minute interval by calling round.up. From there, the call to cardinal converts the time period to English according to the Chicago Manual of Style rules.

Changing any of the pertinent variables in definitions.yaml will trigger a new document build, which then re-runs all the R code. Since YAML is machine-writable, exposing a mechanism to generate its contents would allow end-users to control the ouput from R. The YAML data source could be a database, a web application, or even a CSV file downloaded from a secure FTP site then transformed into YAML.

Generate Graphs

Graphs can communicate ideas and trends at a glance. A simple way to embed a graph is to generate it, save it as a file, then reference the file’s path in the Markdown document. That approach is laborious and error prone. Embedding graphs that are regenerated when the source data changes would be ideal. This section describes a way to embed such graphs into R Markdown documents automatically.

Update Bootstrap

Complete the following steps:

  1. Edit bootstrap.R.
  2. After the last library statement, write:
library( 'pander' )
library( 'ggplot2' )
  1. Save the file.

The new libraries are available for use.

Update Continuous Integration Script

An unresolved problem with the ci script is that stale files are not removed between builds. This is addressed by using the rm command to remove the stale files (i.e., artefacts/*.md) and directories (i.e., plots).

Add another global command-line argument variable to the ci script:


Insert its help into ARGUMENTS:

  "-p,--plots,Plots path (default: ${ARG_DIR_PLOTS})"

Change the start of build_document() to remove outdated files:

build_document() {
  mkdir -p "${DIR_BUILD}"

  $log "Remove stale build and plot artefacts"
  rm -f "${DIR_BUILD}"/*md
  rm -rf "${ARG_DIR_PLOTS}"

Update argument() to accept a new command-line argument:


Stop then restart the ci script to load the new changes:

./ci -d -k

The next time the document is built, any outdated files will be removed from the system, including: Markdown files, autogenerated graphs, and graphs converted to PDF files by ConTeXt.

Create R Markdown File

Rewrite 01.Rmd in the book directory using the following content:

## Example Graph

![Random Plot](`r evals( '
      runif( 20 ), runif( 20 ),
      xlab="X", ylab="Y"
  graph.output = "svg"

The output resembles the following figure:

R Markdown Graph

There’s a lot going on, so let’s look at what’s happening in depth. The following plain Markdown inserts a graphic image named filename.svg from the plots directory into the document:

![Random Plot](plots/filename.svg)

When pander evaluates an R expression that generates a graph (plot), a file containing the graph is created in the plots directory with a unique name (e.g., 22f335c301c5.svg). The goal is to replace plots/filename.svg with the path and file name produced by pander.

Inside the parentheses is an R expression. Start a new R session then execute the following code:


A plot appears in a window by itself, such as:

Random R Plot Window

The evals function redirects drawing the plot from a window to an R object. Every object in R can have attributes, which themselves are objects, and are referenced using a $ sigil. Try the following in the same R session:

graph <- evals('plot(runif(20),runif(20))')

Nothing is displayed to the console because the object named graph contains all the information about the plot, including a reference to the file name containing the graph. Continue the R session by typing in the following expression:


The list of the graph object’s attributes is displayed, including the path and file name produced by pander that’s required to include the graphic inside the final document:

[1] "plots/94976433f4f.png"
[1] "image"

Here, $result contains the path to the image that evals intercepted and wrote to the file named plots/94976433f4f.png. Since $result is part of the graph object, the full reference to retrieve the file name by itself follows:


Storing the plot in the graph variable is an unnecessary step; the line above can be rewritten as:


Using xlab="X" and ylab="Y" sets the plot’s X and Y labels, respectively, exemplified by running the following statement from within the R session:

plot(runif(20),runif(20),xlab="X Label",ylab="Y Label")

Pander’s evals function has a graph.output parameter, which accepts several graphics file formats including png, jpg, svg, or pdf. Scalable vector graphics (SVG) are highly recommended because they can be viewed at any size without losing quality. That is, an SVG file generated using R will appear crisp, not blurry, at any zoom level. Selecting SVG is accomplished by passing a graph.output parameter into the evals function, as per the following line:

  graph.output = "svg"

The final part of the image insertion syntax, which uses pandoc’s Markdown extension (not supported by CommonMark), shrinks the image:


Technically, this violates separating content (an image) from presentation (how the image is rendered on the page). The convenience of controlling the width (or height) so simply is hard to resist. Reasons to avoid this syntax include pandoc-only support and competing alternatives, such as:

![Image Caption | 90%](plots/filename.png) 
![Image Caption](plots/filename.png =90%)

Once a syntax is formally specified in CommonMark, pandoc will probably be updated to support the standard specification.

Plots Directory

By default, the evals function exports files into a directory named plots. This value can be changed by setting the graph.dir parameter. If a different output directory is desired, then (1) the new value will have to propagate from the ci script’s --plots command-line argument through to (2) intarsia as a parameter to Rscript, then (3) set as a variable in bootstrap.R that’s (4) subsequently used for graph.dir. See the vignettes for more information about evals.

Import External Graphics

Even though it is possible to embed R code for graphs directly inside R Markdown files, an approach that offers more flexibility is to define the graphs using external functions that are imported.

Here’s a way to simplify the R Markdown file that generates a complex plot.

  1. Install the following R packages:
    • extrafont – to embed fonts into PDF files
    • ggrepel – to move labels away from each other and data points
    • scales – to determine positions for axes labels and legends
  2. Save the following files to the book directory:
  3. Edit bootstrap.R.
  4. Insert source( 'climate.R' ) to import the climate.R source.
  5. Save the file.
  6. Replace 01.Rmd with the following contents:
## Climate Change

Human activies have been [causally linked](https://www.nature.com/articles/srep21691) to climate change.

![Atmospheric CO~2~ and Temperature](`r plot_graph_climate()`){width=60%}

Viewing output.pdf presents the following page:

External Graph

Open climate.R to see the plot_graph_climate function (near the bottom). Rather than expose pander-specific code to the R Markdown document, the complexity is hidden behind a function call. This provides the ability to change how the path to the generated graph is computed without having to change the R Markdown file, which separates those concerns cleanly:

plot_graph_climate <- function() {
    graph.output = "svg"

By using this technique consistently throughout the document, it would be straightforward to modify all the plots—to use the same background colour, for example—at a single location.

Pander has an evalsOptions function that can help simplify the code. After calling evalsOptions( "graph.output", "svg" ) in bootstrap.R, the plot_graph_climate function reduces to the following:

plot_graph_climate <- function() {
  evals( 'graph_climate()' )[[1]]$result[1]

This change is left as an exercise for the reader.

Import Tabular Data

In addition to generating graphs, pander can help convert data in comma-separated value (CSV) files into Markdown tables.

If triggering a rebuild on local changes to CSV files is important, update the ci script’s filter function to accept csv files as follows:

  [[ "${1,,}" =~ \.(.*md|tex|y.?ml|r|csv)$ ]]

Restart ci using -k as usual.

Append the following text to 01.Rmd, which passes the first several rows from a CSV file into the pander function:

The [data](http://www.columbia.edu/~mhs119/Temperature/globalT_1880-1920base.12rm.txt) in Table 1 is courtesy of Dr. Makiko Sato and Dr. James Hansen.

`r pander( head( read.csv( "global-temperature.csv" ) ) )`

Table: Average Annual Temperatures

A rudimentary table is produced:

External Data

The calls are fairly self-explanatory:

Format Table Style

Tables loaded from CSV files can contain reams of data. Change 01.Rmd to make a couple of slight adjustments to the data being included:

`r pander( read.csv( "global-temperature.csv" ), round=c( 2, 4 ) )`

Table: Average Annual Temperatures

The R statement:

Create a new file, book/styles/tables.tex, having the following text:



The first line addresses the issue of the table data flowing off the page:


The setups for the setupxtable macro:

Passing the [head] option to the setupxtable macro instructs ConTeXt to control the presentation of the table’s header row. In this case, the top and bottom cell borders are enabled. Similarly, the [body] and [foot] options provide the same control structure for their corresponding table parts.

Next, update book/main.tex to include the new table styling:

\input headings
\input tables
\input toc

When the book is recompiled, it resembles the following:

Formatted Table

To reiterate, embedding the calls to pander within the R Markdown document tightly couples pander to the document itself. Consider writing a helper R function, instead, that hides the details of how the CSV file is loaded and converted into Markdown. An example left for the reader to implement:

`r table_temperature()`

Zebra Stripes

Changing the background colour of every other table row requires a fair depth of knowledge about TeX macros. Explaining how TeX macros work is beyond the scope of this part. Implement zebra stripes for tables as follows:

  1. Edit styles/colours.tex.
  2. Change ColourTertiaryLt to d6dcdd.
  3. Save the file.
  4. Edit styles/tables.tex.
  5. Define a new macro at the top of the file:
  1. Replace the body setups to include the footer and use the new alternating background colour macro:
\setupxtable[body, foot][
  1. Save the file.

The table appears with alternating rows having a background colour:

Zebra Stripes

Whether decorating every other row helps in any way with reading is a matter of some debate.


Download the resulting files, distributed under the MIT license.


This part touched on creating dynamic documents using the R programming language, converted distances derived from a web application into English text, embedded scalable vector graphics from plots, imported tabular data into a document from an external data source, and described a way to change the appearance of all tables in a document. Part 7 explains how to typeset mathematics within a Markdown document.


About the Author

My career has spanned tele- and radio communications, enterprise-level e-commerce solutions, finance, transportation, modernization projects in both health and education, and much more.

Delighted to discuss opportunities to work with revolutionary companies combatting climate change.