Typesetting Markdown – Part 2: Tool Review

This part describes how to create a PDF file from a Markdown file using pandoc and ConTeXt.

Introduction

Part 1 introduced a shell script template to simplify re-running tasks. Next, let’s create a minimal PDF file using pandoc and ConTeXt.

Requirements

Have the following tools ready:

LaTeX and ConTeXt can conflict when installed on the same computer.

Markdown File

Start by creating a spot to store writing, such as:

mkdir -p $HOME/dev/writing/book

Edit a file named 01.md in that directory having the following contents:

# Chapter Title
## Section Title

Marcus Tullius Cicero wrote in _De Finibus Bonorum et Malorum_:

> Omnium rerum principia parva sunt.

Chapter, section, and subsection headings are demarcated using hash symbols (#); quotations are typically written by prefacing each line with a right angle bracket (>); and text emphasis, such as italics, is constrained by underscores (_) on either side of the text. The basic syntax is described at length on many websites.

Markdown was meant to simplify writing HTML documents; whereas, pandoc’s enhanced Markdown supports multiple output formats. This distinction makes high-quality typeset Markdown possible.

Pandoc

In a new terminal, issue the following commands:

cd $HOME/dev/writing/book
pandoc 01.md

The following text, representing an HTML document fragment, appears:

<h1 id="chapter-title">Chapter Title</h1>
<h2 id="section-title">Section Title</h2>
<p>Marcus Tullius Cicero wrote in <em>De Finibus Bonorum et Malorum</em>:</p>
<blockquote>
<p>Omnium rerum principia parva sunt.</p>
</blockquote>

Pandoc can produce complete documents or document fragments. When creating a PDF file, complete control over the document styling will be required, which means generating a document fragment (as opposed to a complete, standalone document).

In the same terminal, execute the following command:

pandoc --to context 01.md

The following text appears:

\section[title={Chapter Title},reference={chapter-title}]

\subsection[title={Section Title},reference={section-title}]

Marcus Tullius Cicero wrote in {\em De Finibus Bonorum et Malorum}:

\startblockquote
Omnium rerum principia parva sunt.
\stopblockquote

The --to context (or -t context for short) instructs pandoc to produce a ConTeXt document fragment. Many input and output formats can be used and converted, including lightweight plain text formats such as AsciiDoc, MultiMarkdown, and reStructuredText. These formats are similar to Markdown and in some ways superior. Surprisingly, pandoc can also convert plain text into heavyweight formats such as docx, and odt.

Now by itself, a document fragment cannot be compiled successfully. Pandoc can create standalone ConTeXt documents, but more control over the formatting is needed than what the default standalone document settings provide. More control can be achieved by including document fragments within a container document.

ConTeXt

Some configuration steps may be necessary for ConTeXt to run, depending on how ConTeXt was installed.

Setup

Assuming ConTeXt is installed into /opt/context, configure the bash environment as follows:

  1. Edit $HOME/.bashrc.

  2. Append the following lines:

    CONTEXT_HOME=/opt/context
    . $CONTEXT_HOME/tex/setuptex > /dev/null 2>&1
    
    export OSFONTDIR="/usr/share/fonts//;$HOME/.fonts//;$OSFONTDIR"
    export TEXMFCACHE="$CONTEXT_HOME/tex/texmf-cache"
  3. Save the file.

  4. Close all open terminals.

  5. Open a new terminal.

  6. Change to root and create the cache directory as follows:

    sudo -E bash              # retain environment variables
    mkdir -p $TEXMFCACHE      # ensure cache directory exists
    chmod -R o+w $TEXMFCACHE  # make cache world-writable
    exit                      # return to previous user
  7. Populate the cache directory as follows:

    mtxrun --generate

Verify that files are compiled under /opt/context/tex/texmf-cache/.... If the files are not cached in that directory hierarchy, make sure the directory permissions are correct and try again. For a more secure setup, set TEXMFCACHE to /tmp, but recognise that each time the computer is restarted, the cache will have to be repopulated. Another possibility is to create a context group, but that’s beyond the scope of this document.

If ConTeXt is installed in a directory that is not /opt/context, then change the CONTEXT_HOME variable to reflect its installation path.

The environment is configured.

Operating System Font Directory

Regarding the OSFONTDIR variable:

Simple Document

Before creating a container document, edit a new file called 01.tex having the following contents:

\starttext
  The beginnings of all things are small.
\stoptext

Next, run the following command:

context --nonstopmode --batchmode 01.tex

A PDF file is generated.

If the following error appears, ConTeXT is not configured properly:

mtxrun          | unknown script 'context.lua' or 'mtx-context.lua'

Review the official setup instructions and try again. If problems persist, post a question on either the ConTeXt Mailing List or the TeX Stack Exchange network.

Otherwise, the directory will contain 01.pdf as well as ancillary files 01.log and 01.tuc. ConTeXt can remove the files after compiling the document when given the --purgeall argument as follows:

context --nonstopmode --batchmode --purgeall 01.tex

A PDF file, 01.pdf, is generated and all ancillary files removed. Open the file to reveal the following content:

Sample Page

PDF file generated by ConTeXt

The next step is to create a container document that includes ConTeXt document fragments generated by pandoc.

Container Document

Edit main.tex to have the following content:

\starttext
  \input body
\stoptext

Both \starttext and \stoptext macros indicate where the page content begins and ends, respectively. The \input macro directs ConTeXt to insert the contents of the file body.tex into the document (the filename extension is optional).

Next, run pandoc to generate the aforementioned body document from the Markdown source file (01.md) as follows:

pandoc -t context -o body.tex 01.md

Generate the PDF file as before, using main.tex this time, by executing the following command:

context --nonstopmode --batchmode --purgeall main.tex

Open main.pdf to see that 01.md was converted to body.tex by pandoc, then main.tex and body.tex were converted to main.pdf using ConTeXt:

Main Page

PDF file using container document

Any time the file 01.md changes, running the following commands will produce an updated PDF file:

pandoc -t context -o body.tex 01.md
context --nonstopmode --batchmode --purgeall main.tex

Creating a shell script to perform these tasks will expedite writing.

Continuous Integration

Continuous integration is used by software development teams to ensure that revisions to computer programs can be built, tested, and deployed automatically. Conceptually, continuous integration (CI) entails waiting for system events then running commands when an event is triggered. More concretely, CI involves monitoring the filesystem for changes to files then building a computer program’s source code. Similarly, when writing prose, it would be convenient to regenerate and reload the PDF file automatically when any Markdown files change. That is, any updates to the Markdown files being edited must be continuously integrated into the resulting PDF file.

Implementing CI to rebuild a PDF file requires (1) a PDF reader that can reload updated PDF files; and (2) a way to monitor a directory for any changes to the files within it.

PDF Reader

Adobe Reader can reload a PDF file using Ctrl+r, but that’s hardly automatic. Technically, it is possible to monitor the PDF file for changes and then send the Ctrl+r key combination to Adobe Reader, but that’s more work than installing Evince, which reloads updated PDF files by default.

In short, install Evince and use it to review PDF files.

Monitor Directory

The inotifywait manual states that the program:

is suitable for waiting for changes to files from shell scripts.

See how the program works as follows:

  1. Open two new terminals.

  2. Run the following commands in the first terminal:

    AWAIT=close_write
    inotifywait -q -e $AWAIT -m $HOME/dev/writing/book | \
    while read -r d e f; do echo "$d$f changed ($e)"; done
  3. Run the following command in the second terminal:

    touch $HOME/dev/writing/book/02.md

The first terminal shows:

/home/username/dev/writing/book/02.md changed (CLOSE_WRITE,CLOSE)

Taking a closer look, the following line assigns a single event to the AWAIT variable:

AWAIT=close_write

The variable is passed to inotifywait using the -e argument, which accepts a comma-delimited list of event names, themselves documented in the manual; the -q argument instructs inotifywait to output only when files have changed; and the -m argument is given a directory path to watch for changes:

inotifywait -q -e $AWAIT -m $HOME/dev/writing/book | \

The output from inotifywait is passed into the read command within an infinite while loop:

while read -r d e f; do echo "$d$f changed ($e)"; done

Each line that inotifywait writes contains a directory name, a comma-delimited list of related events, and a filename. These values are read into variables d, e, and f, respectively. When writing a build script, the variables will be given descriptive names.

Next, edit $HOME/dev/writing/book/02.md in a plain text editor, add some text, then save the file. Depending on whether the editor uses temporary files, multiple lines may be written to the first terminal, such as:

/home/username/dev/writing/book/4913 changed (CLOSE_WRITE,CLOSE)
/home/username/dev/writing/book/02.md changed (CLOSE_WRITE,CLOSE)

A filter can ensure that only changes to Markdown files trigger rebuilding.

Summary

This part introduced how to generate a PDF file using pandoc and ConTeXt along with a brief overview for how to continuously integrate changes. Part 3 combines Part 1 and Part 2 to create a flexible, user-friendly build script that automatically produces a PDF file upon changes to Markdown files.

Conclusion

About the Author

My software development career has spanned telecommunications, enterprise-level e-commerce solutions, finance, transportation, modernization projects in health and education, and much more.

Always delighted to discuss new opportunities, especially meaningful work with revolutionary companies that care about the environment.