Typesetting Markdown – Part 2: Tool Review
This part describes how to create a PDF file from a Markdown file using pandoc and ConTeXt.
Introduction
Part 1 introduced a shell script template to simplify re-running tasks. Next, let’s create a minimal PDF file using pandoc and ConTeXt.
Requirements
Have the following tools ready:
- ConTeXt, a typesetting engine
- pandoc, a multi-format document processor
- Remarkable, or any plain text editor
- Evince, or any PDF file reader
- inotify-tools, a filesystem monitor
LaTeX and ConTeXt can conflict when installed on the same computer.
Markdown File
Start by creating a spot to store writing, such as:
mkdir -p $HOME/dev/writing/book
Edit a file named 01.md
in that directory having the following contents:
# Chapter Title
## Section Title
Marcus Tullius Cicero wrote in _De Finibus Bonorum et Malorum_:
> Omnium rerum principia parva sunt.
Chapter, section, and subsection headings are demarcated using hash symbols (#
); quotations are typically written by prefacing each line with a right angle bracket (>
); and text emphasis, such as italics, is constrained by underscores (_
) on either side of the text. The basic syntax is described at length on many websites.
Markdown was meant to simplify writing HTML documents; whereas, pandoc’s enhanced Markdown supports multiple output formats. This distinction makes high-quality typeset Markdown possible.
Pandoc
In a new terminal, issue the following commands:
cd $HOME/dev/writing/book
pandoc 01.md
The following text, representing an HTML document fragment, appears:
<h1 id="chapter-title">Chapter Title</h1>
<h2 id="section-title">Section Title</h2>
<p>Marcus Tullius Cicero wrote in <em>De Finibus Bonorum et Malorum</em>:</p>
<blockquote>
<p>Omnium rerum principia parva sunt.</p>
</blockquote>
Pandoc can produce complete documents or document fragments. When creating a PDF file, complete control over the document styling will be required, which means generating a document fragment (as opposed to a complete, standalone document).
In the same terminal, execute the following command:
pandoc --to context 01.md
The following text appears:
\section[title={Chapter Title},reference={chapter-title}]
\subsection[title={Section Title},reference={section-title}]
Marcus Tullius Cicero wrote in {\em De Finibus Bonorum et Malorum}:
\startblockquote
Omnium rerum principia parva sunt.
\stopblockquote
The --to context
(or -t context
for short) instructs pandoc to produce a ConTeXt document fragment. Many input and output formats can be used and converted, including lightweight plain text formats such as AsciiDoc, MultiMarkdown, and reStructuredText. These formats are similar to Markdown and in some ways superior. Surprisingly, pandoc can also convert plain text into heavyweight formats such as docx, and odt.
Now by itself, a document fragment cannot be compiled successfully. Pandoc can create standalone ConTeXt documents, but more control over the formatting is needed than what the default standalone document settings provide. More control can be achieved by including document fragments within a container document.
ConTeXt
Some configuration steps may be necessary for ConTeXt to run, depending on how ConTeXt was installed.
Setup
Assuming ConTeXt is installed into /opt/context
, configure the bash
environment as follows:
Edit
$HOME/.bashrc
.Append the following lines:
CONTEXT_HOME=/opt/context . $CONTEXT_HOME/tex/setuptex > /dev/null 2>&1 export OSFONTDIR="/usr/share/fonts//;$HOME/.fonts//;$OSFONTDIR" export TEXMFCACHE="$CONTEXT_HOME/tex/texmf-cache"
Save the file.
Close all open terminals.
Open a new terminal.
Change to
root
and create the cache directory as follows:sudo -E bash # retain environment variables mkdir -p $TEXMFCACHE # ensure cache directory exists chmod -R o+w $TEXMFCACHE # make cache world-writable exit # return to previous user
Populate the cache directory as follows:
mtxrun --generate
Verify that files are compiled under /opt/context/tex/texmf-cache/...
. If the files are not cached in that directory hierarchy, make sure the directory permissions are correct and try again. For a more secure setup, set TEXMFCACHE
to /tmp
, but recognise that each time the computer is restarted, the cache will have to be repopulated. Another possibility is to create a context
group, but that’s beyond the scope of this document.
If ConTeXt is installed in a directory that is not /opt/context
, then change the CONTEXT_HOME
variable to reflect its installation path.
The environment is configured.
Operating System Font Directory
Regarding the OSFONTDIR
variable:
- Double slashes – Double forward slashes (
//
) in theOSFONTDIR
environment variable indicate that font files are to found by recursively searching directories. - Home directory – Including
;$HOME/.fonts//
allows font files (.otf
and.ttf
) to be organised in a local, hidden folder for the active user, rather than system-wide.
Simple Document
Before creating a container document, edit a new file called 01.tex
having the following contents:
\starttext
The beginnings of all things are small.
\stoptext
Next, run the following command:
context --nonstopmode --batchmode 01.tex
A PDF file is generated.
If the following error appears, ConTeXT is not configured properly:
mtxrun | unknown script 'context.lua' or 'mtx-context.lua'
Review the official setup instructions and try again. If problems persist, post a question on either the ConTeXt Mailing List or the TeX Stack Exchange network.
Otherwise, the directory will contain 01.pdf
as well as ancillary files 01.log
and 01.tuc
. ConTeXt can remove the files after compiling the document when given the --purgeall
argument as follows:
context --nonstopmode --batchmode --purgeall 01.tex
A PDF file, 01.pdf
, is generated and all ancillary files removed. Open the file to reveal the following content:
PDF file generated by ConTeXt
The next step is to create a container document that includes ConTeXt document fragments generated by pandoc.
Container Document
Edit main.tex
to have the following content:
\starttext
\input body
\stoptext
Both \starttext
and \stoptext
macros indicate where the page content begins and ends, respectively. The \input
macro directs ConTeXt to insert the contents of the file body.tex
into the document (the filename extension is optional).
Next, run pandoc to generate the aforementioned body document from the Markdown source file (01.md
) as follows:
pandoc -t context -o body.tex 01.md
Generate the PDF file as before, using main.tex
this time, by executing the following command:
context --nonstopmode --batchmode --purgeall main.tex
Open main.pdf
to see that 01.md
was converted to body.tex
by pandoc, then main.tex
and body.tex
were converted to main.pdf
using ConTeXt:
PDF file using container document
Any time the file 01.md
changes, running the following commands will produce an updated PDF file:
pandoc -t context -o body.tex 01.md
context --nonstopmode --batchmode --purgeall main.tex
Creating a shell script to perform these tasks will expedite writing.
Continuous Integration
Continuous integration is used by software development teams to ensure that revisions to computer programs can be built, tested, and deployed automatically. Conceptually, continuous integration (CI) entails waiting for system events then running commands when an event is triggered. More concretely, CI involves monitoring the filesystem for changes to files then building a computer program’s source code. Similarly, when writing prose, it would be convenient to regenerate and reload the PDF file automatically when any Markdown files change. That is, any updates to the Markdown files being edited must be continuously integrated into the resulting PDF file.
Implementing CI to rebuild a PDF file requires (1) a PDF reader that can reload updated PDF files; and (2) a way to monitor a directory for any changes to the files within it.
PDF Reader
Adobe Reader can reload a PDF file using Ctrl+r
, but that’s hardly automatic. Technically, it is possible to monitor the PDF file for changes and then send the Ctrl+r
key combination to Adobe Reader, but that’s more work than installing Evince, which reloads updated PDF files by default.
In short, install Evince and use it to review PDF files.
Monitor Directory
The inotifywait
manual states that the program:
is suitable for waiting for changes to files from shell scripts.
See how the program works as follows:
Open two new terminals.
Run the following commands in the first terminal:
AWAIT=close_write inotifywait -q -e $AWAIT -m $HOME/dev/writing/book | \ while read -r d e f; do echo "$d$f changed ($e)"; done
Run the following command in the second terminal:
touch $HOME/dev/writing/book/02.md
The first terminal shows:
/home/username/dev/writing/book/02.md changed (CLOSE_WRITE,CLOSE)
Taking a closer look, the following line assigns a single event to the AWAIT
variable:
AWAIT=close_write
The variable is passed to inotifywait
using the -e
argument, which accepts a comma-delimited list of event names, themselves documented in the manual; the -q
argument instructs inotifywait
to output only when files have changed; and the -m
argument is given a directory path to watch for changes:
inotifywait -q -e $AWAIT -m $HOME/dev/writing/book | \
The output from inotifywait
is passed into the read
command within an infinite while
loop:
while read -r d e f; do echo "$d$f changed ($e)"; done
Each line that inotifywait
writes contains a directory name, a comma-delimited list of related events, and a filename. These values are read into variables d
, e
, and f
, respectively. When writing a build script, the variables will be given descriptive names.
Next, edit $HOME/dev/writing/book/02.md
in a plain text editor, add some text, then save the file. Depending on whether the editor uses temporary files, multiple lines may be written to the first terminal, such as:
/home/username/dev/writing/book/4913 changed (CLOSE_WRITE,CLOSE)
/home/username/dev/writing/book/02.md changed (CLOSE_WRITE,CLOSE)
A filter can ensure that only changes to Markdown files trigger rebuilding.
Summary
This part introduced how to generate a PDF file using pandoc and ConTeXt along with a brief overview for how to continuously integrate changes. Part 3 combines Part 1 and Part 2 to create a flexible, user-friendly build script that automatically produces a PDF file upon changes to Markdown files.
Contact
About the Author
My career has spanned tele- and radio communications, enterprise-level e-commerce solutions, finance, transportation, modernization projects in both health and education, and much more.
Delighted to discuss opportunities to work with revolutionary companies combatting climate change.