Skip to content
Salim B edited this page Sep 30, 2017 · 35 revisions

Here's some tricks that is allowed by pandoc but not obvious at first sight.

From Markdown, To Markdown

using pandoc -f markdown... -t markdown... can have surprisingly useful applications. As a demo, this file is generated by

pandoc -t markdown_github --atx-headers --normalize --reference-location=block --toc -s -o temp-github.md temp.md

Be careful of @ though, you need to escape it in pandoc since it is treated as citation in pandoc.

Cleanup

As shown in issue #2814, rendering a document to itself can be used to clean up / normalize your markdown file.

TOC generation

e.g. you have a long markdown file in GitHub and want to have a TOC, you can use pandoc -t markdown_github --toc -o example-with-toc.md example.md

This a useful workaround to update the TOC of very long documents, but —beware!— if you use this trick for writing over the input file, you'll end stacking TOCs — each new Table of Contents being generated above the previouly built ones, and indexing them too. This technique is useful when working with different source and output files.

Also, you can add a title to the TOC using the toc-title variable, but only if you use a markdown template — as explained ahead.

Using Markdown Templates

Did you know that you can use pandoc template with markdown too?

Ask pandoc to write-out the default template for markdown:

pandoc --print-default-template=markdown > template.markdown

And now let's peek at the template we got:

$if(titleblock)$
$titleblock$

$endif$
$for(header-includes)$
$header-includes$

$endfor$
$for(include-before)$
$include-before$

$endfor$
$if(toc)$
$toc$

$endif$
$body$
$for(include-after)$

$include-after$
$endfor$

As you can see, there's plenty of conditional statements to play with, allowing for additional control over the output markdown file.

You can also use the toc-title template variable to tell pandoc to add a title on top of the generated TOC. Change the template’s toc block like this:

$if(toc)$
$if(toc-title)$
# $toc-title$
$endif$

$toc$

$endif$

And now invoke pandoc like this:

pandoc --toc -V toc-title:"Table of Contents" --template=template.markdown -o example-with-toc.md example.md

And you'll see in the example-with-toc.md file an auto-generated Table of Contents with a # Table of Contents title over it.

NOTE: if you also include some extra markdown contents with the --include-before-body option (eg: --include-before-body=somefile.md) the contents of the included file will go before the TOC (at least, with the template used in this example) and any headings it contais will not be included in the TOC — ie: the TOC only indexes what comes after the $toc$ template tag. This is useful if you'd like to include an Abstract before the TOC.

Math in Pure Markdown

The manual said:

Note: the --webtex option will affect Markdown output as well as HTML.

This can be used to put math in pure markdown. e.g. you want to put math directly in the README.md in GitHub.

For example, in the temp.md:

# Important Discovery!

$1+2\neq3!$

Try it!

Run this:

pandoc --atx-headers --webtex=https://latex.codecogs.com/png.latex? -s -o temp-codecogs.md temp.md

Then the output becomes:

Important Discovery!

1+2\neq3!

Try it!

Convert Between the 4 Table Syntaxes in Pandoc

Say, in your source markdown file pipe.md:

| testing     | pandoc            | tables  |
|-------------|-------------------|---------|
| simple cell | no multiline cell | and     |
| so          | on                | no list |

In command line,

pandoc -t markdown-simple_tables-multiline_tables-pipe_tables -s -o grid.md pipe.md

In the output grid.md:

+--------------------------+--------------------------+--------------------------+
| testing                  | pandoc                   | tables                   |
+==========================+==========================+==========================+
| simple cell              | no multiline cell        | and                      |
+--------------------------+--------------------------+--------------------------+
| so                       | on                       | no list                  |
+--------------------------+--------------------------+--------------------------+

Repeated Footnotes Anchors and Headers Across Multiple Files

If you use auto-identifiers for the headers, and there are different headers with the same name across different files, you'd want to catenate them together, and pandoc can do this for you:

pandoc file1.md file2.md ...

But if there are repeated footnotes anchors on both files, you need to use the --file-scope option, which will parse each file individually (so the footnotes anchors are "local" to the individual file):

pandoc file1.md file2.md --file-scope ...

What about if the 2 files have both these problems? i.e., headers with same names (hence the same Id by the auto-identifier) and footnotes with same anchors appear across the files. Either approach gives you problems.

In this case, you can use "to markdown from markdown" to write an intermediate markdown file using --file-scope, which handles the colliding footnote anchors for you, and then generate the final document from that intermediate markdown file, and let the auto-identifiers handle the headers for you:

pandoc --file-scope -o intermediate.md file1.md file2.md
pandoc intermediate.md ...

Template Snippet

If you wrote a template snippet that do not form a complete template. The -H, -B, or -A option would not help because pandoc would put your snippet as is and wouldn't process it as a template. i.e. The snippet is included after the template is processed.

A trick mentioned by @cagix in jgm/pandoc-templates#220 is this:

pandoc --template=template_snippet.tex document.md -o processed_snippet
pandoc ... -H processed_snippet document.md -o document.<toFormat>
# Or shorter but bash only (process substitution)
SNIPPET=template_snippet.tex; INPUT=document.md; OUTPUT=document.<toFormat>
pandoc ... -H <(pandoc --template=$SNIPPET $INPUT) $INPUT -o $OUTPUT

The first line will process your template snippet according to the properties of the document, but since your snippet (probably) do not have $body$, the body would not be in the output. Now the snippet is processed and can then be included through -H as is in the 2nd line.

YAML Metadata for Any Format

YAML metadata is only defined for pandoc's markdown syntax. See jgm/pandoc#1960.

Currently, there is a workaround like this (while the YAML metadata only accepts markdown syntax):

pandoc -f markdown -t native -s metadata.yml | sed '$ d' > metadata.native
pandoc -t native -o document.native document.<fromFormat>
pandoc -f native -s -o document.<toFormat> metadata.native document.native
# Or shorter but bash only (process substitution)
YAML=metadata.yml; INPUT=document.<fromFormat>; OUTPUT=document.<toFormat>
pandoc ... -f native -s -o $OUTPUT <(pandoc -f markdown -t native -s $YAML | sed '$ d') <(pandoc -t native $INPUT)

Explanation:

The sed in the first line: because the metadata.yml is regarding as a markdown document with no body, so the last line of the metadata in native format is [], which you need to remove. Another way of removing it is head -n -1 (would not work on Mac's default head). From my test it seems the meta in native is always in one-line, if true then head -n1 will work (which also works on Mac).

Left-aligning Tables in LaTeX

Based on this pandoc-discuss exchange and this TeX StackExchange topic, it is possible to left-align all tables in a document (in the PDF output from LaTeX) with this single invocation in the YAML header block of the markdown document:

---
header-includes:
  - \usepackage[margins=raggedright]{floatrow}
---

This applies to all floats, and fine-grained control may be achieved with the options outlined in the documentation for the floatrow LaTeX package.

GFM Task Lists with Pandoc

GitHub flavored markdown's task lists can be mimicked in pandoc via the GFM-TaskList.pp custom pp macros module. Macro syntax example:

!TaskList(
!Task[x][I'm a _checked_ task]
!Task[ ][I'm an _unchecked_ task]
)

For more information, and to download the GFM-TaskList.pp macros module:

Today in date metadata

Add this to the pandoc command you use:

-M date="`date "+%B %e, %Y"`"

POSIX only.