aboutsummaryrefslogtreecommitdiff
path: root/doc/customizing-pandoc.md
blob: cba38b2c6674bfa44c79501c8f16d94be31d8c10 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
---
author:
- Mauro Bieg
- John MacFarlane
title: Customizing Pandoc
---

This document provides a quick overview over the various ways to
customize pandoc's output, with links to fuller documentation
and some examples.

## Templates

When the `-s`/`--standalone` option is used, pandoc will
generate a standalone document rather than a fragment.
For example, in HTML output this will include the
`<head>` element; in LaTeX output, it will include the
preamble.

Pandoc comes with a default template for (almost) every output
format. A template is a plain text file containing variables
that are replaced by text generated by pandoc.  For example,
the variable `$body$` will be replaced by the document body,
and `$title$` by the title from metadata.

To look at the default template for an output format, you can do
`pandoc -D FORMAT`, where `FORMAT` is replaced by the name of
the format. For example `pandoc -D latex`. You can also use your
own template instead, either by using the `--template` option
or by putting the custom template in your user data directory
(on linux and macOS, `~/.pandoc/templates/`).

Note that in many cases you can avoid the need for a custom
template by including a file with the `--include-in-header`,
`--include-before-body`, or `--include-after-body` option.
Or you can set the corresponding template variable directly.

### Template variables

There are several ways to set template variables:

|      | [`--variable`]   | [`--metadata`]   | [YAML metadata] and [`--metadata-file`] |
|:---------------|:------------------|:------------------|:----------------------------|
| values can be… | strings and bools | strings and bools | also YAML objects and lists |
| strings are…   | inserted verbatim | escaped           | interpreted as markdown     |
| accessible by filters: | no        | yes               | yes                         |


[`--variable`]:      https://pandoc.org/MANUAL.html#option--variable
[`--metadata`]:      https://pandoc.org/MANUAL.html#option--metadata
[YAML metadata]:     https://pandoc.org/MANUAL.html#extension-yaml_metadata_block
[`--metadata-file`]: https://pandoc.org/MANUAL.html#option--metadata-file



For more information, see [Templates](https://pandoc.org/MANUAL.html#templates) in
the pandoc manual.

### Example: adding structured author data to HTML

TODO

### Example: generating documents from YAML metadata

TODO <!-- Example of generating a structured document,
say, a table, from structured YAML metadata using
just the control structures in pandoc's template
language. -->

## Reference docx/pptx/odt

For `docx`, `pptx` or `odt` documents, things are a bit more
complicated. Instead of a single template file, you need to
provide a customized `reference.docx/pptx/odt`.
See the manual for the
[`--reference-doc`](https://pandoc.org/MANUAL.html#option--reference-doc) option.

### Example: changing the font and line spacing in a Word docx

TODO

## Filters

Templates are very powerful, but they are only a sort of scaffold to
place your document's body text in. You cannot directly change the
body text using the template.

If you need to affect the output of the actual body text, you
can use a pandoc filter. A filter is a small program that
transforms the document, between the parsing and the writing phase,
while it is still in pandoc's native format. For example,
a filter might find all the Header elements of a document
and capitalize their text.

Pandoc's native representation of a document is an
abstract syntax tree (AST), not unlike the HTML DOM. It is
documented
[here](https://hackage.haskell.org/package/pandoc-types/docs/Text-Pandoc-Definition.html). A `Pandoc` document is a chunk of
metadata (`Meta`) and a list of `Block`s. The `Block`s, in
turn, are composed of other `Block`s and `Inline` elements.
(`Block` elements are things like paragraphs, lists, headers,
and code blocks. `Inline` elements are individual words,
links, emphasis, and so on.) Filters operate on these
elements.  You can use `pandoc -t native` to learn about the
AST's structure.

There are two kinds of filters: JSON filters (which transform a
JSON serialization of the pandoc AST, and may be written in any
language that can parse and emit JSON), and Lua filters (which
use an interface built directly into pandoc, and must be written
in the Lua language).  If you are writing your own filters, it
is best to use Lua filters, which are more portable (they
require only pandoc itself) and more efficient.  See [Lua
filters](https://pandoc.org/lua-filters.html) for documentation and examples.  If
you would prefer to write your filter in another language, see
[Filters](https://pandoc.org/filters.html) for a gentle introduction to JSON
filters.

There's a repository of lua filters at
[pandoc/lua-filters](https://github.com/pandoc/lua-filters)
on GitHub.  A number of pandoc filters, written in
Haskell, are available on
[Hackage](https://hackage.haskell.org/packages/search?terms=pandoc+filter)
and can be installed using the `stack` or `cabal` tools.
The wiki also lists [third party
filters](https://github.com/jgm/pandoc/wiki/Pandoc-Filters).

### Example: capitalizing headers

TODO

### Example: code extractor

TODO

## Generic Divs and Spans

TODO
[Divs and Spans](https://pandoc.org/MANUAL.html#divs-and-spans): generic blocks
that can be transformed with filters

### Example: colored text


### Example: custom styles in docx

[Custom Styles in Docx](https://pandoc.org/MANUAL.html#custom-styles-in-docx)

## Raw attributes

TODO
[Generic raw attributes](https://pandoc.org/MANUAL.html#generic-raw-attribute):
to include raw snippets

## Custom writers

TODO
[Custom writers](https://pandoc.org/MANUAL.html#custom-writers)

## Custom syntax highlighting

TODO
[Custom syntax highlighting](https://pandoc.org/MANUAL.html#syntax-highlighting),
provided by the [skylighting
library](https://github.com/jgm/skylighting)

including highlighting styles