aboutsummaryrefslogtreecommitdiff
path: root/MANUAL.txt
diff options
context:
space:
mode:
Diffstat (limited to 'MANUAL.txt')
-rw-r--r--MANUAL.txt356
1 files changed, 262 insertions, 94 deletions
diff --git a/MANUAL.txt b/MANUAL.txt
index c9dc9e62b..008a0e657 100644
--- a/MANUAL.txt
+++ b/MANUAL.txt
@@ -1,7 +1,7 @@
---
title: Pandoc User's Guide
author: John MacFarlane
-date: June 20, 2021
+date: November 20, 2021
---
# Synopsis
@@ -142,11 +142,11 @@ When using LaTeX, the following packages need to be available
contains images), [`hyperref`], [`xcolor`],
[`ulem`], [`geometry`] (with the `geometry` variable set),
[`setspace`] (with `linestretch`), and
-[`babel`] (with `lang`). The use of `xelatex` or `lualatex` as
+[`babel`] (with `lang`). If `CJKmainfont` is set, [`xeCJK`]
+is needed. The use of `xelatex` or `lualatex` as
the PDF engine requires [`fontspec`]. `lualatex` uses
-[`selnolig`]. `xelatex` uses [`polyglossia`] (with `lang`),
-[`xecjk`], and [`bidi`] (with the `dir` variable set). If the
-`mathspec` variable is set, `xelatex` will use [`mathspec`]
+[`selnolig`]. `xelatex` uses [`bidi`] (with the `dir` variable set).
+If the `mathspec` variable is set, `xelatex` will use [`mathspec`]
instead of [`unicode-math`]. The [`upquote`] and [`microtype`]
packages are used if available, and [`csquotes`] will be used
for [typography] if the `csquotes` variable or metadata field is
@@ -197,7 +197,7 @@ footnotes in tables).
[`weasyprint`]: https://weasyprint.org
[`wkhtmltopdf`]: https://wkhtmltopdf.org
[`xcolor`]: https://ctan.org/pkg/xcolor
-[`xecjk`]: https://ctan.org/pkg/xecjk
+[`xeCJK`]: https://ctan.org/pkg/xecjk
[`xurl`]: https://ctan.org/pkg/xurl
[`selnolig`]: https://ctan.org/pkg/selnolig
@@ -259,12 +259,14 @@ header when requesting a document from a URL:
- `odt` ([ODT])
- `opml` ([OPML])
- `org` ([Emacs Org mode])
+ - `rtf` ([Rich Text Format])
- `rst` ([reStructuredText])
- `t2t` ([txt2tags])
- `textile` ([Textile])
- `tikiwiki` ([TikiWiki markup])
- `twiki` ([TWiki markup])
- `vimwiki` ([Vimwiki])
+ - the path of a custom Lua reader, see [Custom readers and writers] below
:::
Extensions can be individually enabled or disabled by
@@ -314,6 +316,7 @@ header when requesting a document from a URL:
- `markdown_mmd` ([MultiMarkdown])
- `markdown_phpextra` ([PHP Markdown Extra])
- `markdown_strict` (original unextended [Markdown])
+ - `markua` ([Markua])
- `mediawiki` ([MediaWiki markup])
- `ms` ([roff ms])
- `muse` ([Muse]),
@@ -337,7 +340,7 @@ header when requesting a document from a URL:
- `tei` ([TEI Simple])
- `xwiki` ([XWiki markup])
- `zimwiki` ([ZimWiki markup])
- - the path of a custom Lua writer, see [Custom writers] below
+ - the path of a custom Lua writer, see [Custom readers and writers] below
:::
Note that `odt`, `docx`, `epub`, and `pdf` output will not be directed
@@ -500,6 +503,7 @@ header when requesting a document from a URL:
[CSL JSON]: https://citeproc-js.readthedocs.io/en/latest/csl-json/markup.html
[BibTeX]: https://ctan.org/pkg/bibtex
[BibLaTeX]: https://ctan.org/pkg/biblatex
+[Markua]: https://leanpub.com/markua/read
## Reader options {.options}
@@ -729,6 +733,16 @@ header when requesting a document from a URL:
document in standalone mode. If no *VAL* is specified, the
key will be given the value `true`.
+`--sandbox`
+
+: Run pandoc in a sandbox, limiting IO operations in readers
+ and writers to reading the files specified on the command line.
+ Note that this option does not limit IO operations by
+ filters or in the production of PDF documents. But it does
+ offer security against, for example, disclosure of files
+ through the use of `include` directives. Anyone using
+ pandoc on untrusted user input should use this option.
+
`-D` *FORMAT*, `--print-default-template=`*FORMAT*
: Print the system default template for an output *FORMAT*. (See `-t`
@@ -776,7 +790,6 @@ header when requesting a document from a URL:
preserve the wrapping from the source document (that is,
where there are nonsemantic newlines in the source, there
will be nonsemantic newlines in the output as well).
- Automatic wrapping does not currently work in HTML output.
In `ipynb` output, this option affects wrapping of the
contents of markdown cells.
@@ -961,7 +974,9 @@ header when requesting a document from a URL:
: Specify whether footnotes (and references, if `reference-links` is
set) are placed at the end of the current (top-level) block, the
current section, or the document. The default is
- `document`. Currently only affects the markdown writer.
+ `document`. Currently this option only affects the
+ `markdown`, `muse`, `html`, `epub`, `slidy`, `s5`, `slideous`,
+ `dzslides`, and `revealjs` writers.
`--markdown-headings=setext`|`atx`
@@ -1025,13 +1040,13 @@ header when requesting a document from a URL:
: Specifies that headings with the specified level create
slides (for `beamer`, `s5`, `slidy`, `slideous`, `dzslides`). Headings
- above this level in the hierarchy are used to divide the
- slide show into sections; headings below this level create
- subheads within a slide. Note that content that is
- not contained under slide-level headings will not appear in
- the slide show. The default is to set the slide level based
- on the contents of the document; see [Structuring the slide
- show].
+ above this level in the hierarchy are used to divide the slide show
+ into sections; headings below this level create subheads within a slide.
+ Valid values are 0-6. If a slide level of 0 is specified, slides will
+ not be split automatically on headings, and horizontal rules must be used
+ to indicate slide boundaries. If a slide level is not specified
+ explicitly, the slide level will be set automatically based on
+ the contents of the document; see [Structuring the slide show].
`--section-divs`
@@ -1164,13 +1179,22 @@ header when requesting a document from a URL:
`.pptx` or `.potx` extension) are known to work, as are most
templates derived from these.
- The specific requirement is that the template should begin with
- the following first four layouts:
+ The specific requirement is that the template should contain layouts
+ with the following names (as seen within PowerPoint):
- 1. Title Slide
- 2. Title and Content
- 3. Section Header
- 4. Two Content
+ - Title Slide
+ - Title and Content
+ - Section Header
+ - Two Content
+ - Comparison
+ - Content with Caption
+ - Blank
+
+ For each name, the first layout found with that name will be used.
+ If no layout is found with one of the names, pandoc will output a
+ warning and use the layout with that name from the default reference
+ doc instead. (How these layouts are used is described in [PowerPoint
+ layout choice](#powerpoint-layout-choice).)
All templates included with a recent version of MS PowerPoint
will fit these criteria. (You can click on `Layout` under the
@@ -1179,8 +1203,8 @@ header when requesting a document from a URL:
You can also modify the default `reference.pptx`: first run
`pandoc -o custom-reference.pptx --print-default-data-file
reference.pptx`, and then modify `custom-reference.pptx`
- in MS PowerPoint (pandoc will use the first four layout
- slides, as mentioned above).
+ in MS PowerPoint (pandoc will use the layouts with the names
+ listed above).
`--epub-cover-image=`*FILE*
@@ -1458,6 +1482,7 @@ Nonzero exit codes have the following meanings:
Code Error
----- ------------------------------------
+ 1 PandocIOError
3 PandocFailOnWarningError
4 PandocAppError
5 PandocTemplateError
@@ -1466,6 +1491,7 @@ Nonzero exit codes have the following meanings:
22 PandocUnknownWriterError
23 PandocUnsupportedExtensionError
24 PandocCiteprocError
+ 25 PandocBibliographyError
31 PandocEpubSubdirectoryError
43 PandocPDFError
44 PandocXMLError
@@ -1478,6 +1504,7 @@ Nonzero exit codes have the following meanings:
66 PandocMakePDFError
67 PandocSyntaxMapError
83 PandocFilterError
+ 84 PandocLuaError
91 PandocMacroLoop
92 PandocUTF8DecodingError
93 PandocIpynbDecodingError
@@ -1931,7 +1958,7 @@ ${ styles.html() }
```
(If a partial is not found in the directory of the
-template and the template path is given as a relative
+template and the template path is given as a relative
path, it will also be sought in the `templates`
subdirectory of the user data directory.)
@@ -2166,7 +2193,7 @@ Currently the following pipes are predefined:
and AsciiDoc metadata; repeat as for `author`, above
`subject`
-: document subject, included in ODT, PDF, docx and pptx metadata
+: document subject, included in ODT, PDF, docx, EPUB, and pptx metadata
`description`
: document description, included in ODT, docx and pptx metadata. Some
@@ -2793,7 +2820,7 @@ on the output format, and include the following:
`toc-title`
: title of table of contents (works only with EPUB,
- HTML, opendocument, odt, docx, pptx, beamer, LaTeX)
+ HTML, revealjs, opendocument, odt, docx, pptx, beamer, LaTeX)
[pandoc-templates]: https://github.com/jgm/pandoc-templates
@@ -3067,7 +3094,7 @@ starts at 1.
This extension can be enabled/disabled for the following formats:
output formats
-: `odt`, `opendocument`
+: `odt`, `opendocument`, `docx`
#### Extension: `xrefs_name` ####
@@ -3726,8 +3753,8 @@ or two spaces.
A term may have multiple definitions, and each definition may
consist of one or more block elements (paragraph, code block,
list, etc.), each indented four spaces or one tab stop. The
-body of the definition (including the first line, aside from the
-colon or tilde) should be indented four spaces. However, as with
+body of the definition (not including the first line)
+should be indented four spaces. However, as with
other Markdown lists, you can "lazily" omit indentation except
at the beginning of a paragraph or other block element:
@@ -4029,12 +4056,12 @@ legal (though ugly) pipe table:
orange|3.09
The cells of pipe tables cannot contain block elements like paragraphs
-and lists, and cannot span multiple lines. If a pipe table contains a
-row whose Markdown content is wider than the column width (see
-`--columns`), then the table will take up the full text width and
-the cell contents will wrap, with the relative cell widths determined
-by the number of dashes in the line separating the table header from
-the table body. (For example `---|-` would make the first column 3/4
+and lists, and cannot span multiple lines. If any line of the
+markdown source is longer than the column width (see `--columns`),
+then the table will take up the full text width and the cell
+contents will wrap, with the relative cell widths determined by
+the number of dashes in the line separating the table header
+from the table body. (For example `---|-` would make the first column 3/4
and the second column 1/4 of the full text width.)
On the other hand, if no lines are wider than column width, then
cell contents will not be wrapped, and the cells will be sized
@@ -4173,6 +4200,10 @@ A document may contain multiple metadata blocks. If two
metadata blocks attempt to set the same field, the value from
the second block will be taken.
+Each metadata block is handled internally as an independent YAML document.
+This means, for example, that any YAML anchors defined in a block cannot be
+referenced in another block.
+
When pandoc is used with `-t markdown` to create a Markdown document,
a YAML metadata block will be produced only if the `-s/--standalone`
option is used. All of the metadata will appear in a single block
@@ -4395,6 +4426,18 @@ Attributes can be attached to verbatim text, just as with
`<$>`{.haskell}
+### Underline ###
+
+To underline text, use the `underline` class:
+
+ [Underline]{.underline}
+
+Or, without the `bracketed_spans` extension (but with `native_spans`):
+
+ <span class="underline">Underline</span>
+
+This will work in all output formats that support underline.
+
### Small caps ###
To write small caps, use the `smallcaps` class:
@@ -4983,13 +5026,16 @@ See the [CSL user documentation] for more information about CSL
styles and how they affect rendering.
Unless a citation key start with a letter, digit, or `_`,
-and contains only alphanumerics and internal punctuation
+and contains only alphanumerics and single internal punctuation
characters (`:.#$%&-+?<>~/`), it must be surrounded
by curly braces, which are not considered part of the key.
-In `@Foo_bar.baz.`, the key is `Foo_bar.baz`. The final
+In `@Foo_bar.baz.`, the key is `Foo_bar.baz` because the final
period is not *internal* punctuation, so it is not included in
the key. In `@{Foo_bar.baz.}`, the key is `Foo_bar.baz.`, including
-the final period. The curly braces are recommended if you use URLs as
+the final period.
+In `@Foo_bar--baz`, the key is `Foo_bar` because the repeated internal
+punctuation characters terminate the key.
+The curly braces are recommended if you use URLs as
keys: `[@{https://example.com/bib?name=foobar&date=2000}, p. 33]`.
Citation items may optionally include a prefix, a locator, and
@@ -5098,7 +5144,8 @@ or image itself, if these differ.
#### Extension: `attributes` ####
Allows attributes to be attached to any inline or block-level
-element. The syntax for the attributes is the same as that
+element when parsing `commonmark`.
+The syntax for the attributes is the same as that
used in [`header_attributes`][Extension: `header_attributes`].
- Attributes that occur immediately after an inline
@@ -5286,7 +5333,19 @@ for regular emphasis, add extra blank space around headings.
Include source position attributes when parsing `commonmark`.
For elements that accept attributes, a `data-pos` attribute
is added; other elements are placed in a surrounding
-Div or Span elemnet with a `data-pos` attribute.
+Div or Span element with a `data-pos` attribute.
+
+#### Extension: `short_subsuperscripts` ####
+
+Parse multimarkdown style subscripts and superscripts, which start with
+a '~' or '^' character, respectively, and include the alphanumeric sequence
+that follows. For example:
+
+ x^2 = 4
+
+or
+
+ Oxygen is O~2.
## Markdown variants
@@ -5622,9 +5681,16 @@ or BibLaTeX (for `--biblatex`) format.
A few other metadata fields affect bibliography formatting:
`link-citations`
-: If true, citations will be
- hyperlinked to the corresponding bibliography entries
- (for author-date and numerical styles only).
+: If true, citations will be hyperlinked to the
+ corresponding bibliography entries (for author-date and
+ numerical styles only). Defaults to false.
+
+`link-bibliography`
+: If true, DOIs, PMCIDs, PMID, and URLs in bibliographies will
+ be rendered as hyperlinks. (If an entry contains a DOI, PMCID,
+ PMID, or URL, but none of these fields are rendered by the style,
+ then the title, or in the absence of a title the whole entry, will
+ be hyperlinked.) Defaults to true.
`lang`
: The `lang` field will affect how the style is localized,
@@ -5737,8 +5803,8 @@ By default, the *slide level* is the highest heading level in
the hierarchy that is followed immediately by content, and not another
heading, somewhere in the document. In the example above, level-1 headings
are always followed by level-2 headings, which are followed by content,
-so the slide level is 2. This default can be overridden using
-the `--slide-level` option.
+so the slide level is 2. This default can be overridden using the
+`--slide-level` option.
The document is carved up into slides according to the following
rules:
@@ -5762,19 +5828,62 @@ rules:
subsequent slide with the same title (for beamer).
* A title page is constructed automatically from the document's title
- block, if present. (In the case of beamer, this can be disabled
+ block, if present. (In the case of beamer, this can be disabled
by commenting out some lines in the default template.)
These rules are designed to support many different styles of slide show. If
you don't care about structuring your slides into sections and subsections,
-you can just use level-1 headings for all each slide. (In that case, level-1
-will be the slide level.) But you can also structure the slide show into
-sections, as in the example above.
+you can either just use level-1 headings for all slides (in that case, level 1
+will be the slide level) or you can set `--slide-level=0`.
Note: in reveal.js slide shows, if slide level is 2, a two-dimensional
layout will be produced, with level-1 headings building horizontally
-and level-2 headings building vertically. It is not recommended that
-you use deeper nesting of section levels with reveal.js.
+and level-2 headings building vertically. It is not recommended that
+you use deeper nesting of section levels with reveal.js unless you set
+`--slide-level=0` (which lets reveal.js produce a one-dimensional layout
+and only interprets horizontal rules as slide boundaries).
+
+### PowerPoint layout choice
+
+When creating slides, the pptx writer chooses from a number of pre-defined
+layouts, based on the content of the slide:
+
+Title Slide
+: This layout is used for the initial slide, which is generated and
+ filled from the metadata fields `date`, `author`, and `title`, if
+ they are present.
+
+Section Header
+: This layout is used for what pandoc calls “title slides”, i.e.
+ slides which start with a header which is above the slide level in
+ the hierarchy.
+
+Two Content
+: This layout is used for two-column slides, i.e. slides containing a
+ div with class `columns` which contains at least two divs with class
+ `column`.
+
+Comparison
+: This layout is used instead of “Two Content” for any two-column
+ slides in which at least one column contains text followed by
+ non-text (e.g. an image or a table).
+
+Content with Caption
+: This layout is used for any non-two-column slides which contain text
+ followed by non-text (e.g. an image or a table).
+
+Blank
+: This layout is used for any slides which only contain blank content,
+ e.g. a slide containing only speaker notes, or a slide containing
+ only a non-breaking space.
+
+Title and Content
+: This layout is used for all slides which do not match the criteria
+ for another layout.
+
+These layouts are chosen from the default pptx reference doc included with
+pandoc, unless an alternative reference doc is specified using
+`--reference-doc`.
## Incremental lists
@@ -5814,9 +5923,6 @@ option):
Both methods allow incremental and nonincremental lists to be mixed
in a single document.
-Note: Neither the `-i/--incremental` option nor any of the
-methods described here currently works for PowerPoint output.
-
## Inserting pauses
You can add "pauses" within a slide by including a paragraph containing
@@ -5952,44 +6058,65 @@ the [Beamer User's Guide] may also be used: `allowdisplaybreaks`,
`allowframebreaks`, `b`, `c`, `t`, `environment`, `label`, `plain`,
`shrink`, `standout`, `noframenumbering`.
-## Background in reveal.js and beamer
+## Background in reveal.js, beamer, and pptx
-Background images can be added to self-contained reveal.js slideshows and
-to beamer slideshows.
+Background images can be added to self-contained reveal.js slide shows,
+beamer slide shows, and pptx slide shows.
-For the same image on every slide, use the configuration
-option `background-image` either in the YAML metadata block
-or as a command-line variable. (There are no other options in
-beamer and the rest of this section concerns reveal.js slideshows.)
+### On all slides (beamer, reveal.js, pptx)
-For reveal.js, you can instead use the reveal.js-native option
-`parallaxBackgroundImage`. You can also set `parallaxBackgroundHorizontal`
-and `parallaxBackgroundVertical` the same way and must also set
-`parallaxBackgroundSize` to have your values take effect.
+With beamer and reveal.js, the configuration option `background-image` can be
+used either in the YAML metadata block or as a command-line variable to get the
+same image on every slide.
-To set an image for a particular reveal.js slide, add
-`{data-background-image="/path/to/image"}`
-to the first slide-level heading on the slide (which may even be empty).
+For pptx, you can use a [reference doc](#option--reference-doc) in which
+background images have been set on the [relevant
+layouts](#powerpoint-layout-choice).
+
+#### `parallaxBackgroundImage` (reveal.js)
+
+For reveal.js, there is also the reveal.js-native option
+`parallaxBackgroundImage`, which can be used instead of `background-image` to
+produce a parallax scrolling background. You must also set
+`parallaxBackgroundSize`, and can optionally set `parallaxBackgroundHorizontal`
+and `parallaxBackgroundVertical` to configure the scrolling behaviour. See the
+[reveal.js documentation](https://revealjs.com/backgrounds/#parallax-background)
+for more details about the meaning of these options.
In reveal.js's overview mode, the parallaxBackgroundImage will show up
only on the first slide.
-Other reveal.js background settings also work on individual slides, including
-`data-background-size`, `data-background-repeat`, `data-background-color`,
-`data-transition`, and `data-transition-speed`.
+### On individual slides (reveal.js, pptx)
+
+To set an image for a particular reveal.js or pptx slide, add
+`{background-image="/path/to/image"}` to the first slide-level heading on the
+slide (which may even be empty).
+
+As the [HTML writers pass unknown attributes
+through](#extension-link_attributes), other reveal.js background settings also
+work on individual slides, including `background-size`, `background-repeat`,
+`background-color`, `transition`, and `transition-speed`. (The `data-` prefix
+will automatically be added.)
+
+Note: `data-background-image` is also supported in pptx for consistency with
+reveal.js – if `background-image` isn’t found, `data-background-image` will be
+checked.
-To add a background image to the automatically generated title slide, use the
-`title-slide-attributes` variable in the YAML metadata block. It must contain
-a map of attribute names and values.
+### On the title slide (reveal.js, pptx)
-See the [reveal.js documentation](https://revealjs.com/backgrounds/) for more
-details.
+To add a background image to the automatically generated title slide for
+reveal.js, use the `title-slide-attributes` variable in the YAML metadata block.
+It must contain a map of attribute names and values. (Note that the `data-`
+prefix is required here, as it isn’t added automatically.)
-For example in reveal.js:
+For pptx, pass a [reference doc](#option--reference-doc) with the background
+image set on the “Title Slide” layout.
+
+### Example (reveal.js)
```
---
-title: My Slideshow
+title: My Slide Show
parallaxBackgroundImage: /path/to/my/background_image.png
title-slide-attributes:
data-background-image: /path/to/title_image.png
@@ -6000,7 +6127,7 @@ title-slide-attributes:
Slide 1 has background_image.png as its background.
-## {data-background-image="/path/to/special_image.jpg"}
+## {background-image="/path/to/special_image.jpg"}
Slide 2 has a special image for its background, even though the heading has no content.
```
@@ -6066,7 +6193,12 @@ The following fields are recognized:
language if nothing is specified.
`subject`
- ~ A string value or a list of such values.
+ ~ Either a string value, or an object with fields `text`, `authority`,
+ and `term`, or a list of such objects. Valid values for `authority`
+ are either a [reserved authority value] (currently `AAT`, `BIC`,
+ `BISAC`, `CLC`, `DDC`, `CLIL`, `EuroVoc`, `MEDTOP`, `LCSH`, `NDC`,
+ `Thema`, `UDC`, and `WGS`) or an absolute IRI identifying a custom
+ scheme. Valid values for `term` are defined by the scheme.
`description`
~ A string value.
@@ -6085,7 +6217,7 @@ The following fields are recognized:
`rights`
~ A string value.
-
+
`belongs-to-collection`
~ A string value. identifies the name of a collection to which
the EPUB Publication belongs.
@@ -6116,6 +6248,7 @@ The following fields are recognized:
- `scroll-axis`: `vertical`|`horizontal`|`default`
[MARC relators]: https://loc.gov/marc/relators/relaterm.html
+[reserved authority value]: https://idpf.github.io/epub-registries/authorities/
[`spine` element]: http://idpf.org/epub/301/spec/epub-publications.html#sec-spine-elem
## The `epub:type` attribute
@@ -6155,6 +6288,7 @@ halftitlepage frontmatter
seriespage frontmatter
foreword frontmatter
preface frontmatter
+frontispiece frontmatter
appendix backmatter
colophon backmatter
bibliography backmatter
@@ -6442,19 +6576,35 @@ With these custom styles, you can use your input document as a
reference-doc while creating docx output (see below), and maintain the
same styles in your input and output files.
-# Custom writers
+# Custom readers and writers
-Pandoc can be extended with custom writers written in [Lua]. (Pandoc
-includes a Lua interpreter, so Lua need not be installed separately.)
+Pandoc can be extended with custom readers and writers written
+in [Lua]. (Pandoc includes a Lua interpreter, so Lua need not
+be installed separately.)
-To use a custom writer, simply specify the path to the Lua script
-in place of the output format. For example:
+To use a custom reader or writer, simply specify the path to the
+Lua script in place of the input or output format. For example:
pandoc -t data/sample.lua
+ pandoc -f my_custom_markup_language.lua -t latex -s
+
+A custom reader is a Lua script that defines one function,
+Reader, which takes a string as input and returns a Pandoc
+AST. See the [Lua filters documentation] for documentation
+of the functions that are available for creating pandoc
+AST elements. For parsing, the [lpeg] parsing library
+is available by default. To see a sample custom reader:
+
+ pandoc --print-default-data-file creole.lua
+
+If you want your custom reader to have access to reader options
+(e.g. the tab stop setting), you give your Reader function a
+second `options` parameter.
-Creating a custom writer requires writing a Lua function for each
-possible element in a pandoc document. To get a documented example
-which you can modify according to your needs, do
+A custom writer is a Lua script that defines a function
+that specifies how to render each element in a Pandoc AST.
+To see a documented example which you can modify according
+to your needs:
pandoc --print-default-data-file sample.lua
@@ -6466,6 +6616,7 @@ default template with the name
subdirectory of your user data directory (see [Templates]).
[Lua]: https://www.lua.org
+[lpeg]: http://www.inf.puc-rio.br/~roberto/lpeg/
# Reproducible builds
@@ -6493,21 +6644,38 @@ application, here are some things to keep in mind:
writer could in principle do anything on your file system. Please
audit filters and custom writers very carefully before using them.
-2. If your application uses pandoc as a Haskell library (rather than
+2. Several input formats (including HTML, Org, and RST) support `include`
+ directives that allow the contents of a file to be included in the
+ output. An untrusted attacker could use these to view the contents of
+ files on the file system. (Using the `--sandbox` option can
+ protect against this threat.)
+
+3. Several output formats (including RTF, FB2, HTML with
+ `--self-contained`, EPUB, Docx, and ODT) will embed encoded
+ or raw images into the output file. An untrusted attacker
+ could exploit this to view the contents of non-image files on the
+ file system. (Using the `--sandbox` option can protect
+ against this threat, but will also prevent including images in
+ these formats.)
+
+4. If your application uses pandoc as a Haskell library (rather than
shelling out to the executable), it is possible to use it in a mode
that fully isolates pandoc from your file system, by running the
pandoc operations in the `PandocPure` monad. See the document
[Using the pandoc API](https://pandoc.org/using-the-pandoc-api.html)
for more details.
-3. Pandoc's parsers can exhibit pathological performance on some
+5. Pandoc's parsers can exhibit pathological performance on some
corner cases. It is wise to put any pandoc operations under
a timeout, to avoid DOS attacks that exploit these issues.
If you are using the pandoc executable, you can add the
command line options `+RTS -M512M -RTS` (for example) to limit
- the heap size to 512MB.
+ the heap size to 512MB. Note that the `commonmark` parser
+ (including `commonmark_x` and `gfm`) is much less vulnerable
+ to pathological performance than the `markdown` parser, so
+ it is a better choice when processing untrusted input.
-4. The HTML generated by pandoc is not guaranteed to be safe.
+6. The HTML generated by pandoc is not guaranteed to be safe.
If `raw_html` is enabled for the Markdown input, users can
inject arbitrary HTML. Even if `raw_html` is disabled,
users can include dangerous content in URLs and attributes.