diff options
Diffstat (limited to 'MANUAL.txt')
-rw-r--r-- | MANUAL.txt | 161 |
1 files changed, 84 insertions, 77 deletions
diff --git a/MANUAL.txt b/MANUAL.txt index 5d60e2c19..4e33be2a0 100644 --- a/MANUAL.txt +++ b/MANUAL.txt @@ -11,13 +11,16 @@ Description =========== Pandoc is a [Haskell] library for converting from one markup format to -another, and a command-line tool that uses this library. It can read -[Markdown], [CommonMark], [PHP Markdown Extra], [GitHub-Flavored -Markdown], [MultiMarkdown], and (subsets of) [Textile], +another, and a command-line tool that uses this library. + +Pandoc can read [Markdown], [CommonMark], [PHP Markdown Extra], +[GitHub-Flavored Markdown], [MultiMarkdown], and (subsets of) [Textile], [reStructuredText], [HTML], [LaTeX], [MediaWiki markup], [TWiki markup], [TikiWiki markup], [Creole 1.0], [Haddock markup], [OPML], [Emacs Org mode], [DocBook], [JATS], [Muse], [txt2tags], [Vimwiki], -[EPUB], [ODT], and [Word docx]; and it can write plain text, [Markdown], +[EPUB], [ODT], and [Word docx]. + +Pandoc can write plain text, [Markdown], [CommonMark], [PHP Markdown Extra], [GitHub-Flavored Markdown], [MultiMarkdown], [reStructuredText], [XHTML], [HTML5], [LaTeX] \(including [`beamer`] slide shows\), [ConTeXt], [RTF], [OPML], @@ -30,21 +33,20 @@ Simple], [Muse], [PowerPoint] slide shows and [Slidy], [Slideous], [PDF] output on systems where LaTeX, ConTeXt, `pdfroff`, `wkhtmltopdf`, `prince`, or `weasyprint` is installed. -Pandoc's enhanced version of Markdown includes syntax for [footnotes], -[tables], flexible [ordered lists], [definition lists], [fenced code -blocks], [superscripts and subscripts], [strikeout], [metadata blocks], -automatic tables of contents, embedded LaTeX [math], [citations], and -[Markdown inside HTML block elements][Extension: -`markdown_in_html_blocks`]. (These enhancements, described further under -[Pandoc's Markdown], can be disabled using the `markdown_strict` input -or output format.) - -In contrast to most existing tools for converting Markdown to HTML, which -use regex substitutions, pandoc has a modular design: it consists of a -set of readers, which parse text in a given format and produce a native -representation of the document, and a set of writers, which convert +Pandoc's enhanced version of Markdown includes syntax for [tables], +[definition lists], [metadata blocks], [`Div` blocks][Extension: +`fenced_divs`], [footnotes] and [citations], embedded +[LaTeX][Extension: `raw_tex`] (incl. [math]), [Markdown inside HTML +block elements][Extension: `markdown_in_html_blocks`], and much more. +These enhancements, described further under [Pandoc's Markdown], +can be disabled using the `markdown_strict` format. + +Pandoc has a modular design: it consists of a set of readers, which parse +text in a given format and produce a native representation of the document +(like an _abstract syntax tree_ or AST), and a set of writers, which convert this native representation into a target format. Thus, adding an input -or output format requires only adding a reader or writer. +or output format requires only adding a reader or writer. Users can also +run custom [pandoc filters] to modify the intermediate AST. Because pandoc's intermediate representation of a document is less expressive than many of the formats it converts between, one should @@ -109,45 +111,32 @@ Markdown can be expected to be lossy. Using `pandoc` -------------- -If no *input-file* is specified, input is read from *stdin*. -Otherwise, the *input-files* are concatenated (with a blank -line between each) and used as input. Output goes to *stdout* by -default (though output to the terminal is disabled for the -`odt`, `docx`, `epub2`, and `epub3` output formats, unless it is -forced using `-o -`). For output to a file, use the `-o` -option: +If no *input-files* are specified, input is read from *stdin*. +Output goes to *stdout* by default. For output to a file, +use the `-o` option: pandoc -o output.html input.txt -By default, pandoc produces a document fragment, not a standalone -document with a proper header and footer. To produce a standalone -document, use the `-s` or `--standalone` flag: +By default, pandoc produces a document fragment. To produce a standalone +document (e.g. a valid HTML file including `<head>` and `<body>`), +use the `-s` or `--standalone` flag: pandoc -s -o output.html input.txt For more information on how standalone documents are produced, see -[Templates], below. - -Instead of a file, an absolute URI may be given. In this case -pandoc will fetch the content using HTTP: - - pandoc -f html -t markdown http://www.fsf.org - -It is possible to supply a custom User-Agent string or other -header when requesting a document from a URL: - - pandoc -f html -t markdown --request-header User-Agent:"Mozilla/5.0" \ - http://www.fsf.org +[Templates] below. If multiple input files are given, `pandoc` will concatenate them all (with -blank lines between them) before parsing. This feature is disabled for - binary input formats such as `EPUB`, `odt`, and `docx`. +blank lines between them) before parsing. (Use `--file-scope` to parse files +individually.) + +Specifying formats +------------------ The format of the input and output can be specified explicitly using command-line options. The input format can be specified using the -`-r/--read` or `-f/--from` options, the output format using the -`-w/--write` or `-t/--to` options. Thus, to convert `hello.txt` from -Markdown to LaTeX, you could type: +`-f/--from` option, the output format using the `-t/--to` option. +Thus, to convert `hello.txt` from Markdown to LaTeX, you could type: pandoc -f markdown -t latex hello.txt @@ -155,14 +144,11 @@ To convert `hello.html` from HTML to Markdown: pandoc -f html -t markdown hello.html -Supported output formats are listed below under the `-t/--to` option. -Supported input formats are listed below under the `-f/--from` option. Note -that the `rst`, `textile`, `latex`, and `html` readers are not complete; -there are some constructs that they do not parse. +Supported input and output formats are listed below under [Options]. If the input or output format is not specified explicitly, `pandoc` -will attempt to guess it from the extensions of -the input and output filenames. Thus, for example, +will attempt to guess it from the extensions of the filenames. +Thus, for example, pandoc -o hello.tex hello.txt @@ -171,7 +157,10 @@ is specified (so that output goes to *stdout*), or if the output file's extension is unknown, the output format will default to HTML. If no input file is specified (so that input comes from *stdin*), or if the input files' extensions are unknown, the input format will -be assumed to be Markdown unless explicitly specified. +be assumed to be Markdown. + +Character encoding +------------------ Pandoc uses the UTF-8 character encoding for both input and output. If your local character encoding is not UTF-8, you @@ -189,30 +178,12 @@ will only be included if you use the `-s/--standalone` option. Creating a PDF -------------- -To produce a PDF, specify an output file with a `.pdf` extension. -By default, pandoc will use LaTeX to create the PDF: +To produce a PDF, specify an output file with a `.pdf` extension: pandoc test.txt -o test.pdf -Production of a PDF requires that a LaTeX engine be installed (see -`--pdf-engine`, below), and assumes that the following LaTeX packages -are available: [`amsfonts`], [`amsmath`], [`lm`], [`unicode-math`], -[`ifxetex`], [`ifluatex`], [`listings`] (if the -`--listings` option is used), [`fancyvrb`], [`longtable`], -[`booktabs`], [`graphicx`] and [`grffile`] (if the document -contains images), [`hyperref`], [`xcolor`] (with `colorlinks`), [`ulem`], [`geometry`] (with the -`geometry` variable set), [`setspace`] (with `linestretch`), and -[`babel`] (with `lang`). The use of `xelatex` or `lualatex` as -the LaTeX engine requires [`fontspec`]. `xelatex` uses -[`polyglossia`] (with `lang`), [`xecjk`], and [`bidi`] (with the -`dir` variable set). If the `mathspec` variable is set, -`xelatex` will use [`mathspec`] instead of [`unicode-math`]. -The [`upquote`] and [`microtype`] packages are used if -available, and [`csquotes`] will be used for [typography] -if added to the template or included in any header file. The -[`natbib`], [`biblatex`], [`bibtex`], and [`biber`] packages can -optionally be used for [citation rendering]. These are included -with all recent versions of [TeX Live]. +By default, pandoc will use LaTeX to create the PDF, which requires +that a LaTeX engine be installed (see `--pdf-engine` below). Alternatively, pandoc can use [ConTeXt], `pdfroff`, or any of the following HTML/CSS-to-PDF-engines, to create a PDF: [`wkhtmltopdf`], @@ -228,6 +199,29 @@ If `wkhtmltopdf` is used, then the variables `margin-left`, `margin-right`, `margin-top`, `margin-bottom`, and `papersize` will affect the output. +To debug the PDF creation, it can be useful to look at the intermediate +representation: instead of `-o test.pdf`, use for example `-s -o test.tex` +to output the generated LaTeX. You can then test it with `pdflatex test.tex`. + +When using LaTeX, the following packages need to be available +(they are included with all recent versions of [TeX Live]): +[`amsfonts`], [`amsmath`], [`lm`], [`unicode-math`], +[`ifxetex`], [`ifluatex`], [`listings`] (if the +`--listings` option is used), [`fancyvrb`], [`longtable`], +[`booktabs`], [`graphicx`] and [`grffile`] (if the document +contains images), [`hyperref`], [`xcolor`] (with `colorlinks`), [`ulem`], [`geometry`] (with the +`geometry` variable set), [`setspace`] (with `linestretch`), and +[`babel`] (with `lang`). The use of `xelatex` or `lualatex` as +the LaTeX engine requires [`fontspec`]. `xelatex` uses +[`polyglossia`] (with `lang`), [`xecjk`], and [`bidi`] (with the +`dir` variable set). If the `mathspec` variable is set, +`xelatex` will use [`mathspec`] instead of [`unicode-math`]. +The [`upquote`] and [`microtype`] packages are used if +available, and [`csquotes`] will be used for [typography] +if added to the template or included in any header file. The +[`natbib`], [`biblatex`], [`bibtex`], and [`biber`] packages can +optionally be used for [citation rendering]. + [`amsfonts`]: https://ctan.org/pkg/amsfonts [`amsmath`]: https://ctan.org/pkg/amsmath [`lm`]: https://ctan.org/pkg/lm @@ -262,6 +256,20 @@ will affect the output. [`weasyprint`]: http://weasyprint.org [`prince`]: https://www.princexml.com/ +Reading from the Web +-------------------- + +Instead of an input file, an absolute URI may be given. In this case +pandoc will fetch the content using HTTP: + + pandoc -f html -t markdown http://www.fsf.org + +It is possible to supply a custom User-Agent string or other +header when requesting a document from a URL: + + pandoc -f html -t markdown --request-header User-Agent:"Mozilla/5.0" \ + http://www.fsf.org + Options ======= @@ -318,9 +326,8 @@ General options below). (`markdown_github` provides deprecated and less accurate support for Github-Flavored Markdown; please use `gfm` instead, unless you use extensions that do not work with `gfm`.) Note that - `odt`, `epub`, and `epub3` output will not be directed to - *stdout*; an output filename must be specified using the - `-o/--output` option. Extensions can be individually enabled or + `odt`, `docx`, and `epub` output will not be directed to *stdout* + unless forced with `-o -`. Extensions can be individually enabled or disabled by appending `+EXTENSION` or `-EXTENSION` to the format name. See [Extensions] below, for a list of extensions and their names. See `--list-output-formats` and `--list-extensions`, below. @@ -389,7 +396,7 @@ General options `--list-extensions`[`=`*FORMAT*] -: List supported Markdown extensions, one per line, preceded +: List supported extensions, one per line, preceded by a `+` or `-` indicating whether it is enabled by default in *FORMAT*. If *FORMAT* is not specified, defaults for pandoc's Markdown are given. |