diff options
66 files changed, 2797 insertions, 607 deletions
diff --git a/AUTHORS.md b/AUTHORS.md index d56c2872e..16c2a3e8a 100644 --- a/AUTHORS.md +++ b/AUTHORS.md @@ -63,6 +63,7 @@ - Greg Rundlett - Grégory Bataille - Gwern Branwen +- Hamish Mackenzie - Hans-Peter Deifel - Henrik Tramberend - Henry de Valence diff --git a/MANUAL.txt b/MANUAL.txt index 78bd057ed..05018be93 100644 --- a/MANUAL.txt +++ b/MANUAL.txt @@ -1,6 +1,6 @@ % Pandoc User's Guide % John MacFarlane -% December 8, 2017 +% December 27, 2017 Synopsis ======== @@ -284,16 +284,9 @@ General options (`markdown_github` provides deprecated and less accurate support for Github-Flavored Markdown; please use `gfm` instead, unless you need to use extensions other than `smart`.) - If `+lhs` is appended to `markdown`, `rst`, `latex`, or - `html`, the input will be treated as literate Haskell source: see - [Literate Haskell support], below. Markdown - syntax extensions can be individually enabled or disabled by - appending `+EXTENSION` or `-EXTENSION` to the format name. So, for - example, `markdown_strict+footnotes+definition_lists` is strict - Markdown with footnotes and definition lists enabled, and - `markdown-pipe_tables+hard_line_breaks` is pandoc's Markdown - without pipe tables and with hard line breaks. See [Pandoc's - Markdown], below, for a list of extensions and + Extensions can be individually enabled or disabled by + appending `+EXTENSION` or `-EXTENSION` to the format name. + See [Extensions] below, for a list of extensions and their names. See `--list-input-formats` and `--list-extensions`, below. @@ -327,13 +320,10 @@ General options unless you use extensions that do not work with `gfm`.) Note that `odt`, `epub`, and `epub3` output will not be directed to *stdout*; an output filename must be specified using the - `-o/--output` option. If `+lhs` is appended to `markdown`, `rst`, - `latex`, `beamer`, `html4`, or `html5`, the output will be - rendered as literate Haskell source: see [Literate Haskell - support], below. Markdown syntax extensions can be individually - enabled or disabled by appending `+EXTENSION` or `-EXTENSION` to - the format name, as described above under `-f`. See - `--list-output-formats` and `--list-extensions`, below. + `-o/--output` option. Extensions can be individually enabled or + disabled by appending `+EXTENSION` or `-EXTENSION` to the format + name. See [Extensions] below, for a list of extensions and their + names. See `--list-output-formats` and `--list-extensions`, below. `-o` *FILE*, `--output=`*FILE* @@ -397,11 +387,11 @@ General options : List supported output formats, one per line. -`--list-extensions` +`--list-extensions`[`=`*FORMAT*] : List supported Markdown extensions, one per line, followed by a `+` or `-` indicating whether it is enabled by default - in pandoc's Markdown. + in *FORMAT* (defaulting to pandoc's Markdown). `--list-highlight-languages` @@ -1106,7 +1096,7 @@ of the following options. The *URL* should point to the `MathJax.js` load script. If a *URL* is not provided, a link to the Cloudflare CDN will be inserted. - + `--mathml` : Convert TeX math to [MathML] (in `epub3`, `docbook4`, `docbook5`, `jats`, @@ -1698,6 +1688,269 @@ will be treated as a comment and ignored. [pandoc-templates]: https://github.com/jgm/pandoc-templates +Extensions +========== + +The behavior of some of the readers and writers can be adjusted by +enabling or disabling various extensions. + +An extension can be enabled by adding `+EXTENSION` +to the format name and disabled by adding `-EXTENSION`. For example, +`--from markdown_strict+footnotes` is strict Markdown with footnotes +enabled, while `--from markdown-footnotes-pipe_tables` is pandoc's +Markdown without footnotes or pipe tables. + +The markdown reader and writer make by far the most use of extensions. +Extensions only used by them are therefore covered in the +section [Pandoc's Markdown] below (See [Markdown variants] for +`commonmark` and `gfm`.) In the following, extensions that also work +for other formats are covered. + +Typography +---------- + +#### Extension: `smart` #### + +Interpret straight quotes as curly quotes, `---` as em-dashes, +`--` as en-dashes, and `...` as ellipses. Nonbreaking spaces are +inserted after certain abbreviations, such as "Mr." + +This extension can be enabled/disabled for the following formats: + +input formats +: `markdown`, `commonmark`, `latex`, `mediawiki`, `org`, `rst`, `twiki` + +output formats +: `markdown`, `latex`, `context`, `rst` + +enabled by default in +: `markdown`, `latex`, `context` (both input and output) + +Note: If you are *writing* Markdown, then the `smart` extension +has the reverse effect: what would have been curly quotes comes +out straight. + +In LaTeX, `smart` means to use the standard TeX ligatures +for quotation marks (` `` ` and ` '' ` for double quotes, +`` ` `` and `` ' `` for single quotes) and dashes (`--` for +en-dash and `---` for em-dash). If `smart` is disabled, +then in reading LaTeX pandoc will parse these characters +literally. In writing LaTeX, enabling `smart` tells pandoc +to use the ligatures when possible; if `smart` is disabled +pandoc will use unicode quotation mark and dash characters. + +Headers and sections +-------------------- + +#### Extension: `auto_identifiers` #### + +A header without an explicitly specified identifier will be +automatically assigned a unique identifier based on the header text. + +This extension can be enabled/disabled for the following formats: + +input formats +: `markdown`, `latex`, `rst`, `mediawiki`, `textile` + +output formats +: `markdown`, `muse` + +enabled by default in +: `markdown`, `muse` + +The algorithm used to derive the identifier from the header text is: + + - Remove all formatting, links, etc. + - Remove all footnotes. + - Remove all punctuation, except underscores, hyphens, and periods. + - Replace all spaces and newlines with hyphens. + - Convert all alphabetic characters to lowercase. + - Remove everything up to the first letter (identifiers may + not begin with a number or punctuation mark). + - If nothing is left after this, use the identifier `section`. + +Thus, for example, + + Header Identifier + ------------------------------- ---------------------------- + `Header identifiers in HTML` `header-identifiers-in-html` + `*Dogs*?--in *my* house?` `dogs--in-my-house` + `[HTML], [S5], or [RTF]?` `html-s5-or-rtf` + `3. Applications` `applications` + `33` `section` + +These rules should, in most cases, allow one to determine the identifier +from the header text. The exception is when several headers have the +same text; in this case, the first will get an identifier as described +above; the second will get the same identifier with `-1` appended; the +third with `-2`; and so on. + +These identifiers are used to provide link targets in the table of +contents generated by the `--toc|--table-of-contents` option. They +also make it easy to provide links from one section of a document to +another. A link to this section, for example, might look like this: + + See the section on + [header identifiers](#header-identifiers-in-html-latex-and-context). + +Note, however, that this method of providing links to sections works +only in HTML, LaTeX, and ConTeXt formats. + +If the `--section-divs` option is specified, then each section will +be wrapped in a `div` (or a `section`, if `html5` was specified), +and the identifier will be attached to the enclosing `<div>` +(or `<section>`) tag rather than the header itself. This allows entire +sections to be manipulated using JavaScript or treated differently in +CSS. + +#### Extension: `ascii_identifiers` #### + +Causes the identifiers produced by `auto_identifiers` to be pure ASCII. +Accents are stripped off of accented Latin letters, and non-Latin +letters are omitted. + +Math Input +---------- + +The extensions [`tex_math_dollars`](#extension-tex_math_dollars), +[`tex_math_single_backslash`](#extension-tex_math_single_backslash), and +[`tex_math_double_backslash`](#extension-tex_math_double_backslash) +are described in the section about Pandoc's Markdown. + +However, they can also be used with HTML input. This is handy for +reading web pages formatted using MathJax, for example. + +Raw HTML/TeX +------------ + +The following extensions (especially how they affect Markdown +input/output) are also described in more detail in their respective +sections of [Pandoc's Markdown]. + +#### [Extension: `raw_html`] {#raw_html} + +When converting from HTML, parse elements to raw HTML which are not +representable in pandoc's AST. +By default, this is disabled for HTML input. + +#### [Extension: `raw_tex`] {#raw_tex} + +Allows raw LaTeX, TeX, and ConTeXt to be included in a document. + +This extension can be enabled/disabled for the following formats +(in addition to `markdown`): + +input formats +: `latex`, `org`, `textile` + +output formats +: `textile` + +#### [Extension: `native_divs`] {#native_divs} + +This extension is enabled by default for HTML input. This means that +`div`s are parsed to pandoc native elements. (Alternatively, you +can parse them to raw HTML using `-f html-native_divs+raw_html`.) + +When converting HTML to Markdown, for example, you may want to drop all +`div`s and `span`s: + + pandoc -f html-native_divs-native_spans -t markdown + +#### [Extension: `native_spans`] {#native_spans} + +Analogous to `native_divs` above. + + +Literate Haskell support +------------------------ + +#### Extension: `literate_haskell` #### + +Treat the document as literate Haskell source. + +This extension can be enabled/disabled for the following formats: + +input formats +: `markdown`, `rst`, `latex` + +output formats +: `markdown`, `rst`, `latex`, `html` + +If you append `+lhs` (or `+literate_haskell`) to one of the formats +above, pandoc will treat the document as literate Haskell source. +This means that + + - In Markdown input, "bird track" sections will be parsed as Haskell + code rather than block quotations. Text between `\begin{code}` + and `\end{code}` will also be treated as Haskell code. For + ATX-style headers the character '=' will be used instead of '#'. + + - In Markdown output, code blocks with classes `haskell` and `literate` + will be rendered using bird tracks, and block quotations will be + indented one space, so they will not be treated as Haskell code. + In addition, headers will be rendered setext-style (with underlines) + rather than ATX-style (with '#' characters). (This is because ghc + treats '#' characters in column 1 as introducing line numbers.) + + - In restructured text input, "bird track" sections will be parsed + as Haskell code. + + - In restructured text output, code blocks with class `haskell` will + be rendered using bird tracks. + + - In LaTeX input, text in `code` environments will be parsed as + Haskell code. + + - In LaTeX output, code blocks with class `haskell` will be rendered + inside `code` environments. + + - In HTML output, code blocks with class `haskell` will be rendered + with class `literatehaskell` and bird tracks. + +Examples: + + pandoc -f markdown+lhs -t html + +reads literate Haskell source formatted with Markdown conventions and writes +ordinary HTML (without bird tracks). + + pandoc -f markdown+lhs -t html+lhs + +writes HTML with the Haskell code in bird tracks, so it can be copied +and pasted as literate Haskell source. + +Note that GHC expects the bird tracks in the first column, so indentend literate +code blocks (e.g. inside an itemized environment) will not be picked up by the +Haskell compiler. + +Other extensions +---------------- + +#### Extension: `empty_paragraphs` #### + +Allows empty paragraphs. By default empty paragraphs are +omitted. + +This extension can be enabled/disabled for the following formats: + +input formats +: `docx`, `html` + +output formats +: `markdown`, `docx`, `odt`, `opendocument`, `html` + +#### Extension: `amuse` #### + +In the `muse` input format, this enables Text::Amuse +extensions to Emacs Muse markup. + +#### Extension: `citations` {#org-citations} + +Some aspects of [Pandoc's Markdown citation syntax](#citations) are also accepted +in `org` input. + + Pandoc's Markdown ================= @@ -1705,11 +1958,9 @@ Pandoc understands an extended and slightly revised version of John Gruber's [Markdown] syntax. This document explains the syntax, noting differences from standard Markdown. Except where noted, these differences can be suppressed by using the `markdown_strict` format instead -of `markdown`. An extensions can be enabled by adding `+EXTENSION` -to the format name and disabled by adding `-EXTENSION`. For example, -`markdown_strict+footnotes` is strict Markdown with footnotes -enabled, while `markdown-footnotes-pipe_tables` is pandoc's -Markdown without footnotes or pipe tables. +of `markdown`. Extensions can be enabled or disabled to specify the +behavior more granularly. They are described in the following. See also +[Extensions] above, for extensions that work also on other formats. Philosophy ---------- @@ -1801,6 +2052,8 @@ pandoc does require the space. ### Header identifiers ### +See also the [`auto_identifiers` extension](#extension-auto_identifiers) above. + #### Extension: `header_attributes` #### Headers can be assigned attributes using this syntax at the end @@ -1837,55 +2090,6 @@ is just the same as # My header {.unnumbered} -#### Extension: `auto_identifiers` #### - -A header without an explicitly specified identifier will be -automatically assigned a unique identifier based on the header text. -To derive the identifier from the header text, - - - Remove all formatting, links, etc. - - Remove all footnotes. - - Remove all punctuation, except underscores, hyphens, and periods. - - Replace all spaces and newlines with hyphens. - - Convert all alphabetic characters to lowercase. - - Remove everything up to the first letter (identifiers may - not begin with a number or punctuation mark). - - If nothing is left after this, use the identifier `section`. - -Thus, for example, - - Header Identifier - ------------------------------- ---------------------------- - `Header identifiers in HTML` `header-identifiers-in-html` - `*Dogs*?--in *my* house?` `dogs--in-my-house` - `[HTML], [S5], or [RTF]?` `html-s5-or-rtf` - `3. Applications` `applications` - `33` `section` - -These rules should, in most cases, allow one to determine the identifier -from the header text. The exception is when several headers have the -same text; in this case, the first will get an identifier as described -above; the second will get the same identifier with `-1` appended; the -third with `-2`; and so on. - -These identifiers are used to provide link targets in the table of -contents generated by the `--toc|--table-of-contents` option. They -also make it easy to provide links from one section of a document to -another. A link to this section, for example, might look like this: - - See the section on - [header identifiers](#header-identifiers-in-html-latex-and-context). - -Note, however, that this method of providing links to sections works -only in HTML, LaTeX, and ConTeXt formats. - -If the `--section-divs` option is specified, then each section will -be wrapped in a `div` (or a `section`, if `html5` was specified), -and the identifier will be attached to the enclosing `<div>` -(or `<section>`) tag rather than the header itself. This allows entire -sections to be manipulated using JavaScript or treated differently in -CSS. - #### Extension: `implicit_header_references` #### Pandoc behaves as if reference links have been defined for each header. @@ -3028,8 +3232,6 @@ HTML, Slidy, DZSlides, S5, EPUB command-line options selected. Therefore see [Math rendering in HTML] above. -This extension can be used with both `markdown` and `html` input. - [interpreted text role `:math:`]: http://docutils.sourceforge.net/docs/ref/rst/roles.html#math Raw HTML @@ -3457,33 +3659,6 @@ they cannot contain multiple paragraphs). The syntax is as follows: Inline and regular footnotes may be mixed freely. -Typography ----------- - -#### Extension: `smart` #### - -Interpret straight quotes as curly quotes, `---` as em-dashes, -`--` as en-dashes, and `...` as ellipses. Nonbreaking spaces are -inserted after certain abbreviations, such as "Mr." This -option currently affects the input formats `markdown`, -`commonmark`, `latex`, `mediawiki`, `org`, `rst`, and `twiki`, -and the output formats `markdown`, `latex`, and `context`. -It is enabled by default for `markdown`, `latex`, and `context` -(in both input and output). - -Note: If you are *writing* Markdown, then the `smart` extension -has the reverse effect: what would have been curly quotes comes -out straight. - -In LaTeX, `smart` means to use the standard TeX ligatures -for quotation marks (` `` ` and ` '' ` for double quotes, -`` ` `` and `` ' `` for single quotes) and dashes (`--` for -en-dash and `---` for em-dash). If `smart` is disabled, -then in reading LaTeX pandoc will parse these characters -literally. In writing LaTeX, enabling `smart` tells pandoc -to use the ligatures when possible; if `smart` is disabled -pandoc will use unicode quotation mark and dash characters. - Citations --------- @@ -3746,8 +3921,6 @@ TeX math, and anything between `\[` and `\]` to be interpreted as display TeX math. Note: a drawback of this extension is that it precludes escaping `(` and `[`. -This extension can be used with both `markdown` and `html` input. - #### Extension: `tex_math_double_backslash` #### Causes anything between `\\(` and `\\)` to be interpreted as inline @@ -3790,12 +3963,6 @@ simply skipped (as opposed to being parsed as paragraphs). Makes all absolute URIs into links, even when not surrounded by pointy braces `<...>`. -#### Extension: `ascii_identifiers` #### - -Causes the identifiers produced by `auto_identifiers` to be pure ASCII. -Accents are stripped off of accented Latin letters, and non-Latin -letters are omitted. - #### Extension: `mmd_link_attributes` #### Parses multimarkdown style key-value attributes on link @@ -3839,12 +4006,6 @@ in several respects: we must either disallow lazy wrapping or require a blank line between list items. -#### Extension: `empty_paragraphs` #### - -Allows empty paragraphs. By default empty paragraphs are -omitted. This affects the `docx` reader and writer, the -`opendocument` and `odt` writer, and all HTML-based readers and writers. - Markdown variants ----------------- @@ -3878,34 +4039,21 @@ variants are supported: : `raw_html`, `shortcut_reference_links`, `spaced_reference_links`. -We also support `gfm` (GitHub-Flavored Markdown) as a set of -extensions on `commonmark`: +We also support `commonmark` and `gfm` (GitHub-Flavored Markdown, +which is implemented as a set of extensions on `commonmark`). + +Note, however, that `commonmark` and `gfm` have limited support +for extensions. Only those listed below (and `smart` and +`raw_tex`) will work. The extensions can, however, all be +individually disabled. +Also, `raw_tex` only affects `gfm` output, not input. +`gfm` (GitHub-Flavored Markdown) : `pipe_tables`, `raw_html`, `fenced_code_blocks`, `auto_identifiers`, `ascii_identifiers`, `backtick_code_blocks`, `autolink_bare_uris`, `intraword_underscores`, `strikeout`, `hard_line_breaks`, `emoji`, `shortcut_reference_links`, `angle_brackets_escapable`. - These can all be individually disabled. Note, however, that - `commonmark` and `gfm` have limited support for extensions: - extensions other than those listed above (and `smart` and - `raw_tex`) will have no effect on `commonmark` or `gfm`. - And `raw_tex` only affects `gfm` output, not input. - -Extensions with formats other than Markdown -------------------------------------------- - -Some of the extensions discussed above can be used with formats -other than Markdown: - -* `auto_identifiers` can be used with `latex`, `rst`, `mediawiki`, - and `textile` input (and is used by default). - -* `tex_math_dollars`, `tex_math_single_backslash`, and - `tex_math_double_backslash` can be used with `html` input. - (This is handy for reading web pages formatted using MathJax, - for example.) - Producing slide shows with pandoc ================================= @@ -4257,57 +4405,6 @@ with the `src` attribute. For example: </source> </audio> -Literate Haskell support -======================== - -If you append `+lhs` (or `+literate_haskell`) to an appropriate input or output -format (`markdown`, `markdown_strict`, `rst`, or `latex` for input or output; -`beamer`, `html4` or `html5` for output only), pandoc will treat the document as -literate Haskell source. This means that - - - In Markdown input, "bird track" sections will be parsed as Haskell - code rather than block quotations. Text between `\begin{code}` - and `\end{code}` will also be treated as Haskell code. For - ATX-style headers the character '=' will be used instead of '#'. - - - In Markdown output, code blocks with classes `haskell` and `literate` - will be rendered using bird tracks, and block quotations will be - indented one space, so they will not be treated as Haskell code. - In addition, headers will be rendered setext-style (with underlines) - rather than ATX-style (with '#' characters). (This is because ghc - treats '#' characters in column 1 as introducing line numbers.) - - - In restructured text input, "bird track" sections will be parsed - as Haskell code. - - - In restructured text output, code blocks with class `haskell` will - be rendered using bird tracks. - - - In LaTeX input, text in `code` environments will be parsed as - Haskell code. - - - In LaTeX output, code blocks with class `haskell` will be rendered - inside `code` environments. - - - In HTML output, code blocks with class `haskell` will be rendered - with class `literatehaskell` and bird tracks. - -Examples: - - pandoc -f markdown+lhs -t html - -reads literate Haskell source formatted with Markdown conventions and writes -ordinary HTML (without bird tracks). - - pandoc -f markdown+lhs -t html+lhs - -writes HTML with the Haskell code in bird tracks, so it can be copied -and pasted as literate Haskell source. - -Note that GHC expects the bird tracks in the first column, so indentend literate -code blocks (e.g. inside an itemized environment) will not be picked up by the -Haskell compiler. - Syntax highlighting =================== @@ -85,7 +85,15 @@ download_stats: curl https://api.github.com/repos/jgm/pandoc/releases | \ jq -r '.[] | .assets | .[] | "\(.download_count)\t\(.name)"' +pandoc-templates: + rm ../pandoc-templates/default.* ; \ + cp data/templates/default.* ../pandoc-templates/ ; \ + pushd ../pandoc-templates/ && \ + git add default.* && \ + git commit -m "Updated templates for pandoc $(version)" && \ + popd + clean: stack clean -.PHONY: deps quick full haddock install clean test bench changes_github macospkg dist prof download_stats reformat lint weigh doc/lua-filters.md packages +.PHONY: deps quick full haddock install clean test bench changes_github macospkg dist prof download_stats reformat lint weigh doc/lua-filters.md packages pandoc-templates diff --git a/RELEASE-CHECKLIST b/RELEASE-CHECKLIST index f3e42a55e..bc0a85f41 100644 --- a/RELEASE-CHECKLIST +++ b/RELEASE-CHECKLIST @@ -7,11 +7,10 @@ _ make man/pandoc.1 and commit if needed _ Tag release in git -_ Push templates: - git subtree push --prefix=data/templates git@github.com:jgm/pandoc-templates.git master +_ make pandoc-templates cd ../pandoc-templates - git pull git tag REL + git push git push --tags _ Generate Windows package (make winpkg) @@ -1,3 +1,234 @@ +pandoc (2.0.6) + + * Added `jats` as an input format. + + + Add Text.Pandoc.Readers.JATS, exporting `readJATS` (API + change) (Hamish Mackenzie). + + Improved citation handling in JATS reader. JATS citations + are now converted to pandoc citations, and JATS ref-lists + are converted into a `references` field in metadata, suitable + for use with pandoc-citeproc. Thus a JATS article with embedded + bibliographic information can be processed with pandoc and + pandoc-citeproc to produce a formatted bibliography. + + * Allow `--list-extensions` to take an optional FORMAT argument. + This lists the extensions set by default for the selected FORMAT. + + * Markdown reader: + + + Preserve original whitespace between blocks. + + Recognize `\placeformula` as context. + + Be pickier about table captions. A caption starts with a `:` which + can't be followed by punctuation. Otherwise we can falsely interpret + the start of a fenced div, or even a table header line like + `:--:|:--:`, as a caption. + + Always use four space rule for example lists. It would be awkward + to indent example list contents to the first non-space character after + the label, since example list labels are often long. Thanks to + Bernhard Fisseni for the suggestion. + + Improve raw tex parsing. Note that the Markdown reader is also + affected by the `latex_macros` extension changes described below + under the LaTeX reader. + + * LaTeX reader: + + + `latex_macros` extension changes (#4179). Don't pass through macro + definitions themselves when `latex_macros` is set. The macros + have already been applied. If `latex_macros` is enabled, then + `rawLaTeXBlock` in Text.Pandoc.Readers.LaTeX will succeed in parsing + a macro definition, and will update pandoc's internal macro map + accordingly, but the empty string will be returned. + + Export `tokenize`, `untokenize` (API change). + + Use `applyMacros` in `rawLaTeXBlock`, `rawLaTeXInline`. + + Refactored `inlineCommand`. + + Fix bug in tokenizer. Material following `^^` was + dropped if it wasn't a character escape. This only affected + invalid LaTeX, so we didn't see it in the wild, but it appeared + in a QuickCheck test failure. + + Fix regression in LateX tokenization (#4159). This mainly affects the + Markdown reader when parsing raw LaTeX with escaped spaces. + + Add tests of LaTeX tokenizer. + + Support `\foreignlanguage` from babel. + + * Muse reader (Alexander Krotov): + + + Parse anchors immediately after headings as IDs. + + Require that note references does not start with 0. + + Parse empty comments correctly. + + * Org reader (Albert Krewinkel): + + + Fix asterisks-related parsing error (#4180). + + * OPML reader: + + + Enable raw HTML and other extensions by default for notes + (#4164). This fixes a regression in 2.0. Note that extensions can + now be individually disabled, e.g. `-f opml-smart-raw_html`. + + * RST reader: + + + Allow empty list items (#4193). + + More accurate parsing of references (#4156). Previously we erroneously + included the enclosing backticks in a reference ID (#4156). This + change also disables interpretation of syntax inside references, as + in docutils. So, there is no emphasis in `` `my *link*`_ ``. + + * Docx reader: + + + Continue lists after interruption (#4025, Jesse Rosenthal). + Docx expects that lists will continue where they left off after an + interruption and introduces a new id if a list is starting again. So + we keep track of the state of lists and use them to define a "start" + attribute, if necessary. + + Add tests for structured document tags unwrapping (Jesse Rosenthal). + + Preprocess Document body to unwrap `w:sdt` elements (Jesse Rosenthal, + #4190). + + * Plain writer: + + + Don't linkify table of contents. + + * RST writer: + + + Fix anchors for headers (#4188). We were missing an `_`. + + * PowerPoint writer (Jesse Rosenthal): + + + Treat lists inside BlockQuotes as lists. We don't yet produce + incremental lists in PowerPoint, but we should at least treat lists + inside BlockQuotes as lists, for compatibility with other slide formats. + + Add ability to force size. This replaces the more specific + `blockQuote runProp`, which only affected the size of blockquotes. We + can use this for notes, etc. + + Implement notes. This currently prints all notes on a final slide. + Note that at the moment, there is a danger of text overflowing the + note slide, since there is no logic for adding further slides. + + Implement basic definition list functionality to PowerPoint writer. + + Don't look for default template file for Powerpoint (#4181). + + Add pptx to isTextFormat list. This is used to check standalone + and not writing to the terminal. + + * Docx writer: + + + Ensure that `distArchive` is the one that comes with pandoc + (#4182). Previously a `reference.docx` in `~/.pandoc` (or the user data + dir) would be used instead, and this could cause problems because a + user-modified docx sometimes lacks vital sections that we count + on the `distArchive` to supply. + + * Org writer: + + + Do not wrap "-" to avoid accidental bullet lists (Alexander Krotov). + + Don't allow fn refs to wrap to beginning of line (#4171, with help from + Alexander Krotov). Otherwise they can be interpreted as footnote + definitions. + + * Muse writer (Alexander Krotov): + + + Don't wrap note references to the next line (#4172). + + * HTML writer: + + + Use br elements in line blocks instead of relying on CSS + (#4162). HTML-based templates have had the custom CSS for + `div.line-block` removed. Those maintaining custom templates will want + to remove this too. We still enclose line blocks in a div with class + `line-block`. + + * LaTeX writer: + + + Use `\renewcommand` for `\textlatin` with babel (#4161). + This avoids a clash with a deprecated `\textlatin` command defined + in Babel. + + Allow fragile=singleslide attribute in beamer slides (#4169). + + * JATS writer (Hamish Mackenzie): + + + Support writing `<fig>` and `<table-wrap>` elements + with `<title>` and `<caption>` inside them by using Divs with class set + to one of `fig`, `table-wrap` or `caption` (Hamish Mackenzie). The + title is included as a Heading so the constraint on where Heading can + occur is also relaxed. + + Leave out empty alt attributes on links. + + Deduplicate image mime type code. + + Make `<p>` optional in `<td>` and `<th>` (#4178). + + Self closing tags for empty xref (#4187). + + Improve support for code language. + + * Custom writer: + + + Use init file to setup Lua interpreter (Albert Krewinkel). + The same init file (`data/init`) that is used to setup the Lua + interpreter for Lua filters is also used to setup the interpreter of + custom writers.lua. + + Define instances for newtype wrapper (Albert Krewinkel). The custom + writer used its own `ToLuaStack` instance definitions, which made + it difficult to share code with Lua filters, as this could result + in conflicting instances. A `Stringify` wrapper is introduced to + avoid this problem. + + Added tests for custom writer. + + Fixed definition lists and tables in `data/sample.lua`. + + * Fixed regression: when target is PDF, writer extensions were being + ignored. So, for example, `pandoc -t latex-smart -o file.pdf` + did not work properly. + + * Lua modules (Albert Krewinkel): + + + Add `pandoc.utils` module, to hold utility functions. + + Create a Haskell module Text.Pandoc.Lua.Module.Pandoc to + define the `pandoc` lua module. + + Make a Haskell module for each Lua module. Move definitions for the + `pandoc.mediabag` modules to a separate Haskell module. + + Move `sha1` from the main `pandoc` module to `pandoc.utils`. + + Add function `pandoc.utils.hierarchicalize` (convert list of + Pandoc blocks into (hierarchical) list of Elements). + + Add function `pandoc.utils.normalize_date` (parses a date and + converts it (if possible) to "YYYY-MM-DD" format). + + Add function `pandoc.utils.to_roman_numeral` (allows conversion + of numbers below 4000 into roman numerals). + + Add function `pandoc.utils.stringify` (converts any AST element + to a string with formatting removed). + + `data/init.lua`: load `pandoc.utils` by default + + Turn pipe, read into full Haskell functions. The `pipe` and `read` + utility functions are converted from hybrid lua/haskell functions + into full Haskell functions. This avoids the need for intermediate + `_pipe`/`_read` helper functions, which have dropped. + + pandoc.lua: re-add missing MetaMap function. This was a bug + introduced in version 2.0.4. + + * Text.Pandoc.Shared: export `blocksToInlines'` (API change, Maura Bieg). + + * Text.Pandoc.MIME: Add opus to MIME type table as audio/ogg (#4198). + + * Allow lenient decoding of latex error logs, which are not always + properly UTF8-encoded (#4200). + + * Update latex template to work with recent versions of beamer. + The old template produced numbered sections with some recent + versions of beamer. Thanks to Thomas Hodgson. + + * Updated reference.docx (#4175). Instead of just "Hello, world", the + document now contains exemplars of most of the styles that have an + effect on pandoc documents. This makes it easier to see the effect + of style changes. + + * Removed `default.theme` data file (#4096). It is no longer needed now + that we have `--print-highlight-style`. + + * MANUAL.txt: + + + Add note on what formats have `+smart` by default. + + Use native syntax for custom-style (#4174, Mauro Bieg). + + Introduce dedicated Extensions section, since some extensions + affect formats other than markdown (Mauro Bieg, #4204). + + * filters.md: say that Text.Pandoc.JSON comes form pandoc-types. + Closes jgm/pandoc-website#16. + + * epub.md: Delete removed `-S` option from command (#4151, Georger Araújo). + pandoc (2.0.5) * Fix a bug in 2.0.4, whereby pandoc could not read the theme files @@ -318,6 +549,7 @@ pandoc (2.0.3) + Allow spaces after `\(` and before `\)` with `tex_math_single_backslash`. Previously `\( \frac{1}{a} < \frac{1}{b} \)` was not parsed as math in `markdown` or `html` `+tex_math_single_backslash`. + + Parse div with class `line-block` as LineBlock. * MANUAL: clarify that math extensions work with HTML. Clarify that `tex_math_dollars` and `tex_math_single_backslash` diff --git a/data/docx/[Content_Types].xml b/data/docx/[Content_Types].xml index 9c5756aed..1e888dff9 100644 --- a/data/docx/[Content_Types].xml +++ b/data/docx/[Content_Types].xml @@ -1,2 +1,2 @@ <?xml version="1.0" encoding="UTF-8"?> -<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types"><Default Extension="xml" ContentType="application/xml" /><Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml" /><Override PartName="/word/webSettings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.webSettings+xml" /><Override PartName="/word/numbering.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.numbering+xml" /><Override PartName="/word/settings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.settings+xml" /><Override PartName="/word/theme/theme1.xml" ContentType="application/vnd.openxmlformats-officedocument.theme+xml" /><Override PartName="/word/fontTable.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+xml" /><Override PartName="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml" /><Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlformats-package.core-properties+xml" /><Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.styles+xml" /><Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml" /><Override PartName="/word/footnotes.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.footnotes+xml" /></Types> +<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types"><Default Extension="xml" ContentType="application/xml" /><Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml" /><Override PartName="/word/webSettings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.webSettings+xml" /><Override PartName="/word/numbering.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.numbering+xml" /><Override PartName="/word/settings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.settings+xml" /><Override PartName="/word/theme/theme1.xml" ContentType="application/vnd.openxmlformats-officedocument.theme+xml" /><Override PartName="/word/fontTable.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+xml" /><Override PartName="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml" /><Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlformats-package.core-properties+xml" /><Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.styles+xml" /><Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml" /><Override PartName="/word/comments.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.comments+xml" /><Override PartName="/word/footnotes.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.footnotes+xml" /></Types> diff --git a/data/docx/docProps/core.xml b/data/docx/docProps/core.xml index 2274766e4..bc61390b0 100644 --- a/data/docx/docProps/core.xml +++ b/data/docx/docProps/core.xml @@ -1,2 +1,2 @@ <?xml version="1.0" encoding="UTF-8"?> -<cp:coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dcmitype="http://purl.org/dc/dcmitype/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><dc:title></dc:title><dc:creator></dc:creator></cp:coreProperties>
\ No newline at end of file +<cp:coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dcmitype="http://purl.org/dc/dcmitype/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><dc:title>Title</dc:title><dc:creator>Author</dc:creator><cp:keywords></cp:keywords><dcterms:created xsi:type="dcterms:W3CDTF">2017-12-27T05:22:50Z</dcterms:created><dcterms:modified xsi:type="dcterms:W3CDTF">2017-12-27T05:22:50Z</dcterms:modified></cp:coreProperties>
\ No newline at end of file diff --git a/data/docx/word/_rels/document.xml.rels b/data/docx/word/_rels/document.xml.rels index ca0c57b63..f01e07658 100644 --- a/data/docx/word/_rels/document.xml.rels +++ b/data/docx/word/_rels/document.xml.rels @@ -1,2 +1,2 @@ <?xml version="1.0" encoding="UTF-8"?> -<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/numbering" Id="rId1" Target="numbering.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Id="rId2" Target="styles.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Id="rId3" Target="settings.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Id="rId4" Target="webSettings.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Id="rId5" Target="fontTable.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Id="rId6" Target="theme/theme1.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Id="rId7" Target="footnotes.xml" /></Relationships> +<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/numbering" Id="rId1" Target="numbering.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Id="rId2" Target="styles.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Id="rId3" Target="settings.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Id="rId4" Target="webSettings.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Id="rId5" Target="fontTable.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Id="rId6" Target="theme/theme1.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Id="rId7" Target="footnotes.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/comments" Id="rId8" Target="comments.xml" /><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink" Id="rId30" Target="http://example.com" TargetMode="External" /></Relationships> diff --git a/data/docx/word/_rels/footnotes.xml.rels b/data/docx/word/_rels/footnotes.xml.rels index be7e70853..81d529a4c 100644 --- a/data/docx/word/_rels/footnotes.xml.rels +++ b/data/docx/word/_rels/footnotes.xml.rels @@ -1,2 +1,2 @@ <?xml version="1.0" encoding="UTF-8"?> -<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships" />
\ No newline at end of file +<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink" Id="rId30" Target="http://example.com" TargetMode="External" /></Relationships>
\ No newline at end of file diff --git a/data/docx/word/comments.xml b/data/docx/word/comments.xml new file mode 100644 index 000000000..ca80aa7fe --- /dev/null +++ b/data/docx/word/comments.xml @@ -0,0 +1,2 @@ +<?xml version="1.0" encoding="UTF-8"?> +<w:comments xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" />
\ No newline at end of file diff --git a/data/docx/word/document.xml b/data/docx/word/document.xml index 7199034da..f74c3f56e 100644 --- a/data/docx/word/document.xml +++ b/data/docx/word/document.xml @@ -1,2 +1,398 @@ -<?xml version="1.0" encoding="UTF-8"?> -<w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"><w:body><w:p><w:r><w:t xml:space="preserve">Hello world.</w:t></w:r></w:p></w:body></w:document> +<?xml version="1.0" encoding="utf-8"?> +<w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" +xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" +xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" +xmlns:o="urn:schemas-microsoft-com:office:office" +xmlns:v="urn:schemas-microsoft-com:vml" +xmlns:w10="urn:schemas-microsoft-com:office:word" +xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" +xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture" +xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"> + + <w:body> + <w:p> + <w:pPr> + <w:pStyle w:val="Title" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Title +</w:t> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Subtitle" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Subtitle +</w:t> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Author" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Author +</w:t> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Date" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Date +</w:t> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Compact" /> + <w:pStyle w:val="Abstract" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Abstract +</w:t> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Heading1" /> + </w:pPr> + <w:bookmarkStart w:id="21" w:name="heading-1" /> + <w:r> + <w:t xml:space="preserve"> +Heading 1 +</w:t> + </w:r> + <w:bookmarkEnd w:id="21" /> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Heading2" /> + </w:pPr> + <w:bookmarkStart w:id="22" w:name="heading-2" /> + <w:r> + <w:t xml:space="preserve"> +Heading 2 +</w:t> + </w:r> + <w:bookmarkEnd w:id="22" /> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Heading3" /> + </w:pPr> + <w:bookmarkStart w:id="23" w:name="heading-3" /> + <w:r> + <w:t xml:space="preserve"> +Heading 3 +</w:t> + </w:r> + <w:bookmarkEnd w:id="23" /> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Heading4" /> + </w:pPr> + <w:bookmarkStart w:id="24" w:name="heading-4" /> + <w:r> + <w:t xml:space="preserve"> +Heading 4 +</w:t> + </w:r> + <w:bookmarkEnd w:id="24" /> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Heading5" /> + </w:pPr> + <w:bookmarkStart w:id="25" w:name="heading-5" /> + <w:r> + <w:t xml:space="preserve"> +Heading 5 +</w:t> + </w:r> + <w:bookmarkEnd w:id="25" /> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Heading6" /> + </w:pPr> + <w:bookmarkStart w:id="26" w:name="heading-6" /> + <w:r> + <w:t xml:space="preserve"> +Heading 6 +</w:t> + </w:r> + <w:bookmarkEnd w:id="26" /> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Heading7" /> + </w:pPr> + <w:bookmarkStart w:id="27" w:name="heading-7" /> + <w:r> + <w:t xml:space="preserve"> +Heading 7 +</w:t> + </w:r> + <w:bookmarkEnd w:id="27" /> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Heading8" /> + </w:pPr> + <w:bookmarkStart w:id="28" w:name="heading-8" /> + <w:r> + <w:t xml:space="preserve"> +Heading 8 +</w:t> + </w:r> + <w:bookmarkEnd w:id="28" /> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Heading9" /> + </w:pPr> + <w:bookmarkStart w:id="29" w:name="heading-9" /> + <w:r> + <w:t xml:space="preserve"> +Heading 9 +</w:t> + </w:r> + <w:bookmarkEnd w:id="29" /> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="FirstParagraph" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +First Paragraph. +</w:t> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="BodyText" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Body Text. Body Text Char. +</w:t> + </w:r> + <w:r> + <w:t xml:space="preserve"> + +</w:t> + </w:r> + <w:r> + <w:rPr> + <w:rStyle w:val="VerbatimChar" /> + </w:rPr> + <w:t xml:space="preserve"> +Verbatim Char +</w:t> + </w:r> + <w:r> + <w:t xml:space="preserve"> +. +</w:t> + </w:r> + <w:r> + <w:t xml:space="preserve"> + +</w:t> + </w:r> + <w:hyperlink r:id="rId30"> + <w:r> + <w:rPr> + <w:rStyle w:val="Hyperlink" /> + </w:rPr> + <w:t xml:space="preserve"> +Hyperlink +</w:t> + </w:r> + </w:hyperlink> + <w:r> + <w:t xml:space="preserve"> +. +</w:t> + </w:r> + <w:r> + <w:t xml:space="preserve"> + +</w:t> + </w:r> + <w:r> + <w:t xml:space="preserve"> +Footnote. +</w:t> + </w:r> + <w:r> + <w:rPr> + <w:rStyle w:val="FootnoteReference" /> + </w:rPr> + <w:footnoteReference w:id="31" /> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="BlockText" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Block Text. +</w:t> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="TableCaption" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Table caption. +</w:t> + </w:r> + </w:p> + <w:tbl> + <w:tblPr> + <w:tblStyle w:val="Table" /> + <w:tblW w:type="pct" w:w="0.0" /> + <w:tblLook w:firstRow="1" /> + <w:tblCaption w:val="Table caption." /> + </w:tblPr> + <w:tblGrid /> + <w:tr> + <w:trPr> + <w:cnfStyle w:firstRow="1" /> + </w:trPr> + <w:tc> + <w:tcPr> + <w:tcBorders> + <w:bottom w:val="single" /> + </w:tcBorders> + <w:vAlign w:val="bottom" /> + </w:tcPr> + <w:p> + <w:pPr> + <w:pStyle w:val="Compact" /> + <w:jc w:val="left" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Table +</w:t> + </w:r> + </w:p> + </w:tc> + <w:tc> + <w:tcPr> + <w:tcBorders> + <w:bottom w:val="single" /> + </w:tcBorders> + <w:vAlign w:val="bottom" /> + </w:tcPr> + <w:p> + <w:pPr> + <w:pStyle w:val="Compact" /> + <w:jc w:val="left" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Table +</w:t> + </w:r> + </w:p> + </w:tc> + </w:tr> + <w:tr> + <w:tc> + <w:p> + <w:pPr> + <w:pStyle w:val="Compact" /> + <w:jc w:val="left" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +1 +</w:t> + </w:r> + </w:p> + </w:tc> + <w:tc> + <w:p> + <w:pPr> + <w:pStyle w:val="Compact" /> + <w:jc w:val="left" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +2 +</w:t> + </w:r> + </w:p> + </w:tc> + </w:tr> + </w:tbl> + <w:p> + <w:pPr> + <w:pStyle w:val="ImageCaption" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Image Caption +</w:t> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="DefinitionTerm" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +DefinitionTerm +</w:t> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Definition" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Definition +</w:t> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="DefinitionTerm" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +DefinitionTerm +</w:t> + </w:r> + </w:p> + <w:p> + <w:pPr> + <w:pStyle w:val="Definition" /> + </w:pPr> + <w:r> + <w:t xml:space="preserve"> +Definition +</w:t> + </w:r> + </w:p> + <w:sectPr /> + </w:body> +</w:document> diff --git a/data/docx/word/footnotes.xml b/data/docx/word/footnotes.xml index db82d9462..2a150e026 100644 --- a/data/docx/word/footnotes.xml +++ b/data/docx/word/footnotes.xml @@ -1,26 +1,7 @@ -<?xml version="1.0" encoding="utf-8"?> -<w:footnotes xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" -xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" -xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" -xmlns:o="urn:schemas-microsoft-com:office:office" -xmlns:v="urn:schemas-microsoft-com:vml" -xmlns:w10="urn:schemas-microsoft-com:office:word" -xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" -xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture" -xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"> - - <w:footnote w:type="continuationSeparator" w:id="0"> - <w:p> - <w:r> - <w:continuationSeparator /> - </w:r> - </w:p> - </w:footnote> - <w:footnote w:type="separator" w:id="-1"> - <w:p> - <w:r> - <w:separator /> - </w:r> - </w:p> - </w:footnote> -</w:footnotes> +<?xml version="1.0" encoding="UTF-8"?> +<w:footnotes xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"><w:footnote w:type="continuationSeparator" w:id="0"><w:p><w:r><w:continuationSeparator /></w:r></w:p></w:footnote><w:footnote w:type="separator" w:id="-1"><w:p><w:r><w:separator /></w:r></w:p></w:footnote><w:footnote w:id="31"><w:p><w:pPr><w:pStyle w:val="FootnoteText" /></w:pPr><w:r> + <w:rPr> + <w:rStyle w:val="FootnoteReference" /> + </w:rPr> + <w:footnoteRef /> +</w:r><w:r><w:t xml:space="preserve"> </w:t></w:r><w:r><w:t xml:space="preserve">Footnote Text.</w:t></w:r></w:p></w:footnote></w:footnotes>
\ No newline at end of file diff --git a/data/docx/word/numbering.xml b/data/docx/word/numbering.xml index 970dafc45..2df923f28 100644 --- a/data/docx/word/numbering.xml +++ b/data/docx/word/numbering.xml @@ -1,3 +1,2 @@ -<?xml version="1.0" encoding="utf-8"?> -<w:numbering xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"> -</w:numbering> +<?xml version="1.0" encoding="UTF-8"?> +<w:numbering xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"><w:abstractNum w:abstractNumId="990"><w:nsid w:val="170cd2de" /><w:multiLevelType w:val="multilevel" /><w:lvl w:ilvl="0"><w:numFmt w:val="bullet" /><w:lvlText w:val=" " /><w:lvlJc w:val="left" /><w:pPr><w:tabs><w:tab w:val="num" w:pos="0" /></w:tabs><w:ind w:left="480" w:hanging="480" /></w:pPr></w:lvl><w:lvl w:ilvl="1"><w:numFmt w:val="bullet" /><w:lvlText w:val=" " /><w:lvlJc w:val="left" /><w:pPr><w:tabs><w:tab w:val="num" w:pos="720" /></w:tabs><w:ind w:left="1200" w:hanging="480" /></w:pPr></w:lvl><w:lvl w:ilvl="2"><w:numFmt w:val="bullet" /><w:lvlText w:val=" " /><w:lvlJc w:val="left" /><w:pPr><w:tabs><w:tab w:val="num" w:pos="1440" /></w:tabs><w:ind w:left="1920" w:hanging="480" /></w:pPr></w:lvl><w:lvl w:ilvl="3"><w:numFmt w:val="bullet" /><w:lvlText w:val=" " /><w:lvlJc w:val="left" /><w:pPr><w:tabs><w:tab w:val="num" w:pos="2160" /></w:tabs><w:ind w:left="2640" w:hanging="480" /></w:pPr></w:lvl><w:lvl w:ilvl="4"><w:numFmt w:val="bullet" /><w:lvlText w:val=" " /><w:lvlJc w:val="left" /><w:pPr><w:tabs><w:tab w:val="num" w:pos="2880" /></w:tabs><w:ind w:left="3360" w:hanging="480" /></w:pPr></w:lvl><w:lvl w:ilvl="5"><w:numFmt w:val="bullet" /><w:lvlText w:val=" " /><w:lvlJc w:val="left" /><w:pPr><w:tabs><w:tab w:val="num" w:pos="3600" /></w:tabs><w:ind w:left="4080" w:hanging="480" /></w:pPr></w:lvl><w:lvl w:ilvl="6"><w:numFmt w:val="bullet" /><w:lvlText w:val=" " /><w:lvlJc w:val="left" /><w:pPr><w:tabs><w:tab w:val="num" w:pos="4320" /></w:tabs><w:ind w:left="4800" w:hanging="480" /></w:pPr></w:lvl><w:lvl w:ilvl="7"><w:numFmt w:val="bullet" /><w:lvlText w:val=" " /><w:lvlJc w:val="left" /><w:pPr><w:tabs><w:tab w:val="num" w:pos="5040" /></w:tabs><w:ind w:left="5520" w:hanging="480" /></w:pPr></w:lvl><w:lvl w:ilvl="8"><w:numFmt w:val="bullet" /><w:lvlText w:val=" " /><w:lvlJc w:val="left" /><w:pPr><w:tabs><w:tab w:val="num" w:pos="5760" /></w:tabs><w:ind w:left="6240" w:hanging="480" /></w:pPr></w:lvl></w:abstractNum><w:num w:numId="1000"><w:abstractNumId w:val="990" /></w:num></w:numbering>
\ No newline at end of file diff --git a/data/docx/word/settings.xml b/data/docx/word/settings.xml index 425e6f7b5..afa0199c9 100644 --- a/data/docx/word/settings.xml +++ b/data/docx/word/settings.xml @@ -44,4 +44,4 @@ <w:clrSchemeMapping w:bg1="light1" w:t1="dark1" w:bg2="light2" w:t2="dark2" w:accent1="accent1" w:accent2="accent2" w:accent3="accent3" w:accent4="accent4" w:accent5="accent5" w:accent6="accent6" w:hyperlink="hyperlink" w:followedHyperlink="followedHyperlink" /> <w:decimalSymbol w:val="." /> <w:listSeparator w:val="," /> -</w:settings> +</w:settings>
\ No newline at end of file diff --git a/data/docx/word/styles.xml b/data/docx/word/styles.xml index 3596d8bbc..130a55a63 100644 --- a/data/docx/word/styles.xml +++ b/data/docx/word/styles.xml @@ -450,8 +450,7 @@ <w:rFonts w:asciiTheme="majorHAnsi" w:eastAsiaTheme="majorEastAsia" w:hAnsiTheme="majorHAnsi" w:cstheme="majorBidi" /> <w:b w:val="0" /> <w:bCs w:val="0" /> - <w:color w:val="365F91" w:themeColor="accent1" - w:themeShade="BF" /> + <w:color w:val="365F91" w:themeColor="accent1" w:themeShade="BF" /> </w:rPr> </w:style> </w:styles> diff --git a/data/init.lua b/data/init.lua index 7a46269d8..ed39dd294 100644 --- a/data/init.lua +++ b/data/init.lua @@ -4,3 +4,4 @@ pandoc = require 'pandoc' pandoc.mediabag = require 'pandoc.mediabag' +pandoc.utils = require 'pandoc.utils' diff --git a/data/sample.lua b/data/sample.lua index 1e3a08731..6c09442b5 100644 --- a/data/sample.lua +++ b/data/sample.lua @@ -242,14 +242,12 @@ function OrderedList(items) return "<ol>\n" .. table.concat(buffer, "\n") .. "\n</ol>" end --- Revisit association list STackValue instance. function DefinitionList(items) local buffer = {} for _,item in pairs(items) do - for k, v in pairs(item) do - table.insert(buffer,"<dt>" .. k .. "</dt>\n<dd>" .. - table.concat(v,"</dd>\n<dd>") .. "</dd>") - end + local k, v = next(item) + table.insert(buffer, "<dt>" .. k .. "</dt>\n<dd>" .. + table.concat(v, "</dd>\n<dd>") .. "</dd>") end return "<dl>\n" .. table.concat(buffer, "\n") .. "\n</dl>" end @@ -288,7 +286,7 @@ function Table(caption, aligns, widths, headers, rows) end if widths and widths[1] ~= 0 then for _, w in pairs(widths) do - add('<col width="' .. string.format("%d%%", w * 100) .. '" />') + add('<col width="' .. string.format("%.0f%%", w * 100) .. '" />') end end local header_row = {} diff --git a/data/templates/default.dzslides b/data/templates/default.dzslides index 73ef045f1..892a434cb 100644 --- a/data/templates/default.dzslides +++ b/data/templates/default.dzslides @@ -16,7 +16,6 @@ $endif$ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} $if(quotes)$ q { quotes: "“" "”" "‘" "’"; } diff --git a/data/templates/default.epub2 b/data/templates/default.epub2 index 2abead3d0..cca9fcf6f 100644 --- a/data/templates/default.epub2 +++ b/data/templates/default.epub2 @@ -10,7 +10,6 @@ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} $if(quotes)$ q { quotes: "“" "”" "‘" "’"; } diff --git a/data/templates/default.epub3 b/data/templates/default.epub3 index ffe230507..b22714963 100644 --- a/data/templates/default.epub3 +++ b/data/templates/default.epub3 @@ -9,7 +9,6 @@ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} $if(quotes)$ q { quotes: "“" "”" "‘" "’"; } diff --git a/data/templates/default.html4 b/data/templates/default.html4 index a771599f7..714b3ff2e 100644 --- a/data/templates/default.html4 +++ b/data/templates/default.html4 @@ -18,7 +18,6 @@ $endif$ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} $if(quotes)$ q { quotes: "“" "”" "‘" "’"; } diff --git a/data/templates/default.html5 b/data/templates/default.html5 index 2272a0179..5c484f376 100644 --- a/data/templates/default.html5 +++ b/data/templates/default.html5 @@ -18,7 +18,6 @@ $endif$ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} $if(quotes)$ q { quotes: "“" "”" "‘" "’"; } diff --git a/data/templates/default.revealjs b/data/templates/default.revealjs index 9e3bc8d09..65ab09049 100644 --- a/data/templates/default.revealjs +++ b/data/templates/default.revealjs @@ -21,7 +21,6 @@ $endif$ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} $if(quotes)$ q { quotes: "“" "”" "‘" "’"; } diff --git a/data/templates/default.s5 b/data/templates/default.s5 index 33e539744..e9c36b4d4 100644 --- a/data/templates/default.s5 +++ b/data/templates/default.s5 @@ -19,7 +19,6 @@ $endif$ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} $if(quotes)$ q { quotes: "“" "”" "‘" "’"; } diff --git a/data/templates/default.slideous b/data/templates/default.slideous index 4c5e6a7bd..ad58272ae 100644 --- a/data/templates/default.slideous +++ b/data/templates/default.slideous @@ -20,7 +20,6 @@ $endif$ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} $if(quotes)$ q { quotes: "“" "”" "‘" "’"; } diff --git a/data/templates/default.slidy b/data/templates/default.slidy index 5bc237c91..98b8d669d 100644 --- a/data/templates/default.slidy +++ b/data/templates/default.slidy @@ -20,7 +20,6 @@ $endif$ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} $if(quotes)$ q { quotes: "“" "”" "‘" "’"; } diff --git a/doc/lua-filters.md b/doc/lua-filters.md index 52d745ce8..e9ed704ad 100644 --- a/doc/lua-filters.md +++ b/doc/lua-filters.md @@ -173,10 +173,11 @@ Some pandoc functions have been made available in lua: documents; - [`pipe`](#pipe) runs an external command with input from and output to strings; -- [`sha1`](#utils-sha1) generates a SHA1 hash; - the [`pandoc.mediabag`](#module-pandoc.mediabag) module allows access to the "mediabag," which stores binary content such as - images that may be included in the final document. + images that may be included in the final document; +- the [`pandoc.utils`](#module-pandoc.utils) module contains + various utility functions. # Lua interpreter initialization @@ -1399,12 +1400,13 @@ Lua functions for pandoc scripts. Usage: -- within a file defining a pandoc filter: + local text = require('text') function Str(text) - return pandoc.Str(utf8.upper(text)) + return pandoc.Str(text.upper(text)) end return {pandoc.global_filter()} - -- the above is equivallent to + -- the above is equivalent to -- return {{Str = Str}} [`pipe (command, args, input)`]{#pipe} @@ -1432,6 +1434,45 @@ Lua functions for pandoc scripts. This module exposes internal pandoc functions and utility functions. +[`hierarchicalize (blocks)`]{#utils-hierarchicalize} + +: Convert list of blocks into an hierarchical list. An + hierarchical elements is either a normal block (but no + Header), or a `Sec` element. The latter has the following + fields: + + - level: level in the document hierarchy; + - numbering: list of integers of length `level`, + specifying the absolute position of the section in the + document; + - attr: section attributes (see [Attr](#Attr)); + - contents: nested list of hierarchical elements. + + Returns: + + - List of hierarchical elements + + Usage: + + local blocks = { + pandoc.Header(2, pandoc.Str 'first'), + pandoc.Header(2, pandoc.Str 'second'), + } + local elements = pandoc.utils.hierarchicalize(blocks) + print(table.concat(elements[1].numbering, '.')) -- 0.1 + print(table.concat(elements[2].numbering, '.')) -- 0.2 + +[`normalize_date (date_string)`]{#utils-normalize_date} + +: Parse a date and convert (if possible) to "YYYY-MM-DD" + format. We limit years to the range 1601-9999 (ISO 8601 + accepts greater than or equal to 1583, but MS Word only + accepts dates starting 1601). + + Returns: + + - A date string, or nil when the conversion failed. + [`sha1 (contents)`]{#utils-sha1} : Returns the SHA1 has of the contents. @@ -1456,9 +1497,23 @@ functions. Usage: local inline = pandoc.Emph{pandoc.Str 'Moin'} - -- outputs "Moin" + -- outputs "Moin" print(pandoc.utils.stringify(inline)) +[`to_roman_numeral (integer)`]{#utils-to_roman_numeral} + +: Converts an integer < 4000 to uppercase roman numeral. + + Returns: + + - A roman numeral string. + + Usage: + + local to_roman_numeral = pandoc.utils.to_roman_numeral + local pandoc_birth_year = to_roman_numeral(2006) + -- pandoc_birth_year == 'MMVI' + # Module pandoc.mediabag diff --git a/man/pandoc.1 b/man/pandoc.1 index 64db3ae51..91cd1f7be 100644 --- a/man/pandoc.1 +++ b/man/pandoc.1 @@ -1,5 +1,5 @@ .\"t -.TH PANDOC 1 "December 8, 2017" "pandoc 2.0.5" +.TH PANDOC 1 "December 27, 2017" "pandoc 2.0.6" .SH NAME pandoc - general markup converter .SH SYNOPSIS @@ -12,16 +12,16 @@ another, and a command\-line tool that uses this library. It can read Markdown, CommonMark, PHP Markdown Extra, GitHub\-Flavored Markdown, MultiMarkdown, and (subsets of) Textile, reStructuredText, HTML, LaTeX, MediaWiki markup, TWiki markup, TikiWiki markup, Creole -1.0, Haddock markup, OPML, Emacs Org mode, DocBook, Muse, txt2tags, -Vimwiki, EPUB, ODT, and Word docx; and it can write plain text, -Markdown, CommonMark, PHP Markdown Extra, GitHub\-Flavored Markdown, -MultiMarkdown, reStructuredText, XHTML, HTML5, LaTeX (including -\f[C]beamer\f[] slide shows), ConTeXt, RTF, OPML, DocBook, OpenDocument, -ODT, Word docx, GNU Texinfo, MediaWiki markup, DokuWiki markup, ZimWiki -markup, Haddock markup, EPUB (v2 or v3), FictionBook2, Textile, groff -man, groff ms, Emacs Org mode, AsciiDoc, InDesign ICML, TEI Simple, -Muse, PowerPoint slide shows and Slidy, Slideous, DZSlides, reveal.js or -S5 HTML slide shows. +1.0, Haddock markup, OPML, Emacs Org mode, DocBook, JATS, Muse, +txt2tags, Vimwiki, EPUB, ODT, and Word docx; and it can write plain +text, Markdown, CommonMark, PHP Markdown Extra, GitHub\-Flavored +Markdown, MultiMarkdown, reStructuredText, XHTML, HTML5, LaTeX +(including \f[C]beamer\f[] slide shows), ConTeXt, RTF, OPML, DocBook, +JATS, OpenDocument, ODT, Word docx, GNU Texinfo, MediaWiki markup, +DokuWiki markup, ZimWiki markup, Haddock markup, EPUB (v2 or v3), +FictionBook2, Textile, groff man, groff ms, Emacs Org mode, AsciiDoc, +InDesign ICML, TEI Simple, Muse, PowerPoint slide shows and Slidy, +Slideous, DZSlides, reveal.js or S5 HTML slide shows. It can also produce PDF output on systems where LaTeX, ConTeXt, \f[C]pdfroff\f[], \f[C]wkhtmltopdf\f[], \f[C]prince\f[], or \f[C]weasyprint\f[] is installed. @@ -242,19 +242,10 @@ markup), \f[C]tikiwiki\f[] (TikiWiki markup), \f[C]creole\f[] (Creole 1.0), \f[C]haddock\f[] (Haddock markup), or \f[C]latex\f[] (LaTeX). (\f[C]markdown_github\f[] provides deprecated and less accurate support for Github\-Flavored Markdown; please use \f[C]gfm\f[] instead, unless -you need to use extensions other than \f[C]smart\f[].) If \f[C]+lhs\f[] -is appended to \f[C]markdown\f[], \f[C]rst\f[], \f[C]latex\f[], or -\f[C]html\f[], the input will be treated as literate Haskell source: see -Literate Haskell support, below. -Markdown syntax extensions can be individually enabled or disabled by -appending \f[C]+EXTENSION\f[] or \f[C]\-EXTENSION\f[] to the format -name. -So, for example, \f[C]markdown_strict+footnotes+definition_lists\f[] is -strict Markdown with footnotes and definition lists enabled, and -\f[C]markdown\-pipe_tables+hard_line_breaks\f[] is pandoc\[aq]s Markdown -without pipe tables and with hard line breaks. -See Pandoc\[aq]s Markdown, below, for a list of extensions and their -names. +you need to use extensions other than \f[C]smart\f[].) Extensions can be +individually enabled or disabled by appending \f[C]+EXTENSION\f[] or +\f[C]\-EXTENSION\f[] to the format name. +See Extensions below, for a list of extensions and their names. See \f[C]\-\-list\-input\-formats\f[] and \f[C]\-\-list\-extensions\f[], below. .RS @@ -295,13 +286,9 @@ you use extensions that do not work with \f[C]gfm\f[].) Note that \f[C]odt\f[], \f[C]epub\f[], and \f[C]epub3\f[] output will not be directed to \f[I]stdout\f[]; an output filename must be specified using the \f[C]\-o/\-\-output\f[] option. -If \f[C]+lhs\f[] is appended to \f[C]markdown\f[], \f[C]rst\f[], -\f[C]latex\f[], \f[C]beamer\f[], \f[C]html4\f[], or \f[C]html5\f[], the -output will be rendered as literate Haskell source: see Literate Haskell -support, below. -Markdown syntax extensions can be individually enabled or disabled by -appending \f[C]+EXTENSION\f[] or \f[C]\-EXTENSION\f[] to the format -name, as described above under \f[C]\-f\f[]. +Extensions can be individually enabled or disabled by appending +\f[C]+EXTENSION\f[] or \f[C]\-EXTENSION\f[] to the format name. +See Extensions below, for a list of extensions and their names. See \f[C]\-\-list\-output\-formats\f[] and \f[C]\-\-list\-extensions\f[], below. .RS @@ -398,10 +385,10 @@ List supported output formats, one per line. .RS .RE .TP -.B \f[C]\-\-list\-extensions\f[] +.B \f[C]\-\-list\-extensions\f[][\f[C]=\f[]\f[I]FORMAT\f[]] List supported Markdown extensions, one per line, followed by a \f[C]+\f[] or \f[C]\-\f[] indicating whether it is enabled by default in -pandoc\[aq]s Markdown. +\f[I]FORMAT\f[] (defaulting to pandoc\[aq]s Markdown). .RS .RE .TP @@ -2035,6 +2022,336 @@ merge in changes after each pandoc release. .PP Templates may contain comments: anything on a line after \f[C]$\-\-\f[] will be treated as a comment and ignored. +.SH EXTENSIONS +.PP +The behavior of some of the readers and writers can be adjusted by +enabling or disabling various extensions. +.PP +An extension can be enabled by adding \f[C]+EXTENSION\f[] to the format +name and disabled by adding \f[C]\-EXTENSION\f[]. +For example, \f[C]\-\-from\ markdown_strict+footnotes\f[] is strict +Markdown with footnotes enabled, while +\f[C]\-\-from\ markdown\-footnotes\-pipe_tables\f[] is pandoc\[aq]s +Markdown without footnotes or pipe tables. +.PP +The markdown reader and writer make by far the most use of extensions. +Extensions only used by them are therefore covered in the section +Pandoc\[aq]s Markdown below (See Markdown variants for +\f[C]commonmark\f[] and \f[C]gfm\f[].) In the following, extensions that +also work for other formats are covered. +.SS Typography +.SS Extension: \f[C]smart\f[] +.PP +Interpret straight quotes as curly quotes, \f[C]\-\-\-\f[] as +em\-dashes, \f[C]\-\-\f[] as en\-dashes, and \f[C]\&...\f[] as ellipses. +Nonbreaking spaces are inserted after certain abbreviations, such as +"Mr." +.PP +This extension can be enabled/disabled for the following formats: +.TP +.B input formats +\f[C]markdown\f[], \f[C]commonmark\f[], \f[C]latex\f[], +\f[C]mediawiki\f[], \f[C]org\f[], \f[C]rst\f[], \f[C]twiki\f[] +.RS +.RE +.TP +.B output formats +\f[C]markdown\f[], \f[C]latex\f[], \f[C]context\f[], \f[C]rst\f[] +.RS +.RE +.TP +.B enabled by default in +\f[C]markdown\f[], \f[C]latex\f[], \f[C]context\f[] (both input and +output) +.RS +.RE +.PP +Note: If you are \f[I]writing\f[] Markdown, then the \f[C]smart\f[] +extension has the reverse effect: what would have been curly quotes +comes out straight. +.PP +In LaTeX, \f[C]smart\f[] means to use the standard TeX ligatures for +quotation marks (\f[C]``\f[] and \f[C]\[aq]\[aq]\f[] for double quotes, +\f[C]`\f[] and \f[C]\[aq]\f[] for single quotes) and dashes +(\f[C]\-\-\f[] for en\-dash and \f[C]\-\-\-\f[] for em\-dash). +If \f[C]smart\f[] is disabled, then in reading LaTeX pandoc will parse +these characters literally. +In writing LaTeX, enabling \f[C]smart\f[] tells pandoc to use the +ligatures when possible; if \f[C]smart\f[] is disabled pandoc will use +unicode quotation mark and dash characters. +.SS Headers and sections +.SS Extension: \f[C]auto_identifiers\f[] +.PP +A header without an explicitly specified identifier will be +automatically assigned a unique identifier based on the header text. +.PP +This extension can be enabled/disabled for the following formats: +.TP +.B input formats +\f[C]markdown\f[], \f[C]latex\f[], \f[C]rst\f[], \f[C]mediawiki\f[], +\f[C]textile\f[] +.RS +.RE +.TP +.B output formats +\f[C]markdown\f[], \f[C]muse\f[] +.RS +.RE +.TP +.B enabled by default in +\f[C]markdown\f[], \f[C]muse\f[] +.RS +.RE +.PP +The algorithm used to derive the identifier from the header text is: +.IP \[bu] 2 +Remove all formatting, links, etc. +.IP \[bu] 2 +Remove all footnotes. +.IP \[bu] 2 +Remove all punctuation, except underscores, hyphens, and periods. +.IP \[bu] 2 +Replace all spaces and newlines with hyphens. +.IP \[bu] 2 +Convert all alphabetic characters to lowercase. +.IP \[bu] 2 +Remove everything up to the first letter (identifiers may not begin with +a number or punctuation mark). +.IP \[bu] 2 +If nothing is left after this, use the identifier \f[C]section\f[]. +.PP +Thus, for example, +.PP +.TS +tab(@); +l l. +T{ +Header +T}@T{ +Identifier +T} +_ +T{ +\f[C]Header\ identifiers\ in\ HTML\f[] +T}@T{ +\f[C]header\-identifiers\-in\-html\f[] +T} +T{ +\f[C]*Dogs*?\-\-in\ *my*\ house?\f[] +T}@T{ +\f[C]dogs\-\-in\-my\-house\f[] +T} +T{ +\f[C][HTML],\ [S5],\ or\ [RTF]?\f[] +T}@T{ +\f[C]html\-s5\-or\-rtf\f[] +T} +T{ +\f[C]3.\ Applications\f[] +T}@T{ +\f[C]applications\f[] +T} +T{ +\f[C]33\f[] +T}@T{ +\f[C]section\f[] +T} +.TE +.PP +These rules should, in most cases, allow one to determine the identifier +from the header text. +The exception is when several headers have the same text; in this case, +the first will get an identifier as described above; the second will get +the same identifier with \f[C]\-1\f[] appended; the third with +\f[C]\-2\f[]; and so on. +.PP +These identifiers are used to provide link targets in the table of +contents generated by the \f[C]\-\-toc|\-\-table\-of\-contents\f[] +option. +They also make it easy to provide links from one section of a document +to another. +A link to this section, for example, might look like this: +.IP +.nf +\f[C] +See\ the\ section\ on +[header\ identifiers](#header\-identifiers\-in\-html\-latex\-and\-context). +\f[] +.fi +.PP +Note, however, that this method of providing links to sections works +only in HTML, LaTeX, and ConTeXt formats. +.PP +If the \f[C]\-\-section\-divs\f[] option is specified, then each section +will be wrapped in a \f[C]div\f[] (or a \f[C]section\f[], if +\f[C]html5\f[] was specified), and the identifier will be attached to +the enclosing \f[C]<div>\f[] (or \f[C]<section>\f[]) tag rather than the +header itself. +This allows entire sections to be manipulated using JavaScript or +treated differently in CSS. +.SS Extension: \f[C]ascii_identifiers\f[] +.PP +Causes the identifiers produced by \f[C]auto_identifiers\f[] to be pure +ASCII. +Accents are stripped off of accented Latin letters, and non\-Latin +letters are omitted. +.SS Math Input +.PP +The extensions \f[C]tex_math_dollars\f[], +\f[C]tex_math_single_backslash\f[], and +\f[C]tex_math_double_backslash\f[] are described in the section about +Pandoc\[aq]s Markdown. +.PP +However, they can also be used with HTML input. +This is handy for reading web pages formatted using MathJax, for +example. +.SS Raw HTML/TeX +.PP +The following extensions (especially how they affect Markdown +input/output) are also described in more detail in their respective +sections of Pandoc\[aq]s Markdown. +.SS Extension: \f[C]raw_html\f[] +.PP +When converting from HTML, parse elements to raw HTML which are not +representable in pandoc\[aq]s AST. +By default, this is disabled for HTML input. +.SS Extension: \f[C]raw_tex\f[] +.PP +Allows raw LaTeX, TeX, and ConTeXt to be included in a document. +.PP +This extension can be enabled/disabled for the following formats (in +addition to \f[C]markdown\f[]): +.TP +.B input formats +\f[C]latex\f[], \f[C]org\f[], \f[C]textile\f[] +.RS +.RE +.TP +.B output formats +\f[C]textile\f[] +.RS +.RE +.SS Extension: \f[C]native_divs\f[] +.PP +This extension is enabled by default for HTML input. +This means that \f[C]div\f[]s are parsed to pandoc native elements. +(Alternatively, you can parse them to raw HTML using +\f[C]\-f\ html\-native_divs+raw_html\f[].) +.PP +When converting HTML to Markdown, for example, you may want to drop all +\f[C]div\f[]s and \f[C]span\f[]s: +.IP +.nf +\f[C] +pandoc\ \-f\ html\-native_divs\-native_spans\ \-t\ markdown +\f[] +.fi +.SS Extension: \f[C]native_spans\f[] +.PP +Analogous to \f[C]native_divs\f[] above. +.SS Literate Haskell support +.SS Extension: \f[C]literate_haskell\f[] +.PP +Treat the document as literate Haskell source. +.PP +This extension can be enabled/disabled for the following formats: +.TP +.B input formats +\f[C]markdown\f[], \f[C]rst\f[], \f[C]latex\f[] +.RS +.RE +.TP +.B output formats +\f[C]markdown\f[], \f[C]rst\f[], \f[C]latex\f[], \f[C]html\f[] +.RS +.RE +.PP +If you append \f[C]+lhs\f[] (or \f[C]+literate_haskell\f[]) to one of +the formats above, pandoc will treat the document as literate Haskell +source. +This means that +.IP \[bu] 2 +In Markdown input, "bird track" sections will be parsed as Haskell code +rather than block quotations. +Text between \f[C]\\begin{code}\f[] and \f[C]\\end{code}\f[] will also +be treated as Haskell code. +For ATX\-style headers the character \[aq]=\[aq] will be used instead of +\[aq]#\[aq]. +.IP \[bu] 2 +In Markdown output, code blocks with classes \f[C]haskell\f[] and +\f[C]literate\f[] will be rendered using bird tracks, and block +quotations will be indented one space, so they will not be treated as +Haskell code. +In addition, headers will be rendered setext\-style (with underlines) +rather than ATX\-style (with \[aq]#\[aq] characters). +(This is because ghc treats \[aq]#\[aq] characters in column 1 as +introducing line numbers.) +.IP \[bu] 2 +In restructured text input, "bird track" sections will be parsed as +Haskell code. +.IP \[bu] 2 +In restructured text output, code blocks with class \f[C]haskell\f[] +will be rendered using bird tracks. +.IP \[bu] 2 +In LaTeX input, text in \f[C]code\f[] environments will be parsed as +Haskell code. +.IP \[bu] 2 +In LaTeX output, code blocks with class \f[C]haskell\f[] will be +rendered inside \f[C]code\f[] environments. +.IP \[bu] 2 +In HTML output, code blocks with class \f[C]haskell\f[] will be rendered +with class \f[C]literatehaskell\f[] and bird tracks. +.PP +Examples: +.IP +.nf +\f[C] +pandoc\ \-f\ markdown+lhs\ \-t\ html +\f[] +.fi +.PP +reads literate Haskell source formatted with Markdown conventions and +writes ordinary HTML (without bird tracks). +.IP +.nf +\f[C] +pandoc\ \-f\ markdown+lhs\ \-t\ html+lhs +\f[] +.fi +.PP +writes HTML with the Haskell code in bird tracks, so it can be copied +and pasted as literate Haskell source. +.PP +Note that GHC expects the bird tracks in the first column, so indentend +literate code blocks (e.g. +inside an itemized environment) will not be picked up by the Haskell +compiler. +.SS Other extensions +.SS Extension: \f[C]empty_paragraphs\f[] +.PP +Allows empty paragraphs. +By default empty paragraphs are omitted. +.PP +This extension can be enabled/disabled for the following formats: +.TP +.B input formats +\f[C]docx\f[], \f[C]html\f[] +.RS +.RE +.TP +.B output formats +\f[C]markdown\f[], \f[C]docx\f[], \f[C]odt\f[], \f[C]opendocument\f[], +\f[C]html\f[] +.RS +.RE +.SS Extension: \f[C]amuse\f[] +.PP +In the \f[C]muse\f[] input format, this enables Text::Amuse extensions +to Emacs Muse markup. +.SS Extension: \f[C]citations\f[] +.PP +Some aspects of Pandoc\[aq]s Markdown citation syntax are also accepted +in \f[C]org\f[] input. .SH PANDOC\[aq]S MARKDOWN .PP Pandoc understands an extended and slightly revised version of John @@ -2043,11 +2360,11 @@ This document explains the syntax, noting differences from standard Markdown. Except where noted, these differences can be suppressed by using the \f[C]markdown_strict\f[] format instead of \f[C]markdown\f[]. -An extensions can be enabled by adding \f[C]+EXTENSION\f[] to the format -name and disabled by adding \f[C]\-EXTENSION\f[]. -For example, \f[C]markdown_strict+footnotes\f[] is strict Markdown with -footnotes enabled, while \f[C]markdown\-footnotes\-pipe_tables\f[] is -pandoc\[aq]s Markdown without footnotes or pipe tables. +Extensions can be enabled or disabled to specify the behavior more +granularly. +They are described in the following. +See also Extensions above, for extensions that work also on other +formats. .SS Philosophy .PP Markdown is designed to be easy to write, and, even more importantly, @@ -2149,6 +2466,8 @@ Many Markdown implementations do not require a space between the opening \f[C]#5\ bolt\f[] and \f[C]#hashtag\f[] count as headers. With this extension, pandoc does require the space. .SS Header identifiers +.PP +See also the \f[C]auto_identifiers\f[] extension above. .SS Extension: \f[C]header_attributes\f[] .PP Headers can be assigned attributes using this syntax at the end of the @@ -2203,96 +2522,6 @@ is just the same as #\ My\ header\ {.unnumbered} \f[] .fi -.SS Extension: \f[C]auto_identifiers\f[] -.PP -A header without an explicitly specified identifier will be -automatically assigned a unique identifier based on the header text. -To derive the identifier from the header text, -.IP \[bu] 2 -Remove all formatting, links, etc. -.IP \[bu] 2 -Remove all footnotes. -.IP \[bu] 2 -Remove all punctuation, except underscores, hyphens, and periods. -.IP \[bu] 2 -Replace all spaces and newlines with hyphens. -.IP \[bu] 2 -Convert all alphabetic characters to lowercase. -.IP \[bu] 2 -Remove everything up to the first letter (identifiers may not begin with -a number or punctuation mark). -.IP \[bu] 2 -If nothing is left after this, use the identifier \f[C]section\f[]. -.PP -Thus, for example, -.PP -.TS -tab(@); -l l. -T{ -Header -T}@T{ -Identifier -T} -_ -T{ -\f[C]Header\ identifiers\ in\ HTML\f[] -T}@T{ -\f[C]header\-identifiers\-in\-html\f[] -T} -T{ -\f[C]*Dogs*?\-\-in\ *my*\ house?\f[] -T}@T{ -\f[C]dogs\-\-in\-my\-house\f[] -T} -T{ -\f[C][HTML],\ [S5],\ or\ [RTF]?\f[] -T}@T{ -\f[C]html\-s5\-or\-rtf\f[] -T} -T{ -\f[C]3.\ Applications\f[] -T}@T{ -\f[C]applications\f[] -T} -T{ -\f[C]33\f[] -T}@T{ -\f[C]section\f[] -T} -.TE -.PP -These rules should, in most cases, allow one to determine the identifier -from the header text. -The exception is when several headers have the same text; in this case, -the first will get an identifier as described above; the second will get -the same identifier with \f[C]\-1\f[] appended; the third with -\f[C]\-2\f[]; and so on. -.PP -These identifiers are used to provide link targets in the table of -contents generated by the \f[C]\-\-toc|\-\-table\-of\-contents\f[] -option. -They also make it easy to provide links from one section of a document -to another. -A link to this section, for example, might look like this: -.IP -.nf -\f[C] -See\ the\ section\ on -[header\ identifiers](#header\-identifiers\-in\-html\-latex\-and\-context). -\f[] -.fi -.PP -Note, however, that this method of providing links to sections works -only in HTML, LaTeX, and ConTeXt formats. -.PP -If the \f[C]\-\-section\-divs\f[] option is specified, then each section -will be wrapped in a \f[C]div\f[] (or a \f[C]section\f[], if -\f[C]html5\f[] was specified), and the identifier will be attached to -the enclosing \f[C]<div>\f[] (or \f[C]<section>\f[]) tag rather than the -header itself. -This allows entire sections to be manipulated using JavaScript or -treated differently in CSS. .SS Extension: \f[C]implicit_header_references\f[] .PP Pandoc behaves as if reference links have been defined for each header. @@ -2911,6 +3140,13 @@ As\ (\@good)\ illustrates,\ ... .PP The label can be any string of alphanumeric characters, underscores, or hyphens. +.PP +Note: continuation paragraphs in example lists must always be indented +four spaces, regardless of the length of the list marker. +That is, example lists always behave as if the \f[C]four_space_rule\f[] +extension is set. +This is because example labels tend to be long, and indenting content to +the first non\-space character after the label would be awkward. .SS Compact and loose lists .PP Pandoc behaves differently from \f[C]Markdown.pl\f[] on some "edge @@ -3756,9 +3992,6 @@ options selected. Therefore see Math rendering in HTML above. .RS .RE -.PP -This extension can be used with both \f[C]markdown\f[] and \f[C]html\f[] -input. .SS Raw HTML .SS Extension: \f[C]raw_html\f[] .PP @@ -3951,9 +4184,9 @@ The raw attribute cannot be combined with regular attributes. .SS LaTeX macros .SS Extension: \f[C]latex_macros\f[] .PP -For output formats other than LaTeX, pandoc will parse LaTeX -\f[C]\\newcommand\f[] and \f[C]\\renewcommand\f[] definitions and apply -the resulting macros to all LaTeX math. +For output formats other than LaTeX, pandoc will parse LaTeX macro +definitions and apply the resulting macros to all LaTeX math and raw +LaTeX. So, for example, the following will work in all output formats, not just LaTeX: .IP @@ -3965,8 +4198,13 @@ $\\tuple{a,\ b,\ c}$ \f[] .fi .PP -In LaTeX output, the \f[C]\\newcommand\f[] definition will simply be -passed unchanged to the output. +In LaTeX output, the macro definitions will not be passed through as raw +LaTeX. +.PP +When \f[C]latex_macros\f[] is disabled, the macro definitions will be +passed through as raw LaTeX, and the raw LaTeX and math will not have +macros applied. +This is usually a better approach when you are targeting LaTeX or PDF. .SS Links .PP Markdown allows links to be specified in several ways. @@ -4288,30 +4526,6 @@ note.] .fi .PP Inline and regular footnotes may be mixed freely. -.SS Typography -.SS Extension: \f[C]smart\f[] -.PP -Interpret straight quotes as curly quotes, \f[C]\-\-\-\f[] as -em\-dashes, \f[C]\-\-\f[] as en\-dashes, and \f[C]\&...\f[] as ellipses. -Nonbreaking spaces are inserted after certain abbreviations, such as -"Mr." This option currently affects the input formats \f[C]markdown\f[], -\f[C]commonmark\f[], \f[C]latex\f[], \f[C]mediawiki\f[], \f[C]org\f[], -\f[C]rst\f[], and \f[C]twiki\f[], and the output formats -\f[C]markdown\f[], \f[C]latex\f[], and \f[C]context\f[]. -.PP -Note: If you are \f[I]writing\f[] Markdown, then the \f[C]smart\f[] -extension has the reverse effect: what would have been curly quotes -comes out straight. -.PP -In LaTeX, \f[C]smart\f[] means to use the standard TeX ligatures for -quotation marks (\f[C]``\f[] and \f[C]\[aq]\[aq]\f[] for double quotes, -\f[C]`\f[] and \f[C]\[aq]\f[] for single quotes) and dashes -(\f[C]\-\-\f[] for en\-dash and \f[C]\-\-\-\f[] for em\-dash). -If \f[C]smart\f[] is disabled, then in reading LaTeX pandoc will parse -these characters literally. -In writing LaTeX, enabling \f[C]smart\f[] tells pandoc to use the -ligatures when possible; if \f[C]smart\f[] is disabled pandoc will use -unicode quotation mark and dash characters. .SS Citations .SS Extension: \f[C]citations\f[] .PP @@ -4675,9 +4889,6 @@ as inline TeX math, and anything between \f[C]\\[\f[] and \f[C]\\]\f[] to be interpreted as display TeX math. Note: a drawback of this extension is that it precludes escaping \f[C](\f[] and \f[C][\f[]. -.PP -This extension can be used with both \f[C]markdown\f[] and \f[C]html\f[] -input. .SS Extension: \f[C]tex_math_double_backslash\f[] .PP Causes anything between \f[C]\\\\(\f[] and \f[C]\\\\)\f[] to be @@ -4725,12 +4936,6 @@ opposed to being parsed as paragraphs). .PP Makes all absolute URIs into links, even when not surrounded by pointy braces \f[C]<...>\f[]. -.SS Extension: \f[C]ascii_identifiers\f[] -.PP -Causes the identifiers produced by \f[C]auto_identifiers\f[] to be pure -ASCII. -Accents are stripped off of accented Latin letters, and non\-Latin -letters are omitted. .SS Extension: \f[C]mmd_link_attributes\f[] .PP Parses multimarkdown style key\-value attributes on link and image @@ -4765,13 +4970,6 @@ anything. .IP \[bu] 2 Lazy wrapping of paragraphs is not allowed: the entire definition must be indented four spaces. -.SS Extension: \f[C]empty_paragraphs\f[] -.PP -Allows empty paragraphs. -By default empty paragraphs are omitted. -This affects the \f[C]docx\f[] reader and writer, the -\f[C]opendocument\f[] and \f[C]odt\f[] writer, and all HTML\-based -readers and writers. .SS Markdown variants .PP In addition to pandoc\[aq]s extended Markdown, the following Markdown @@ -4817,37 +5015,26 @@ variants are supported: .RS .RE .PP -We also support \f[C]gfm\f[] (GitHub\-Flavored Markdown) as a set of -extensions on \f[C]commonmark\f[]: +We also support \f[C]commonmark\f[] and \f[C]gfm\f[] (GitHub\-Flavored +Markdown, which is implemented as a set of extensions on +\f[C]commonmark\f[]). .PP -: \f[C]pipe_tables\f[], \f[C]raw_html\f[], \f[C]fenced_code_blocks\f[], +Note, however, that \f[C]commonmark\f[] and \f[C]gfm\f[] have limited +support for extensions. +Only those listed below (and \f[C]smart\f[] and \f[C]raw_tex\f[]) will +work. +The extensions can, however, all be individually disabled. +Also, \f[C]raw_tex\f[] only affects \f[C]gfm\f[] output, not input. +.TP +.B \f[C]gfm\f[] (GitHub\-Flavored Markdown) +\f[C]pipe_tables\f[], \f[C]raw_html\f[], \f[C]fenced_code_blocks\f[], \f[C]auto_identifiers\f[], \f[C]ascii_identifiers\f[], \f[C]backtick_code_blocks\f[], \f[C]autolink_bare_uris\f[], \f[C]intraword_underscores\f[], \f[C]strikeout\f[], \f[C]hard_line_breaks\f[], \f[C]emoji\f[], \f[C]shortcut_reference_links\f[], \f[C]angle_brackets_escapable\f[]. -.IP -.nf -\f[C] -These\ can\ all\ be\ individually\ disabled.\ Note,\ however,\ that -`commonmark`\ and\ `gfm`\ have\ limited\ support\ for\ extensions: -extensions\ other\ than\ those\ listed\ above\ (and\ `smart`\ and -`raw_tex`)\ will\ have\ no\ effect\ on\ `commonmark`\ or\ `gfm`. -And\ `raw_tex`\ only\ affects\ `gfm`\ output,\ not\ input. -\f[] -.fi -.SS Extensions with formats other than Markdown -.PP -Some of the extensions discussed above can be used with formats other -than Markdown: -.IP \[bu] 2 -\f[C]auto_identifiers\f[] can be used with \f[C]latex\f[], \f[C]rst\f[], -\f[C]mediawiki\f[], and \f[C]textile\f[] input (and is used by default). -.IP \[bu] 2 -\f[C]tex_math_dollars\f[], \f[C]tex_math_single_backslash\f[], and -\f[C]tex_math_double_backslash\f[] can be used with \f[C]html\f[] input. -(This is handy for reading web pages formatted using MathJax, for -example.) +.RS +.RE .SH PRODUCING SLIDE SHOWS WITH PANDOC .PP You can use pandoc to produce an HTML + JavaScript slide presentation @@ -5299,70 +5486,6 @@ For example: </audio> \f[] .fi -.SH LITERATE HASKELL SUPPORT -.PP -If you append \f[C]+lhs\f[] (or \f[C]+literate_haskell\f[]) to an -appropriate input or output format (\f[C]markdown\f[], -\f[C]markdown_strict\f[], \f[C]rst\f[], or \f[C]latex\f[] for input or -output; \f[C]beamer\f[], \f[C]html4\f[] or \f[C]html5\f[] for output -only), pandoc will treat the document as literate Haskell source. -This means that -.IP \[bu] 2 -In Markdown input, "bird track" sections will be parsed as Haskell code -rather than block quotations. -Text between \f[C]\\begin{code}\f[] and \f[C]\\end{code}\f[] will also -be treated as Haskell code. -For ATX\-style headers the character \[aq]=\[aq] will be used instead of -\[aq]#\[aq]. -.IP \[bu] 2 -In Markdown output, code blocks with classes \f[C]haskell\f[] and -\f[C]literate\f[] will be rendered using bird tracks, and block -quotations will be indented one space, so they will not be treated as -Haskell code. -In addition, headers will be rendered setext\-style (with underlines) -rather than ATX\-style (with \[aq]#\[aq] characters). -(This is because ghc treats \[aq]#\[aq] characters in column 1 as -introducing line numbers.) -.IP \[bu] 2 -In restructured text input, "bird track" sections will be parsed as -Haskell code. -.IP \[bu] 2 -In restructured text output, code blocks with class \f[C]haskell\f[] -will be rendered using bird tracks. -.IP \[bu] 2 -In LaTeX input, text in \f[C]code\f[] environments will be parsed as -Haskell code. -.IP \[bu] 2 -In LaTeX output, code blocks with class \f[C]haskell\f[] will be -rendered inside \f[C]code\f[] environments. -.IP \[bu] 2 -In HTML output, code blocks with class \f[C]haskell\f[] will be rendered -with class \f[C]literatehaskell\f[] and bird tracks. -.PP -Examples: -.IP -.nf -\f[C] -pandoc\ \-f\ markdown+lhs\ \-t\ html -\f[] -.fi -.PP -reads literate Haskell source formatted with Markdown conventions and -writes ordinary HTML (without bird tracks). -.IP -.nf -\f[C] -pandoc\ \-f\ markdown+lhs\ \-t\ html+lhs -\f[] -.fi -.PP -writes HTML with the Haskell code in bird tracks, so it can be copied -and pasted as literate Haskell source. -.PP -Note that GHC expects the bird tracks in the first column, so indentend -literate code blocks (e.g. -inside an itemized environment) will not be picked up by the Haskell -compiler. .SH SYNTAX HIGHLIGHTING .PP Pandoc will automatically highlight syntax in fenced code blocks that @@ -5395,26 +5518,26 @@ blocks and text using \f[C]div\f[]s and \f[C]span\f[]s, respectively. If you define a \f[C]div\f[] or \f[C]span\f[] with the attribute \f[C]custom\-style\f[], pandoc will apply your specified style to the contained elements. -So, for example, +So, for example using the \f[C]bracketed_spans\f[] syntax, .IP .nf \f[C] -<span\ custom\-style="Emphatically">Get\ out,</span>\ he\ said. +[Get\ out]{custom\-style="Emphatically"},\ he\ said. \f[] .fi .PP -would produce a docx file with "Get out," styled with character style +would produce a docx file with "Get out" styled with character style \f[C]Emphatically\f[]. -Similarly, +Similarly, using the \f[C]fenced_divs\f[] syntax, .IP .nf \f[C] Dickinson\ starts\ the\ poem\ simply: -<div\ custom\-style="Poetry"> +:::\ {custom\-style="Poetry"} |\ A\ Bird\ came\ down\ the\ Walk\-\-\- |\ He\ did\ not\ know\ I\ saw\-\-\- -</div> +::: \f[] .fi .PP diff --git a/pandoc.cabal b/pandoc.cabal index 5edf9cbd7..29442f7f8 100644 --- a/pandoc.cabal +++ b/pandoc.cabal @@ -1,5 +1,5 @@ name: pandoc -version: 2.0.5 +version: 2.0.6 cabal-version: >= 1.10 build-type: Custom license: GPL @@ -244,6 +244,7 @@ extra-source-files: test/tables.txt test/tables.fb2 test/tables.muse + test/tables.custom test/testsuite.txt test/writer.latex test/writer.context @@ -272,6 +273,7 @@ extra-source-files: test/writer.dokuwiki test/writer.zimwiki test/writer.muse + test/writer.custom test/writers-lang-and-dir.latex test/writers-lang-and-dir.context test/dokuwiki_inline_formatting.dokuwiki diff --git a/src/Text/Pandoc/App.hs b/src/Text/Pandoc/App.hs index df4bdc151..50464830b 100644 --- a/src/Text/Pandoc/App.hs +++ b/src/Text/Pandoc/App.hs @@ -58,6 +58,9 @@ import Data.Monoid import qualified Data.Set as Set import Data.Text (Text) import qualified Data.Text as T +import qualified Data.Text.Lazy as TL +import qualified Data.Text.Lazy.Encoding as TE +import qualified Data.Text.Encoding.Error as TE import Data.Yaml (decode) import qualified Data.Yaml as Yaml import GHC.Generics @@ -143,7 +146,7 @@ pdfWriterAndProg :: Maybe String -- ^ user-specified writer name -> IO (String, Maybe String) -- ^ IO (writerName, maybePdfEngineProg) pdfWriterAndProg mWriter mEngine = do let panErr msg = liftIO $ E.throwIO $ PandocAppError msg - case go (baseWriterName <$> mWriter) mEngine of + case go mWriter mEngine of Right (writ, prog) -> return (writ, Just prog) Left err -> panErr err where @@ -151,7 +154,7 @@ pdfWriterAndProg mWriter mEngine = do go (Just writer) Nothing = (writer,) <$> engineForWriter writer go Nothing (Just engine) = (,engine) <$> writerForEngine engine go (Just writer) (Just engine) = - case find (== (writer, engine)) engines of + case find (== (baseWriterName writer, engine)) engines of Just _ -> Right (writer, engine) Nothing -> Left $ "pdf-engine " ++ engine ++ " is not compatible with output format " ++ writer @@ -161,7 +164,7 @@ pdfWriterAndProg mWriter mEngine = do [] -> Left $ "pdf-engine " ++ eng ++ " not known" - engineForWriter w = case [e | (f,e) <- engines, f == w] of + engineForWriter w = case [e | (f,e) <- engines, f == baseWriterName w] of eng : _ -> Right eng [] -> Left $ "cannot produce pdf output from " ++ w @@ -513,7 +516,9 @@ convertWithOpts opts = do case res of Right pdf -> writeFnBinary outputFile pdf Left err' -> liftIO $ - E.throwIO $ PandocPDFError (UTF8.toStringLazy err') + E.throwIO $ PandocPDFError $ + TL.unpack (TE.decodeUtf8With TE.lenientDecode err') + Nothing -> do let htmlFormat = format `elem` ["html","html4","html5","s5","slidy", @@ -1584,15 +1589,17 @@ options = "" , Option "" ["list-extensions"] - (NoArg - (\_ -> do + (OptArg + (\arg _ -> do + let exts = getDefaultExtensions (fromMaybe "markdown" arg) let showExt x = drop 4 (show x) ++ - if extensionEnabled x pandocExtensions + if extensionEnabled x exts then " +" else " -" mapM_ (UTF8.hPutStrLn stdout . showExt) ([minBound..maxBound] :: [Extension]) - exitSuccess )) + exitSuccess ) + "FORMAT") "" , Option "" ["list-highlight-languages"] diff --git a/src/Text/Pandoc/Extensions.hs b/src/Text/Pandoc/Extensions.hs index bea293891..7fa75cdd9 100644 --- a/src/Text/Pandoc/Extensions.hs +++ b/src/Text/Pandoc/Extensions.hs @@ -321,6 +321,7 @@ getDefaultExtensions "org" = extensionsFromList getDefaultExtensions "html" = extensionsFromList [Ext_auto_identifiers, Ext_native_divs, + Ext_line_blocks, Ext_native_spans] getDefaultExtensions "html4" = getDefaultExtensions "html" getDefaultExtensions "html5" = getDefaultExtensions "html" diff --git a/src/Text/Pandoc/Lua/Module/Utils.hs b/src/Text/Pandoc/Lua/Module/Utils.hs index 3a3727355..35495dae1 100644 --- a/src/Text/Pandoc/Lua/Module/Utils.hs +++ b/src/Text/Pandoc/Lua/Module/Utils.hs @@ -30,10 +30,10 @@ module Text.Pandoc.Lua.Module.Utils ) where import Control.Applicative ((<|>)) -import Foreign.Lua (FromLuaStack, Lua, NumResults) +import Foreign.Lua (FromLuaStack, Lua, LuaInteger, NumResults) import Text.Pandoc.Definition (Pandoc, Meta, Block, Inline) import Text.Pandoc.Lua.StackInstances () -import Text.Pandoc.Lua.Util (addFunction) +import Text.Pandoc.Lua.Util (OrNil (OrNil), addFunction) import qualified Data.Digest.Pure.SHA as SHA import qualified Data.ByteString.Lazy as BSL @@ -44,15 +44,32 @@ import qualified Text.Pandoc.Shared as Shared pushModule :: Lua NumResults pushModule = do Lua.newtable + addFunction "hierarchicalize" hierarchicalize + addFunction "normalize_date" normalizeDate addFunction "sha1" sha1 addFunction "stringify" stringify + addFunction "to_roman_numeral" toRomanNumeral return 1 +-- | Convert list of Pandoc blocks into (hierarchical) list of Elements. +hierarchicalize :: [Block] -> Lua [Shared.Element] +hierarchicalize = return . Shared.hierarchicalize + +-- | Parse a date and convert (if possible) to "YYYY-MM-DD" format. We +-- limit years to the range 1601-9999 (ISO 8601 accepts greater than +-- or equal to 1583, but MS Word only accepts dates starting 1601). +-- Returns nil instead of a string if the conversion failed. +normalizeDate :: String -> Lua (OrNil String) +normalizeDate = return . OrNil . Shared.normalizeDate + -- | Calculate the hash of the given contents. sha1 :: BSL.ByteString -> Lua String sha1 = return . SHA.showDigest . SHA.sha1 +-- | Convert pandoc structure to a string with formatting removed. +-- Footnotes are skipped (since we don't want their contents in link +-- labels). stringify :: AstElement -> Lua String stringify el = return $ case el of PandocElement pd -> Shared.stringify pd @@ -77,3 +94,7 @@ instance FromLuaStack AstElement where Right x -> return x Left _ -> Lua.throwLuaError "Expected an AST element, but could not parse value as such." + +-- | Convert a number < 4000 to uppercase roman numeral. +toRomanNumeral :: LuaInteger -> Lua String +toRomanNumeral = return . Shared.toRomanNumeral . fromIntegral diff --git a/src/Text/Pandoc/Lua/StackInstances.hs b/src/Text/Pandoc/Lua/StackInstances.hs index ce6dbdb98..119946b78 100644 --- a/src/Text/Pandoc/Lua/StackInstances.hs +++ b/src/Text/Pandoc/Lua/StackInstances.hs @@ -33,13 +33,15 @@ StackValue instances for pandoc types. module Text.Pandoc.Lua.StackInstances () where import Control.Applicative ((<|>)) +import Control.Monad (when) import Foreign.Lua (FromLuaStack (peek), Lua, LuaInteger, LuaNumber, StackIndex, ToLuaStack (push), Type (..), throwLuaError, tryLua) import Text.Pandoc.Definition import Text.Pandoc.Lua.Util (adjustIndexBy, getTable, pushViaConstructor) -import Text.Pandoc.Shared (safeRead) +import Text.Pandoc.Shared (Element (Blk, Sec), safeRead) import qualified Foreign.Lua as Lua +import qualified Text.Pandoc.Lua.Util as LuaUtil instance ToLuaStack Pandoc where push (Pandoc meta blocks) = @@ -306,3 +308,27 @@ instance ToLuaStack LuaAttr where instance FromLuaStack LuaAttr where peek idx = LuaAttr <$> peek idx + +-- +-- Hierarchical elements +-- +instance ToLuaStack Element where + push (Blk blk) = push blk + push (Sec lvl num attr label contents) = do + Lua.newtable + LuaUtil.addValue "level" lvl + LuaUtil.addValue "numbering" num + LuaUtil.addValue "attr" (LuaAttr attr) + LuaUtil.addValue "label" label + LuaUtil.addValue "contents" contents + pushSecMetaTable + Lua.setmetatable (-2) + where + pushSecMetaTable :: Lua () + pushSecMetaTable = do + inexistant <- Lua.newmetatable "PandocElementSec" + when inexistant $ do + LuaUtil.addValue "t" "Sec" + Lua.push "__index" + Lua.pushvalue (-2) + Lua.rawset (-3) diff --git a/src/Text/Pandoc/Lua/Util.hs b/src/Text/Pandoc/Lua/Util.hs index 28d09d339..1f7664fc0 100644 --- a/src/Text/Pandoc/Lua/Util.hs +++ b/src/Text/Pandoc/Lua/Util.hs @@ -125,6 +125,10 @@ instance FromLuaStack a => FromLuaStack (OrNil a) where then return (OrNil Nothing) else OrNil . Just <$> Lua.peek idx +instance ToLuaStack a => ToLuaStack (OrNil a) where + push (OrNil Nothing) = Lua.pushnil + push (OrNil (Just x)) = Lua.push x + -- | Helper class for pushing a single value to the stack via a lua function. -- See @pushViaCall@. class PushViaCall a where diff --git a/src/Text/Pandoc/MIME.hs b/src/Text/Pandoc/MIME.hs index fb85910bb..eba8d512f 100644 --- a/src/Text/Pandoc/MIME.hs +++ b/src/Text/Pandoc/MIME.hs @@ -325,6 +325,7 @@ mimeTypesList = -- List borrowed from happstack-server. ,("ogv","video/ogg") ,("ogx","application/ogg") ,("old","application/x-trash") + ,("opus","audio/ogg") ,("otg","application/vnd.oasis.opendocument.graphics-template") ,("oth","application/vnd.oasis.opendocument.text-web") ,("otp","application/vnd.oasis.opendocument.presentation-template") diff --git a/src/Text/Pandoc/Readers/Docx/Parse.hs b/src/Text/Pandoc/Readers/Docx/Parse.hs index 99e6f99e6..48a512be2 100644 --- a/src/Text/Pandoc/Readers/Docx/Parse.hs +++ b/src/Text/Pandoc/Readers/Docx/Parse.hs @@ -73,6 +73,7 @@ import Text.TeXMath (Exp) import Text.TeXMath.Readers.OMML (readOMML) import Text.TeXMath.Unicode.Fonts (Font (..), getUnicode, stringToFont) import Text.XML.Light +import qualified Text.XML.Light.Cursor as XMLC data ReaderEnv = ReaderEnv { envNotes :: Notes , envComments :: Comments @@ -117,6 +118,32 @@ mapD f xs = in concatMapM handler xs +unwrapSDT :: NameSpaces -> Content -> Content +unwrapSDT ns (Elem element) + | isElem ns "w" "sdt" element + , Just sdtContent <- findChildByName ns "w" "sdtContent" element + , child : _ <- elChildren sdtContent + = Elem child +unwrapSDT _ content = content + +walkDocument' :: NameSpaces -> XMLC.Cursor -> XMLC.Cursor +walkDocument' ns cur = + let modifiedCur = XMLC.modifyContent (unwrapSDT ns) cur + in + case XMLC.nextDF modifiedCur of + Just cur' -> walkDocument' ns cur' + Nothing -> XMLC.root modifiedCur + +walkDocument :: NameSpaces -> Element -> Maybe Element +walkDocument ns element = + let cur = XMLC.fromContent (Elem element) + cur' = walkDocument' ns cur + in + case XMLC.toTree cur' of + Elem element' -> Just element' + _ -> Nothing + + data Docx = Docx Document deriving Show @@ -298,7 +325,10 @@ archiveToDocument zf = do docElem <- maybeToD $ (parseXMLDoc . UTF8.toStringLazy . fromEntry) entry let namespaces = elemToNameSpaces docElem bodyElem <- maybeToD $ findChildByName namespaces "w" "body" docElem - body <- elemToBody namespaces bodyElem + let bodyElem' = case walkDocument namespaces bodyElem of + Just e -> e + Nothing -> bodyElem + body <- elemToBody namespaces bodyElem' return $ Document namespaces body elemToBody :: NameSpaces -> Element -> D Body diff --git a/src/Text/Pandoc/Readers/HTML.hs b/src/Text/Pandoc/Readers/HTML.hs index 3e59c4bf7..05a80335a 100644 --- a/src/Text/Pandoc/Readers/HTML.hs +++ b/src/Text/Pandoc/Readers/HTML.hs @@ -51,7 +51,7 @@ import Data.Char (isAlphaNum, isDigit, isLetter) import Data.Default (Default (..), def) import Data.Foldable (for_) import Data.List (intercalate, isPrefixOf) -import Data.List.Split (wordsBy) +import Data.List.Split (wordsBy, splitWhen) import qualified Data.Map as M import Data.Maybe (fromMaybe, isJust, isNothing) import Data.Monoid (First (..), (<>)) @@ -66,6 +66,7 @@ import qualified Text.Pandoc.Builder as B import Text.Pandoc.Class (PandocMonad (..)) import Text.Pandoc.CSS (foldOrElse, pickStyleAttrProps) import Text.Pandoc.Definition +import Text.Pandoc.Extensions (Extension(..)) import Text.Pandoc.Error import Text.Pandoc.Logging import Text.Pandoc.Options ( @@ -191,6 +192,7 @@ block = do , pHtml , pHead , pBody + , pLineBlock , pDiv , pPlain , pFigure @@ -377,6 +379,16 @@ pRawTag = do then return mempty else return $ renderTags' [tag] +pLineBlock :: PandocMonad m => TagParser m Blocks +pLineBlock = try $ do + guardEnabled Ext_line_blocks + _ <- pSatisfy $ tagOpen (=="div") (== [("class","line-block")]) + ils <- trimInlines . mconcat <$> manyTill inline (pSatisfy (tagClose (=="div"))) + let lns = map B.fromList $ + splitWhen (== LineBreak) $ filter (/= SoftBreak) $ + B.toList ils + return $ B.lineBlock lns + pDiv :: PandocMonad m => TagParser m Blocks pDiv = try $ do guardEnabled Ext_native_divs diff --git a/src/Text/Pandoc/Readers/JATS.hs b/src/Text/Pandoc/Readers/JATS.hs index 851fbec35..9223db68c 100644 --- a/src/Text/Pandoc/Readers/JATS.hs +++ b/src/Text/Pandoc/Readers/JATS.hs @@ -5,6 +5,7 @@ import Data.Char (isDigit, isSpace, toUpper) import Data.Default import Data.Generics import Data.List (intersperse) +import qualified Data.Map as Map import Data.Maybe (maybeToList, fromMaybe) import Data.Text (Text) import qualified Data.Text as T @@ -23,7 +24,6 @@ type JATS m = StateT JATSState m data JATSState = JATSState{ jatsSectionLevel :: Int , jatsQuoteType :: QuoteType , jatsMeta :: Meta - , jatsAcceptsMeta :: Bool , jatsBook :: Bool , jatsFigureTitle :: Inlines , jatsContent :: [Content] @@ -33,7 +33,6 @@ instance Default JATSState where def = JATSState{ jatsSectionLevel = 0 , jatsQuoteType = DoubleQuote , jatsMeta = mempty - , jatsAcceptsMeta = False , jatsBook = False , jatsFigureTitle = mempty , jatsContent = [] } @@ -79,19 +78,6 @@ named s e = qName (elName e) == s -- -acceptingMetadata :: PandocMonad m => JATS m a -> JATS m a -acceptingMetadata p = do - modify (\s -> s { jatsAcceptsMeta = True } ) - res <- p - modify (\s -> s { jatsAcceptsMeta = False }) - return res - -checkInMeta :: (PandocMonad m, Monoid a) => JATS m () -> JATS m a -checkInMeta p = do - accepts <- jatsAcceptsMeta <$> get - when accepts p - return mempty - addMeta :: PandocMonad m => ToMetaValue a => String -> a -> JATS m () addMeta field val = modify (setMeta field val) @@ -179,18 +165,16 @@ parseBlock (Elem e) = <$> listitems "def-list" -> definitionList <$> deflistitems "sec" -> gets jatsSectionLevel >>= sect . (+1) - "title" -> return mempty - "title-group" -> checkInMeta getTitle "graphic" -> para <$> getGraphic e - "journal-meta" -> metaBlock - "article-meta" -> metaBlock - "custom-meta" -> metaBlock + "journal-meta" -> parseMetadata e + "article-meta" -> parseMetadata e + "custom-meta" -> parseMetadata e + "title" -> return mempty -- processed by header "table" -> parseTable "fig" -> divWith (attrValue "id" e, ["fig"], []) <$> getBlocks e "table-wrap" -> divWith (attrValue "id" e, ["table-wrap"], []) <$> getBlocks e "caption" -> divWith (attrValue "id" e, ["caption"], []) <$> sect 6 - "ref-list" -> divWith ("refs", [], []) <$> getBlocks e - "ref" -> divWith ("ref-" <> attrValue "id" e, [], []) <$> getBlocks e + "ref-list" -> parseRefList e "?xml" -> return mempty _ -> getBlocks e where parseMixed container conts = do @@ -231,16 +215,6 @@ parseBlock (Elem e) = terms' <- mapM getInlines terms items' <- mapM getBlocks items return (mconcat $ intersperse (str "; ") terms', items') - getTitle = do - tit <- case filterChild (named "article-title") e of - Just s -> getInlines s - Nothing -> return mempty - subtit <- case filterChild (named "subtitle") e of - Just s -> (text ": " <>) <$> - getInlines s - Nothing -> return mempty - addMeta "title" (tit <> subtit) - parseTable = do let isCaption x = named "title" x || named "caption" x caption <- case filterChild isCaption e of @@ -305,13 +279,127 @@ parseBlock (Elem e) = let ident = attrValue "id" e modify $ \st -> st{ jatsSectionLevel = oldN } return $ headerWith (ident,[],[]) n' headerText <> b --- lineItems = mapM getInlines $ filterChildren (named "line") e - metaBlock = acceptingMetadata (getBlocks e) >> return mempty getInlines :: PandocMonad m => Element -> JATS m Inlines getInlines e' = (trimInlines . mconcat) <$> mapM parseInline (elContent e') +parseMetadata :: PandocMonad m => Element -> JATS m Blocks +parseMetadata e = do + getTitle e + getAuthors e + getAffiliations e + return mempty + +getTitle :: PandocMonad m => Element -> JATS m () +getTitle e = do + tit <- case filterElement (named "article-title") e of + Just s -> getInlines s + Nothing -> return mempty + subtit <- case filterElement (named "subtitle") e of + Just s -> (text ": " <>) <$> + getInlines s + Nothing -> return mempty + when (tit /= mempty) $ addMeta "title" tit + when (subtit /= mempty) $ addMeta "subtitle" subtit + +getAuthors :: PandocMonad m => Element -> JATS m () +getAuthors e = do + authors <- mapM getContrib $ filterElements + (\x -> named "contrib" x && + attrValue "contrib-type" x == "author") e + authorNotes <- mapM getInlines $ filterElements (named "author-notes") e + let authors' = case (reverse authors, authorNotes) of + ([], _) -> [] + (_, []) -> authors + (a:as, ns) -> reverse as ++ [a <> mconcat ns] + unless (null authors) $ addMeta "author" authors' + +getAffiliations :: PandocMonad m => Element -> JATS m () +getAffiliations x = do + affs <- mapM getInlines $ filterChildren (named "aff") x + unless (null affs) $ addMeta "institute" affs + +getContrib :: PandocMonad m => Element -> JATS m Inlines +getContrib x = do + given <- maybe (return mempty) getInlines + $ filterElement (named "given-names") x + family <- maybe (return mempty) getInlines + $ filterElement (named "surname") x + if given == mempty && family == mempty + then return mempty + else if given == mempty || family == mempty + then return $ given <> family + else return $ given <> space <> family + +parseRefList :: PandocMonad m => Element -> JATS m Blocks +parseRefList e = do + refs <- mapM parseRef $ filterChildren (named "ref") e + addMeta "references" refs + return mempty + +parseRef :: PandocMonad m + => Element -> JATS m (Map.Map String MetaValue) +parseRef e = do + let refId = text $ attrValue "id" e + let getInlineText n = maybe (return mempty) getInlines . filterChild (named n) + case filterChild (named "element-citation") e of + Just c -> do + let refType = text $ + case attrValue "publication-type" c of + "journal" -> "article-journal" + x -> x + (refTitle, refContainerTitle) <- do + t <- getInlineText "article-title" c + ct <- getInlineText "source" c + if t == mempty + then return (ct, mempty) + else return (t, ct) + refLabel <- getInlineText "label" c + refYear <- getInlineText "year" c + refVolume <- getInlineText "volume" c + refFirstPage <- getInlineText "fpage" c + refLastPage <- getInlineText "lpage" c + refPublisher <- getInlineText "publisher-name" c + refPublisherPlace <- getInlineText "publisher-loc" c + let refPages = refFirstPage <> (if refLastPage == mempty + then mempty + else text "\x2013" <> refLastPage) + let personGroups' = filterChildren (named "person-group") c + let getName nm = do + given <- maybe (return mempty) getInlines + $ filterChild (named "given-names") nm + family <- maybe (return mempty) getInlines + $ filterChild (named "surname") nm + return $ toMetaValue $ Map.fromList [ + ("given", given) + , ("family", family) + ] + personGroups <- mapM (\pg -> + do names <- mapM getName + (filterChildren (named "name") pg) + return (attrValue "person-group-type" pg, + toMetaValue names)) + personGroups' + return $ Map.fromList $ + [ ("id", toMetaValue refId) + , ("type", toMetaValue refType) + , ("title", toMetaValue refTitle) + , ("container-title", toMetaValue refContainerTitle) + , ("publisher", toMetaValue refPublisher) + , ("publisher-place", toMetaValue refPublisherPlace) + , ("title", toMetaValue refTitle) + , ("issued", toMetaValue + $ Map.fromList [ + ("year", refYear) + ]) + , ("volume", toMetaValue refVolume) + , ("page", toMetaValue refPages) + , ("citation-label", toMetaValue refLabel) + ] ++ personGroups + Nothing -> return $ Map.insert "id" (toMetaValue refId) mempty + -- TODO handle mixed-citation + strContentRecursive :: Element -> String strContentRecursive = strContent . (\e' -> e'{ elContent = map elementToStr $ elContent e' }) @@ -354,7 +442,15 @@ parseInline (Elem e) = let rid = attrValue "rid" e let refType = ("ref-type",) <$> maybeAttrValue "ref-type" e let attr = (attrValue "id" e, [], maybeToList refType) - return $ linkWith attr ('#' : rid) "" ils + return $ if refType == Just ("ref-type","bibr") + then cite [Citation{ + citationId = rid + , citationPrefix = [] + , citationSuffix = [] + , citationMode = NormalCitation + , citationNoteNum = 0 + , citationHash = 0}] ils + else linkWith attr ('#' : rid) "" ils "ext-link" -> do ils <- innerInlines let title = fromMaybe "" $ findAttr (QName "title" (Just "http://www.w3.org/1999/xlink") Nothing) e @@ -375,9 +471,6 @@ parseInline (Elem e) = "uri" -> return $ link (strContent e) "" $ str $ strContent e "fn" -> (note . mconcat) <$> mapM parseBlock (elContent e) - -- Note: this isn't a real docbook tag; it's what we convert - -- <?asciidor-br?> to in handleInstructions, above. A kludge to - -- work around xml-light's inability to parse an instruction. _ -> innerInlines where innerInlines = (trimInlines . mconcat) <$> mapM parseInline (elContent e) diff --git a/src/Text/Pandoc/Readers/LaTeX.hs b/src/Text/Pandoc/Readers/LaTeX.hs index f7e45e01a..6c5567ffd 100644 --- a/src/Text/Pandoc/Readers/LaTeX.hs +++ b/src/Text/Pandoc/Readers/LaTeX.hs @@ -1489,8 +1489,17 @@ inlineCommands = M.union inlineLanguageCommands $ M.fromList $ -- biblatex misc , ("RN", romanNumeralUpper) , ("Rn", romanNumeralLower) + -- babel + , ("foreignlanguage", foreignlanguage) ] +foreignlanguage :: PandocMonad m => LP m Inlines +foreignlanguage = do + babelLang <- T.unpack . untokenize <$> braced + case babelLangToBCP47 babelLang of + Just lang -> spanWith ("", [], [("lang", renderLang $ lang)]) <$> tok + _ -> tok + inlineLanguageCommands :: PandocMonad m => M.Map Text (LP m Inlines) inlineLanguageCommands = M.fromList $ mk <$> M.toList polyglossiaLangToBCP47 where @@ -2655,3 +2664,24 @@ polyglossiaLangToBCP47 = M.fromList , ("urdu", \_ -> Lang "ur" "" "" []) , ("vietnamese", \_ -> Lang "vi" "" "" []) ] + +babelLangToBCP47 :: String -> Maybe Lang +babelLangToBCP47 s = + case s of + "austrian" -> Just $ Lang "de" "" "AT" ["1901"] + "naustrian" -> Just $ Lang "de" "" "AT" [] + "swissgerman" -> Just $ Lang "de" "" "CH" ["1901"] + "nswissgerman" -> Just $ Lang "de" "" "CH" [] + "german" -> Just $ Lang "de" "" "DE" ["1901"] + "ngerman" -> Just $ Lang "de" "" "DE" [] + "lowersorbian" -> Just $ Lang "dsb" "" "" [] + "uppersorbian" -> Just $ Lang "hsb" "" "" [] + "polutonikogreek" -> Just $ Lang "el" "" "" ["polyton"] + "slovene" -> Just $ Lang "sl" "" "" [] + "australian" -> Just $ Lang "en" "" "AU" [] + "canadian" -> Just $ Lang "en" "" "CA" [] + "british" -> Just $ Lang "en" "" "GB" [] + "newzealand" -> Just $ Lang "en" "" "NZ" [] + "american" -> Just $ Lang "en" "" "US" [] + "classiclatin" -> Just $ Lang "la" "" "" ["x-classic"] + _ -> fmap ($ "") $ M.lookup s polyglossiaLangToBCP47 diff --git a/src/Text/Pandoc/Readers/RST.hs b/src/Text/Pandoc/Readers/RST.hs index 6b5d0a331..9f259d958 100644 --- a/src/Text/Pandoc/Readers/RST.hs +++ b/src/Text/Pandoc/Readers/RST.hs @@ -547,7 +547,7 @@ bulletListStart :: Monad m => ParserT [Char] st m Int bulletListStart = try $ do notFollowedBy' hrule -- because hrules start out just like lists marker <- oneOf bulletListMarkers - white <- many1 spaceChar + white <- many1 spaceChar <|> "" <$ lookAhead (char '\n') return $ length (marker:white) -- parses ordered list start and returns its length (inc following whitespace) @@ -556,7 +556,7 @@ orderedListStart :: Monad m => ListNumberStyle -> RSTParser m Int orderedListStart style delim = try $ do (_, markerLen) <- withHorizDisplacement (orderedListMarker style delim) - white <- many1 spaceChar + white <- many1 spaceChar <|> "" <$ lookAhead (char '\n') return $ markerLen + length white -- parse a line of a list item diff --git a/src/Text/Pandoc/Writers/Custom.hs b/src/Text/Pandoc/Writers/Custom.hs index 72f443ed0..a33196cbe 100644 --- a/src/Text/Pandoc/Writers/Custom.hs +++ b/src/Text/Pandoc/Writers/Custom.hs @@ -87,6 +87,15 @@ instance ToLuaStack (Stringify Citation) where addValue "citationNoteNum" $ citationNoteNum cit addValue "citationHash" $ citationHash cit +-- | Key-value pair, pushed as a table with @a@ as the only key and @v@ as the +-- associated value. +newtype KeyValue a b = KeyValue (a, b) + +instance (ToLuaStack a, ToLuaStack b) => ToLuaStack (KeyValue a b) where + push (KeyValue (k, v)) = do + newtable + addValue k v + data PandocLuaException = PandocLuaException String deriving (Show, Typeable) @@ -102,8 +111,7 @@ writeCustom luaFile opts doc@(Pandoc meta _) = do -- to handle this more gracefully): when (stat /= OK) $ tostring 1 >>= throw . PandocLuaException . UTF8.toString - call 0 0 - -- TODO - call hierarchicalize, so we have that info + -- TODO - call hierarchicalize, so we have that info rendered <- docToCustom opts doc context <- metaToJSON opts blockListToCustom @@ -166,7 +174,8 @@ blockToCustom (OrderedList (num,sty,delim) items) = callFunc "OrderedList" (map Stringify items) num (show sty) (show delim) blockToCustom (DefinitionList items) = - callFunc "DefinitionList" (map (Stringify *** map Stringify) items) + callFunc "DefinitionList" + (map (KeyValue . (Stringify *** map Stringify)) items) blockToCustom (Div attr items) = callFunc "Div" (Stringify items) (attrToMap attr) diff --git a/src/Text/Pandoc/Writers/HTML.hs b/src/Text/Pandoc/Writers/HTML.hs index f25bbadfb..7ff7284cc 100644 --- a/src/Text/Pandoc/Writers/HTML.hs +++ b/src/Text/Pandoc/Writers/HTML.hs @@ -670,8 +670,7 @@ blockToHtml opts (LineBlock lns) = if writerWrapText opts == WrapNone then blockToHtml opts $ linesToPara lns else do - let lf = preEscapedString "\n" - htmlLines <- mconcat . intersperse lf <$> mapM (inlineListToHtml opts) lns + htmlLines <- inlineListToHtml opts $ intercalate [LineBreak] lns return $ H.div ! A.class_ "line-block" $ htmlLines blockToHtml opts (Div attr@(ident, classes, kvs') bs) = do html5 <- gets stHtml5 diff --git a/src/Text/Pandoc/Writers/LaTeX.hs b/src/Text/Pandoc/Writers/LaTeX.hs index 666aea07c..d6ccc1512 100644 --- a/src/Text/Pandoc/Writers/LaTeX.hs +++ b/src/Text/Pandoc/Writers/LaTeX.hs @@ -398,10 +398,10 @@ elementToBeamer slideLevel (Sec lvl _num (ident,classes,kvs) tit elts) hasCode _ = [] let fragile = "fragile" `elem` classes || not (null $ query hasCodeBlock elts ++ query hasCode elts) - let frameoptions = ["allowdisplaybreaks", "allowframebreaks", + let frameoptions = ["allowdisplaybreaks", "allowframebreaks", "fragile", "b", "c", "t", "environment", "label", "plain", "shrink", "standout"] - let optionslist = ["fragile" | fragile] ++ + let optionslist = ["fragile" | fragile && lookup "fragile" kvs == Nothing] ++ [k | k <- classes, k `elem` frameoptions] ++ [k ++ "=" ++ v | (k,v) <- kvs, k `elem` frameoptions] let options = if null optionslist diff --git a/src/Text/Pandoc/Writers/Markdown.hs b/src/Text/Pandoc/Writers/Markdown.hs index 7a3d204f2..13572c466 100644 --- a/src/Text/Pandoc/Writers/Markdown.hs +++ b/src/Text/Pandoc/Writers/Markdown.hs @@ -305,22 +305,24 @@ escapeString opts (c:cs) = _ -> c : escapeString opts cs -- | Construct table of contents from list of header blocks. -tableOfContents :: PandocMonad m => WriterOptions -> [Block] -> m Doc -tableOfContents opts headers = - let contents = BulletList $ map (elementToListItem opts) $ hierarchicalize headers - in evalMD (blockToMarkdown opts contents) def def +tableOfContents :: PandocMonad m => WriterOptions -> [Block] -> MD m Doc +tableOfContents opts headers = do + contents <- BulletList <$> mapM (elementToListItem opts) (hierarchicalize headers) + blockToMarkdown opts contents -- | Converts an Element to a list item for a table of contents, -elementToListItem :: WriterOptions -> Element -> [Block] +elementToListItem :: PandocMonad m => WriterOptions -> Element -> MD m [Block] elementToListItem opts (Sec lev _nums (ident,_,_) headerText subsecs) - = Plain headerLink : - [ BulletList (map (elementToListItem opts) subsecs) | - not (null subsecs) && lev < writerTOCDepth opts ] - where headerLink = if null ident + = do isPlain <- asks envPlain + let headerLink = if null ident || isPlain then walk deNote headerText else [Link nullAttr (walk deNote headerText) ('#':ident, "")] -elementToListItem _ (Blk _) = [] + listContents <- if null subsecs || lev >= writerTOCDepth opts + then return [] + else mapM (elementToListItem opts) subsecs + return [Plain headerLink, BulletList listContents] +elementToListItem _ (Blk _) = return [] attrsToMarkdown :: Attr -> Doc attrsToMarkdown attribs = braces $ hsep [attribId, attribClasses, attribKeys] diff --git a/stack.pkg.yaml b/stack.pkg.yaml index fe6f0622c..450c7c3ae 100644 --- a/stack.pkg.yaml +++ b/stack.pkg.yaml @@ -13,4 +13,4 @@ flags: packages: - '.' extra-deps: [] -resolver: lts-10.0 +resolver: lts-10.1 diff --git a/stack.yaml b/stack.yaml index c8e1990d7..e04468cfc 100644 --- a/stack.yaml +++ b/stack.yaml @@ -6,4 +6,4 @@ flags: network-uri: true packages: extra-deps: [] -resolver: lts-10.0 +resolver: lts-10.1 diff --git a/test/Tests/Lua.hs b/test/Tests/Lua.hs index 57e7c5f0c..6f495a3ca 100644 --- a/test/Tests/Lua.hs +++ b/test/Tests/Lua.hs @@ -96,12 +96,15 @@ tests = map (localOption (QuickCheckTests 20)) assertFilterConversion "pandoc.utils doesn't work as expected." "test-pandoc-utils.lua" (doc $ para "doesn't matter") - (doc $ mconcat [ plain (str "sha1: OK") + (doc $ mconcat [ plain (str "hierarchicalize: OK") + , plain (str "normalize_date: OK") , plain (str "pipe: OK") , plain (str "failing pipe: OK") , plain (str "read: OK") , plain (str "failing read: OK") + , plain (str "sha1: OK") , plain (str "stringify: OK") + , plain (str "to_roman_numeral: OK") ]) ] diff --git a/test/Tests/Old.hs b/test/Tests/Old.hs index bbd51ee98..b82251a56 100644 --- a/test/Tests/Old.hs +++ b/test/Tests/Old.hs @@ -162,6 +162,12 @@ tests = [ testGroup "markdown" [ test "reader" ["-r", "creole", "-w", "native", "-s"] "creole-reader.txt" "creole-reader.native" ] + , testGroup "custom writer" + [ test "basic" ["-f", "native", "-t", "../data/sample.lua"] + "testsuite.native" "writer.custom" + , test "tables" ["-f", "native", "-t", "../data/sample.lua"] + "tables.native" "tables.custom" + ] ] -- makes sure file is fully closed after reading diff --git a/test/Tests/Readers/Docx.hs b/test/Tests/Readers/Docx.hs index 6d91c36ae..5710a388f 100644 --- a/test/Tests/Readers/Docx.hs +++ b/test/Tests/Readers/Docx.hs @@ -171,6 +171,10 @@ tests = [ testGroup "inlines" "inline code in subscript and superscript" "docx/verbatim_subsuper.docx" "docx/verbatim_subsuper.native" + , testCompare + "inlines inside of Structured Document Tags" + "docx/sdt_elements.docx" + "docx/sdt_elements.native" ] , testGroup "blocks" [ testCompare diff --git a/test/command/4162.md b/test/command/4162.md new file mode 100644 index 000000000..d88e1ec4e --- /dev/null +++ b/test/command/4162.md @@ -0,0 +1,10 @@ +``` +% pandoc -f html -t native +<div class="line-block">hi<br /><br> + there</div> +^D +[LineBlock + [[Str "hi"] + ,[] + ,[Str "\160there"]]] +``` diff --git a/test/command/4193.md b/test/command/4193.md new file mode 100644 index 000000000..44c7d70cc --- /dev/null +++ b/test/command/4193.md @@ -0,0 +1,10 @@ +``` +% pandoc -f rst -t native +- + a +- b +^D +[BulletList + [[Plain [Str "a"]] + ,[Plain [Str "b"]]]] +``` diff --git a/test/command/4199.md b/test/command/4199.md new file mode 100644 index 000000000..49d2bdbcb --- /dev/null +++ b/test/command/4199.md @@ -0,0 +1,6 @@ +``` +% pandoc -f latex -t native +\foreignlanguage{ngerman}{foo} +^D +[Para [Span ("",[],[("lang","de-DE")]) [Str "foo"]]] +``` diff --git a/test/docx/sdt_elements.docx b/test/docx/sdt_elements.docx Binary files differnew file mode 100644 index 000000000..9356a6b40 --- /dev/null +++ b/test/docx/sdt_elements.docx diff --git a/test/docx/sdt_elements.native b/test/docx/sdt_elements.native new file mode 100644 index 000000000..7f7768728 --- /dev/null +++ b/test/docx/sdt_elements.native @@ -0,0 +1,10 @@ +[Table [] [AlignDefault,AlignDefault,AlignDefault] [0.0,0.0,0.0] + [[] + ,[] + ,[]] + [[[Plain [Strong [Str "col1Header"]]] + ,[Plain [Strong [Str "col2Header"]]] + ,[Plain [Strong [Str "col3Header"]]]] + ,[[Plain [Str "col1",Space,Str "content"]] + ,[Plain [Str "Body",Space,Str "copy"]] + ,[Plain [Str "col3",Space,Str "content"]]]]] diff --git a/test/jats-reader.native b/test/jats-reader.native index 2bc8b94ce..a7c349149 100644 --- a/test/jats-reader.native +++ b/test/jats-reader.native @@ -1,4 +1,4 @@ -Pandoc (Meta {unMeta = fromList [("title",MetaInlines [Str "Pandoc",Space,Str "Test",Space,Str "Suite"])]}) +Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Space,Str "MacFarlane"]]),("title",MetaInlines [Str "Pandoc",Space,Str "Test",Space,Str "Suite"])]}) [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "set",Space,Str "of",Space,Str "tests",Space,Str "for",Space,Str "pandoc.",Space,Str "Most",Space,Str "of",Space,Str "them",Space,Str "are",Space,Str "adapted",Space,Str "from",Space,Str "John",SoftBreak,Str "Gruber's",Space,Str "markdown",Space,Str "test",Space,Str "suite."] ,Header 1 ("headers",[],[]) [Str "Headers"] ,Header 2 ("level-2-with-an-embedded-link",[],[]) [Str "Level",Space,Str "2",Space,Str "with",Space,Str "an",SoftBreak,Link ("",[],[]) [Str "embedded",SoftBreak,Str "link"] ("/url","")] diff --git a/test/jats-reader.xml b/test/jats-reader.xml index eb06fcc22..f98caa46e 100644 --- a/test/jats-reader.xml +++ b/test/jats-reader.xml @@ -14,6 +14,18 @@ <title-group> <article-title>Pandoc Test Suite</article-title> </title-group> +<contrib-group> + <contrib contrib-type="author"> + <name> + <surname>MacFarlane</surname> + <given-names>John</given-names> + </name> + <contrib contrib-type="author"> + <name> + <surname>Anonymous</surname> + </name> + </contrib> +</contrib-group> </article-meta> </front> <body> diff --git a/test/lhs-test.html b/test/lhs-test.html index 77f75f354..c9777ea7b 100644 --- a/test/lhs-test.html +++ b/test/lhs-test.html @@ -9,7 +9,6 @@ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} </style> <style type="text/css"> diff --git a/test/lhs-test.html+lhs b/test/lhs-test.html+lhs index a17941998..4a121e0d1 100644 --- a/test/lhs-test.html+lhs +++ b/test/lhs-test.html+lhs @@ -9,7 +9,6 @@ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} </style> <style type="text/css"> diff --git a/test/lua/test-pandoc-utils.lua b/test/lua/test-pandoc-utils.lua index ce3456d5d..c732d2f85 100644 --- a/test/lua/test-pandoc-utils.lua +++ b/test/lua/test-pandoc-utils.lua @@ -1,5 +1,20 @@ utils = require 'pandoc.utils' +-- hierarchicalize +------------------------------------------------------------------------ +function test_hierarchicalize () + local blks = { + pandoc.Header(1, {pandoc.Str 'First'}), + pandoc.Header(2, {pandoc.Str 'Second'}), + pandoc.Header(2, {pandoc.Str 'Third'}), + } + local hblks = utils.hierarchicalize(blks) + return hblks[1].t == "Sec" + and hblks[1].contents[1].t == "Sec" + and hblks[1].contents[2].numbering[1] == 1 + and hblks[1].contents[2].numbering[2] == 2 +end + -- SHA1 ------------------------------------------------------------------------ function test_sha1 () @@ -64,6 +79,21 @@ function test_stringify () return utils.stringify(inline) == 'Cogito ergo sum.' end +-- to_roman_numeral +------------------------------------------------------------------------ +function test_to_roman_numeral () + return utils.to_roman_numeral(1888) == 'MDCCCLXXXVIII' + -- calling with a string fails + and not pcall(utils.to_roman_numeral, 'not a number') +end + +-- normalize_date +------------------------------------------------------------------------ +function test_normalize_date () + return utils.normalize_date("12/31/2017") == '2017-12-31' + and utils.normalize_date("pandoc") == nil +end + -- Return result ------------------------------------------------------------------------ function run(fn) @@ -72,11 +102,14 @@ end function Para (el) return { - pandoc.Plain{pandoc.Str("sha1: " .. run(test_sha1))}, + pandoc.Plain{pandoc.Str("hierarchicalize: " .. run(test_hierarchicalize))}, + pandoc.Plain{pandoc.Str("normalize_date: " .. run(test_normalize_date))}, pandoc.Plain{pandoc.Str("pipe: " .. run(test_pipe))}, pandoc.Plain{pandoc.Str("failing pipe: " .. run(test_failing_pipe))}, pandoc.Plain{pandoc.Str("read: " .. run(test_read))}, pandoc.Plain{pandoc.Str("failing read: " .. run(test_failing_read))}, + pandoc.Plain{pandoc.Str("sha1: " .. run(test_sha1))}, pandoc.Plain{pandoc.Str("stringify: " .. run(test_stringify))}, + pandoc.Plain{pandoc.Str("to_roman_numeral: " .. run(test_to_roman_numeral))}, } end diff --git a/test/s5-basic.html b/test/s5-basic.html index f126ad1f4..b3b950327 100644 --- a/test/s5-basic.html +++ b/test/s5-basic.html @@ -14,7 +14,6 @@ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} </style> <!-- configuration parameters --> diff --git a/test/s5-fancy.html b/test/s5-fancy.html index 1befe4052..9f724af96 100644 --- a/test/s5-fancy.html +++ b/test/s5-fancy.html @@ -14,7 +14,6 @@ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} </style> <!-- configuration parameters --> diff --git a/test/s5-inserts.html b/test/s5-inserts.html index 6779e5b76..efde179d2 100644 --- a/test/s5-inserts.html +++ b/test/s5-inserts.html @@ -12,7 +12,6 @@ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} </style> <link rel="stylesheet" href="main.css" type="text/css" /> diff --git a/test/tables.custom b/test/tables.custom new file mode 100644 index 000000000..410b68d3f --- /dev/null +++ b/test/tables.custom @@ -0,0 +1,201 @@ +<p>Simple table with caption:</p> + +<table> +<caption>Demonstration of simple table syntax.</caption> +<tr class="header"> +<th align="right">Right</th> +<th align="left">Left</th> +<th align="center">Center</th> +<th align="left">Default</th> +</tr> +<tr class="odd"> +<td align="right">12</td> +<td align="left">12</td> +<td align="center">12</td> +<td align="left">12</td> +</tr> +<tr class="even"> +<td align="right">123</td> +<td align="left">123</td> +<td align="center">123</td> +<td align="left">123</td> +</tr> +<tr class="odd"> +<td align="right">1</td> +<td align="left">1</td> +<td align="center">1</td> +<td align="left">1</td> +</tr> +</table + +<p>Simple table without caption:</p> + +<table> +<tr class="header"> +<th align="right">Right</th> +<th align="left">Left</th> +<th align="center">Center</th> +<th align="left">Default</th> +</tr> +<tr class="odd"> +<td align="right">12</td> +<td align="left">12</td> +<td align="center">12</td> +<td align="left">12</td> +</tr> +<tr class="even"> +<td align="right">123</td> +<td align="left">123</td> +<td align="center">123</td> +<td align="left">123</td> +</tr> +<tr class="odd"> +<td align="right">1</td> +<td align="left">1</td> +<td align="center">1</td> +<td align="left">1</td> +</tr> +</table + +<p>Simple table indented two spaces:</p> + +<table> +<caption>Demonstration of simple table syntax.</caption> +<tr class="header"> +<th align="right">Right</th> +<th align="left">Left</th> +<th align="center">Center</th> +<th align="left">Default</th> +</tr> +<tr class="odd"> +<td align="right">12</td> +<td align="left">12</td> +<td align="center">12</td> +<td align="left">12</td> +</tr> +<tr class="even"> +<td align="right">123</td> +<td align="left">123</td> +<td align="center">123</td> +<td align="left">123</td> +</tr> +<tr class="odd"> +<td align="right">1</td> +<td align="left">1</td> +<td align="center">1</td> +<td align="left">1</td> +</tr> +</table + +<p>Multiline table with caption:</p> + +<table> +<caption>Here’s the caption. +It may span multiple lines.</caption> +<col width="15%" /> +<col width="14%" /> +<col width="16%" /> +<col width="34%" /> +<tr class="header"> +<th align="center">Centered +Header</th> +<th align="left">Left +Aligned</th> +<th align="right">Right +Aligned</th> +<th align="left">Default aligned</th> +</tr> +<tr class="odd"> +<td align="center">First</td> +<td align="left">row</td> +<td align="right">12.0</td> +<td align="left">Example of a row that spans +multiple lines.</td> +</tr> +<tr class="even"> +<td align="center">Second</td> +<td align="left">row</td> +<td align="right">5.0</td> +<td align="left">Here’s another one. Note +the blank line between rows.</td> +</tr> +</table + +<p>Multiline table without caption:</p> + +<table> +<col width="15%" /> +<col width="14%" /> +<col width="16%" /> +<col width="34%" /> +<tr class="header"> +<th align="center">Centered +Header</th> +<th align="left">Left +Aligned</th> +<th align="right">Right +Aligned</th> +<th align="left">Default aligned</th> +</tr> +<tr class="odd"> +<td align="center">First</td> +<td align="left">row</td> +<td align="right">12.0</td> +<td align="left">Example of a row that spans +multiple lines.</td> +</tr> +<tr class="even"> +<td align="center">Second</td> +<td align="left">row</td> +<td align="right">5.0</td> +<td align="left">Here’s another one. Note +the blank line between rows.</td> +</tr> +</table + +<p>Table without column headers:</p> + +<table> +<tr class="odd"> +<td align="right">12</td> +<td align="left">12</td> +<td align="center">12</td> +<td align="right">12</td> +</tr> +<tr class="even"> +<td align="right">123</td> +<td align="left">123</td> +<td align="center">123</td> +<td align="right">123</td> +</tr> +<tr class="odd"> +<td align="right">1</td> +<td align="left">1</td> +<td align="center">1</td> +<td align="right">1</td> +</tr> +</table + +<p>Multiline table without column headers:</p> + +<table> +<col width="15%" /> +<col width="14%" /> +<col width="16%" /> +<col width="34%" /> +<tr class="odd"> +<td align="center">First</td> +<td align="left">row</td> +<td align="right">12.0</td> +<td align="left">Example of a row that spans +multiple lines.</td> +</tr> +<tr class="even"> +<td align="center">Second</td> +<td align="left">row</td> +<td align="right">5.0</td> +<td align="left">Here’s another one. Note +the blank line between rows.</td> +</tr> +</table + diff --git a/test/writer.custom b/test/writer.custom new file mode 100644 index 000000000..b32d777de --- /dev/null +++ b/test/writer.custom @@ -0,0 +1,783 @@ +<p>This is a set of tests for pandoc. Most of them are adapted from +John Gruber’s markdown test suite.</p> + +<hr/> + +<h1 id="headers">Headers</h1> + +<h2 id="level-2-with-an-embedded-link">Level 2 with an <a href='/url' title=''>embedded link</a></h2> + +<h3 id="level-3-with-emphasis">Level 3 with <em>emphasis</em></h3> + +<h4 id="level-4">Level 4</h4> + +<h5 id="level-5">Level 5</h5> + +<h1 id="level-1">Level 1</h1> + +<h2 id="level-2-with-emphasis">Level 2 with <em>emphasis</em></h2> + +<h3 id="level-3">Level 3</h3> + +<p>with no blank line</p> + +<h2 id="level-2">Level 2</h2> + +<p>with no blank line</p> + +<hr/> + +<h1 id="paragraphs">Paragraphs</h1> + +<p>Here’s a regular paragraph.</p> + +<p>In Markdown 1.0.0 and earlier. Version +8. This line turns into a list item. +Because a hard-wrapped line in the +middle of a paragraph looked like a +list item.</p> + +<p>Here’s one with a bullet. +* criminey.</p> + +<p>There should be a hard line break<br/>here.</p> + +<hr/> + +<h1 id="block-quotes">Block Quotes</h1> + +<p>E-mail style:</p> + +<blockquote> +<p>This is a block quote. +It is pretty short.</p> +</blockquote> + +<blockquote> +<p>Code in a block quote:</p> + +<pre><code>sub status { + print "working"; +}</code></pre> + +<p>A list:</p> + +<ol> +<li>item one</li> +<li>item two</li> +</ol> + +<p>Nested block quotes:</p> + +<blockquote> +<p>nested</p> +</blockquote> + +<blockquote> +<p>nested</p> +</blockquote> +</blockquote> + +<p>This should not be a block quote: 2 +> 1.</p> + +<p>And a following paragraph.</p> + +<hr/> + +<h1 id="code-blocks">Code Blocks</h1> + +<p>Code:</p> + +<pre><code>---- (should be four hyphens) + +sub status { + print "working"; +} + +this code block is indented by one tab</code></pre> + +<p>And:</p> + +<pre><code> this code block is indented by two tabs + +These should not be escaped: \$ \\ \> \[ \{</code></pre> + +<hr/> + +<h1 id="lists">Lists</h1> + +<h2 id="unordered">Unordered</h2> + +<p>Asterisks tight:</p> + +<ul> +<li>asterisk 1</li> +<li>asterisk 2</li> +<li>asterisk 3</li> +</ul> + +<p>Asterisks loose:</p> + +<ul> +<li><p>asterisk 1</p></li> +<li><p>asterisk 2</p></li> +<li><p>asterisk 3</p></li> +</ul> + +<p>Pluses tight:</p> + +<ul> +<li>Plus 1</li> +<li>Plus 2</li> +<li>Plus 3</li> +</ul> + +<p>Pluses loose:</p> + +<ul> +<li><p>Plus 1</p></li> +<li><p>Plus 2</p></li> +<li><p>Plus 3</p></li> +</ul> + +<p>Minuses tight:</p> + +<ul> +<li>Minus 1</li> +<li>Minus 2</li> +<li>Minus 3</li> +</ul> + +<p>Minuses loose:</p> + +<ul> +<li><p>Minus 1</p></li> +<li><p>Minus 2</p></li> +<li><p>Minus 3</p></li> +</ul> + +<h2 id="ordered">Ordered</h2> + +<p>Tight:</p> + +<ol> +<li>First</li> +<li>Second</li> +<li>Third</li> +</ol> + +<p>and:</p> + +<ol> +<li>One</li> +<li>Two</li> +<li>Three</li> +</ol> + +<p>Loose using tabs:</p> + +<ol> +<li><p>First</p></li> +<li><p>Second</p></li> +<li><p>Third</p></li> +</ol> + +<p>and using spaces:</p> + +<ol> +<li><p>One</p></li> +<li><p>Two</p></li> +<li><p>Three</p></li> +</ol> + +<p>Multiple paragraphs:</p> + +<ol> +<li><p>Item 1, graf one.</p> + +<p>Item 1. graf two. The quick brown fox jumped over the lazy dog’s +back.</p></li> +<li><p>Item 2.</p></li> +<li><p>Item 3.</p></li> +</ol> + +<h2 id="nested">Nested</h2> + +<ul> +<li>Tab + +<ul> +<li>Tab + +<ul> +<li>Tab</li> +</ul></li> +</ul></li> +</ul> + +<p>Here’s another:</p> + +<ol> +<li>First</li> +<li>Second: + +<ul> +<li>Fee</li> +<li>Fie</li> +<li>Foe</li> +</ul></li> +<li>Third</li> +</ol> + +<p>Same thing but with paragraphs:</p> + +<ol> +<li><p>First</p></li> +<li><p>Second:</p> + +<ul> +<li>Fee</li> +<li>Fie</li> +<li>Foe</li> +</ul></li> +<li><p>Third</p></li> +</ol> + +<h2 id="tabs-and-spaces">Tabs and spaces</h2> + +<ul> +<li><p>this is a list item +indented with tabs</p></li> +<li><p>this is a list item +indented with spaces</p> + +<ul> +<li><p>this is an example list item +indented with tabs</p></li> +<li><p>this is an example list item +indented with spaces</p></li> +</ul></li> +</ul> + +<h2 id="fancy-list-markers">Fancy list markers</h2> + +<ol> +<li>begins with 2</li> +<li><p>and now 3</p> + +<p>with a continuation</p> + +<ol> +<li>sublist with roman numerals, +starting with 4</li> +<li>more items + +<ol> +<li>a subsublist</li> +<li>a subsublist</li> +</ol></li> +</ol></li> +</ol> + +<p>Nesting:</p> + +<ol> +<li>Upper Alpha + +<ol> +<li>Upper Roman. + +<ol> +<li>Decimal start with 6 + +<ol> +<li>Lower alpha with paren</li> +</ol></li> +</ol></li> +</ol></li> +</ol> + +<p>Autonumbering:</p> + +<ol> +<li>Autonumber.</li> +<li>More. + +<ol> +<li>Nested.</li> +</ol></li> +</ol> + +<p>Should not be a list item:</p> + +<p>M.A. 2007</p> + +<p>B. Williams</p> + +<hr/> + +<h1 id="definition-lists">Definition Lists</h1> + +<p>Tight using spaces:</p> + +<dl> +<dt>apple</dt> +<dd>red fruit</dd> +<dt>orange</dt> +<dd>orange fruit</dd> +<dt>banana</dt> +<dd>yellow fruit</dd> +</dl> + +<p>Tight using tabs:</p> + +<dl> +<dt>apple</dt> +<dd>red fruit</dd> +<dt>orange</dt> +<dd>orange fruit</dd> +<dt>banana</dt> +<dd>yellow fruit</dd> +</dl> + +<p>Loose:</p> + +<dl> +<dt>apple</dt> +<dd><p>red fruit</p></dd> +<dt>orange</dt> +<dd><p>orange fruit</p></dd> +<dt>banana</dt> +<dd><p>yellow fruit</p></dd> +</dl> + +<p>Multiple blocks with italics:</p> + +<dl> +<dt><em>apple</em></dt> +<dd><p>red fruit</p> + +<p>contains seeds, +crisp, pleasant to taste</p></dd> +<dt><em>orange</em></dt> +<dd><p>orange fruit</p> + +<pre><code>{ orange code block }</code></pre> + +<blockquote> +<p>orange block quote</p> +</blockquote></dd> +</dl> + +<p>Multiple definitions, tight:</p> + +<dl> +<dt>apple</dt> +<dd>red fruit</dd> +<dd>computer</dd> +<dt>orange</dt> +<dd>orange fruit</dd> +<dd>bank</dd> +</dl> + +<p>Multiple definitions, loose:</p> + +<dl> +<dt>apple</dt> +<dd><p>red fruit</p></dd> +<dd><p>computer</p></dd> +<dt>orange</dt> +<dd><p>orange fruit</p></dd> +<dd><p>bank</p></dd> +</dl> + +<p>Blank line after term, indented marker, alternate markers:</p> + +<dl> +<dt>apple</dt> +<dd><p>red fruit</p></dd> +<dd><p>computer</p></dd> +<dt>orange</dt> +<dd><p>orange fruit</p> + +<ol> +<li>sublist</li> +<li>sublist</li> +</ol></dd> +</dl> + +<h1 id="html-blocks">HTML Blocks</h1> + +<p>Simple block on one line:</p> + +<div> +foo</div> + +<p>And nested without indentation:</p> + +<div> +<div> +<div> +<p>foo</p></div></div> + +<div> +bar</div></div> + +<p>Interpreted markdown in a table:</p> + +<table> + +<tr> + +<td> + +This is <em>emphasized</em> + +</td> + +<td> + +And this is <strong>strong</strong> + +</td> + +</tr> + +</table> + +<script type="text/javascript">document.write('This *should not* be interpreted as markdown');</script> + +<p>Here’s a simple block:</p> + +<div> +<p>foo</p></div> + +<p>This should be a code block, though:</p> + +<pre><code><div> + foo +</div></code></pre> + +<p>As should this:</p> + +<pre><code><div>foo</div></code></pre> + +<p>Now, nested:</p> + +<div> +<div> +<div> +foo</div></div></div> + +<p>This should just be an HTML comment:</p> + +<!-- Comment --> + +<p>Multiline:</p> + +<!-- +Blah +Blah +--> + +<!-- + This is another comment. +--> + +<p>Code block:</p> + +<pre><code><!-- Comment --></code></pre> + +<p>Just plain comment, with trailing spaces on the line:</p> + +<!-- foo --> + +<p>Code:</p> + +<pre><code><hr /></code></pre> + +<p>Hr’s:</p> + +<hr> + +<hr /> + +<hr /> + +<hr> + +<hr /> + +<hr /> + +<hr class="foo" id="bar" /> + +<hr class="foo" id="bar" /> + +<hr class="foo" id="bar"> + +<hr/> + +<h1 id="inline-markup">Inline Markup</h1> + +<p>This is <em>emphasized</em>, and so <em>is this</em>.</p> + +<p>This is <strong>strong</strong>, and so <strong>is this</strong>.</p> + +<p>An <em><a href='/url' title=''>emphasized link</a></em>.</p> + +<p><strong><em>This is strong and em.</em></strong></p> + +<p>So is <strong><em>this</em></strong> word.</p> + +<p><strong><em>This is strong and em.</em></strong></p> + +<p>So is <strong><em>this</em></strong> word.</p> + +<p>This is code: <code>></code>, <code>$</code>, <code>\</code>, <code>\$</code>, <code><html></code>.</p> + +<p><del>This is <em>strikeout</em>.</del></p> + +<p>Superscripts: a<sup>bc</sup>d a<sup><em>hello</em></sup> a<sup>hello there</sup>.</p> + +<p>Subscripts: H<sub>2</sub>O, H<sub>23</sub>O, H<sub>many of them</sub>O.</p> + +<p>These should not be superscripts or subscripts, +because of the unescaped spaces: a^b c^d, a~b c~d.</p> + +<hr/> + +<h1 id="smart-quotes-ellipses-dashes">Smart quotes, ellipses, dashes</h1> + +<p> said the spider. </p> + +<p>, , and are letters.</p> + +<p> and are names of trees. +So is </p> + +<p> Were you alive in the +70’s?</p> + +<p>Here is some quoted and a .</p> + +<p>Some dashes: one—two — three—four — five.</p> + +<p>Dashes between numbers: 5–7, 255–66, 1987–1999.</p> + +<p>Ellipses…and…and….</p> + +<hr/> + +<h1 id="latex">LaTeX</h1> + +<ul> +<li></li> +<li>\(2+2=4\)</li> +<li>\(x \in y\)</li> +<li>\(\alpha \wedge \omega\)</li> +<li>\(223\)</li> +<li>\(p\)-Tree</li> +<li>Here’s some display math: +\[\frac{d}{dx}f(x)=\lim_{h\to 0}\frac{f(x+h)-f(x)}{h}\]</li> +<li>Here’s one that has a line break in it: \(\alpha + \omega \times x^2\).</li> +</ul> + +<p>These shouldn’t be math:</p> + +<ul> +<li>To get the famous equation, write <code>$e = mc^2$</code>.</li> +<li>$22,000 is a <em>lot</em> of money. So is $34,000. +(It worked if is emphasized.)</li> +<li>Shoes ($20) and socks ($5).</li> +<li>Escaped <code>$</code>: $73 <em>this should be emphasized</em> 23$.</li> +</ul> + +<p>Here’s a LaTeX table:</p> + + + +<hr/> + +<h1 id="special-characters">Special Characters</h1> + +<p>Here is some unicode:</p> + +<ul> +<li>I hat: Î</li> +<li>o umlaut: ö</li> +<li>section: §</li> +<li>set membership: ∈</li> +<li>copyright: ©</li> +</ul> + +<p>AT&T has an ampersand in their name.</p> + +<p>AT&T is another way to write it.</p> + +<p>This & that.</p> + +<p>4 < 5.</p> + +<p>6 > 5.</p> + +<p>Backslash: \</p> + +<p>Backtick: `</p> + +<p>Asterisk: *</p> + +<p>Underscore: _</p> + +<p>Left brace: {</p> + +<p>Right brace: }</p> + +<p>Left bracket: [</p> + +<p>Right bracket: ]</p> + +<p>Left paren: (</p> + +<p>Right paren: )</p> + +<p>Greater-than: ></p> + +<p>Hash: #</p> + +<p>Period: .</p> + +<p>Bang: !</p> + +<p>Plus: +</p> + +<p>Minus: -</p> + +<hr/> + +<h1 id="links">Links</h1> + +<h2 id="explicit">Explicit</h2> + +<p>Just a <a href='/url/' title=''>URL</a>.</p> + +<p><a href='/url/' title='title'>URL and title</a>.</p> + +<p><a href='/url/' title='title preceded by two spaces'>URL and title</a>.</p> + +<p><a href='/url/' title='title preceded by a tab'>URL and title</a>.</p> + +<p><a href='/url/' title='title with "quotes" in it'>URL and title</a></p> + +<p><a href='/url/' title='title with single quotes'>URL and title</a></p> + +<p><a href='/url/with_underscore' title=''>with_underscore</a></p> + +<p><a href='mailto:nobody@nowhere.net' title=''>Email link</a></p> + +<p><a href='' title=''>Empty</a>.</p> + +<h2 id="reference">Reference</h2> + +<p>Foo <a href='/url/' title=''>bar</a>.</p> + +<p>With <a href='/url/' title=''>embedded [brackets]</a>.</p> + +<p><a href='/url/' title=''>b</a> by itself should be a link.</p> + +<p>Indented <a href='/url' title=''>once</a>.</p> + +<p>Indented <a href='/url' title=''>twice</a>.</p> + +<p>Indented <a href='/url' title=''>thrice</a>.</p> + +<p>This should [not][] be a link.</p> + +<pre><code>[not]: /url</code></pre> + +<p>Foo <a href='/url/' title='Title with "quotes" inside'>bar</a>.</p> + +<p>Foo <a href='/url/' title='Title with "quote" inside'>biz</a>.</p> + +<h2 id="with-ampersands">With ampersands</h2> + +<p>Here’s a <a href='http://example.com/?foo=1&bar=2' title=''>link with an ampersand in the URL</a>.</p> + +<p>Here’s a link with an amersand in the link text: <a href='http://att.com/' title='AT&T'>AT&T</a>.</p> + +<p>Here’s an <a href='/script?foo=1&bar=2' title=''>inline link</a>.</p> + +<p>Here’s an <a href='/script?foo=1&bar=2' title=''>inline link in pointy braces</a>.</p> + +<h2 id="autolinks">Autolinks</h2> + +<p>With an ampersand: <a href='http://example.com/?foo=1&bar=2' title=''>http://example.com/?foo=1&bar=2</a></p> + +<ul> +<li>In a list?</li> +<li><a href='http://example.com/' title=''>http://example.com/</a></li> +<li>It should.</li> +</ul> + +<p>An e-mail address: <a href='mailto:nobody@nowhere.net' title=''>nobody@nowhere.net</a></p> + +<blockquote> +<p>Blockquoted: <a href='http://example.com/' title=''>http://example.com/</a></p> +</blockquote> + +<p>Auto-links should not occur here: <code><http://example.com/></code></p> + +<pre><code>or here: <http://example.com/></code></pre> + +<hr/> + +<h1 id="images">Images</h1> + +<p>From by Georges Melies (1902):</p> + +<div class="figure"> +<img src="lalune.jpg" title="fig:Voyage dans la Lune"/> +<p class="caption">lalune</p> +</div> + +<p>Here is a movie <img src='movie.jpg' title=''/> icon.</p> + +<hr/> + +<h1 id="footnotes">Footnotes</h1> + +<p>Here is a footnote reference,<a id="fnref1" href="#fn1"><sup>1</sup></a> and another.<a id="fnref2" href="#fn2"><sup>2</sup></a> +This should <em>not</em> be a footnote reference, because it +contains a space.[^my note] Here is an inline note.<a id="fnref3" href="#fn3"><sup>3</sup></a></p> + +<blockquote> +<p>Notes can go in quotes.<a id="fnref4" href="#fn4"><sup>4</sup></a></p> +</blockquote> + +<ol> +<li>And in list items.<a id="fnref5" href="#fn5"><sup>5</sup></a></li> +</ol> + +<p>This paragraph should not be part of the note, as it is not indented.</p> +<ol class="footnotes"> +<li id="fn1"><p>Here is the footnote. It can go anywhere after the footnote +reference. It need not be placed at the end of the document. <a href="#fnref1">↩</a></p></li> +<li id="fn2"><p>Here’s the long note. This one contains multiple +blocks.</p> + +<p>Subsequent blocks are indented to show that they belong to the +footnote (as with list items).</p> + +<pre><code> { <code> }</code></pre> + +<p>If you want, you can indent every line, but you can also be +lazy and just indent the first line of each block. <a href="#fnref2">↩</a></p></li> +<li id="fn3"><p>This +is <em>easier</em> to type. Inline notes may contain +<a href='http://google.com' title=''>links</a> and <code>]</code> verbatim characters, +as well as [bracketed text]. <a href="#fnref3">↩</a></p></li> +<li id="fn4"><p>In quote. <a href="#fnref4">↩</a></p></li> +<li id="fn5"><p>In list. <a href="#fnref5">↩</a></p></li> +</ol> + diff --git a/test/writer.html4 b/test/writer.html4 index ef6d8df74..dc889f07a 100644 --- a/test/writer.html4 +++ b/test/writer.html4 @@ -12,7 +12,6 @@ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} </style> </head> diff --git a/test/writer.html5 b/test/writer.html5 index a2052e9ea..53fcb84e2 100644 --- a/test/writer.html5 +++ b/test/writer.html5 @@ -12,7 +12,6 @@ code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} - div.line-block{white-space: pre-line;} div.column{display: inline-block; vertical-align: top; width: 50%;} </style> <!--[if lt IE 9]> |