pandoc (2.7) * Use XDG data directory for user data directory (#3582). Instead of `$HOME/.pandoc`, the default user data directory is now `$XDG_DATA_HOME/pandoc`, where `XDG_DATA_HOME` defaults to `$HOME/.local/share` but can be overridden by setting the environment variable. If this directory is missing, then `$HOME/.pandoc` is searched instead, for backwards compatibility. However, we recommend moving local pandoc data files from `$HOME/.pandoc` to `$HOME/.local/share/pandoc`. On Windows the default user data directory remains the same. * Add command line option `--ipynb-output=all|none|best` (#5339). * `asciidoctor` is now an output format separate from `asciidoc`, to accommodate some minor implementation-specific differences (currently just in the treatment of display math). * Add `latexmk` as an option for `--pdf-engine` (#3195). Note that you can use `--pdf-engine-opt=-outdir=bar` to specify a persistent temp directory. * Markdown reader: + Improve tight/loose list handling (#5285). Previously the algorithm allowed list items with a mix of Para and Plain, which is never wanted. + Add newline when parsing blocks in YAML (#5271). Otherwise last block gets parsed as a Plain rather than a Para. This is a regression in pandoc 2.x. This patch restores pandoc 1.19 behavior. + Make `yamlToMeta` respect extensions (#5272, Mauro Bieg). This adds a `ReaderOptions` parameter to `yamlToMeta` [API change]. + Fix bug parsing fenced code blocks (#5304). Previously parsing would break if the code block contained a string of backticks of sufficient length followed by something other than end of line. * LaTeX reader: don't let `\egroup` match `{`. `braced` now actually requires nested braces. Otherwise some legitimate command and environment definitions can break. * Docx reader (Jesse Rosenthal): + Rename `getDocumentPath` as `getDocumentXmlPath`. + Use field notation for setting `ReaderEnv`. + Figure out `document.xml` path once at the beginning of parsing, and add it to the environment, so we can avoid repeated lookups. + Dynamically determine main document xml path (#5277). The desktop Word program places the main document file in `word/document.xml`, but the online word places it in `word/document2.xml`. This file path is actually stated in the root `_rels/.rels` file, in the `Relationship` element with an `http://../officedocument` type. + Fix paths in archive to prevent Windows failure (#5277). Some paths in archives are absolute (have an opening slash) which, for reasons unknown, produces a failure in the test suite on MS Windows. This fixes that by removing the leading slash if it exists. + Add comments to aid code readability. + Trim space inside the last inline (#5273). + Unwrap sdt elements in footnotes and comments (#5302). * Muse reader (Alexander Krotov): + Test that block level markup does not break ``. + Add secondary note support. * ipynb reader: handle images referring to attachments. Previously we didn't strip off the attachment: prefix, so even though the attachment is available in the mediabag, pandoc couldn't find it. * JATS reader: + Fix parsing of figures (#5321). This ensures that a figure containing a single image is parsed as a pandoc "implicit figure" (i.e., a Para with a single Image whose title attribute begins with `fig:`). More complex figures will still be parsed as divs. + Support `fig-group` block element (#5317). + Handle citations with multiple references (#5310). The `rid` attribute can have a space-separated list of ids. * AsciiDoc Writer: Add `writeAsciiDoctor` [API change, Tarik Graba]. Handle display math appropriately for Asciidoctor. * JATS writer: wrap figure caption in `

` to fix validation (#5290, Mauro Bieg). * HTML writer: + Implement WAI-ARIA roles for (end)notes, citations, and bibliography (#4213). Note that doc-biblioref is only used when link-citations produces links, since it belongs on links. + Include content under title slides (#4317, #5237). This facilitates real 2D revealjs slideshows, with content under the top-level slide in each stack. It also enables notes on title slides. Behavior change: content above slide level is no longer ignored; it now gets added to the title slide. * ipynb writer: + Ensure final newline. + Only include metadata under `jupyter` field. + Don't create attachments for images with absolute URIs, including data: URIs (#5303). + Keep plain text fallbacks in output even if a richer format is included (#5293). We don't know what output format will be needed. The fallback can always be weeded out using a filter. * Markdown writer: use `markdown="1"` when appropriate for Divs: when `native_divs` and `markdown_in_html_blocks` are disabled but `raw_html` and `markdown_attribute` are enabled. * LaTeX writer: + Use right fold for `escapeString`. This is more elegant than the explicit recursive code we were using. + Avoid `{}` after control sequences when escaping. `\ldots{}.` doesn't behave as well as `\ldots.` with the latex ellipsis package. This patch causes pandoc to avoid emitting the `{}` when it is not necessary. Now `\ldots` and other control sequences used in escaping will be followed by either a `{}`, a space, or nothing, depending on context. + For beamer, include contents under headers superordinate to slidelevel (#4317). Currently we keep the fancy title slide, and add a new slide with the same title and whatever content was under the header. This changes behavior of slides, but is consistent with the new behavior of the revealjs and other HTML slide show writers. * Powerpoint writer (Jesse Rosenthal): support underlines. Use span with single class "underline" as in docx writer. * Muse writer: escape secondary notes (Alexander Krotov). * FB2 writer: add section identifiers support (#5229, John KetzerX). * Make `--fail-if-warnings` work for PDF output (#5343). * Lua filters (Albert Krewinkel): + Load module `pandoc` before calling `init.lua` (#5287). The file `init.lua` in pandoc's data directory is run as part of pandoc's Lua initialization process. Previously, the `pandoc` module was loaded in `init.lua`, and the structure for marshaling was set-up after. This allowed simple patching of element marshaling, but made using `init.lua` more difficult. Now, all required modules are now loaded before calling `init.lua`. The file can be used entirely for user customization. Patching marshaling functions, while discouraged, is still possible via the `debug` module. + Re-export all bundled modules (Albert Krewinkel). All Lua modules bundled with pandoc, i.e., `pandoc.List`, `pandoc.mediabag`, `pandoc.utils`, and `text` are re-exported from the `pandoc` module. They are assigned to the fields `List`, `mediabag`, `utils`, and `text`, respectively. * Text.Pandoc.Lua (Albert Krewinkel): + Split `StackInstances` into smaller Marshaling modules. + Get `CommonState` from Lua global. This allows more control over the common state from within Lua scripts. * LaTeX template: + Support the `subject` metadata variable (#5289, Pascal Wagler). + Add \frontmatter, \mainmatter, \backmatter for book classes (#5306). * epub3 template: Add titlepage class to section (#5269). * HTML5 template: Add role with ARIA doc-toc for table of contents (#4213). * Make --metadata-file use pandoc-markdown (#5279, #5272, Mauro Bieg). * Text.Pandoc.Shared: + Remove `withTempDir` [API change]. + Add new exported function `defaultUserDataDirs` [API change]. + Add `filterIpynbOutput` [API change]. + `compactify`: Avoid lists with a mix of Plain and Para elements (#5285). * Text.Pandoc.Translations: reorder alphabetically and remove `Author` (#5334, Mauro Bieg). * Text.Pandoc.Extensions: + More carefully groom ipynb default extensions. + Add `all_symbols_escapable` to `githubMarkdownExtensions`. * Text.Pandoc.PDF: + Use system temp directory when possible (#1192). Previously we created temp dirs in the working directory, partly (a) because there were problems using the system temp directory on Windows, when their pathnames included tildes, and partly (b) because programs like `epstopdf.pl` would not be allowed to write to directories outside the working directory in restricted mode. We now (a) use the system temp dir except when the path includes tildes, and (b) set TEXMFOUTPUT when creating the PDF, so that subsidiary programs can use the system temp directory. This addresses problems that occurred when pandoc was used in a synced directory. + Change types of subsidiary functions to PandocIO, to allow warnings to be threaded through (#5343). * Text.Pandoc.MIME: add WebP (#5267, Mauro Bieg). * Tests: avoid calling `findPandoc` multiple times. * Old tests: remove need for temp files by using `pipeProcess`. * Added simple ipynb reader/writer tests (#5274). * Rearrange `--help` output in a more rational way (#5336). * trypandoc: Add JATS and other missing formats (Arfon Smith, #5291). * Add missing copyright notices and remove license boilerplate (#4592, Albert Krewinkel). * Use latest basement/foundation on 32bit windows. * Use latest skylighting (#5328). * Require texmath 0.11.2.1 * Use latest pandoc-citeproc (0.16.1). * MANUAL.txt: + Clarify variable substitution indentation in templates (#5338, Agustín Martín Barbero). + Reorder custom-styles section (#5324, Mauro Bieg). pandoc (2.6) * Support ipynb (Jupyter notebook) as input and output format. + Add `ipynb` as input and output format (extension `.ipynb`). + Added Text.Pandoc.Readers.Ipynb [API change]. + Added Text.Pandoc.Writers.Ipynb [API change]. + Add `PandocIpynbDecodingError` constructor to Text.Pandoc.Error.Error [API change]. + Depend on ipynb library. + Note: there is no template for ipynb. * Add DokuWiki reader (#1792, Alexander Krotov). This adds Text.Pandoc.Readers.DokuWiki [API change], and adds `dokuwiki` as an input format. * Implement task lists (#3051, Mauro Bieg). Added `task_lists` extension. Task lists are supported from markdown and gfm input. They should work, to some degree, in all output formats, though in most formats you'll get a bullet list with a unicode character for the box. In HTML, you get checkboxes and in LaTeX/PDF output, a box is used as the list marker. API changes: + Added constructor `Ext_task_lists` to `Extension`. + Added `taskListItemFromAscii` and `taskListItemToAscii` to Text.Pandoc.Shared. * Allow some command line options to take URL in addition to FILE. `--include-in-header`, `--include-before-body`, `--include-after-body`. * HTML reader: + Handle empty `start` attribute (see #5162). + Treat `textarea` as a verbatim environment (#5241) and preserve spacing. * RST reader: + Change treatment of `number-lines` directive (Brian Leung, #5207). Directives of this type without numeric inputs should not have a `startFrom` attribute; with a blank value, the writers can produce extra whitespace. + Removed superfluous `sourceCode` class on code blocks (#5047). + Handle `sourcecode` directive as synonynm for `code` (#5204). * Markdown reader: + Remove `sourceCode` class for literate Haskell code blocks (#5047). Reverse order of `literate` and `haskell` classes on code blocks when parsing literate Haskell, so `haskell` is first. + Treat `