pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2018-11-25	Fix parsing of citations and quotes after parentheses.	John MacFarlane	3	-13/+15
	Starting with pandoc 2.4, citations and quoted inlines were no longer recognized after parentheses. This is because of commit 9b0bd4ec6f5c9125efb3e36232e6d1f6ac08a728, which is reverted here. The point of that commit was to allow relocation of soft line breaks to before an abbreviation, so that a nonbreaking space could be added after the abbreviation. Now we simply leave the soft line break in place, even though this means that we won't get a nonbreaking space after "Mr." at the end of a line (and in LaTeX this may result in a longer intersentential space). Those who care about this issue should take care not to end lines with an abbreviation, or to insert nonbreaking spaces manually. Closes #5099.
2018-11-16	HTML reader: allow tfoot before body rows.	John MacFarlane	1	-0/+16
	Closes #5079.
2018-11-15	HTML reader: parse `<small>` as a Span with class "small".	John MacFarlane	1	-0/+7
	Closes #5080.
2018-11-15	Asciidoc writer: Render Spans using `[#id .class]#contents#`.	John MacFarlane	1	-0/+6
	See #5080.
2018-11-13	Fix test case for #5014.	John MacFarlane	1	-1/+3

2018-11-13	HTML reader: allow thead containing a row with td rather than th.	John MacFarlane	1	-0/+17
	See #5014. Note that this doesn't address the original issue in #5014, only an unrelated side-issue.
2018-11-12	LaTeX writer: don't emit `[<+->]` unless beamer output,	John MacFarlane	1	-0/+14
	even if `writerIncremental` is True. See #5072.
2018-11-11	Exactly match GitHub's identifier generating algorithm.	John MacFarlane	1	-0/+4
	See #5057.
2018-11-11	Text.Pandoc.Shared: add parameter to uniqueIdent, inlineListToIdentifier.	John MacFarlane	1	-4/+4
	The parameter is Extensions. This allows these functions to be sensitive to the settings of `Ext_gfm_auto_identifiers` and `Ext_ascii_identifiers`. This allows us to use `uniqueIdent` in the CommonMark reader, replacing some custom code. It also means that `gfm_auto_identifiers` can now be used in all formats. Semantically, `gfm_auto_identifiers` is now a modifier of `auto_identifiers`; for identifiers to be set, `auto_identifiers` must be turned on, and then the type of identifier produced depends on `gfm_auto_identifiers` and `ascii_identifiers` are set. Closes #5057.
2018-11-06	Add command test for #5050.	John MacFarlane	1	-0/+7

2018-11-05	CommonMark writer: respect --ascii (#5043)	quasicomputational	1	-0/+36

2018-11-04	XML: toHtml5Entities: prefer shorter entities...	John MacFarlane	1	-2/+2
	when there are several choices for a particular character.
2018-11-02	Roff reader: Improved handling of custom strings as arguments.	John MacFarlane	1	-0/+27
	Added test.
2018-11-01	Implement --ascii for Markdown writer.	John MacFarlane	1	-0/+8

2018-11-01	HTML writer: use character entities references when possible for HTML5.	John MacFarlane	1	-1/+1

2018-10-31	LaTeX writer: add newline if math ends in a comment.	John MacFarlane	1	-0/+7
	This prevents the closing delimiter from being swalled up in the comment. Closes #4880.
2018-10-29	LaTeX reader: allow space at end of math after `\`.	John MacFarlane	1	-0/+21
	Closes #5010. Expose trimMath from T.P.Shared.
2018-10-25	Lua: allow access to pandoc state (#5015)	Albert Krewinkel	2	-0/+25
	* Lua: allow access to pandoc state Lua filters and custom writers now have read-only access to most fields of pandoc's internal state via the global variable `PANDOC_STATE`. * Lua: allow iterating through fields of PANDOC_STATE * Lua filters doc: describe CommonState * Lua filters doc: mention global variable PANDOC_STATE * Lua: add access to logs Log messages can currently only be printed, but not decomposed.
2018-10-23	Groff writer character escaping changes.	John MacFarlane	1	-1/+1
	T.P.GroffChar: replaced `essentialEscapes` with `manEscapes`, which includes all the escapes mentioned in the groff_man manual. T.P.Writers.Groff: removed escapeCode; changed parameter on escapeString from Bool to new type `EscapeMode`. Rewrote `escapeString`.
2018-10-22	LaTeX reader: add support for `nolinkurl` command. (#4992)	Brian Leung	1	-0/+10

2018-10-18	Groff escaping changes.	John MacFarlane	2	-3/+3
	- `--ascii` is now turned on automatically for man output, for portability. All man output will be escaped to ASCII. - In T.P.Writers.Groff, `escapeChar`, `escapeString`, and `escapeCode` now take a boolean parameter that selects ascii-only output. This is used by the Ms writer for `--ascii`, instead of doing an extra pass after writing the document. - In ms output without `--ascii`, unicode is used whenever possible (e.g. for double quotes). - A few escapes are changed: e.g. `\[rs]` instead of `\\` for backslash, and `\ga]` instead of `` \` `` for backtick.
2018-10-17	Move common groff functions to Text.Pandoc.Writers.Groff	John MacFarlane	2	-2/+2
	(unexported module). These are used in both the man and ms writers. Moved groffEscape out of Text.Pandoc.Writers.Shared [cancels earlier API change from adding it, which was after last release]. This fixes strong/code combination on man (should be `\f[CB]` not `\f[BC]`), mentioned in #4973. Updated tests. Closes #4975.
2018-10-17	Man writer: use \f[R] instead of \f[] to reset font	Alexander Krotov	1	-3/+4
	Fixes #4973
2018-10-15	LaTeX reader: make macroDef polymorphic and allow in inline context.	John MacFarlane	1	-2/+1
	Otherwise we can't parse something like ``` \lowercase{\def\x{Foo}} ``` I have actually seen tex like this in the wild.
2018-10-15	Added failing test case for macros.	John MacFarlane	1	-0/+18

2018-10-14	Markdown writer: ensure blank between raw block and normal content.	John MacFarlane	1	-0/+4
	Otherwise a raw block can prevent a paragraph from being recognized as such. Closes #4629.
2018-10-14	Markdown reader: Fix awkward soft break movements before abbreviations.	John MacFarlane	1	-0/+31
	Closes #4635.
2018-10-11	HTML reader: fix htmlTag and isInlineTag to accept processing instructions.	John MacFarlane	1	-0/+13
	Fixes regression #3123 (since 2.0). Added regression test.
2018-10-08	LaTeX writer: with `--biblatex`, use `\autocite` when possible.	John MacFarlane	1	-0/+22
	`\autocites{a1}{a2}{a3}` will not collapse the entries. So, if we don't have prefixes and suffixes, we use instead `\autocite{a1;a2;a3}`. Closes #4960.
2018-10-07	RST reader: don't allow single-dash separator in headerless table.	John MacFarlane	1	-0/+10
	Closes #4382.
2018-10-07	LaTeX reader: fix bugs omitting raw tex.	John MacFarlane	4	-8/+21
	The default is `-raw_tex`, so no raw tex should result unless we explicitly say `+raw_tex`. Previously some raw commands did make it through. Closes #4527.
2018-10-07	RST reader: pass through fields in unknown directives as div attributes.	John MacFarlane	1	-0/+16
	This commit also adds support for `class` and `name` attributes to directives in general. Closes #4715.
2018-10-05	Org reader: fix behavior for successive calls of `#+EXCLUDE_TAGS`. (#4951)	Brian Leung	1	-0/+11
	Calling `#+EXCLUDE_TAGS` multiple times should preserve the status of the previously declared tags.
2018-10-05	CommonMark writer: add plain text fallbacks. (#4531)	quasicomputational	1	-0/+156
	Previously, the writer would unconditionally emit HTMLish output for subscripts, superscripts, strikeouts (if the strikeout extension is disabled) and small caps, even with raw_html disabled. Now there are plain-text (and, where possible, fancy Unicode) fallbacks for all of these corresponding (mostly) to the Markdown fallbacks, and the HTMLish output is only used when raw_html is enabled. This commit adds exported functions `toSuperscript` and `toSubscript` to `Text.Pandoc.Writers.Shared`. [API change] Closes #4528.
2018-10-05	Org reader: Add partial support for `#+EXCLUDE_TAGS` option. (#4950)	Brian Leung	1	-0/+29
	Closes #4284. Headers with the corresponding tags should not appear in the output. If one or more of the specified tags contains a non-tag character like `+`, Org-mode will not treat that as a valid tag, but will nonetheless continue scanning for valid tags. That behavior is not replicated in this patch; entering `cat+dog` as one of the entries in `#+EXCLUDE_TAGS` and running the file through Pandoc will cause the parser to fail and result in the only excluded tag being the default, `noexport`.
2018-09-30	Implement `--ascii` (`writerPreferAscii`) in writers, not App.	John MacFarlane	1	-0/+45
	Now the `write*` functions for Docbook, HTML, ICML, JATS, Man, Ms, OPML are sensitive to `writerPreferAscii`. Previously the to-ascii translation was done in Text.Pandoc.App, and thus not available to those using the writer functions directly. In addition, the LaTeX writer is now sensitive to `writerPreferAscii` and to `--ascii`. 100% ASCII output can't be guaranteed, but the writer will use commands like `\"{a}` and `\l` whenever possible, to avoid emiting a non-ASCII character. A new unexported module, Text.Pandoc.Groff, has been added to store functions used in the different groff-based writers.
2018-09-29	LaTeX reader: allow verbatim blocks ending with blank lines.	John MacFarlane	1	-0/+30
	Closes #4624.
2018-09-26	Force inline code blocks to honor export options.	leungbk	1	-0/+8
	`exportsCode` is moved from `Blocks.hs` to `Shared.hs` and exported accordingly.
2018-09-25	Add support for multiprenote and multipostnote arguments in LaTeX. (#4930)	Brian Leung	1	-0/+48
	* Add support for multiprenote and multipostnote arguments. The multiprenotes occur before the first prefix of a multicite, and the multipostnotes follow the last suffix. * Add test for multiprenote and multipostnote.
2018-09-20	RST reader: fix bug with internal link targets.	John MacFarlane	1	-0/+14
	They were gobbling up indented content underneath. Closes #4919.
2018-09-19	Markdown reader: distinguish autolinks in the AST.	John MacFarlane	2	-1/+35
	With this change, autolinks are parsed as Links with the `uri` class. (The same is true for bare links, if the `autolink_bare_uris` extension is enabled.) Email autolinks are parsed as Links with the `email` class. This allows the distinction to be represented in the URI. Formerly the `uri` class was added to autolinks by the HTML writer, but it had to guess what was an autolink and could not distinguish `[http://example.com](http://example.com)` from `<http://example.com>`. It also incorrectly recognized `[pandoc](pandoc)` as an autolink. Now the HTML writer simply passes through the `uri` attribute if it is present, but does not add anything. The Textile writer has been modified so that the `uri` class is not explicitly added for autolinks, even if it is present. Closes #4913.
2018-09-16	Markdown reader: example_lists should work without startnum.	John MacFarlane	1	-0/+16
	Closes #4908.
2018-09-15	add test for --metadata-file	mb21	2	-0/+19

2018-09-15	add test yaml-metadata-blocks.md	mb21	1	-0/+48

2018-09-09	LaTeX reader: resolve `\ref` for figure numbers.	John MacFarlane	1	-1/+44

2018-09-07	HTML reader: parse `<script type="math/tex` tags as math.	John MacFarlane	1	-0/+13
	These are used by MathJax. Closes #4877.
2018-08-29	RSTR reader: don't skip link definitions after comments.	John MacFarlane	1	-0/+9
	Closes #4860.
2018-08-24	HTML reader: allow enabling `raw_tex` extension.	John MacFarlane	1	-0/+29
	This now allows raw LaTeX environments, `\ref`, and `\eqref` to be parsed (which is helpful for translation HTML documents using MathJaX). Closes #1126.
2018-08-22	HTML reader: extract spaces inside links instead of trimming them	Alexander Krotov	1	-0/+6
	Fixes #4845
2018-08-21	LaTeX reader: support blockcquote, foreignblockquote from csquotes.	John MacFarlane	1	-3/+4
	Also foreigncblockquote, hyphenblockquote, hyphencblockquote. Closes #4848. But note: currently foreignquote will be parsed as a regular Quoted inline (not using the quotes appropriate to the foreign language).