pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2021-05-09	Change reader types, allowing better tracking of source positions.	John MacFarlane	43	-615/+1021
	Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn't report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn't resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752). Add Text.Pandoc.Sources (exported module), with a `Sources` type and a `ToSources` class. A `Sources` wraps a list of `(SourcePos, Text)` pairs. [API change] A parsec `Stream` instance is provided for `Sources`. The module also exports versions of parsec's `satisfy` and other Char parsers that track source positions accurately from a `Sources` stream (or any instance of the new `UpdateSourcePos` class). Text.Pandoc.Parsing now exports these modified Char parsers instead of the ones parsec provides. Modified parsers to use a `Sources` as stream [API change]. The readers that previously took a `Text` argument have been modified to take any instance of `ToSources`. So, they may still be used with a `Text`, but they can also be used with a `Sources` object. In Text.Pandoc.Error, modified the constructor PandocParsecError to take a `Sources` rather than a `Text` as first argument, so parse error locations can be accurately reported. T.P.Error: showPos, do not print "-" as source name.
2021-05-07	ConTeXt writer: support blank lines in line blocks.	Albert Krewinkel	1	-2/+6
	Fixes: #6564 Thanks to @denismaier.
2021-05-05	App: allow tabs expansion even if file-scope is used	Albert Krewinkel	1	-7/+11
	Tabs in plain-text inputs are now handled correctly, even if the `--file-scope` flag is used. Closes: #6709
2021-05-01	Docx writer: support colspans and rowspans in tables	Albert Krewinkel	3	-70/+140
	See: #6315
2021-05-01	Add new internal module Text.Pandoc.Writers.GridTable	Albert Krewinkel	1	-0/+157

2021-04-30	Org writer: inline latex envs need newlines (#7259)	tecosaur	1	-0/+2
	Closes #7252 As specified in https://orgmode.org/manual/LaTeX-fragments.html, an inline \begin{}...\end{} LaTeX block must start on a new line.
2021-04-29	Docx reader: add handling of vml image objects (jgm#4735) (#7257)	mbrackeantidot	1	-2/+9
	They represent images, the same way as other images in vml format.
2021-04-29	Further improvements in smart quotes.	John MacFarlane	1	-2/+2
	Improves heuristic for detection of an "open double quote." Closes #2103.
2021-04-28	Smarter smart quotes.	John MacFarlane	4	-63/+41
	Treat a leading " with no closing " as a left curly quote. This supports the practice, in fiction, of continuing paragraphs quoting the same speaker without an end quote. It also helps with quotes that break over lines in line blocks. Closes #7216.
2021-04-28	JATS writer: use either styled-content or named-content for spans.	Albert Krewinkel	1	-10/+26
	If the element has a content-type attribute, or at least one class, then that value is used as `content-type` and the span is put inside a `<named-content>` element. Otherwise a `<styled-content>` element is used instead. Closes: #7211
2021-04-27	Docx writer: autoset table width if no column has an explicit width.	Albert Krewinkel	1	-7/+11

2021-04-25	Minor code reformatting.	John MacFarlane	1	-1/+2
	Also taking this opportunity to note, for the record, that the commit for #7241 should be marked [API change]. It changes the type of `languagesByExtension` in Highlighting, adding a parameter for a `SyntaxMap`.
2021-04-25	Writers: Recognize custom syntax definitions (#7241)	Jan Tojnar	5	-23/+28
	Languages defined using `--syntax-definition` were not recognized by `languagesByExtension`. This patch corrects that, allowing the writers to see all custom definitions. The LaTeX still uses the default syntax map, but that's okay in that context, since `--syntax-definition` won't create new listings styles.
2021-04-25	Markdown writer: Cleaner (code)blocks with single class (#7242)	Jan Tojnar	1	-2/+8
	When a block only has a single class and no other attributes, it is not necessary to wrap the class attribute in curly braces – the class name can be placed after the opening mark as is. This will result in bit cleaner output when pandoc is used as a markdown pretty-printer.
2021-04-25	Add quotes properly in markdown YAML metadata fields.	John MacFarlane	1	-6/+5
	This fixes a bug, which caused the writer to look at the LAST rather than the FIRST character in determining whether quotes were needed. So we got spurious quotes in some cases and didn't get necessary quotes in others. Closes #7245. Updated a number of test cases accordingly.
2021-04-20	Docx writer: add missing file	Albert Krewinkel	1	-0/+181

2021-04-20	Docx writer: extract Table handling into separate module	Albert Krewinkel	2	-221/+119

2021-04-19	Issue error message when reader or writer format is malformed.	John MacFarlane	2	-6/+6
	Previously we exited with an error status but (due to a bug) no message. Closes #7231.
2021-04-18	Use MetaInlines not MetaBlocks for multimarkdown metadata fields.	John MacFarlane	1	-1/+1
	This gives better results in converting to e.g. pandoc markdown. Ref: <https://groups.google.com/d/msgid/pandoc-discuss/9728d1f4-040e-4392-aa04-148f648a8dfdn%40googlegroups.com>
2021-04-17	Update to released unicode-collation, latest citeproc dev version.	John MacFarlane	13	-13/+13
	Update citeproc test.
2021-04-17	Use document's lang for the lang parameter of citeproc...	John MacFarlane	1	-2/+1
	even if it differs from localeLanguage. (It is designed to be possible to override the locale language, and this is especially useful when one wants to use the unicode extension syntx, e.g. fr-u-kb.)
2021-04-17	Remove Text.Pandoc.BCP47 module.	John MacFarlane	19	-293/+198
	[API change] Use Lang from UnicodeCollation.Lang instead. This is a richer implementation of BCP 47.
2021-04-17	Move getLang from BCP47 -> T.P.Writers.Shared.	John MacFarlane	5	-74/+76
	[API change]
2021-04-16	JATS writer: reduce unnecessary use of <p> elements for wrapping	Albert Krewinkel	3	-16/+47
	The `<p>` element is used for wrapping in cases were the contents would otherwise not be allowed in a certain context. Unnecessary wrapping is avoided, especially around quotes (`<disp-quote>` elements). Closes: #7227
2021-04-10	JATS writer: convert spans to <named-content> elements	Albert Krewinkel	1	-6/+7
	Spans with attributes are converted to `<named-content>` elements instead of being wrapped with `<milestone-start/>` and `<milestone-end>` elements. Milestone elements are not allowed in documents using the articleauthoring tag set, so this change ensures the creation of valid documents. Closes: #7211
2021-04-10	JATS writer: add footnote number as label in backmatter	Albert Krewinkel	1	-0/+1
	Footnotes in the backmatter are given the footnote's number as a label. The articleauthoring output is unaffected from this change, as footnotes are placed inline there. Closes: #7210
2021-04-08	Fix regression in grid tables for wide characters.	John MacFarlane	1	-5/+13
	In the translation from String to Text, a char-width-sensitive splitAt' was dropped. This commit reinstates it. Closes #7214.
2021-04-08	Lua filter: respect Inlines/Blocks filter functions in pandoc.walk_*	Albert Krewinkel	2	-3/+10

2021-04-05	Commonmark writer: Use backslash escapes for `<` and `\|`...	John MacFarlane	1	-1/+11
	instead of entities. Closes #7208.
2021-04-05	SelfContained: remove unneeded imports.	John MacFarlane	1	-2/+0

2021-04-05	JATS writer: escape disallows chars in identifiers	Albert Krewinkel	4	-19/+47
	XML identifiers must start with an underscore or letter, and can contain only a limited set of punctuation characters. Any IDs not adhering to these rules are rewritten by writing the offending characters as Uxxxx, where `xxxx` is the character's hex code.
2021-04-05	SelfContained: use application/octet-stream for unknown mime types...	John MacFarlane	1	-5/+4
	instead of halting with an error. Closes #7202.
2021-04-02	Fix "phrase" in DocBook: take classes from "role" not "class".	John MacFarlane	1	-1/+1
	Closes #7195. Revises #6438.
2021-04-01	Org writer: Use LaTeX style maths deliminators (#7196)	tecosaur	1	-2/+2
	Org works better with LaTeX-style delimiters.
2021-03-31	Treat tabs as spaces in ODT Reader. (#7185)	niszet	1	-1/+7

2021-03-29	Powerpoint writer: allow monofont to be specified in metadata...	John MacFarlane	2	-6/+19
	...not just using `--variable` on the command line (as in other writers). Closes #7187.
2021-03-24	Fix DocBook reader mathml regression...	John MacFarlane	2	-4/+7
	...caused by the switch in XML libraries. Also fixed a similar issue in JATS. Closes #7173.
2021-03-21	Simplify T.P.Asciify and export toAsciiText [API change].	John MacFarlane	3	-395/+18
	Instead of encoding a giant (and incomplete) map, we now just use unicode-transforms to normalize the text to a canonical decomposition, and manipulate the result. The new `toAsciiText` is equivalent to the old `T.pack . mapMaybe toAsciiChar . T.unpack` but should be faster.
2021-03-20	Support `yaml_metadata_block` extension form commonmark, gfm.	John MacFarlane	3	-1/+34
	This is a bit more limited than with markdown, as documented in the manual: - The YAML block must be the first thing in the input. - The leaf notes are parsed in isolation from the rest of the document. So, for example, you can't use reference links if the references are defined later in the document. Closes #6537.
2021-03-20	Move yamlMetaBlock from Markdown reader to T.P.Readers.Metadata.	John MacFarlane	2	-22/+22

2021-03-20	Markdown reader: export `yamlMetaBlock`.	John MacFarlane	1	-17/+23
	[API change] This will allow us to parse YAML metadata blocks in other readers, potentially.
2021-03-20	Text.Pandoc.Parsing: remove F type synonym.	John MacFarlane	5	-9/+9
	Muse and Org were defining their own F anyway, with their own state. We therefore move this definition to the Markdown reader.
2021-03-20	T.P.Readers.Metadata: made `yamlBsToMeta`, `yamlBsToRefs` polymorphic...	John MacFarlane	1	-15/+15
	on the parser state, instead of requiring ParserState. [API change]
2021-03-20	RST writer: use NonEmpty for init, last.	John MacFarlane	1	-8/+12

2021-03-20	Include Header.Attr.attributes as XML attributes on section	Erik Rask	1	-2/+45
	Add key-value pairs found in the attributes list of Header.Attr as XML attributes on the corresponding section element. Any key name not allowed as an XML attribute name is dropped, as are keys with invalid values where they are defined as enums in DocBook, and xml:id (for DocBook 5)/id (for DocBook 4) to not intervene with computed identifiers.
2021-03-20	T.P.Shared: remove `backslashEscapes`, `escapeStringUsing`.	John MacFarlane	8	-47/+77
	[API change] These are inefficient association list lookups. Replace with more efficient functions in the writers that used them (with 10-25% performance improvements in haddock, org, rtf, texinfo writers).
2021-03-19	Fix fallback to default partials on templates.	John MacFarlane	1	-0/+4
	If the directory containing a template does not contain the partial, it should be sought in the default data files. Closes #7164.
2021-03-19	Hlint suggestion.	John MacFarlane	1	-2/+3

2021-03-19	T.P.Shared: Remove ToString, ToText typeclasses [API change].	John MacFarlane	2	-24/+4
	T.P.Parsing: revise type of readWithM so that it takes a Text rather than a polymorphic ToText value. These typeclasses were there to ease the transition from String to Text. They are no longer needed, and they may clash with more useful versions under the same name. This will require a bump to 2.13.
2021-03-19	Protect partial uses of maximum with NonEmpty.	John MacFarlane	21	-86/+108