pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2021-05-09	Change reader types, allowing better tracking of source positions.	John MacFarlane	1	-2/+2
	Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn't report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn't resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752). Add Text.Pandoc.Sources (exported module), with a `Sources` type and a `ToSources` class. A `Sources` wraps a list of `(SourcePos, Text)` pairs. [API change] A parsec `Stream` instance is provided for `Sources`. The module also exports versions of parsec's `satisfy` and other Char parsers that track source positions accurately from a `Sources` stream (or any instance of the new `UpdateSourcePos` class). Text.Pandoc.Parsing now exports these modified Char parsers instead of the ones parsec provides. Modified parsers to use a `Sources` as stream [API change]. The readers that previously took a `Text` argument have been modified to take any instance of `ToSources`. So, they may still be used with a `Text`, but they can also be used with a `Sources` object. In Text.Pandoc.Error, modified the constructor PandocParsecError to take a `Sources` rather than a `Text` as first argument, so parse error locations can be accurately reported. T.P.Error: showPos, do not print "-" as source name.
2021-05-05	App: allow tabs expansion even if file-scope is used	Albert Krewinkel	1	-0/+11
	Tabs in plain-text inputs are now handled correctly, even if the `--file-scope` flag is used. Closes: #6709
2021-05-01	Docx writer: support colspans and rowspans in tables	Albert Krewinkel	3	-0/+0
	See: #6315
2021-04-29	Docx reader: add handling of vml image objects (jgm#4735) (#7257)	mbrackeantidot	3	-0/+6
	They represent images, the same way as other images in vml format.
2021-04-29	Further improvements in smart quotes.	John MacFarlane	1	-0/+8
	Improves heuristic for detection of an "open double quote." Closes #2103.
2021-04-28	Smarter smart quotes.	John MacFarlane	3	-7/+26
	Treat a leading " with no closing " as a left curly quote. This supports the practice, in fiction, of continuing paragraphs quoting the same speaker without an end quote. It also helps with quotes that break over lines in line blocks. Closes #7216.
2021-04-28	JATS writer: use either styled-content or named-content for spans.	Albert Krewinkel	1	-5/+9
	If the element has a content-type attribute, or at least one class, then that value is used as `content-type` and the span is put inside a `<named-content>` element. Otherwise a `<styled-content>` element is used instead. Closes: #7211
2021-04-27	Docx writer: autoset table width if no column has an explicit width.	Albert Krewinkel	3	-0/+0

2021-04-25	Markdown writer: Cleaner (code)blocks with single class (#7242)	Jan Tojnar	4	-6/+6
	When a block only has a single class and no other attributes, it is not necessary to wrap the class attribute in curly braces – the class name can be placed after the opening mark as is. This will result in bit cleaner output when pandoc is used as a markdown pretty-printer.
2021-04-25	Add quotes properly in markdown YAML metadata fields.	John MacFarlane	8	-11/+11
	This fixes a bug, which caused the writer to look at the LAST rather than the FIRST character in determining whether quotes were needed. So we got spurious quotes in some cases and didn't get necessary quotes in others. Closes #7245. Updated a number of test cases accordingly.
2021-04-25	Remove biblatex-nussbaum.md test.	John MacFarlane	1	-63/+0
	It is basically the same as biblaetx-quotes.md.
2021-04-18	Use MetaInlines not MetaBlocks for multimarkdown metadata fields.	John MacFarlane	1	-0/+20
	This gives better results in converting to e.g. pandoc markdown. Ref: <https://groups.google.com/d/msgid/pandoc-discuss/9728d1f4-040e-4392-aa04-148f648a8dfdn%40googlegroups.com>
2021-04-17	Update to released unicode-collation, latest citeproc dev version.	John MacFarlane	1	-4/+4
	Update citeproc test.
2021-04-17	Use BCP47 language codes in citeproc tests.	John MacFarlane	3	-4/+4

2021-04-17	Use new citeproc + unicode-collation.	John MacFarlane	1	-0/+130
	Add command test for unicode-collation.
2021-04-16	JATS writer: reduce unnecessary use of <p> elements for wrapping	Albert Krewinkel	4	-128/+119
	The `<p>` element is used for wrapping in cases were the contents would otherwise not be allowed in a certain context. Unnecessary wrapping is avoided, especially around quotes (`<disp-quote>` elements). Closes: #7227
2021-04-10	JATS writer: convert spans to <named-content> elements	Albert Krewinkel	1	-0/+15
	Spans with attributes are converted to `<named-content>` elements instead of being wrapped with `<milestone-start/>` and `<milestone-end>` elements. Milestone elements are not allowed in documents using the articleauthoring tag set, so this change ensures the creation of valid documents. Closes: #7211
2021-04-10	JATS writer: add footnote number as label in backmatter	Albert Krewinkel	2	-12/+16
	Footnotes in the backmatter are given the footnote's number as a label. The articleauthoring output is unaffected from this change, as footnotes are placed inline there. Closes: #7210
2021-04-08	Fix regression in grid tables for wide characters.	John MacFarlane	1	-0/+28
	In the translation from String to Text, a char-width-sensitive splitAt' was dropped. This commit reinstates it. Closes #7214.
2021-04-05	Commonmark writer: Use backslash escapes for `<` and `\|`...	John MacFarlane	1	-0/+6
	instead of entities. Closes #7208.
2021-04-05	JATS writer: escape disallows chars in identifiers	Albert Krewinkel	1	-90/+115
	XML identifiers must start with an underscore or letter, and can contain only a limited set of punctuation characters. Any IDs not adhering to these rules are rewritten by writing the offending characters as Uxxxx, where `xxxx` is the character's hex code.
2021-04-01	Org writer: Use LaTeX style maths deliminators (#7196)	tecosaur	1	-7/+7
	Org works better with LaTeX-style delimiters.
2021-03-31	Treat tabs as spaces in ODT Reader. (#7185)	niszet	3	-0/+2

2021-03-24	Fix DocBook reader mathml regression...	John MacFarlane	2	-1/+140
	...caused by the switch in XML libraries. Also fixed a similar issue in JATS. Closes #7173.
2021-03-20	Include Header.Attr.attributes as XML attributes on section	Erik Rask	1	-0/+37
	Add key-value pairs found in the attributes list of Header.Attr as XML attributes on the corresponding section element. Any key name not allowed as an XML attribute name is dropped, as are keys with invalid values where they are defined as enums in DocBook, and xml:id (for DocBook 5)/id (for DocBook 4) to not intervene with computed identifiers.
2021-03-19	Tests: Use getExecutablePath from base...	John MacFarlane	3	-4/+3
	avoiding the need to depend on the executable-path package.
2021-03-19	Tests: factor out setupEnvironment in Test.Helpers.	John MacFarlane	3	-25/+26
	This avoids code duplication between Command and Old.
2021-03-19	Fix finding of data files from test programs.	John MacFarlane	2	-2/+5
	Apparently Cabal sets a `pandoc_datadir` environment variable so that the data files will be sought in the source directory rather than in the final destination (where they aren't yet installed). So we no longer need to set `--data-dir` in the tests. We just need to make sure `pandoc_datadir` is set in the environment when we call the program in the test suite. This will fix the issue with loading of pandoc.lua when pandoc is built with `-embed_data_files`, reported in #7163. Closes #7163.
2021-03-17	Docx writer: make nsid in abstractNum deterministic.	John MacFarlane	33	-0/+0
	Previously we assigned a random number (though in a deterministic way). But changes in the random package mean we get different results now on different architectures, even with the same random seed. We don't need random values; so now we just assign a value based on the list number id, which is guaranteed to be unique to the list marker.
2021-03-17	Add test for #7155.	John MacFarlane	1	-0/+15

2021-03-15	Update tests for new texmath.	John MacFarlane	5	-6/+6

2021-03-13	MediaWiki reader: Allow block-level content in notes (ref).	John MacFarlane	1	-0/+12
	Closes #7145.
2021-03-13	Use integral values for w:tblW in docx.	John MacFarlane	3	-0/+0
	Cloess #7141.
2021-03-13	Use jira-wiki-markup 1.3.4	Albert Krewinkel	2	-3/+2
	Jira reader: * Fixed parsing of autolinks (i.e., of bare URLs in the text). Previously an autolink would take up the rest of a line, as spaces were allowed characters in these items. * Emoji character sequences no longer cause parsing failures. This was due to missing backtracking when emoji parsing fails. Jira writer: * Block quotes are only rendered as `bq.` if they do not contain a linebreak.
2021-03-13	Jira reader: mark divs created from panels with class "panel".	Albert Krewinkel	1	-0/+6
	Closes: tarleb/jira-wiki-markup#2
2021-03-13	Jira writer: improve div/panel handling	Albert Krewinkel	1	-0/+30
	Include div attributes in panels, always render divs with class `panel` as panels, and avoid nesting of panels.
2021-03-10	HTML writer: Add warnings on duplicate attribute values.	John MacFarlane	1	-0/+7
	This prevents emitting invalid HTML. Ultimately it would be good to prevent this in the types themselves, but this is better for now. T.P.Logging: Add DuplicateAttribute constructor to LogMessage. [API change]
2021-03-09	RST reader: fix logic for ending comments.	John MacFarlane	1	-0/+16
	Previously comments sometimes got extended too far. Closes #7134.
2021-03-09	Org writer: prevent unintended creation of ordered list items	Albert Krewinkel	1	-0/+10
	Adjust line wrapping if default wrapping would cause a line to be read as an ordered list item. Fixes #7132
2021-03-08	Jira writer: use noformat instead of code for unknown languages.	Albert Krewinkel	2	-33/+32
	Code blocks that are not marked as a language supported by Jira are rendered as preformatted text with `{noformat}` blocks. Fixes: tarleb/jira-wiki-markup#4
2021-03-07	LaTeX reader: handle table cells containing `&` in `\verb`.	John MacFarlane	1	-0/+27
	Closes #7129.
2021-03-01	Jira writer: use Span identifiers as anchors	Albert Krewinkel	1	-1/+8
	Closes: tarleb/jira-wiki-markup#3.
2021-02-28	Remove superfluous imports.	John MacFarlane	1	-2/+0

2021-02-28	T.P.Readers.LaTeX: Don't export tokenize, untokenize.	John MacFarlane	1	-16/+1
	[API change] These were only exported for testing, which seems the wrong thing to do. They don't belong in the public API and are not really usable as they are, without access to the Tok type which is not exported. Removed the tokenize/untokenize roundtrip test. We put a quickcheck property in the comments which may be used when this code is touched (if it is).
2021-02-26	Update tests for changes to https URLs.	John MacFarlane	6	-6/+6

2021-02-26	Fix/update URLs and use HTTPS where possible (#7122)	Salim B	1	-1/+1

2021-02-22	T.P.CSV: fix parsing of unquoted values.	John MacFarlane	1	-0/+15
	Previously we didn't allow unescaped quotes in unquoted values, but they are allowed. Closes #7112.
2021-02-22	tests: print accurate location if a test fails	Albert Krewinkel	1	-1/+1
	Ensures that tasty-hunit reports the location of the failing test instead of the location of the helper `test` function.
2021-02-22	Text.Pandoc.UTF8: change IO functions to return Text, not String.	John MacFarlane	3	-4/+5
	[API change] This affects `readFile`, `getContents`, `writeFileWith`, `writeFile`, `putStrWith`, `putStr`, `putStrLnWith`, `putStrLn`. `hPutStrWith`, `hPutStr`, `hPutStrLnWith`, `hPutStrLn`, `hGetContents`. This avoids the need to uselessly create a linked list of characters when emiting output.
2021-02-18	Revert "LaTeX template: disable `` ?` `` and `` !` `` ligatures."	John MacFarlane	4	-4/+0
	This reverts commit 24d7cd539ba70aa94480976a7957420c020cb19a.