pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2021-04-17	Use new citeproc + unicode-collation.	John MacFarlane	1	-0/+130
	Add command test for unicode-collation.
2021-04-16	JATS writer: reduce unnecessary use of <p> elements for wrapping	Albert Krewinkel	4	-128/+119
	The `<p>` element is used for wrapping in cases were the contents would otherwise not be allowed in a certain context. Unnecessary wrapping is avoided, especially around quotes (`<disp-quote>` elements). Closes: #7227
2021-04-10	JATS writer: convert spans to <named-content> elements	Albert Krewinkel	1	-0/+15
	Spans with attributes are converted to `<named-content>` elements instead of being wrapped with `<milestone-start/>` and `<milestone-end>` elements. Milestone elements are not allowed in documents using the articleauthoring tag set, so this change ensures the creation of valid documents. Closes: #7211
2021-04-10	JATS writer: add footnote number as label in backmatter	Albert Krewinkel	2	-12/+16
	Footnotes in the backmatter are given the footnote's number as a label. The articleauthoring output is unaffected from this change, as footnotes are placed inline there. Closes: #7210
2021-04-08	Fix regression in grid tables for wide characters.	John MacFarlane	1	-0/+28
	In the translation from String to Text, a char-width-sensitive splitAt' was dropped. This commit reinstates it. Closes #7214.
2021-04-05	Commonmark writer: Use backslash escapes for `<` and `\|`...	John MacFarlane	1	-0/+6
	instead of entities. Closes #7208.
2021-04-05	JATS writer: escape disallows chars in identifiers	Albert Krewinkel	1	-90/+115
	XML identifiers must start with an underscore or letter, and can contain only a limited set of punctuation characters. Any IDs not adhering to these rules are rewritten by writing the offending characters as Uxxxx, where `xxxx` is the character's hex code.
2021-04-01	Org writer: Use LaTeX style maths deliminators (#7196)	tecosaur	1	-7/+7
	Org works better with LaTeX-style delimiters.
2021-03-31	Treat tabs as spaces in ODT Reader. (#7185)	niszet	3	-0/+2

2021-03-24	Fix DocBook reader mathml regression...	John MacFarlane	2	-1/+140
	...caused by the switch in XML libraries. Also fixed a similar issue in JATS. Closes #7173.
2021-03-20	Include Header.Attr.attributes as XML attributes on section	Erik Rask	1	-0/+37
	Add key-value pairs found in the attributes list of Header.Attr as XML attributes on the corresponding section element. Any key name not allowed as an XML attribute name is dropped, as are keys with invalid values where they are defined as enums in DocBook, and xml:id (for DocBook 5)/id (for DocBook 4) to not intervene with computed identifiers.
2021-03-19	Tests: Use getExecutablePath from base...	John MacFarlane	3	-4/+3
	avoiding the need to depend on the executable-path package.
2021-03-19	Tests: factor out setupEnvironment in Test.Helpers.	John MacFarlane	3	-25/+26
	This avoids code duplication between Command and Old.
2021-03-19	Fix finding of data files from test programs.	John MacFarlane	2	-2/+5
	Apparently Cabal sets a `pandoc_datadir` environment variable so that the data files will be sought in the source directory rather than in the final destination (where they aren't yet installed). So we no longer need to set `--data-dir` in the tests. We just need to make sure `pandoc_datadir` is set in the environment when we call the program in the test suite. This will fix the issue with loading of pandoc.lua when pandoc is built with `-embed_data_files`, reported in #7163. Closes #7163.
2021-03-17	Docx writer: make nsid in abstractNum deterministic.	John MacFarlane	33	-0/+0
	Previously we assigned a random number (though in a deterministic way). But changes in the random package mean we get different results now on different architectures, even with the same random seed. We don't need random values; so now we just assign a value based on the list number id, which is guaranteed to be unique to the list marker.
2021-03-17	Add test for #7155.	John MacFarlane	1	-0/+15

2021-03-15	Update tests for new texmath.	John MacFarlane	5	-6/+6

2021-03-13	MediaWiki reader: Allow block-level content in notes (ref).	John MacFarlane	1	-0/+12
	Closes #7145.
2021-03-13	Use integral values for w:tblW in docx.	John MacFarlane	3	-0/+0
	Cloess #7141.
2021-03-13	Use jira-wiki-markup 1.3.4	Albert Krewinkel	2	-3/+2
	Jira reader: * Fixed parsing of autolinks (i.e., of bare URLs in the text). Previously an autolink would take up the rest of a line, as spaces were allowed characters in these items. * Emoji character sequences no longer cause parsing failures. This was due to missing backtracking when emoji parsing fails. Jira writer: * Block quotes are only rendered as `bq.` if they do not contain a linebreak.
2021-03-13	Jira reader: mark divs created from panels with class "panel".	Albert Krewinkel	1	-0/+6
	Closes: tarleb/jira-wiki-markup#2
2021-03-13	Jira writer: improve div/panel handling	Albert Krewinkel	1	-0/+30
	Include div attributes in panels, always render divs with class `panel` as panels, and avoid nesting of panels.
2021-03-10	HTML writer: Add warnings on duplicate attribute values.	John MacFarlane	1	-0/+7
	This prevents emitting invalid HTML. Ultimately it would be good to prevent this in the types themselves, but this is better for now. T.P.Logging: Add DuplicateAttribute constructor to LogMessage. [API change]
2021-03-09	RST reader: fix logic for ending comments.	John MacFarlane	1	-0/+16
	Previously comments sometimes got extended too far. Closes #7134.
2021-03-09	Org writer: prevent unintended creation of ordered list items	Albert Krewinkel	1	-0/+10
	Adjust line wrapping if default wrapping would cause a line to be read as an ordered list item. Fixes #7132
2021-03-08	Jira writer: use noformat instead of code for unknown languages.	Albert Krewinkel	2	-33/+32
	Code blocks that are not marked as a language supported by Jira are rendered as preformatted text with `{noformat}` blocks. Fixes: tarleb/jira-wiki-markup#4
2021-03-07	LaTeX reader: handle table cells containing `&` in `\verb`.	John MacFarlane	1	-0/+27
	Closes #7129.
2021-03-01	Jira writer: use Span identifiers as anchors	Albert Krewinkel	1	-1/+8
	Closes: tarleb/jira-wiki-markup#3.
2021-02-28	Remove superfluous imports.	John MacFarlane	1	-2/+0

2021-02-28	T.P.Readers.LaTeX: Don't export tokenize, untokenize.	John MacFarlane	1	-16/+1
	[API change] These were only exported for testing, which seems the wrong thing to do. They don't belong in the public API and are not really usable as they are, without access to the Tok type which is not exported. Removed the tokenize/untokenize roundtrip test. We put a quickcheck property in the comments which may be used when this code is touched (if it is).
2021-02-26	Update tests for changes to https URLs.	John MacFarlane	6	-6/+6

2021-02-26	Fix/update URLs and use HTTPS where possible (#7122)	Salim B	1	-1/+1

2021-02-22	T.P.CSV: fix parsing of unquoted values.	John MacFarlane	1	-0/+15
	Previously we didn't allow unescaped quotes in unquoted values, but they are allowed. Closes #7112.
2021-02-22	tests: print accurate location if a test fails	Albert Krewinkel	1	-1/+1
	Ensures that tasty-hunit reports the location of the failing test instead of the location of the helper `test` function.
2021-02-22	Text.Pandoc.UTF8: change IO functions to return Text, not String.	John MacFarlane	3	-4/+5
	[API change] This affects `readFile`, `getContents`, `writeFileWith`, `writeFile`, `putStrWith`, `putStr`, `putStrLnWith`, `putStrLn`. `hPutStrWith`, `hPutStr`, `hPutStrLnWith`, `hPutStrLn`, `hGetContents`. This avoids the need to uselessly create a linked list of characters when emiting output.
2021-02-18	Revert "LaTeX template: disable `` ?` `` and `` !` `` ligatures."	John MacFarlane	4	-4/+0
	This reverts commit 24d7cd539ba70aa94480976a7957420c020cb19a.
2021-02-18	LaTeX template: disable `` ?` `` and `` !` `` ligatures.	John MacFarlane	4	-0/+4
	These are often triggered by accident in languagegs that use ` `` ` for end quote (e.g. German). See jgm/citeproc#54.
2021-02-18	Org reader: fix bug in org-ref citation parsing.	Albert Krewinkel	1	-0/+40
	The org-ref syntax allows to list multiple citations separated by comma. This fixes a bug that accepted commas as part of the citation id, so all citation lists were parsed as one single citation. Fixes: #7101
2021-02-16	Rename Text.Pandoc.XMLParser -> Text.Pandoc.XML.Light...	John MacFarlane	76	-1/+2
	..and add new definitions isomorphic to xml-light's, but with Text instead of String. This allows us to keep most of the code in existing readers that use xml-light, but avoid lots of unnecessary allocation. We also add versions of the functions from xml-light's Text.XML.Light.Output and Text.XML.Light.Proc that operate on our modified XML types, and functions that convert xml-light types to our types (since some of our dependencies, like texmath, use xml-light). Update golden tests for docx and pptx. OOXML test: Use `showContent` instead of `ppContent` in `displayDiff`. Docx: Do a manual traversal to unwrap sdt and smartTag. This is faster, and needed to pass the tests. Benchmarks: A = prior to 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8) B = as of 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8) C = this commit \| Reader \| A \| B \| C \| \| ------- \| ----- \| ------ \| ----- \| \| docbook \| 18 ms \| 12 ms \| 10 ms \| \| opml \| 65 ms \| 62 ms \| 35 ms \| \| jats \| 15 ms \| 11 ms \| 9 ms \| \| docx \| 72 ms \| 69 ms \| 44 ms \| \| odt \| 78 ms \| 41 ms \| 28 ms \| \| epub \| 64 ms \| 61 ms \| 56 ms \| \| fb2 \| 14 ms \| 5 ms \| 4 ms \|
2021-02-15	JATS writer: add date-type to pub-date elements	Albert Krewinkel	2	-2/+2

2021-02-15	JATS writer: replace attribute "pub-type" with "publication-format".	Albert Krewinkel	2	-2/+2
	The former attribute is deprecated.
2021-02-13	HTML reader: fix bad handling of empty src attribute in iframe.	John MacFarlane	1	-2/+12
	- If src is empty, we simply skip the iframe. - If src is invalid or cannot be fetched, we issue a warning and skip instead of failing with an error. - Closes #7099.
2021-02-13	T.P.Error: export `renderError`.	John MacFarlane	1	-0/+8
	Refactor `handleError` to use `renderError`. This allows us render error messages without exiting.
2021-02-13	Org: support task_lists extension	Albert Krewinkel	2	-11/+59
	The tasks lists extension is now supported by the org reader and writer; the extension is turned on by default. Closes: #6336
2021-02-12	Fix command test 5686	John MacFarlane	1	-1/+1

2021-02-12	Add command test for #7092	John MacFarlane	1	-0/+8

2021-02-12	Jira: require jira-wiki-markup 1.3.3	Albert Krewinkel	1	-0/+7
	* Modified the Doc parser to skip leading blank lines. This fixes parsing of documents which start with multiple blank lines. (#7095) * Prevent URLs within link aliases to be treated as autolinks. (#6944) Fixes: #7095 Fixes: #6944
2021-02-10	Add new unexported module T.P.XMLParser.	John MacFarlane	80	-4/+11
	This exports functions that uses xml-conduit's parser to produce an xml-light Element or [Content]. This allows existing pandoc code to use a better parser without much modification. The new parser is used in all places where xml-light's parser was previously used. Benchmarks show a significant performance improvement in parsing XML-based formats (especially ODT and FB2). Note that the xml-light types use String, so the conversion from xml-conduit types involves a lot of extra allocation. It would be desirable to avoid that in the future by gradually switching to using xml-conduit directly. This can be done module by module. The new parser also reports errors, which we report when possible. A new constructor PandocXMLError has been added to PandocError in T.P.Error [API change]. Closes #7091, which was the main stimulus. These changes revealed the need for some changes in the tests. The docbook-reader.docbook test lacked definitions for the entities it used; these have been added. And the docx golden tests have been updated, because the new parser does not preserve the order of attributes. Add entity defs to docbook-reader.docbook. Update golden tests for docx.
2021-02-07	Avoid unnecessary use of NoImplicitPrelude pragma (#7089)	Albert Krewinkel	53	-102/+0

2021-02-06	Markdown reader: improved handling of mmd link attributes in references.	John MacFarlane	1	-0/+8
	Previously they only worked for links that had titles. Closes #7080.