pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2021-03-13	Use integral values for w:tblW in docx.	John MacFarlane	3	-0/+0
	Cloess #7141.
2021-03-13	Use jira-wiki-markup 1.3.4	Albert Krewinkel	2	-3/+2
	Jira reader: * Fixed parsing of autolinks (i.e., of bare URLs in the text). Previously an autolink would take up the rest of a line, as spaces were allowed characters in these items. * Emoji character sequences no longer cause parsing failures. This was due to missing backtracking when emoji parsing fails. Jira writer: * Block quotes are only rendered as `bq.` if they do not contain a linebreak.
2021-03-13	Jira reader: mark divs created from panels with class "panel".	Albert Krewinkel	1	-0/+6
	Closes: tarleb/jira-wiki-markup#2
2021-03-13	Jira writer: improve div/panel handling	Albert Krewinkel	1	-0/+30
	Include div attributes in panels, always render divs with class `panel` as panels, and avoid nesting of panels.
2021-03-10	HTML writer: Add warnings on duplicate attribute values.	John MacFarlane	1	-0/+7
	This prevents emitting invalid HTML. Ultimately it would be good to prevent this in the types themselves, but this is better for now. T.P.Logging: Add DuplicateAttribute constructor to LogMessage. [API change]
2021-03-09	RST reader: fix logic for ending comments.	John MacFarlane	1	-0/+16
	Previously comments sometimes got extended too far. Closes #7134.
2021-03-09	Org writer: prevent unintended creation of ordered list items	Albert Krewinkel	1	-0/+10
	Adjust line wrapping if default wrapping would cause a line to be read as an ordered list item. Fixes #7132
2021-03-08	Jira writer: use noformat instead of code for unknown languages.	Albert Krewinkel	2	-33/+32
	Code blocks that are not marked as a language supported by Jira are rendered as preformatted text with `{noformat}` blocks. Fixes: tarleb/jira-wiki-markup#4
2021-03-07	LaTeX reader: handle table cells containing `&` in `\verb`.	John MacFarlane	1	-0/+27
	Closes #7129.
2021-03-01	Jira writer: use Span identifiers as anchors	Albert Krewinkel	1	-1/+8
	Closes: tarleb/jira-wiki-markup#3.
2021-02-28	Remove superfluous imports.	John MacFarlane	1	-2/+0

2021-02-28	T.P.Readers.LaTeX: Don't export tokenize, untokenize.	John MacFarlane	1	-16/+1
	[API change] These were only exported for testing, which seems the wrong thing to do. They don't belong in the public API and are not really usable as they are, without access to the Tok type which is not exported. Removed the tokenize/untokenize roundtrip test. We put a quickcheck property in the comments which may be used when this code is touched (if it is).
2021-02-26	Update tests for changes to https URLs.	John MacFarlane	6	-6/+6

2021-02-26	Fix/update URLs and use HTTPS where possible (#7122)	Salim B	1	-1/+1

2021-02-22	T.P.CSV: fix parsing of unquoted values.	John MacFarlane	1	-0/+15
	Previously we didn't allow unescaped quotes in unquoted values, but they are allowed. Closes #7112.
2021-02-22	tests: print accurate location if a test fails	Albert Krewinkel	1	-1/+1
	Ensures that tasty-hunit reports the location of the failing test instead of the location of the helper `test` function.
2021-02-22	Text.Pandoc.UTF8: change IO functions to return Text, not String.	John MacFarlane	3	-4/+5
	[API change] This affects `readFile`, `getContents`, `writeFileWith`, `writeFile`, `putStrWith`, `putStr`, `putStrLnWith`, `putStrLn`. `hPutStrWith`, `hPutStr`, `hPutStrLnWith`, `hPutStrLn`, `hGetContents`. This avoids the need to uselessly create a linked list of characters when emiting output.
2021-02-18	Revert "LaTeX template: disable `` ?` `` and `` !` `` ligatures."	John MacFarlane	4	-4/+0
	This reverts commit 24d7cd539ba70aa94480976a7957420c020cb19a.
2021-02-18	LaTeX template: disable `` ?` `` and `` !` `` ligatures.	John MacFarlane	4	-0/+4
	These are often triggered by accident in languagegs that use ` `` ` for end quote (e.g. German). See jgm/citeproc#54.
2021-02-18	Org reader: fix bug in org-ref citation parsing.	Albert Krewinkel	1	-0/+40
	The org-ref syntax allows to list multiple citations separated by comma. This fixes a bug that accepted commas as part of the citation id, so all citation lists were parsed as one single citation. Fixes: #7101
2021-02-16	Rename Text.Pandoc.XMLParser -> Text.Pandoc.XML.Light...	John MacFarlane	76	-1/+2
	..and add new definitions isomorphic to xml-light's, but with Text instead of String. This allows us to keep most of the code in existing readers that use xml-light, but avoid lots of unnecessary allocation. We also add versions of the functions from xml-light's Text.XML.Light.Output and Text.XML.Light.Proc that operate on our modified XML types, and functions that convert xml-light types to our types (since some of our dependencies, like texmath, use xml-light). Update golden tests for docx and pptx. OOXML test: Use `showContent` instead of `ppContent` in `displayDiff`. Docx: Do a manual traversal to unwrap sdt and smartTag. This is faster, and needed to pass the tests. Benchmarks: A = prior to 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8) B = as of 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8) C = this commit \| Reader \| A \| B \| C \| \| ------- \| ----- \| ------ \| ----- \| \| docbook \| 18 ms \| 12 ms \| 10 ms \| \| opml \| 65 ms \| 62 ms \| 35 ms \| \| jats \| 15 ms \| 11 ms \| 9 ms \| \| docx \| 72 ms \| 69 ms \| 44 ms \| \| odt \| 78 ms \| 41 ms \| 28 ms \| \| epub \| 64 ms \| 61 ms \| 56 ms \| \| fb2 \| 14 ms \| 5 ms \| 4 ms \|
2021-02-15	JATS writer: add date-type to pub-date elements	Albert Krewinkel	2	-2/+2

2021-02-15	JATS writer: replace attribute "pub-type" with "publication-format".	Albert Krewinkel	2	-2/+2
	The former attribute is deprecated.
2021-02-13	HTML reader: fix bad handling of empty src attribute in iframe.	John MacFarlane	1	-2/+12
	- If src is empty, we simply skip the iframe. - If src is invalid or cannot be fetched, we issue a warning and skip instead of failing with an error. - Closes #7099.
2021-02-13	T.P.Error: export `renderError`.	John MacFarlane	1	-0/+8
	Refactor `handleError` to use `renderError`. This allows us render error messages without exiting.
2021-02-13	Org: support task_lists extension	Albert Krewinkel	2	-11/+59
	The tasks lists extension is now supported by the org reader and writer; the extension is turned on by default. Closes: #6336
2021-02-12	Fix command test 5686	John MacFarlane	1	-1/+1

2021-02-12	Add command test for #7092	John MacFarlane	1	-0/+8

2021-02-12	Jira: require jira-wiki-markup 1.3.3	Albert Krewinkel	1	-0/+7
	* Modified the Doc parser to skip leading blank lines. This fixes parsing of documents which start with multiple blank lines. (#7095) * Prevent URLs within link aliases to be treated as autolinks. (#6944) Fixes: #7095 Fixes: #6944
2021-02-10	Add new unexported module T.P.XMLParser.	John MacFarlane	80	-4/+11
	This exports functions that uses xml-conduit's parser to produce an xml-light Element or [Content]. This allows existing pandoc code to use a better parser without much modification. The new parser is used in all places where xml-light's parser was previously used. Benchmarks show a significant performance improvement in parsing XML-based formats (especially ODT and FB2). Note that the xml-light types use String, so the conversion from xml-conduit types involves a lot of extra allocation. It would be desirable to avoid that in the future by gradually switching to using xml-conduit directly. This can be done module by module. The new parser also reports errors, which we report when possible. A new constructor PandocXMLError has been added to PandocError in T.P.Error [API change]. Closes #7091, which was the main stimulus. These changes revealed the need for some changes in the tests. The docbook-reader.docbook test lacked definitions for the entities it used; these have been added. And the docx golden tests have been updated, because the new parser does not preserve the order of attributes. Add entity defs to docbook-reader.docbook. Update golden tests for docx.
2021-02-07	Avoid unnecessary use of NoImplicitPrelude pragma (#7089)	Albert Krewinkel	53	-102/+0

2021-02-06	Markdown reader: improved handling of mmd link attributes in references.	John MacFarlane	1	-0/+8
	Previously they only worked for links that had titles. Closes #7080.
2021-02-03	LaTeX template: Update to iftex package (#7073)	Andrew Dunning	4	-15/+15
	Load the iftex package directly rather than via the ifxetex and ifluatex compatibility wrappers, which have been merged into a single package that is part of the LaTeX core. The capitalization of the commands has been changed for compatibility with older versions of TeX Live that have the version of iftex by the Persian TeX Group. This had been removed in <https://github.com/jgm/pandoc/commit/2845794c0c31b2ef1f3e6a73bb5b109da4c74f37> for compatibility with BasicTeX, but that is no longer an issue.
2021-02-02	Fixed some compiler warnings in tests.	John MacFarlane	3	-14/+3

2021-02-02	Add tests for search_path_separator	Albert Krewinkel	1	-0/+8

2021-02-02	Check that all documented functions are present.	Albert Krewinkel	1	-0/+19
	Rely on tests in the module package to check the correctness of each function.
2021-02-02	Lua: add module "pandoc.path"	Albert Krewinkel	2	-0/+19
	The module allows to work with file paths in a convenient and platform-independent manner. Closes: #6001 Closes: #6565
2021-02-02	Test suite: a more robust way of testing the executable.	John MacFarlane	4	-72/+64
	Mmny of our tests require running the pandoc executable. This is problematic for a few different reasons. First, cabal-install will sometimes run the test suite after building the library but before building the executable, which means the executable isn't in place for the tests. One can work around that by first building, then building and running the tests, but that's fragile. Second, we have to find the executable. So far, we've done that using a function findPandoc that attempts to locate it relative to the test executable (which can be located using findExecutablePath). But the logic here is delicate and work with every combination of options. To solve both problems, we add an `--emulate` option to the `test-pandoc` executable. When `--emulate` occurs as the first argument passed to `test-pandoc`, the program simply emulates the regular pandoc executable, using the rest of the arguments (after `--emulate`). Thus, test-pandoc --emulate -f markdown -t latex is just like pandoc -f markdown -t latex Since all the work is done by library functions, implementing this emulation just takes a couple lines of code and should be entirely reliable. With this change, we can test the pandoc executable by running the test program itself (locatable using findExecutablePath) with the `--emulate` option. This removes the need for the fragile `findPandoc` step, and it means we can run our integration tests even when we're just building the library, not the executable. Part of this change involved simplifying some complex handling to set environment variables for dynamic library paths. I have tested a build with `--enable-dynamic-executable`, and it works, but further testing may be needed.
2021-02-01	BibTeX writer: use doclayout and doctemplate.	John MacFarlane	1	-3/+3
	This change allows bibtex/biblatex output to wrap as other formats do, depending on the settings of `--wrap` and `--columns`. It also introduces default templates for bibtex and biblatex, which allow for using the variables `header-include`, `include-before` or `include-after` (or alternatively the command line options `--include-in-header`, `--include-before-body`, `--include-after-body`) to insert content into the generated bibtex/biblatex. This change requires a change in the return type of the unexported `T.P.Citeproc.writeBibTeXString` from `Text` to `Doc Text`. Closes #7068.
2021-02-01	BibTeX writer fixes. Closes #7067.	John MacFarlane	1	-0/+90
	+ Require citeproc 0.3.0.7, which correctly titlecases when titles contain non-ASCII characters. + Correctly handle 'pages' (= 'page' in CSL). + Correctly handle BibLaTeX 'langid' (= 'language' in CSL). + In BibTeX output, protect foreign titles since there's no language field.
2021-01-31	RST reader: fix handling of header in CSV tables.	John MacFarlane	1	-0/+32
	The interpretation of this line is not affected by the delim option. Closes #7064.
2021-01-29	Markdown writer: handle math right before digit.	John MacFarlane	1	-0/+6
	We insert an HTML comment to avoid a `$` right before a digit, which pandoc will not recognize as a math delimiter.
2021-01-26	Clean up BibTeX parsing.	John MacFarlane	2	-5/+6
	Previously there was a messy code path that gave strange results in some cases, not passing through raw tex but trying to extract a string content. This was an artefact of trying to handle some special bibtex-specific commands in the BibTeX reader. Now we just handle these in the LaTeX reader and simplify parsing in the BibTeX reader. This does mean that more raw tex will be passed through (and currently this is not sensitive to the `raw_tex` extension; this should be fixed). Closes #7049.
2021-01-22	ImageSize: use viewBox for svg if no length, width.	John MacFarlane	1	-16/+14
	This change allows pandoc to extract size information from more SVGs. Closes #7045.
2021-01-22	JATS writer: allow to use element-citation	Albert Krewinkel	1	-0/+146

2021-01-19	JATS writer: Ensure that disp-quote is always wrapped in p.	John MacFarlane	3	-96/+128
	Closes #7041.
2021-01-16	Revert "Markdown reader: support GitHub wiki's internal links (#2923) (#6458)"	John MacFarlane	1	-30/+0
	This reverts commit 6efd3460a776620fdb93812daa4f6831e6c332ce. Since this extension is designed to be used with GitHub markdown (gfm), we need to implement the parser as a commonmark extension (commonmark-extensions), rather than in pandoc's markdown reader. When that is done, we can add it here.
2021-01-16	Markdown reader: support GitHub wiki's internal links (#2923) (#6458)	Gautier DI FOLCO	1	-0/+30
	Canges overview: * Add a `Ext_markdown_github_wikilink` constructor to `Extension` [API change]. * Add the parser `githubWikiLink` in `Text.Pandoc.Readers.Markdown` * Add tests.
2021-01-15	Use dev version of citeproc.	John MacFarlane	1	-11/+9
	Change a citation test which had wrong disambiguation (see jgm/citeproc#44).
2021-01-12	Docx writer: handle table header using styles.	John MacFarlane	32	-0/+0
	Instead of hard-coding the border and header cell vertical alignment, we now let this be determined by the Table style, making use of Word's "conditional formatting" for the table's first row. For headerless tables, we use the tblLook element to tell Word not to apply conditional first-row formatting. Closes #7008.