pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2021-02-18	DocBook, JATS, OPML readers: performance optimization.	John MacFarlane	3	-64/+8
	With the new XML parser, we can avoid the expensive tree normalization step we used to do. This gives a significant speed boost in docbook and JATS parsing (e.g. 9.7 to 6 ms).
2021-02-18	T.P.XML Improve fromEntities.	John MacFarlane	1	-17/+13

2021-02-18	T.P.PDF: disable `smart` when building PDF via LaTeX.	John MacFarlane	1	-1/+5
	This is to prevent accidental creation of ligatures like `` ?` `` and `` !` `` (especially in languages with quotations like German), and similar ligature issues. See jgm/citeproc#54.
2021-02-18	Revert "LaTeX template: disable `` ?` `` and `` !` `` ligatures."	John MacFarlane	5	-5/+0
	This reverts commit 24d7cd539ba70aa94480976a7957420c020cb19a.
2021-02-18	LaTeX template: disable `` ?` `` and `` !` `` ligatures.	John MacFarlane	5	-0/+5
	These are often triggered by accident in languagegs that use ` `` ` for end quote (e.g. German). See jgm/citeproc#54.
2021-02-18	LaTeX writer: adjust hypertargets to beginnings of paragraphs.	John MacFarlane	1	-2/+3
	Use `\vadjust pre` so that the hypertarget takes you to the beginning of the paragraph rather than one line down. Closes #7078. This makes a particular difference for links to citations using `--citeproc` and `link-citations: true`.
2021-02-18	T.P.Shared: cleanup.	John MacFarlane	1	-11/+26
	Cleanup up some functions and added deprecation pragmas to funtions no longer used in the code base.
2021-02-18	Org reader: fix bug in org-ref citation parsing.	Albert Krewinkel	2	-1/+41
	The org-ref syntax allows to list multiple citations separated by comma. This fixes a bug that accepted commas as part of the citation id, so all citation lists were parsed as one single citation. Fixes: #7101
2021-02-18	Allow base64-bytestring-1.2.*	Dmitrii Kovanikov	1	-2/+2

2021-02-17	Docx reader: use Map instead of list for Namespaces.	John MacFarlane	2	-20/+20
	This gives a speedup of about 5-10%. The reader is now approximately twice as fast as in the last release.
2021-02-16	Revert "Add T.P.XML.Light.Cursor."	John MacFarlane	2	-347/+0
	This reverts commit d8fc4971868104274881570ce9bc3d9edf0d2506.
2021-02-16	Add T.P.XML.Light.Cursor.	John MacFarlane	2	-0/+347

2021-02-16	Add orig copyright/license info for code derived from xml-light.	John MacFarlane	3	-3/+12

2021-02-16	Split up T.P.XML.Light into submodules.	John MacFarlane	5	-504/+568

2021-02-16	Rename Text.Pandoc.XMLParser -> Text.Pandoc.XML.Light...	John MacFarlane	102	-930/+1388
	..and add new definitions isomorphic to xml-light's, but with Text instead of String. This allows us to keep most of the code in existing readers that use xml-light, but avoid lots of unnecessary allocation. We also add versions of the functions from xml-light's Text.XML.Light.Output and Text.XML.Light.Proc that operate on our modified XML types, and functions that convert xml-light types to our types (since some of our dependencies, like texmath, use xml-light). Update golden tests for docx and pptx. OOXML test: Use `showContent` instead of `ppContent` in `displayDiff`. Docx: Do a manual traversal to unwrap sdt and smartTag. This is faster, and needed to pass the tests. Benchmarks: A = prior to 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8) B = as of 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8) C = this commit \| Reader \| A \| B \| C \| \| ------- \| ----- \| ------ \| ----- \| \| docbook \| 18 ms \| 12 ms \| 10 ms \| \| opml \| 65 ms \| 62 ms \| 35 ms \| \| jats \| 15 ms \| 11 ms \| 9 ms \| \| docx \| 72 ms \| 69 ms \| 44 ms \| \| odt \| 78 ms \| 41 ms \| 28 ms \| \| epub \| 64 ms \| 61 ms \| 56 ms \| \| fb2 \| 14 ms \| 5 ms \| 4 ms \|
2021-02-15	JATS writer: add date-type to pub-date elements	Albert Krewinkel	4	-6/+13

2021-02-15	JATS writer: replace attribute "pub-type" with "publication-format".	Albert Krewinkel	4	-9/+12
	The former attribute is deprecated.
2021-02-14	T.P.Error: remove unused variables	Albert Krewinkel	1	-2/+2

2021-02-14	Allow tasty 1.4.*	Albert Krewinkel	1	-1/+1

2021-02-13	HTML reader: fix bad handling of empty src attribute in iframe.	John MacFarlane	2	-8/+24
	- If src is empty, we simply skip the iframe. - If src is invalid or cannot be fetched, we issue a warning and skip instead of failing with an error. - Closes #7099.
2021-02-13	T.P.Error: export `renderError`.	John MacFarlane	2	-33/+80
	Refactor `handleError` to use `renderError`. This allows us render error messages without exiting.
2021-02-13	Org: support task_lists extension	Albert Krewinkel	5	-16/+113
	The tasks lists extension is now supported by the org reader and writer; the extension is turned on by default. Closes: #6336
2021-02-13	T.P.Shared: export `handleTaskListItem`. [API change]	Albert Krewinkel	1	-0/+1

2021-02-13	LaTeX reader: remove unnecessary line	John MacFarlane	1	-1/+0

2021-02-13	Remove Ext_fenced_code_attributes from allowed commonmark attributes.	John MacFarlane	1	-2/+0
	This attribute was listed as allowed, but it didn't actually do anything. Use `attributes` for code attributes and more. Closes #7097.
2021-02-13	Clean up benchmark code.	John MacFarlane	2	-75/+41
	Now we can do patterns using `-p blah'.
2021-02-12	Avoid an unnecessary withRaw.	John MacFarlane	1	-1/+4

2021-02-12	LaTeX reader improvements.	John MacFarlane	2	-22/+68
	* Rewrote `withRaw` so it doesn't rely on fragile assumptions about token positions (which break when macros are expanded). This requires the addition of `sEnableWithRaw` and `sRawTokens` in `LaTeXState`, and a new combinator `disablingWithRaw` to disable collecting of raw tokens in certain contexts. * Add `parseFromToks` to T.P.Readers.LaTeX.Parsing. * Fix parsing of single character tokens so it doesn't mess up the new raw token collecting. * These changes slightly increase allocations and have a small performance impact, but it's minor. Closes #7092.
2021-02-12	Fix command test 5686	John MacFarlane	1	-1/+1

2021-02-12	Add command test for #7092	John MacFarlane	1	-0/+8

2021-02-12	Jira: require jira-wiki-markup 1.3.3	Albert Krewinkel	3	-2/+9
	* Modified the Doc parser to skip leading blank lines. This fixes parsing of documents which start with multiple blank lines. (#7095) * Prevent URLs within link aliases to be treated as autolinks. (#6944) Fixes: #7095 Fixes: #6944
2021-02-11	Add MANUAL section on reproducible builds.	John MacFarlane	1	-0/+15

2021-02-11	Use getTimestamp instead of getCurrentTime in writers.	John MacFarlane	5	-7/+7
	Setting SOURCE_DATE_EPOCH will allow reproducible builds. Partially addresses #7093. This does not suffice to fully enable reproducible in EPUB, since a unique id is being generated for each build.
2021-02-11	T.P.Class: Add getTimestamp [API change].	John MacFarlane	1	-2/+19
	This attempts to read the SOURCE_DATE_EPOCH environment variable and parse a UTC time from it (treating it as a unix date stamp, see https://reproducible-builds.org/specs/source-date-epoch/). If the variable is not set or can't be parsed as a unix date stamp, then the function returns the current date.
2021-02-11	Correctly parse "raw" date value in markdown references metadata.	John MacFarlane	1	-3/+5
	See jgm/citeproc#53.
2021-02-10	Add new unexported module T.P.XMLParser.	John MacFarlane	98	-91/+238
	This exports functions that uses xml-conduit's parser to produce an xml-light Element or [Content]. This allows existing pandoc code to use a better parser without much modification. The new parser is used in all places where xml-light's parser was previously used. Benchmarks show a significant performance improvement in parsing XML-based formats (especially ODT and FB2). Note that the xml-light types use String, so the conversion from xml-conduit types involves a lot of extra allocation. It would be desirable to avoid that in the future by gradually switching to using xml-conduit directly. This can be done module by module. The new parser also reports errors, which we report when possible. A new constructor PandocXMLError has been added to PandocError in T.P.Error [API change]. Closes #7091, which was the main stimulus. These changes revealed the need for some changes in the tests. The docbook-reader.docbook test lacked definitions for the entities it used; these have been added. And the docx golden tests have been updated, because the new parser does not preserve the order of attributes. Add entity defs to docbook-reader.docbook. Update golden tests for docx.
2021-02-08	Use lts-17.2 resolver (with ghc 8.10.3).	John MacFarlane	1	-10/+1

2021-02-08	ODT reader: finer-grained errors on parse failure.	John MacFarlane	1	-21/+18
	See #7091.
2021-02-08	ODT reader: give more information if zip can't be unpacked.	John MacFarlane	1	-1/+4

2021-02-08	DocBook reader: Support informalfigure (#7079)	Nils Carlson	1	-1/+3
	Add support for informalfigure.
2021-02-07	Avoid unnecessary use of NoImplicitPrelude pragma (#7089)	Albert Krewinkel	59	-112/+1

2021-02-07	pandoc.cabal: use common stanza to reduce duplication (#7086)	Albert Krewinkel	1	-124/+45

2021-02-07	Document no template fallback for absolute path (#7088)	Nixon Enraght-Moony	1	-1/+2
	See jgm/pandoc#7077
2021-02-06	Markdown reader: improved handling of mmd link attributes in references.	John MacFarlane	2	-0/+10
	Previously they only worked for links that had titles. Closes #7080.
2021-02-06	stack.yaml - use commonmark-0.1.1.4 for GHC 9	John MacFarlane	1	-1/+1

2021-02-06	CI: use haskell/actions/setup.	John MacFarlane	1	-3/+3
	actions/haskell-setup is no longer maintained.
2021-02-06	CI: use cabal 2.2 when building with GHC 8.0.2. (#7085)	Albert Krewinkel	1	-5/+8

2021-02-04	Lua filters: use same function names in Haskell and Lua	Albert Krewinkel	3	-28/+31

2021-02-04	doc/lua-filters.md: improve docs for `pandoc.mediabag.insert`	Albert Krewinkel	1	-2/+3

2021-02-04	doc/lua-filters.md: fix, improve docs for `pandoc.mediabag.fetch`	Albert Krewinkel	1	-2/+12