pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2020-04-28	Support new Underline element in readers and writers (#6277)	Vaibhav Sagar	4	-10/+6
	Deprecate `underlineSpan` in Shared in favor of `Text.Pandoc.Builder.underline`.
2020-04-19	More fixes for round-trip tests of HTML reader.	John MacFarlane	1	-6/+10
	We exclude tables that have default widths but non-simple content, as these can't really round-trip.
2020-04-18	Fixed round-trip HTML tests.	John MacFarlane	1	-0/+5
	Exclude tables with cells with line breaks because they don't currently round-trip. (Table goes from being simple to having explicit widths.)
2020-04-17	API change: use PandocError for exceptions in Lua subsystem	Albert Krewinkel	1	-7/+8
	The PandocError type is used throughout the Lua subsystem, all Lua functions throw an exception of this type if an error occurs. The `LuaException` type is removed and no longer exported from `Text.Pandoc.Lua`. In its place, a new constructor `PandocLuaError` is added to PandocError.
2020-04-15	Modify toLegacyTable to cut up cells, add tests	despresc	1	-0/+61
	Now a cell with dimension (h, w) will be cut up into h*w cells of dimension (1,1), all in the same grid position, with the upper-left holding the original cell contents and the rest being empty.
2020-04-15	Use the new builders, modify readers to preserve empty headers	despresc	8	-144/+163
	The Builder.simpleTable now only adds a row to the TableHead when the given header row is not null. This uncovered an inconsistency in the readers: some would unconditionally emit a header filled with empty cells, even if the header was not present. Now every reader has the conditional behaviour. Only the XWiki writer depended on the header row being always present; it now pads its head as necessary.
2020-04-15	Adapt to the newest Table type, fix some previous adaptation issues	despresc	8	-50/+80
	- Writers.Native is now adapted to the new Table type. - Inline captions should now be conditionally wrapped in a Plain, not a Para block. - The toLegacyTable function now lives in Writers.Shared.
2020-04-15	Implement the new Table type	despresc	8	-47/+69

2020-04-15	Markdown Reader: Fix inline code in lists (#6284)	Nikolay Yakimov	1	-0/+47
	Closes #6284. Previously inline code containing list markers was sometimes parsed incorrectly.
2020-04-04	Jira: support citations, attachment links, and user links	Albert Krewinkel	2	-3/+67
	Closes: #6231 Closes: #6238 Closes: #6239
2020-04-03	Jira reader: resolve parsing issues of blockquote, color	Albert Krewinkel	2	-2/+21
	Parsing problems occurring with block quotes and colored text have been resolved. Fixes: #6233 Fixes: #6235
2020-03-31	Jira reader: use span with class `underline` for inserted text	Albert Krewinkel	1	-0/+4
	Jira text which is marked as `+inserted+` is converted into pandoc's default representation for underlined text: a span with class `underline`. Previously, the span was marked with the non-standard class `inserted`. Closes: #6237
2020-03-31	Jira writer: convert spans with class `underline` to inserted text	Albert Krewinkel	1	-0/+27
	Spans with class `underline` as converted into Jira text marked as `+inserted+`, i.e. surrounded by plus-signs.
2020-03-30	Jira reader: retain image attributes	Albert Krewinkel	1	-0/+9
	Jira images attributes as in `!image.jpg\|align=right!` are retained as key-value pairs. Thumbnail images, such as `!example.gif\|thumbnail!`, are marked by a `thumbnail` class in their attributes. Related to #6234.
2020-03-28	More cleanup (#6209)	Joseph C. Sible	1	-3/+2
	* Simplify by collapsing a do block into a single <$> * Remove an unnecessary variable: `all` takes any Foldable, so only blocksToInlines needs toList.
2020-03-19	Jira reader: fix parsing of tables without preceding blankline	Albert Krewinkel	1	-0/+5
	A bug was fixed which caused faulty parsing if a table was not preceded by a newline and the first table cell had no space after the initial `\|` characters. Fixes: #6198
2020-03-18	Jira reader: fix parsing of strikeout, emphasis	Albert Krewinkel	1	-0/+4
	A bug was fixed which caused non-emphasized text containing digits and/or non-special symbols (like dots) to sometimes be parsed incorrectly. Fixes: #6196
2020-03-13	Update copyright year (#6186)	Albert Krewinkel	34	-35/+35
	* Update copyright year * Copyright: add notes for Lua and Jira modules
2020-03-13	Jira reader: support colored inline text, indented lists	Albert Krewinkel	1	-0/+4
	* Support for colored inlines has been added. * Lists are now allowed to be indented; i.e., lists are still recognized if list markers are preceded by spaces. Closes: #6183, #6184
2020-03-05	Fix man reader test for previous change.	John MacFarlane	1	-1/+1

2020-02-12	Introduce new format variants for JATS (#6067)	Albert Krewinkel	2	-2/+23
	New formats: - `jats_archiving` for the "Archiving and Interchange Tag Set", - `jats_publishing` for the "Journal Publishing Tag Set", and - `jats_articleauthoring` for the "Article Authoring Tag Set." The "jats" output format is now an alias for "jats_archiving". Closes: #6014
2020-02-08	Use <$> instead of >>= and return (#6128)	Joseph C. Sible	1	-1/+1

2020-02-07	Apply linter suggestions. Add fix_spacing to lint target in Makefile.	John MacFarlane	9	-51/+50

2020-02-07	Various minor cleanups and refactoring (#6117)	Joseph C. Sible	1	-3/+3
	* Use concatMap instead of reimplementing it * Replace an unnecessary multi-way if with a regular if * Use sortOn instead of sortBy and comparing * Use guards instead of lots of indents for if and else * Remove redundant do blocks * Extract common functions from both branches of maybe Whenever both the Nothing and the Just branch of maybe do the same function, do that function on the result of maybe instead. * Use fmap instead of reimplementing it from maybe * Use negative forms instead of negating the positive forms * Use mapMaybe instead of mapping and then using catMaybes * Use zipWith instead of mapping over the result of zip * Use unwords instead of reimplementing it * Use <$ instead of <$> and const * Replace case of Bool with if and else * Use find instead of listToMaybe and filter * Use zipWithM instead of mapM and zip * Inline lambda wrappers into the real functions * We get zipWithM from Text.Pandoc.Writers.Shared * Use maybe instead of fromMaybe and fmap I'm not sure how this one slipped past me. * Increase a bit of indentation
2020-01-15	Lua filters: allow filtering of element lists (#6040)	Albert Krewinkel	1	-1/+24
	Lists of Inline and Block elements can now be filtered via `Inlines` and `Blocks` functions, respectively. This is helpful if a filter conversion depends on the order of elements rather than a single element. For example, the following filter can be used to remove all spaces before a citation: function isSpaceBeforeCite (spc, cite) return spc and spc.t == 'Space' and cite and cite.t == 'Cite' end function Inlines (inlines) for i = #inlines-1,1,-1 do if isSpaceBeforeCite(inlines[i], inlines[i+1]) then inlines:remove(i) end end return inlines end Closes: #6038
2020-01-11	Add tests for pandoc.List module	Albert Krewinkel	1	-0/+2

2020-01-01	LaTeX writer: properly handle unnumbered headings level 4+.	John MacFarlane	1	-1/+1
	Closes #6018. Previously the `\paragraph` command was used instead of `\paragraph*` for unnumbered level 4 headings.
2019-12-21	HTML reader tests: modify round-trip tests...	John MacFarlane	1	-0/+4
	to avoid a special failure case involving makeSections.
2019-12-19	Org reader: fix parsing problem for colons in headline	Albert Krewinkel	1	-0/+10
	Fixed a problem where words surrounded by colons could causing parse failures in some cases when they occurred in headers. Fixes: #5993
2019-12-18	Org reader: wrap named table in div, using name as id	Albert Krewinkel	1	-0/+7
	Closes: #5984
2019-12-17	Add jira reader (#5913)	Albert Krewinkel	2	-0/+116
	Closes #5556
2019-11-18	DokuWiki reader: parse markup inside monospace ('') (#5917)	Alexander Krotov	1	-0/+3
	Fixes #5916
2019-11-14	RST writer: fix backslash escaping after strings	Albert Krewinkel	1	-0/+3
	The check whether a complex inline element following a string must be escaped, now depends on the last character of the string instead of the first. Fixes: #5906
2019-11-12	Switch to new pandoc-types and use Text instead of String [API change].	despresc	14	-32/+49
	PR #5884. + Use pandoc-types 1.20 and texmath 0.12. + Text is now used instead of String, with a few exceptions. + In the MediaBag module, some of the types using Strings were switched to use FilePath instead (not Text). + In the Parsing module, new parsers `manyChar`, `many1Char`, `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`, `mantyUntilChar` have been added: these are like their unsuffixed counterparts but pack some or all of their output. + `glob` in Text.Pandoc.Class still takes String since it seems to be intended as an interface to Glob, which uses strings. It seems to be used only once in the package, in the EPUB writer, so that is not hard to change.
2019-11-04	HTML Reader/Writer - Add support for <var> and <samp> (#5861)	Amogh Rathore	2	-0/+28
	Closes #5799
2019-11-03	Docx reader: fix list number resumption for sublists. Closes #4324.	John MacFarlane	1	-0/+4
	The first list item of a sublist should not resume numbering from the number of the last sublist item of the same level, if that sublist was a sublist of a different list item. That is, we should not get: ``` 1. one 1. sub one 2. sub two 2. two 3. sub one ```
2019-10-30	docbook reader: fix nesting of chapters and sections (#5864)	Florian Klink	1	-0/+2
	* Set dbBook to true when traversing a chapter too. Currently, a `<title/>` in a chapter and in a `<section/>` below that chapter have the same level if they're not inside a `<book/>`. This can happen in a multi-file book project. Also see the example at https://tdg.docbook.org/tdg/4.5/chapter.html Co-authored-by: Félix Baylac-Jacqué <felix@alternativebit.fr> * Add docbook-chapter test This tests nested `<section/>` and makes sure `<title/>` in the first `<section/>` below `<chapter/>` is one level deeper than the `<chapter/>`'s `<title/>`, also when not inside a `<book/>`. Co-authored-by: Félix Baylac-Jacqué <felix@alternativebit.fr>
2019-10-29	Changes to build with new doctemplates/doclayout.	John MacFarlane	1	-1/+1
	The new version of doctemplates adds many features to pandoc's templating system, while remaining backwards-compatible. New features include partials and filters. Using template filters, one can lay out data in enumerated lists and tables. Templates are now layout-sensitive: so, for example, if a text with soft line breaks is interpolated near the end of a line, the text will break and wrap naturally. This makes the templating system much more suitable for programatically generating markdown or other plain-text files from metadata.
2019-10-27	Org reader: fix parsing of empty comment lines	Albert Krewinkel	1	-1/+11
	Comment lines in Org-mode can be completely empty; both of these line should produce no output: # a comment # The reader used to produce a wrong result for the latter, but ignores that line as well now. Fixes: #5856
2019-10-24	HTML reader/writer: Better handling of <q> with cite attribute (#5837)	Ole Martin Ruud	1	-0/+17
	* HTML reader: Handle cite attribute for quotes. If a `<q>` tag has a `cite` attribute, we interpret it as a Quoted element with an inner Span. Closes #5798 * Refactor url canonicalization into a helper function * Modify HTML writer to handle quote with cite. [0]: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/q
2019-10-23	Add Reader support for HTML <samp> element (#5843)	Amogh Rathore	1	-0/+6
	The `<samp>` element is parsed as a Span with class `sample`. Closes #5792.
2019-10-15	Muse reader: do not allow closing asterisks to be followed by "*"	Alexander Krotov	1	-3/+23

2019-10-15	Muse reader: do not split series of asterisks into symbols and emphasis	Alexander Krotov	1	-0/+8
	Fixes #5821
2019-10-15	Muse reader: do not terminate emphasis on "*" not followed by space	Alexander Krotov	1	-0/+4

2019-10-09	Options.WriterOptions: Change type of writerVariables to Context Text.	John MacFarlane	1	-1/+5
	This will allow structured values. [API change]
2019-10-04	hlint Muse reader tests	Alexander Krotov	1	-1/+1

2019-10-03	Minor ghc 8.8 fixups.	John MacFarlane	1	-1/+0

2019-09-23	ConTeXt unit tests - tweak code property.	John MacFarlane	1	-1/+1
	Inline code will never have two consecutive newlines. We get a counterexample in this case https://pipelines.actions.githubusercontent.com/bMXCpShstkkHbFPgw9hBRMWw2w9plyzdVM8r7CRPFBHFvidaAG/5cf52d2d-3804-412d-ae65-4f8c059b0fb7/_apis/pipelines/1/runs/116/signedlogcontent/39?urlExpires=2019-09-23T17%3A38%3A05.8358735Z&urlSigningMethod=HMACV1&urlSignature=Qtd6vnzqgSwXpAkIyp9DJY4Kn7GJzYMR8UDkLR%2FsMQY%3D so for simplicity we just weed out code with newlines.
2019-09-22	Make `plain` output plainer.	John MacFarlane	1	-1/+3
	Previously we used the following Project Gutenberg conventions for plain output: - extra space before and after level 1 and 2 headings - all-caps for strong emphasis `LIKE THIS` - underscores surrounding regular emphasis `_like this_` This commit makes `plain` output plainer. Strong and Emph inlines are rendered without special formatting. Headings are also rendered without special formatting, and with only one blank line following. To restore the former behavior, use `-t plain+gutenberg`. API change: Add `Ext_gutenberg` constructor to `Extension`. See #5741.
2019-09-21	[Docx Reader] Use style names, not ids, for assigning semantic meaning	Nikolay Yakimov	1	-0/+9
	Motivating issues: #5523, #5052, #5074 Style name comparisons are case-insensitive, since those are case-insensitive in Word. w:styleId will be used as style name if w:name is missing (this should only happen for malformed docx and is kept as a fallback to avoid failing altogether on malformed documents) Block quote detection code moved from Docx.Parser to Readers.Docx Code styles, i.e. "Source Code" and "Verbatim Char" now honor style inheritance Docx Reader now honours "Compact" style (used in Pandoc-generated docx). The side-effect is that "Compact" style no longer shows up in docx+styles output. Styles inherited from "Compact" will still show up. Removed obsolete list-item style from divsToKeep. That didn't really do anything for a while now. Add newtypes to differentiate between style names, ids, and different style types (that is, paragraph and character styles) Since docx style names can have spaces in them, and pandoc-markdown classes can't, anywhere when style name is used as a class name, spaces are replaced with ASCII dashes `-`. Get rid of extraneous intermediate types, carrying styleId information. Instead, styleId is saved with other style data. Use RunStyle for inline style definitions only (lacking styleId and styleName); for Character Styles use CharStyle type (which is basicaly RunStyle with styleId and StyleName bolted onto it).