pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2021-05-28	Support `rebase_relative_paths` for commonmark based formats.	John MacFarlane	1	-0/+16
	(Including `gfm`.)
2021-05-28	Docx reader: Support new table features.	Emily Bourke	12	-19/+324
	* Column spans * Row spans - The spec says that if the `val` attribute is ommitted, its value should be assumed to be `continue`, and that its values are restricted to {`restart`, `continue`}. If the value has any other value, I think it seems reasonable to default it to `continue`. It might cause problems if the spec is extended in the future by adding a third possible value, in which case this would probably give incorrect behaviour, and wouldn't error. * Allow multiple header rows * Include table description in simple caption - The table description element is like alt text for a table (along with the table caption element). It seems like we should include this somewhere, but I’m not 100% sure how – I’m pairing it with the simple caption for the moment. (Should it maybe go in the block caption instead?) * Detect table captions - Check for caption paragraph style /and/ either the simple or complex table field. This means the caption detection fails for captions which don’t contain a field, as in an example doc I added as a test. However, I think it’s better to be too conservative: a missed table caption will still show up as a paragraph next to the table, whereas if I incorrectly classify something else as a table caption it could cause havoc by pairing it up with a table it’s not at all related to, or dropping it entirely. * Update tests and add new ones Partially fixes: #6316
2021-05-28	Docx reader: Read table column widths.	Emily Bourke	12	-27/+124

2021-05-27	Two citeproc locator/suffix improvements:	John MacFarlane	2	-0/+54
	- Recognize locators spelled with a capital letter. Closes #7323. - Add a comma and a space in front of the suffix if it doesn't start with space or punctuation. Closes #7324.
2021-05-27	rebase_relative_paths: leave empty paths unchanged.	John MacFarlane	2	-0/+5

2021-05-27	rebase_relative_paths extension: don't change fragment paths.	John MacFarlane	2	-0/+5
	We don't want a pure fragment path to be rewritten, since these are used for cross-referencing.
2021-05-27	Modify rebase_reference_links treatment of reference links/images.	John MacFarlane	3	-1/+8
	The directory is based on the file containing the link reference, not the file containing the link, if these differ.
2021-05-27	Citeproc: Don't detect math elements as locators.	John MacFarlane	1	-0/+24
	Closes #7321.
2021-05-27	Add `rebase_relative_paths` extension.	John MacFarlane	5	-0/+49
	- Add manual entry for (non-default) extension `rebase_relative_paths`. - Add constructor `Ext_rebase_relative_paths` to `Extensions` in Text.Pandoc.Extensions [API change]. When enabled, this extension rewrites relative image and link paths by prepending the (relative) directory of the containing file. - Make Markdown reader sensitive to the new extension. - Add tests for #3752. Closes #3752. NB. currently the extension applies to markdown and associated readers but not commonmark/gfm.
2021-05-27	LaTeX reader: improve `\def` and implement `\newif`.	John MacFarlane	1	-0/+55
	- Improve parsing of `\def` macros. We previously set "verbatim mode" even for parsing the initial `\def`; this caused problems for things like ``` \def\foo{\def\bar{BAR}} \foo \bar ``` - Implement `\newif`. - Add tests.
2021-05-26	Command tests: fail if a file contains no tests.	John MacFarlane	2	-4/+9
	And fix a test that failed in that way!
2021-05-25	Fix a command test so it writes to stdout not stderr.	John MacFarlane	1	-1/+1
	The error message to stderr was appearing in test output and confusing some users, who thought it indicated a failing test rather than expected output.
2021-05-25	Logging: remove single quotes around paths in messages.	John MacFarlane	2	-2/+2
	We weren't doing it consistently and it seems unnecessary.
2021-05-25	Jira: add support for "smart" links	Albert Krewinkel	2	-0/+16
	Support has been added for the new `[alias\|https://example.com\|smart-card]` syntax.
2021-05-24	MediaBag improvements.	John MacFarlane	1	-5/+5
	In the current dev version, we will sometimes add a version of an image with a hashed name, keeping the original version with the original name, which would leave to undesirable duplication. This change separates the media's filename from the media's canonical name (which is the path of the link in the document itself). Filenames are based on SHA1 hashes and assigned automatically. In Text.Pandoc.MediaBag: - Export MediaItem type [API change]. - Change MediaBag type to a map from Text to MediaItem [API change]. - `lookupMedia` now returns a `MediaItem` [API change]. - Change `insertMedia` so it sets the `mediaPath` to a filename based on the SHA1 hash of the contents. This will be used when contents are extracted. In Text.Pandoc.Class.PandocMonad: - Remove `fetchMediaResource` [API change]. Lua MediaBag module has been changed minimally. In the future it would be better, probably, to give Lua access to the full MediaItem type.
2021-05-24	Jira writer: use `{color}` when span has a color attribute	Albert Krewinkel	1	-0/+4
	Closes: tarleb/jira-wiki-markup#10
2021-05-22	Handle relative lengths (e.g. `2*`) in HTML column widths.	John MacFarlane	1	-0/+29
	See <https://www.w3.org/TR/html4/types.html#h-6.6>. "A relative length has the form "i", where "i" is an integer. When allotting space among elements competing for that space, user agents allot pixel and percentage lengths first, then divide up remaining available space among relative lengths. Each relative length receives a portion of the available space that is proportional to the integer preceding the "". The value "" is equivalent to "1". Thus, if 60 pixels of space are available after the user agent allots pixel and percentage space, and the competing relative lengths are 1, 2, and 3, the 1 will be alloted 10 pixels, the 2* will be alloted 20 pixels, and the 3* will be alloted 30 pixels." Closes #4063.
2021-05-20	DocBook reader: ensure that first and last names are separated.	John MacFarlane	1	-0/+27
	Closes #6541.
2021-05-20	Ms writer: handle tables with multiple paragraphs.	John MacFarlane	2	-0/+70
	Previously they overflowed the table cell width. We now set line lengths per-cell and restore them after the table has been written. Closes #7288.
2021-05-20	LaTeX reader: More siunitx improvements. Closes #6658.	John MacFarlane	1	-6/+66
	There's still one slight divergence from the siunitx behavior: we get 'kg m/A/s' instead of 'kg m/(A s)'. At the moment I'm not going to worry about that.
2021-05-20	LaTeX/siunitx: fix parsing of `\cubic` etc. See #6658.	John MacFarlane	1	-0/+3

2021-05-20	LaTeX reader sinuitx: fix + sign on ang.	John MacFarlane	1	-0/+3

2021-05-20	LaTeX reader siunitx: add leading 0 to numbers starting with .	John MacFarlane	2	-3/+9

2021-05-20	ConTeXt reader: improve ordered lists (#7304)	Denis Maier	2	-38/+46
	Closes #5016 - change ordered list from itemize to enumerate - adds new itemgroup for ordered lists - add fontfeature for table figures - remove width from itemize in context writer
2021-05-20	LaTeX reader: Fix parsing of `+-` in siunitx numbers.	John MacFarlane	1	-1/+4
	See #6658.
2021-05-20	LaTeX reader: support `\pm` in `SI{..}`.	John MacFarlane	1	-0/+3
	Closes #6620.
2021-05-20	ZimWiki writer: allow links and emphasis in headers	Albert Krewinkel	1	-3/+3
	The latest version of ZimWiki supports this. Closes: #6605
2021-05-19	LaTeX reader: better support for `\xspace`.	John MacFarlane	2	-1/+24
	Previously we only supported it in inline contexts; now we support it in all contexts, including math. Partially addresses #7299.
2021-05-18	LaTeX writer: separate successive quote chars with thin space	Albert Krewinkel	1	-0/+10
	Successive quote characters are separated with a thin space to improve readability and to prevent unwanted ligatures. Detection of these quotes sometimes had failed if the second quote was nested in a span element. Closes: #6958
2021-05-17	HTML writer: keep attributes from code nested below pre tag.	Albert Krewinkel	1	-0/+11
	If a code block is defined with `<pre><code class="language-x">…</code></pre>`, where the `<pre>` element has no attributes, then the attributes from the `<code>` element are used instead. Any leading `language-` prefix is dropped in the code's class attribute are dropped to improve syntax highlighting. Closes: #7221
2021-05-17	HTML writer: ensure headings only have valid attribs in HTML4	Albert Krewinkel	1	-52/+57
	Fixes: #5944
2021-05-17	ConTeXt writer: use span identifiers as reference anchors.	Albert Krewinkel	1	-0/+3
	Closes: #7246
2021-05-17	ConTeXt writer tests: keep code lines below 80 chars.	Albert Krewinkel	1	-113/+119

2021-05-16	LaTeX template: move title, author, date up to top of preamble.	John MacFarlane	4	-13/+13
	This allows header-includes to use them, and puts them in a position where you can see them immediately. Closes #7295.
2021-05-16	Markdown writer: fewer unneeded escapes for `#`.	John MacFarlane	5	-5/+5
	See #6259.
2021-05-15	Docx writer: copy over more settings from referenc.odcx.	John MacFarlane	33	-0/+0
	From settings.xml in the reference-doc, we now include: `zoom`, `embedSystemFonts`, `doNotTrackMoves`, `defaultTabStop`, `drawingGridHorizontalSpacing`, `drawingGridVerticalSpacing`, `displayHorizontalDrawingGridEvery`, `displayVerticalDrawingGridEvery`, `characterSpacingControl`, `savePreviewPicture`, `mathPr`, `themeFontLang`, `decimalSymbol`, `listSeparator`, `autoHyphenation`, `compat`. Closes #7240.
2021-05-15	docx writer: Remove rsids from settings.docx.	John MacFarlane	33	-0/+0
	Word will add these when revisions are made. But it's pointless to start out with a set of them.
2021-05-15	HTML writer: parse `<header>` as a Div	Albert Krewinkel	1	-5/+9
	HTML5 `<header>` elements are treated like `<div>` elements.
2021-05-14	HTML reader: keep h1 tags as normal headers (#7274)	Albert Krewinkel	1	-1/+2
	The tags `<title>` and `<h1 class="title">` often contain the same information, so the latter was dropped from the document. However, as this can lead to loss of information, the heading is now always retained. Use `--shift-heading-level-by=-1` to turn the `<h1>` into the document title, or a filter to restore the previous behavior. Closes: #2293
2021-05-14	Beamer writer: support exampleblock and alertblock.	John MacFarlane	1	-0/+38
	A block will be rendered as an exampleblock if the heading has class `example` and alertblock if it has class `alert`. Closes #7278.
2021-05-14	Docx writer: allow multirow table headers	Albert Krewinkel	2	-0/+0

2021-05-14	HTML reader: don't fail on unmatched closing "script" tag.	Albert Krewinkel	1	-0/+7
	Prevent the reader from crashing if the HTML input contains an unmatched closing `</script>` tag. Fixes: #7282
2021-05-13	Implement curly-brace syntax for Markdown citation keys.	John MacFarlane	1	-0/+19
	The change provides a way to use citation keys that contain special characters not usable with the standard citation key syntax. Example: `@{foo_bar{x}'}` for the key `foo_bar{x}`. Closes #6026. The change requires adding a new parameter to the `citeKey` parser from Text.Pandoc.Parsing [API change]. Markdown reader: recognize @{..} syntax for citatinos. Markdown writer: use @{..} syntax for citations when needed. Update manual with curly-brace syntax for citations. Closes #6026.
2021-05-12	Hande 'annote' field in bibtex/biblatex writer.	John MacFarlane	1	-0/+10
	Closes #7266.
2021-05-11	Improve integration of settings from reference.docx.	John MacFarlane	33	-0/+0
	The settings we can carry over from a reference.docx are autoHyphenation, consecutiveHyphenLimit, hyphenationZone, doNotHyphenateCap, evenAndOddHeaders, and proofState. Previously this was implemented in a buggy way, so that the reference doc's values AND the new values were included. This change allows users to create a reference.docx that sets w:proofState for spelling or grammar to "dirty," so that spell/grammar checking will be triggered on the generated docx. Closes #1209.
2021-05-11	LaTeX writer: better handling of line breaks in simple tables.	John MacFarlane	1	-0/+24
	Now we also handle the case where they're embedded in other elements, e.g. spans. Closes #7272.
2021-05-09	Change reader types, allowing better tracking of source positions.	John MacFarlane	1	-2/+2
	Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn't report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn't resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752). Add Text.Pandoc.Sources (exported module), with a `Sources` type and a `ToSources` class. A `Sources` wraps a list of `(SourcePos, Text)` pairs. [API change] A parsec `Stream` instance is provided for `Sources`. The module also exports versions of parsec's `satisfy` and other Char parsers that track source positions accurately from a `Sources` stream (or any instance of the new `UpdateSourcePos` class). Text.Pandoc.Parsing now exports these modified Char parsers instead of the ones parsec provides. Modified parsers to use a `Sources` as stream [API change]. The readers that previously took a `Text` argument have been modified to take any instance of `ToSources`. So, they may still be used with a `Text`, but they can also be used with a `Sources` object. In Text.Pandoc.Error, modified the constructor PandocParsecError to take a `Sources` rather than a `Text` as first argument, so parse error locations can be accurately reported. T.P.Error: showPos, do not print "-" as source name.
2021-05-05	App: allow tabs expansion even if file-scope is used	Albert Krewinkel	1	-0/+11
	Tabs in plain-text inputs are now handled correctly, even if the `--file-scope` flag is used. Closes: #6709
2021-05-01	Docx writer: support colspans and rowspans in tables	Albert Krewinkel	3	-0/+0
	See: #6315
2021-04-29	Docx reader: add handling of vml image objects (jgm#4735) (#7257)	mbrackeantidot	3	-0/+6
	They represent images, the same way as other images in vml format.