aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc
AgeCommit message (Collapse)AuthorFilesLines
2021-06-10Fix MediaBag regressions.John MacFarlane3-42/+41
With the 2.14 release `--extract-media` stopped working as before; there could be mismatches between the paths in the rendered document and the extracted media. This patch makes several changes (while keeping the same API). The `mediaPath` in 2.14 was always constructed from the SHA1 hash of the media contents. Now, we preserve the original path unless it's an absolute path or contains `..` segments (in that case we use a path based on the SHA1 hash of the contents). When constructing a path from the SHA1 hash, we always use the original extension, if there is one. Otherwise we look up an appropriate extension for the mime type. `mediaDirectory` and `mediaItems` now use the `mediaPath`, rather than the mediabag key, for the first component of the tuple. This makes more sense, I think, and fits with the documentation of these functions; eventually, though, we should rework the API so that `mediaItems` returns both the keys and the MediaItems. Rewriting of source paths in `extractMedia` has been fixed. `fillMediaBag` has been modified so that it doesn't modify image paths (that was part of the problem in #7345). We now do path normalization (e.g. `\` separators on Windows) only in writing the media; the paths are left unchanged in the image links (sensibly, since they might be URLs and not file paths). These changes should restore the original behavior from before 2.14. Closes #7345.
2021-06-10T.P.MIME, extensionFromMimeType: add a few special cases.John MacFarlane1-0/+10
When we do a reverse lookup in the MIME table, we just get the last match, so when the same mime type is associated with several different extensions, we sometimes got weird results, e.g. `.vs` for `text/plain`. These special cases help us get the most standard extensions for mime types like `text/plain`.
2021-06-10Docx writer: fix handling of empty table headersAlbert Krewinkel1-2/+2
A table header which does not contain any cells is now treated as an empty header. Fixes: #7369
2021-06-10Lua utils: fix handling of table headers in `from_simple_table`Albert Krewinkel1-1/+1
Passing an empty list of header cells now results in an empty table header. Fixes: #7369
2021-06-08Citeproc: avoid duplicate classes and attributes on refs div.John MacFarlane1-2/+2
2021-06-05LaTeX writer: Fix regression in table header position.John MacFarlane1-3/+10
In recent versions the table headers were no longer bottom-aligned (if more than one line). This patch fixes that by using minipages for table headers in non-simple tables. Closes #7347.
2021-06-05CommonMark writer: do not use simple class for fenced-divsJan Tojnar1-3/+6
In https://github.com/jgm/pandoc/pull/7242, we introduced a simple attribute style for for code blocks and fenced divs with a single class but turns out the CommonMark extension does not support it for fenced divs. https://github.com/jgm/commonmark-hs/blob/master/commonmark-extensions/test/fenced_divs.md
2021-06-05CommonMark writer: do not throw away attributes when Ext_attributes is enabledJan Tojnar2-13/+17
Ext_attributes covers at least the following: - Ext_fenced_code_attributes - Ext_header_attributes - Ext_inline_code_attributes - Ext_link_attributes
2021-06-05Markdown writer: re-use functions from InlineJan Tojnar2-29/+4
Instead of duplicating linkAttributes and attrsToMarkdown, let’s just use those from the Inline module.
2021-06-05DocBook reader: Add support for danger elementJan Tojnar1-1/+2
Added in DocBook 5.2: - https://github.com/docbook/docbook/pull/64 - https://tdg.docbook.org/tdg/5.2/danger.html
2021-06-05DocBook writer: Remove non-existent admonitionsJan Tojnar1-2/+1
attention, error and hint are actually just reStructuredText specific. danger was too until introduced in DocBook 5.2: https://github.com/docbook/docbook/issues/55
2021-06-03T.P.Class.IO: normalise path in writeMedia.John MacFarlane1-3/+2
This ensures that we get `\` separators on Windows.
2021-06-02Text.Pandoc.PDF: only print relevant part of environment on `--verbose`.John MacFarlane1-2/+14
2021-06-02Fix regression in 2.14 for generation of PDFs with SVGs.John MacFarlane1-1/+2
Closes #7344.
2021-06-01HTML writer: Don't omit width attribute on div.John MacFarlane1-3/+4
Closes #7342.
2021-06-01Markdown reader: fix pipe table regression in 2.11.4.John MacFarlane1-1/+1
Previously pipe tables with empty headers (that is, a header line with all empty cells) would be rendered as headerless tables. This broke in 2.11.4. The fix here is to produce an AST with an empty table head when a pipe table has all empty header cells. Closes #7343.
2021-06-01LaTeX reader: don't allow optional * on symbol control sequences.John MacFarlane1-2/+4
Generally we allow optional starred variants of LaTeX commands (since many allow them, and if we don't accept these explicitly, ignoring the star usually gives acceptable results). But we don't want to do this for `\(*\)` and similar cases. Closes #7340.
2021-05-31Fix regression with commonmark/gfm yaml metdata block parsing.John MacFarlane1-5/+5
A regression in 2.14 led to the document body being omitted after YAML metadata in some cases. This is now fixed. Closes #7339.
2021-05-30HTML reader: fix column width regression.John MacFarlane1-1/+1
Column widths specified with a style attribute were off by a factor of 100 in 2.14. Closes #7334.
2021-05-30Have LoadedResource use relative paths.John MacFarlane1-2/+2
The immediate reason for this is to allow the test output of #3752 to work on both windows and linux.
2021-05-30Docx writer: fix regression on captions.John MacFarlane1-1/+3
The "Table Caption" style was no longer getting applied. (It was overwritten by "Compact.") Closes #7328.
2021-05-29Markdown reader: in rebasePaths, check for both Windows and PosixJohn MacFarlane1-4/+5
absolute paths. Previously Windows pandoc was treating `/foo/bar.jpg` as non-absolute.
2021-05-29In rebasePath, check for absolute paths two ways.John MacFarlane1-1/+4
isAbsolute from FilePath doesn't return True on Windows for paths beginning with `/`, so we check that separately.
2021-05-28Support `rebase_relative_paths` for commonmark based formats.John MacFarlane2-1/+4
(Including `gfm`.)
2021-05-28Docx reader: Support new table features.Emily Bourke3-49/+163
* Column spans * Row spans - The spec says that if the `val` attribute is ommitted, its value should be assumed to be `continue`, and that its values are restricted to {`restart`, `continue`}. If the value has any other value, I think it seems reasonable to default it to `continue`. It might cause problems if the spec is extended in the future by adding a third possible value, in which case this would probably give incorrect behaviour, and wouldn't error. * Allow multiple header rows * Include table description in simple caption - The table description element is like alt text for a table (along with the table caption element). It seems like we should include this somewhere, but I’m not 100% sure how – I’m pairing it with the simple caption for the moment. (Should it maybe go in the block caption instead?) * Detect table captions - Check for caption paragraph style /and/ either the simple or complex table field. This means the caption detection fails for captions which don’t contain a field, as in an example doc I added as a test. However, I think it’s better to be too conservative: a missed table caption will still show up as a paragraph next to the table, whereas if I incorrectly classify something else as a table caption it could cause havoc by pairing it up with a table it’s not at all related to, or dropping it entirely. * Update tests and add new ones Partially fixes: #6316
2021-05-28Docx reader: Read table column widths.Emily Bourke2-3/+4
2021-05-27Two citeproc locator/suffix improvements:John MacFarlane1-3/+11
- Recognize locators spelled with a capital letter. Closes #7323. - Add a comma and a space in front of the suffix if it doesn't start with space or punctuation. Closes #7324.
2021-05-27rebase_relative_paths: leave empty paths unchanged.John MacFarlane1-1/+1
2021-05-27rebase_relative_paths extension: don't change fragment paths.John MacFarlane1-1/+2
We don't want a pure fragment path to be rewritten, since these are used for cross-referencing.
2021-05-27Modify rebase_reference_links treatment of reference links/images.John MacFarlane1-5/+4
The directory is based on the file containing the link reference, not the file containing the link, if these differ.
2021-05-27Citeproc: Don't detect math elements as locators.John MacFarlane1-0/+7
Closes #7321.
2021-05-27Add `rebase_relative_paths` extension.John MacFarlane2-7/+32
- Add manual entry for (non-default) extension `rebase_relative_paths`. - Add constructor `Ext_rebase_relative_paths` to `Extensions` in Text.Pandoc.Extensions [API change]. When enabled, this extension rewrites relative image and link paths by prepending the (relative) directory of the containing file. - Make Markdown reader sensitive to the new extension. - Add tests for #3752. Closes #3752. NB. currently the extension applies to markdown and associated readers but not commonmark/gfm.
2021-05-27LaTeX reader: improve `\def` and implement `\newif`.John MacFarlane2-15/+63
- Improve parsing of `\def` macros. We previously set "verbatim mode" even for parsing the initial `\def`; this caused problems for things like ``` \def\foo{\def\bar{BAR}} \foo \bar ``` - Implement `\newif`. - Add tests.
2021-05-25Logging: remove single quotes around paths in messages.John MacFarlane1-6/+6
We weren't doing it consistently and it seems unnecessary.
2021-05-25Allow compilation with base 4.15Albert Krewinkel4-77/+72
2021-05-25Use haddock-library-1.10.0Albert Krewinkel1-1/+2
2021-05-25PandocMonad: add info message in `downloadOrRead`...John MacFarlane1-5/+8
indicating what path local resources have been loaded from.
2021-05-25Logging: add LoadedResource constructor to LogMessage.John MacFarlane1-0/+7
[API change] This is for INFO-level messages telling where image data has been loaded from. (This can vary because of the resource path.)
2021-05-25Jira: add support for "smart" linksAlbert Krewinkel2-0/+4
Support has been added for the new `[alias|https://example.com|smart-card]` syntax.
2021-05-24MediaBag improvements.John MacFarlane4-43/+50
In the current dev version, we will sometimes add a version of an image with a hashed name, keeping the original version with the original name, which would leave to undesirable duplication. This change separates the media's filename from the media's canonical name (which is the path of the link in the document itself). Filenames are based on SHA1 hashes and assigned automatically. In Text.Pandoc.MediaBag: - Export MediaItem type [API change]. - Change MediaBag type to a map from Text to MediaItem [API change]. - `lookupMedia` now returns a `MediaItem` [API change]. - Change `insertMedia` so it sets the `mediaPath` to a filename based on the SHA1 hash of the contents. This will be used when contents are extracted. In Text.Pandoc.Class.PandocMonad: - Remove `fetchMediaResource` [API change]. Lua MediaBag module has been changed minimally. In the future it would be better, probably, to give Lua access to the full MediaItem type.
2021-05-24Jira writer: use `{color}` when span has a color attributeAlbert Krewinkel1-3/+7
Closes: tarleb/jira-wiki-markup#10
2021-05-22Handle relative lengths (e.g. `2*`) in HTML column widths.John MacFarlane1-14/+33
See <https://www.w3.org/TR/html4/types.html#h-6.6>. "A relative length has the form "i*", where "i" is an integer. When allotting space among elements competing for that space, user agents allot pixel and percentage lengths first, then divide up remaining available space among relative lengths. Each relative length receives a portion of the available space that is proportional to the integer preceding the "*". The value "*" is equivalent to "1*". Thus, if 60 pixels of space are available after the user agent allots pixel and percentage space, and the competing relative lengths are 1*, 2*, and 3*, the 1* will be alloted 10 pixels, the 2* will be alloted 20 pixels, and the 3* will be alloted 30 pixels." Closes #4063.
2021-05-22Revert "HTML reader: simplify col width parsing"John MacFarlane1-9/+13
This reverts commit f76fe2ab56606528d4710cc6c40bceb5788c3906.
2021-05-22HTML reader: simplify col width parsingAlbert Krewinkel1-13/+9
2021-05-20DocBook reader: ensure that first and last names are separated.John MacFarlane1-6/+14
Closes #6541.
2021-05-20Ms writer: handle tables with multiple paragraphs.John MacFarlane1-6/+22
Previously they overflowed the table cell width. We now set line lengths per-cell and restore them after the table has been written. Closes #7288.
2021-05-20LaTeX reader: More siunitx improvements. Closes #6658.John MacFarlane2-46/+95
There's still one slight divergence from the siunitx behavior: we get 'kg m/A/s' instead of 'kg m/(A s)'. At the moment I'm not going to worry about that.
2021-05-20LaTeX/siunitx: fix parsing of `\cubic` etc. See #6658.John MacFarlane1-35/+50
2021-05-20LaTeX reader sinuitx: fix + sign on ang.John MacFarlane1-3/+6
2021-05-20LaTeX reader siunitx: add leading 0 to numbers starting with .John MacFarlane1-2/+5