aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Class/IO.hs
AgeCommit message (Collapse)AuthorFilesLines
2021-06-10Fix MediaBag regressions.John MacFarlane1-20/+21
With the 2.14 release `--extract-media` stopped working as before; there could be mismatches between the paths in the rendered document and the extracted media. This patch makes several changes (while keeping the same API). The `mediaPath` in 2.14 was always constructed from the SHA1 hash of the media contents. Now, we preserve the original path unless it's an absolute path or contains `..` segments (in that case we use a path based on the SHA1 hash of the contents). When constructing a path from the SHA1 hash, we always use the original extension, if there is one. Otherwise we look up an appropriate extension for the mime type. `mediaDirectory` and `mediaItems` now use the `mediaPath`, rather than the mediabag key, for the first component of the tuple. This makes more sense, I think, and fits with the documentation of these functions; eventually, though, we should rework the API so that `mediaItems` returns both the keys and the MediaItems. Rewriting of source paths in `extractMedia` has been fixed. `fillMediaBag` has been modified so that it doesn't modify image paths (that was part of the problem in #7345). We now do path normalization (e.g. `\` separators on Windows) only in writing the media; the paths are left unchanged in the image links (sensibly, since they might be URLs and not file paths). These changes should restore the original behavior from before 2.14. Closes #7345.
2021-06-03T.P.Class.IO: normalise path in writeMedia.John MacFarlane1-3/+2
This ensures that we get `\` separators on Windows.
2021-05-24MediaBag improvements.John MacFarlane1-5/+4
In the current dev version, we will sometimes add a version of an image with a hashed name, keeping the original version with the original name, which would leave to undesirable duplication. This change separates the media's filename from the media's canonical name (which is the path of the link in the document itself). Filenames are based on SHA1 hashes and assigned automatically. In Text.Pandoc.MediaBag: - Export MediaItem type [API change]. - Change MediaBag type to a map from Text to MediaItem [API change]. - `lookupMedia` now returns a `MediaItem` [API change]. - Change `insertMedia` so it sets the `mediaPath` to a filename based on the SHA1 hash of the contents. This will be used when contents are extracted. In Text.Pandoc.Class.PandocMonad: - Remove `fetchMediaResource` [API change]. Lua MediaBag module has been changed minimally. In the future it would be better, probably, to give Lua access to the full MediaItem type.
2021-02-22Text.Pandoc.UTF8: change IO functions to return Text, not String.John MacFarlane1-3/+3
[API change] This affects `readFile`, `getContents`, `writeFileWith`, `writeFile`, `putStrWith`, `putStr`, `putStrLnWith`, `putStrLn`. `hPutStrWith`, `hPutStr`, `hPutStrLnWith`, `hPutStrLn`, `hGetContents`. This avoids the need to uselessly create a linked list of characters when emiting output.
2020-10-19Normalize rewritten image paths with --extract-media.John MacFarlane1-1/+2
This change will avoid mixed paths like this one when `--extract-media` is used with a Word file: `![](C:\Git\TIJ4\Markdown/media/image30.wmf)` Instead we'll get `![](C:\Git\TIJ4\Markdown`media`image30.wmf)`. Closes #6761.
2020-04-17Class: generalize PandocIO functions to MonadIOAlbert Krewinkel1-0/+231