aboutsummaryrefslogtreecommitdiff
path: root/MANUAL.txt
diff options
context:
space:
mode:
authorJohn MacFarlane <jgm@berkeley.edu>2021-06-10 16:47:02 -0700
committerJohn MacFarlane <jgm@berkeley.edu>2021-06-10 16:47:02 -0700
commit3776e828a83048697e5c64d9fb4bedc0145197dc (patch)
treecad6f9754a013ea5c86d4559b1ceeb3187d0e301 /MANUAL.txt
parentaa79b3035c3343adf1bb41b37266049a65ab5da7 (diff)
downloadpandoc-3776e828a83048697e5c64d9fb4bedc0145197dc.tar.gz
Fix MediaBag regressions.
With the 2.14 release `--extract-media` stopped working as before; there could be mismatches between the paths in the rendered document and the extracted media. This patch makes several changes (while keeping the same API). The `mediaPath` in 2.14 was always constructed from the SHA1 hash of the media contents. Now, we preserve the original path unless it's an absolute path or contains `..` segments (in that case we use a path based on the SHA1 hash of the contents). When constructing a path from the SHA1 hash, we always use the original extension, if there is one. Otherwise we look up an appropriate extension for the mime type. `mediaDirectory` and `mediaItems` now use the `mediaPath`, rather than the mediabag key, for the first component of the tuple. This makes more sense, I think, and fits with the documentation of these functions; eventually, though, we should rework the API so that `mediaItems` returns both the keys and the MediaItems. Rewriting of source paths in `extractMedia` has been fixed. `fillMediaBag` has been modified so that it doesn't modify image paths (that was part of the problem in #7345). We now do path normalization (e.g. `\` separators on Windows) only in writing the media; the paths are left unchanged in the image links (sensibly, since they might be URLs and not file paths). These changes should restore the original behavior from before 2.14. Closes #7345.
Diffstat (limited to 'MANUAL.txt')
-rw-r--r--MANUAL.txt12
1 files changed, 6 insertions, 6 deletions
diff --git a/MANUAL.txt b/MANUAL.txt
index b3a1f95e2..ef569433a 100644
--- a/MANUAL.txt
+++ b/MANUAL.txt
@@ -675,12 +675,12 @@ header when requesting a document from a URL:
: Extract images and other media contained in or linked from
the source document to the path *DIR*, creating it if
necessary, and adjust the images references in the document
- so they point to the extracted files. If the source format is
- a binary container (docx, epub, or odt), the media is
- extracted from the container and the original
- filenames are used. Otherwise the media is read from the
- file system or downloaded, and new filenames are constructed
- based on SHA1 hashes of the contents.
+ so they point to the extracted files. Media are downloaded,
+ read from the file system, or extracted from a binary
+ container (e.g. docx), as needed. The original file paths
+ are used if they are relative paths not containing `..`.
+ Otherwise filenames are constructed from the SHA1 hash of
+ the contents.
`--abbreviations=`*FILE*