diff options
author | John MacFarlane <jgm@berkeley.edu> | 2021-05-23 22:57:02 -0700 |
---|---|---|
committer | John MacFarlane <jgm@berkeley.edu> | 2021-05-24 09:20:44 -0700 |
commit | 8511f6fdf6c9fbc2cc926538bca4ae9f554b4ed9 (patch) | |
tree | dfa3c06ba0756109cc5581f22d7d96d8f7ff3fed /src/Text/Pandoc/MediaBag.hs | |
parent | 58fbf56548bf985b40e4338befaf5b11a0665cbe (diff) | |
download | pandoc-8511f6fdf6c9fbc2cc926538bca4ae9f554b4ed9.tar.gz |
MediaBag improvements.
In the current dev version, we will sometimes add
a version of an image with a hashed name, keeping
the original version with the original name, which
would leave to undesirable duplication.
This change separates the media's filename from the
media's canonical name (which is the path of the link
in the document itself). Filenames are based on SHA1
hashes and assigned automatically.
In Text.Pandoc.MediaBag:
- Export MediaItem type [API change].
- Change MediaBag type to a map from Text to MediaItem [API change].
- `lookupMedia` now returns a `MediaItem` [API change].
- Change `insertMedia` so it sets the `mediaPath` to
a filename based on the SHA1 hash of the contents.
This will be used when contents are extracted.
In Text.Pandoc.Class.PandocMonad:
- Remove `fetchMediaResource` [API change].
Lua MediaBag module has been changed minimally. In the future
it would be better, probably, to give Lua access to the full
MediaItem type.
Diffstat (limited to 'src/Text/Pandoc/MediaBag.hs')
-rw-r--r-- | src/Text/Pandoc/MediaBag.hs | 35 |
1 files changed, 26 insertions, 9 deletions
diff --git a/src/Text/Pandoc/MediaBag.hs b/src/Text/Pandoc/MediaBag.hs index 4a9b4efa1..a65f315fc 100644 --- a/src/Text/Pandoc/MediaBag.hs +++ b/src/Text/Pandoc/MediaBag.hs @@ -15,6 +15,7 @@ Definition of a MediaBag object to hold binary resources, and an interface for interacting with it. -} module Text.Pandoc.MediaBag ( + MediaItem(..), MediaBag, deleteMedia, lookupMedia, @@ -28,15 +29,23 @@ import qualified Data.Map as M import Data.Maybe (fromMaybe) import Data.Typeable (Typeable) import System.FilePath -import Text.Pandoc.MIME (MimeType, getMimeTypeDef) +import Text.Pandoc.MIME (MimeType, getMimeTypeDef, extensionFromMimeType) import Data.Text (Text) import qualified Data.Text as T +import Data.Digest.Pure.SHA (sha1, showDigest) + +data MediaItem = + MediaItem + { mediaMimeType :: MimeType + , mediaPath :: FilePath + , mediaContents :: BL.ByteString + } deriving (Eq, Ord, Show, Data, Typeable) -- | A container for a collection of binary resources, with names and -- mime types. Note that a 'MediaBag' is a Monoid, so 'mempty' -- can be used for an empty 'MediaBag', and '<>' can be used to append -- two 'MediaBag's. -newtype MediaBag = MediaBag (M.Map Text (MimeType, BL.ByteString)) +newtype MediaBag = MediaBag (M.Map Text MediaItem) deriving (Semigroup, Monoid, Data, Typeable) instance Show MediaBag where @@ -62,26 +71,34 @@ insertMedia :: FilePath -- ^ relative path and canonical name of resource -> MediaBag -> MediaBag insertMedia fp mbMime contents (MediaBag mediamap) = - MediaBag (M.insert (canonicalize fp) (mime, contents) mediamap) - where mime = fromMaybe fallback mbMime + MediaBag (M.insert (canonicalize fp) mediaItem mediamap) + where mediaItem = MediaItem{ mediaPath = showDigest (sha1 contents) <> + "." <> ext + , mediaContents = contents + , mediaMimeType = mt } fallback = case takeExtension fp of ".gz" -> getMimeTypeDef $ dropExtension fp _ -> getMimeTypeDef fp + mt = fromMaybe fallback mbMime + ext = maybe (takeExtension fp) T.unpack $ extensionFromMimeType mt + -- | Lookup a media item in a 'MediaBag', returning mime type and contents. lookupMedia :: FilePath -> MediaBag - -> Maybe (MimeType, BL.ByteString) + -> Maybe MediaItem lookupMedia fp (MediaBag mediamap) = M.lookup (canonicalize fp) mediamap -- | Get a list of the file paths stored in a 'MediaBag', with -- their corresponding mime types and the lengths in bytes of the contents. mediaDirectory :: MediaBag -> [(FilePath, MimeType, Int)] mediaDirectory (MediaBag mediamap) = - M.foldrWithKey (\fp (mime,contents) -> - ((T.unpack fp, mime, fromIntegral (BL.length contents)):)) [] mediamap + M.foldrWithKey (\fp item -> + ((T.unpack fp, mediaMimeType item, + fromIntegral (BL.length (mediaContents item))):)) [] mediamap mediaItems :: MediaBag -> [(FilePath, MimeType, BL.ByteString)] mediaItems (MediaBag mediamap) = - M.foldrWithKey (\fp (mime,contents) -> - ((T.unpack fp, mime, contents):)) [] mediamap + M.foldrWithKey (\fp item -> + ((T.unpack fp, mediaMimeType item, mediaContents item):)) + [] mediamap |