aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Class
AgeCommit message (Collapse)AuthorFilesLines
2021-11-02Docx writer: use getTimestamp for modification times in reference.docx.John MacFarlane1-1/+1
This ensures that when `SOURCE_DATE_EPOCH` is set, the modification times of files taken from the reference.docx will be set deterministically, allowing for reproducible builds. Closes #7654.
2021-08-28Add `--sandbox` option.John MacFarlane2-79/+50
+ Add sandbox feature for readers. When this option is used, readers and writers only have access to input files (and other files specified directly on command line). This restriction is enforced in the type system. + Filters, PDF production, custom writers are unaffected. This feature only insulates the actual readers and writers, not the pipeline around them in Text.Pandoc.App. + Note that when `--sandboxed` is specified, readers won't have access to the resource path, nor will anything have access to the user data directory. + Add module Text.Pandoc.Class.Sandbox, defining `sandbox`. Exported via Text.Pandoc.Class. [API change] Closes #5045.
2021-08-24Text.Pandoc.Class: add readStdinStrict method to PandocMonad.John MacFarlane4-0/+16
[API change]
2021-08-24Class: Generalize type of extractMedia.John MacFarlane1-1/+1
It was uselessly restricted to PandocIO, instead of any instance of PandocMonad and MonadIO. [API change]
2021-08-24Text.Pandoc.Filter: Generalize type of applyFilters...John MacFarlane1-0/+79
from PandocIO to any instance of MonadIO and PandocMonad. [API change]
2021-08-22PandocIO: derive MonadCatch, MonadThrow, MonadMask.John MacFarlane1-0/+4
This will allow us to use withTempDir.
2021-07-09Always use / when adding directory to image path with extractMedia.John MacFarlane1-1/+1
Even on Windows. May help with #7431.
2021-06-10Fix MediaBag regressions.John MacFarlane2-31/+27
With the 2.14 release `--extract-media` stopped working as before; there could be mismatches between the paths in the rendered document and the extracted media. This patch makes several changes (while keeping the same API). The `mediaPath` in 2.14 was always constructed from the SHA1 hash of the media contents. Now, we preserve the original path unless it's an absolute path or contains `..` segments (in that case we use a path based on the SHA1 hash of the contents). When constructing a path from the SHA1 hash, we always use the original extension, if there is one. Otherwise we look up an appropriate extension for the mime type. `mediaDirectory` and `mediaItems` now use the `mediaPath`, rather than the mediabag key, for the first component of the tuple. This makes more sense, I think, and fits with the documentation of these functions; eventually, though, we should rework the API so that `mediaItems` returns both the keys and the MediaItems. Rewriting of source paths in `extractMedia` has been fixed. `fillMediaBag` has been modified so that it doesn't modify image paths (that was part of the problem in #7345). We now do path normalization (e.g. `\` separators on Windows) only in writing the media; the paths are left unchanged in the image links (sensibly, since they might be URLs and not file paths). These changes should restore the original behavior from before 2.14. Closes #7345.
2021-06-03T.P.Class.IO: normalise path in writeMedia.John MacFarlane1-3/+2
This ensures that we get `\` separators on Windows.
2021-05-30Have LoadedResource use relative paths.John MacFarlane1-2/+2
The immediate reason for this is to allow the test output of #3752 to work on both windows and linux.
2021-05-25PandocMonad: add info message in `downloadOrRead`...John MacFarlane1-5/+8
indicating what path local resources have been loaded from.
2021-05-24MediaBag improvements.John MacFarlane2-31/+21
In the current dev version, we will sometimes add a version of an image with a hashed name, keeping the original version with the original name, which would leave to undesirable duplication. This change separates the media's filename from the media's canonical name (which is the path of the link in the document itself). Filenames are based on SHA1 hashes and assigned automatically. In Text.Pandoc.MediaBag: - Export MediaItem type [API change]. - Change MediaBag type to a map from Text to MediaItem [API change]. - `lookupMedia` now returns a `MediaItem` [API change]. - Change `insertMedia` so it sets the `mediaPath` to a filename based on the SHA1 hash of the contents. This will be used when contents are extracted. In Text.Pandoc.Class.PandocMonad: - Remove `fetchMediaResource` [API change]. Lua MediaBag module has been changed minimally. In the future it would be better, probably, to give Lua access to the full MediaItem type.
2021-05-19Remove unused pragma.John MacFarlane1-1/+0
2021-05-18Use fetchItem instead of downloadOrRead in fetchMediaResource.John MacFarlane1-1/+1
2021-05-18Text.Pandoc.MediaBag: change type to use a Text key...John MacFarlane1-0/+1
instead of `[FilePath]`. We normalize the path and use `/` separators for consistency.
2021-04-17Update to released unicode-collation, latest citeproc dev version.John MacFarlane2-2/+2
Update citeproc test.
2021-04-17Remove Text.Pandoc.BCP47 module.John MacFarlane2-3/+3
[API change] Use Lang from UnicodeCollation.Lang instead. This is a richer implementation of BCP 47.
2021-03-15Use foldl' instead of foldl everywhere.John MacFarlane1-1/+2
2021-02-22Text.Pandoc.UTF8: change IO functions to return Text, not String.John MacFarlane1-3/+3
[API change] This affects `readFile`, `getContents`, `writeFileWith`, `writeFile`, `putStrWith`, `putStr`, `putStrLnWith`, `putStrLn`. `hPutStrWith`, `hPutStr`, `hPutStrLnWith`, `hPutStrLn`, `hGetContents`. This avoids the need to uselessly create a linked list of characters when emiting output.
2021-02-11T.P.Class: Add getTimestamp [API change].John MacFarlane1-2/+19
This attempts to read the SOURCE_DATE_EPOCH environment variable and parse a UTC time from it (treating it as a unix date stamp, see https://reproducible-builds.org/specs/source-date-epoch/). If the variable is not set or can't be parsed as a unix date stamp, then the function returns the current date.
2020-10-19Normalize rewritten image paths with --extract-media.John MacFarlane1-1/+2
This change will avoid mixed paths like this one when `--extract-media` is used with a Word file: `![](C:\Git\TIJ4\Markdown/media/image30.wmf)` Instead we'll get `![](C:\Git\TIJ4\Markdown`media`image30.wmf)`. Closes #6761.
2020-10-15Fix some small typos in the API documentation (#6751)Michael Hoffmann1-1/+1
While reading the docs I found a couple of small typos.
2020-10-14Fix typos in comments, doc strings, error messages, and testsAlbert Krewinkel1-1/+1
Typos reported by https://fossies.org/linux/test/pandoc-master.tar.gz/codespell.html See: #6738
2020-09-21Add built-in citation support using new citeproc library.John MacFarlane1-2/+2
This deprecates the use of the external pandoc-citeproc filter; citation processing is now built in to pandoc. * Add dependency on citeproc library. * Add Text.Pandoc.Citeproc module (and some associated unexported modules under Text.Pandoc.Citeproc). Exports `processCitations`. [API change] * Add data files needed for Text.Pandoc.Citeproc: default.csl in the data directory, and a citeproc directory that is just used at compile-time. Note that we've added file-embed as a mandatory rather than a conditional depedency, because of the biblatex localization files. We might eventually want to use readDataFile for this, but it would take some code reorganization. * Text.Pandoc.Loging: Add `CiteprocWarning` to `LogMessage` and use it in `processCitations`. [API change] * Add tests from the pandoc-citeproc package as command tests (including some tests pandoc-citeproc did not pass). * Remove instructions for building pandoc-citeproc from CI and release binary build instructions. We will no longer distribute pandoc-citeproc. * Markdown reader: tweak abbreviation support. Don't insert a nonbreaking space after a potential abbreviation if it comes right before a note or citation. This messes up several things, including citeproc's moving of note citations. * Add `csljson` as and input and output format. This allows pandoc to convert between `csljson` and other bibliography formats, and to generate formatted versions of CSL JSON bibliographies. * Add module Text.Pandoc.Writers.CslJson, exporting `writeCslJson`. [API change] * Add module Text.Pandoc.Readers.CslJson, exporting `readCslJson`. [API change] * Added `bibtex`, `biblatex` as input formats. This allows pandoc to convert between BibLaTeX and BibTeX and other bibliography formats, and to generated formatted versions of BibTeX/BibLaTeX bibliographies. * Add module Text.Pandoc.Readers.BibTeX, exporting `readBibTeX` and `readBibLaTeX`. [API change] * Make "standalone" implicit if output format is a bibliography format. This is needed because pandoc readers for bibliography formats put the bibliographic information in the `references` field of metadata; and unless standalone is specified, metadata gets ignored. (TODO: This needs improvement. We should trigger standalone for the reader when the input format is bibliographic, and for the writer when the output format is markdown.) * Carry over `citationNoteNum` to `citationNoteNumber`. This was just ignored in pandoc-citeproc. * Text.Pandoc.Filter: Add `CiteprocFilter` constructor to Filter. [API change] This runs the processCitations transformation. We need to treat it like a filter so it can be placed in the sequence of filter runs (after some, before others). In FromYAML, this is parsed from `citeproc` or `{type: citeproc}`, so this special filter may be specified either way in a defaults file (or by `citeproc: true`, though this gives no control of positioning relative to other filters). TODO: we need to add something to the manual section on defaults files for this. * Add deprecation warning if `upandoc-citeproc` filter is used. * Add `--citeproc/-C` option to trigger citation processing. This behaves like a filter and will be positioned relative to filters as they appear on the command line. * Rewrote the manual on citatations, adding a dedicated Citations section which also includes some information formerly found in the pandoc-citeproc man page. * Look for CSL styles in the `csl` subdirectory of the pandoc user data directory. This changes the old pandoc-citeproc behavior, which looked in `~/.csl`. Users can simply symlink `~/.csl` to the `csl` subdirectory of their pandoc user data directory if they want the old behavior. * Add support for CSL bibliography entry formatting to LaTeX, HTML, Ms writers. Added CSL-related CSS to styles.html.
2020-09-13Fix hlint suggestions, update hlint.yaml (#6680)Christian Despres1-5/+5
* Fix hlint suggestions, update hlint.yaml Most suggestions were redundant brackets. Some required LambdaCase. The .hlint.yaml file had a small typo, and didn't ignore camelCase suggestions in certain modules.
2020-04-17Class: generalize PandocIO functions to MonadIOAlbert Krewinkel2-167/+252
2020-04-13Add an option to disable certificate validation (#6156)Cédric Couralet3-2/+12
This commit adds the option `--no-check-certificate`, which disables certificate checking when resources are fetched by HTTP. Co-authored-by: Cécile Chemin <cecile.chemin@insee.fr> Co-authored-by: Juliette Fourcot <juliette.fourcot@insee.fr>
2020-03-29Split the RNG so they don't end up equal again after 1 call to next (#6227)Joseph C. Sible1-5/+5
2020-03-22Finer grained imports of Text.Pandoc.Class submodules (#6203)Albert Krewinkel1-1/+1
This should speed-up recompilation after changes in `Text.Pandoc.Class`, as the number of modules affected by a change will be smaller in general. It also offers faster insights into the parts of `T.P.Class` used within a module.
2020-03-22Text.Pandoc.Class: extract submodules PandocIO, PandocPureAlbert Krewinkel2-0/+436
2020-03-15Use implicit Prelude (#6187)Albert Krewinkel1-2/+0
* Use implicit Prelude The previous behavior was introduced as a fix for #4464. It seems that this change alone did not fix the issue, and `stack ghci` and `cabal repl` only work with GHC 8.4.1 or newer, as no custom Prelude is loaded for these versions. Given this, it seems cleaner to revert to the implicit Prelude. * PandocMonad: remove outdated check for base version Only base versions 4.9 and later are supported, the check for `MIN_VERSION_base(4,8,0)` is therefore unnecessary. * Always use custom prelude Previously, the custom prelude was used only with older GHC versions, as a workaround for problems with ghci. The ghci problems are resolved by replacing package `base` with `base-noprelude`, allowing for consistent use of the custom prelude across all GHC versions.
2020-03-15PandocMonad: remove outdated check for base versionAlbert Krewinkel1-6/+1
Only base versions 4.9 and later are supported, the check for `MIN_VERSION_base(4,8,0)` is therefore unnecessary.
2020-03-14Subdivide Text.Pandoc.Class into small modules (#6106)Albert Krewinkel2-0/+786
* Extract CommonState into submodule * Extract PandocMonad into submodule * PandocMonad: ensure all functions have Haddock documentation