aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers/Ipynb.hs
AgeCommit message (Collapse)AuthorFilesLines
2021-05-09Change reader types, allowing better tracking of source positions.John MacFarlane1-3/+5
Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn't report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn't resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752). Add Text.Pandoc.Sources (exported module), with a `Sources` type and a `ToSources` class. A `Sources` wraps a list of `(SourcePos, Text)` pairs. [API change] A parsec `Stream` instance is provided for `Sources`. The module also exports versions of parsec's `satisfy` and other Char parsers that track source positions accurately from a `Sources` stream (or any instance of the new `UpdateSourcePos` class). Text.Pandoc.Parsing now exports these modified Char parsers instead of the ones parsec provides. Modified parsers to use a `Sources` as stream [API change]. The readers that previously took a `Text` argument have been modified to take any instance of `ToSources`. So, they may still be used with a `Text`, but they can also be used with a `Sources` object. In Text.Pandoc.Error, modified the constructor PandocParsecError to take a `Sources` rather than a `Text` as first argument, so parse error locations can be accurately reported. T.P.Error: showPos, do not print "-" as source name.
2021-01-08Update copyright notices for 2021 (#7012)Albert Krewinkel1-1/+1
2020-07-23Add `raw_markdown` extension affecting `ipynb` reader.John MacFarlane1-1/+5
Specifying `-f ipynb+raw_markdown` will cause Markdown cells to be represented as raw Markdown blocks, instead of being parsed. This is not what you want when going from `ipynb` to other formats, but it may be useful when going from `ipynb` to Markdown or to `ipynb`, to avoid semantically insignificant changes in the contents of the Markdown cells that might otherwise be introduced. Closes #5408.
2020-07-02Revert "Ipnyb: allow lossless round-tripping of markdown cell content."John MacFarlane1-2/+1
This reverts commit efbc2050315b60c8a753dee6255465f1083019ab.
2020-07-02Revert "Ipynb reader: fix duplication of 'source' attribute."John MacFarlane1-1/+1
This reverts commit 2d009366cef2358ec2c99612ae2c73068841306c.
2020-07-02Ipynb reader: fix duplication of 'source' attribute.John MacFarlane1-1/+1
See #5408.
2020-06-30Ipnyb: allow lossless round-tripping of markdown cell content.John MacFarlane1-1/+2
The reader now parses the contents of the markdown cell to a Pandoc structure, but *also* stores the raw markdown in a `source` attribute on the cell Div. When we convert back to markdown, this attribute is stripped off and the original source is used. When we convert to other formats, the attribute is usually ignored (though it will come through in HTML as a `data-source` attribute, not unhelpfully). I'll note some potential drawbacks of this approach: - It makes it impossible to use pandoc to clean up or change the contents of markdown cells, e.g. going from `+smart` to `-smart`. - There may be formats where the addition of the `source` attribute is problematic. I can't think of any, though. Closes #5408.
2020-06-09Ipynb reader: handle application/pdf output as image.John MacFarlane1-1/+1
Closes #6430.
2020-06-09Ipynb reader: properly handle image/svg+xml as an image.John MacFarlane1-3/+5
Partially addresses #6430.
2020-04-15Implement the new Table typedespresc1-1/+1
2020-03-22Finer grained imports of Text.Pandoc.Class submodules (#6203)Albert Krewinkel1-1/+1
This should speed-up recompilation after changes in `Text.Pandoc.Class`, as the number of modules affected by a change will be smaller in general. It also offers faster insights into the parts of `T.P.Class` used within a module.
2020-03-15Use implicit Prelude (#6187)Albert Krewinkel1-2/+0
* Use implicit Prelude The previous behavior was introduced as a fix for #4464. It seems that this change alone did not fix the issue, and `stack ghci` and `cabal repl` only work with GHC 8.4.1 or newer, as no custom Prelude is loaded for these versions. Given this, it seems cleaner to revert to the implicit Prelude. * PandocMonad: remove outdated check for base version Only base versions 4.9 and later are supported, the check for `MIN_VERSION_base(4,8,0)` is therefore unnecessary. * Always use custom prelude Previously, the custom prelude was used only with older GHC versions, as a workaround for problems with ghci. The ghci problems are resolved by replacing package `base` with `base-noprelude`, allowing for consistent use of the custom prelude across all GHC versions.
2020-03-13Update copyright year (#6186)Albert Krewinkel1-1/+1
* Update copyright year * Copyright: add notes for Lua and Jira modules
2020-02-07Resolve HLint warningsAlbert Krewinkel1-12/+10
All warnings are either fixed or, if more appropriate, HLint is configured to ignore them. HLint suggestions remain. * Ignore "Use camelCase" warnings in Lua and legacy code * Fix or ignore remaining HLint warnings * Remove redundant brackets * Remove redundant `return`s * Remove redundant as-pattern * Fuse mapM_/map * Use `.` to shorten code * Remove redundant `fmap` * Remove unused LANGUAGE pragmas * Hoist `not` in Text.Pandoc.App * Use fewer imports for `Text.DocTemplates` * Remove redundant `do`s * Remove redundant `$`s * Jira reader: remove unnecessary parentheses
2019-11-12Switch to new pandoc-types and use Text instead of String [API change].despresc1-40/+39
PR #5884. + Use pandoc-types 1.20 and texmath 0.12. + Text is now used instead of String, with a few exceptions. + In the MediaBag module, some of the types using Strings were switched to use FilePath instead (not Text). + In the Parsing module, new parsers `manyChar`, `many1Char`, `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`, `mantyUntilChar` have been added: these are like their unsuffixed counterparts but pack some or all of their output. + `glob` in Text.Pandoc.Class still takes String since it seems to be intended as an interface to Glob, which uses strings. It seems to be used only once in the package, in the EPUB writer, so that is not hard to change.
2019-07-02Fix redundant constraint warnings. (#5625)Pete Ryland1-1/+1
2019-03-30ipynb reader/writer: use format 'ipynb' for raw cell where no format given.John MacFarlane1-2/+3
According to nbformat docs, this is supposed to render in every format. We don't do that, but we at least preserve it as a raw block in markdown, so you can round-trip.
2019-03-28Ipynb reader: use `html` for a raw cell with no format.John MacFarlane1-1/+1
The nbformat spec says that when no format is specified, the raw cell will be rendered in every markup format. Pandoc doesn't have a construct that works this way, so we just fall back to `html`.
2019-03-27ipynb reader: avoid introducing spurious `.0` on integers in metadata.John MacFarlane1-1/+4
2019-03-10ipynb reader: removed vestigial ReaderOptions param.John MacFarlane1-18/+16
2019-03-09ipynb reader: remove sensitivity to `raw_html`, `raw_tex` extensions.John MacFarlane1-6/+2
We now include every output format. Pruning is handled by `--ipynb-output=`.
2019-03-09Ipynb reader/writer: better handling of cell metadata.John MacFarlane1-7/+10
We now handle even complex cell metadata in the Div's attributes. Simple metadata fields are rendered as a plain string, and complex ones as JSON.
2019-03-01Remove license boilerplate.John MacFarlane1-18/+0
The haddock module header contains essentially the same information, so the boilerplate is redundant and just one more thing to get out of sync.
2019-02-11Remove redundant import.John MacFarlane1-1/+0
2019-02-10ipynb writer: keep plain text fallbacks in output...John MacFarlane1-26/+14
even if a richer format is included. We don't know what output format will be needed. The fallback can always be weeded out using a filter. Closes #5293.
2019-02-02ipynb reader: handle images referring to attachments.John MacFarlane1-1/+9
Previously we didn't strip off the attachment: prefix, so even though the attachment is available in the mediabag, pandoc couldn't find it.
2019-01-24Ipynb: Put all jupyter metadata under 'jupyter' key.John MacFarlane1-1/+1
2019-01-24Revert "Prepend `jupyter_` to jupyter metadata keys."John MacFarlane1-6/+0
This reverts commit 5eaff399d5d6dc30b0d453eff42c4101674d75ab.
2019-01-24Prepend `jupyter_` to jupyter metadata keys.John MacFarlane1-0/+6
This avoids conflics with things like 'toc'.
2019-01-22Support ipynb (Jupyter notebook) as input and output format.John MacFarlane1-0/+249
[API change] * Depend on ipynb library. * Add `ipynb` as input and output format. * Added Text.Pandoc.Readers.Ipynb (supports both nbformat v3 and v4). * Added Text.Pandoc.Writers.Ipynb (supports nbformat v4). * Added ipynb readers and writers to T.P.Readers, T.P.Writers, and T.P.Extensions. Register the file extension .ipynb for this format. * Add `PandocIpynbDecodingError` constructor to Text.Pandoc.Error.Error. * Note: there is no template for ipynb.