aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers/CommonMark.hs
AgeCommit message (Collapse)AuthorFilesLines
2021-05-09Change reader types, allowing better tracking of source positions.John MacFarlane1-30/+40
Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn't report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn't resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752). Add Text.Pandoc.Sources (exported module), with a `Sources` type and a `ToSources` class. A `Sources` wraps a list of `(SourcePos, Text)` pairs. [API change] A parsec `Stream` instance is provided for `Sources`. The module also exports versions of parsec's `satisfy` and other Char parsers that track source positions accurately from a `Sources` stream (or any instance of the new `UpdateSourcePos` class). Text.Pandoc.Parsing now exports these modified Char parsers instead of the ones parsec provides. Modified parsers to use a `Sources` as stream [API change]. The readers that previously took a `Text` argument have been modified to take any instance of `ToSources`. So, they may still be used with a `Text`, but they can also be used with a `Sources` object. In Text.Pandoc.Error, modified the constructor PandocParsecError to take a `Sources` rather than a `Text` as first argument, so parse error locations can be accurately reported. T.P.Error: showPos, do not print "-" as source name.
2021-03-20Support `yaml_metadata_block` extension form commonmark, gfm.John MacFarlane1-0/+30
This is a bit more limited than with markdown, as documented in the manual: - The YAML block must be the first thing in the input. - The leaf notes are parsed in isolation from the rest of the document. So, for example, you can't use reference links if the references are defined later in the document. Closes #6537.
2021-01-08Update copyright notices for 2021 (#7012)Albert Krewinkel1-1/+1
2020-12-10Add sourcepos extension for commonmarkeJohn MacFarlane1-5/+9
* Add `Ext_sourcepos` constructor for `Extension`. * Add `sourcepos` extension (only for commonmark). * Bump to 2.11.3 With the `sourcepos` extension set set, `data-pos` attributes are added to the AST by the commonmark reader. No other readers are affected. The `data-pos` attributes are put on elements that accept attributes; for other elements, an enlosing Div or Span is added to hold the attributes. Closes #4565.
2020-12-10Commonmark reader: refactor specFor, set input name to "".John MacFarlane1-2/+8
2020-10-12Commonmark reader: add pipe_table extension after defaults.John MacFarlane1-21/+22
Otherwise we get bad results for non-table, non-paragraph lines containing pipe characters. Closes #6739. See also jgm/commonmark-hs#52.
2020-09-13Fix hlint suggestions, update hlint.yaml (#6680)Christian Despres1-1/+0
* Fix hlint suggestions, update hlint.yaml Most suggestions were redundant brackets. Some required LambdaCase. The .hlint.yaml file had a small typo, and didn't ignore camelCase suggestions in certain modules.
2020-07-19Add generic `attributes` extension.John MacFarlane1-4/+1
This allows attributes to be added to any block or inline element, in principle. (Though in many cases this will be done by adding a Div or Span container, since pandoc's AST doesn't have a slot for attributes for most elements.) Currently this is only possible with the commonmark and gfm readers. Add `Ext_attributes` constructor for `Extension` [API change].
2020-07-19Use commonmark-hs to parse commonmark/gfm...John MacFarlane1-198/+43
...instead of cmark-gfm (a wrapper around a C library). We can now support many more pandoc extensions for commonmark and gfm. Add fenced_code_attributes to gfm/commonmark extensions.
2020-05-08Implement implicit_figures extension for commonmark reader.John MacFarlane1-1/+6
Closes #6350.
2020-04-15Adapt to the newest Table type, fix some previous adaptation issuesdespresc1-3/+9
- Writers.Native is now adapted to the new Table type. - Inline captions should now be conditionally wrapped in a Plain, not a Para block. - The toLegacyTable function now lives in Writers.Shared.
2020-04-15Implement the new Table typedespresc1-8/+10
2020-03-22Finer grained imports of Text.Pandoc.Class submodules (#6203)Albert Krewinkel1-1/+1
This should speed-up recompilation after changes in `Text.Pandoc.Class`, as the number of modules affected by a change will be smaller in general. It also offers faster insights into the parts of `T.P.Class` used within a module.
2020-03-15Use implicit Prelude (#6187)Albert Krewinkel1-2/+0
* Use implicit Prelude The previous behavior was introduced as a fix for #4464. It seems that this change alone did not fix the issue, and `stack ghci` and `cabal repl` only work with GHC 8.4.1 or newer, as no custom Prelude is loaded for these versions. Given this, it seems cleaner to revert to the implicit Prelude. * PandocMonad: remove outdated check for base version Only base versions 4.9 and later are supported, the check for `MIN_VERSION_base(4,8,0)` is therefore unnecessary. * Always use custom prelude Previously, the custom prelude was used only with older GHC versions, as a workaround for problems with ghci. The ghci problems are resolved by replacing package `base` with `base-noprelude`, allowing for consistent use of the custom prelude across all GHC versions.
2020-03-13Update copyright year (#6186)Albert Krewinkel1-1/+1
* Update copyright year * Copyright: add notes for Lua and Jira modules
2020-02-07Various minor cleanups and refactoring (#6117)Joseph C. Sible1-1/+1
* Use concatMap instead of reimplementing it * Replace an unnecessary multi-way if with a regular if * Use sortOn instead of sortBy and comparing * Use guards instead of lots of indents for if and else * Remove redundant do blocks * Extract common functions from both branches of maybe Whenever both the Nothing and the Just branch of maybe do the same function, do that function on the result of maybe instead. * Use fmap instead of reimplementing it from maybe * Use negative forms instead of negating the positive forms * Use mapMaybe instead of mapping and then using catMaybes * Use zipWith instead of mapping over the result of zip * Use unwords instead of reimplementing it * Use <$ instead of <$> and const * Replace case of Bool with if and else * Use find instead of listToMaybe and filter * Use zipWithM instead of mapM and zip * Inline lambda wrappers into the real functions * We get zipWithM from Text.Pandoc.Writers.Shared * Use maybe instead of fromMaybe and fmap I'm not sure how this one slipped past me. * Increase a bit of indentation
2019-11-12Switch to new pandoc-types and use Text instead of String [API change].despresc1-22/+23
PR #5884. + Use pandoc-types 1.20 and texmath 0.12. + Text is now used instead of String, with a few exceptions. + In the MediaBag module, some of the types using Strings were switched to use FilePath instead (not Text). + In the Parsing module, new parsers `manyChar`, `many1Char`, `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`, `mantyUntilChar` have been added: these are like their unsuffixed counterparts but pack some or all of their output. + `glob` in Text.Pandoc.Class still takes String since it seems to be intended as an interface to Glob, which uses strings. It seems to be used only once in the package, in the EPUB writer, so that is not hard to change.
2019-03-01Remove license boilerplate.John MacFarlane1-18/+0
The haddock module header contains essentially the same information, so the boilerplate is redundant and just one more thing to get out of sync.
2019-02-04Add missing copyright notices and remove license boilerplate (#5112)Albert Krewinkel1-2/+2
Quite a few modules were missing copyright notices. This commit adds copyright notices everywhere via haddock module headers. The old license boilerplate comment is redundant with this and has been removed. Update copyright years to 2019. Closes #4592.
2019-01-02Implement task lists (#5139)Mauro Bieg1-2/+5
Closes #3051
2018-11-11Text.Pandoc.Shared: add parameter to uniqueIdent, inlineListToIdentifier.John MacFarlane1-27/+8
The parameter is Extensions. This allows these functions to be sensitive to the settings of `Ext_gfm_auto_identifiers` and `Ext_ascii_identifiers`. This allows us to use `uniqueIdent` in the CommonMark reader, replacing some custom code. It also means that `gfm_auto_identifiers` can now be used in all formats. Semantically, `gfm_auto_identifiers` is now a modifier of `auto_identifiers`; for identifiers to be set, `auto_identifiers` must be turned on, and then the type of identifier produced depends on `gfm_auto_identifiers` and `ascii_identifiers` are set. Closes #5057.
2018-11-11Clean up toIdent in CommonMark reader.John MacFarlane1-8/+9
This partially addresses #5057, fixing a bad interaction between the `ascii_identifiers` extension and the `gfm_auto_identifiers` extension, and creating identifiers that match the ones GitHub produces. This code still needs to be put somewhere common, so the `gfm_auto_identifiers` extension will work with other formats.
2018-07-15Wrap emojis in span nodes (#4759)Anders Waldenborg1-14/+17
Text.Pandoc.Emoji now exports `emojiToInline`, which returns a Span inline containing the emoji character and some attributes with metadata (class `emoji`, attribute `data-emoji` with emoji name). Previously, emojis (as supported in Markdown and CommonMark readers, e.g ":smile:") were simply translated into the corresponding unicode code point. By wrapping them in Span nodes, we make it possible to do special handling such as giving them a special font in HTML output. We also open up the possibility of treating them differently when the `--ascii` option is selected (though that is not part of this commit). Closes #4743.
2018-06-29CommonMark reader: Handle ascii_identifiers extension (#4733)Anders Waldenborg1-13/+18
Non-ascii characters were not stripped from identifiers even if the `ascii_identifiers` extension was enabled (which is is by default for gfm). Closes #4742
2018-03-18Use NoImplicitPrelude and explicitly import Prelude.John MacFarlane1-0/+2
This seems to be necessary if we are to use our custom Prelude with ghci. Closes #4464.
2018-03-02Revert "Commonmark reader: parse HTML as plain text if `-raw_html`."John MacFarlane1-2/+2
This reverts commit 6dd21250288b51f10056b15a83130f76c788d904.
2018-03-02Commonmark reader: parse HTML as plain text if `-raw_html`.John MacFarlane1-2/+2
2018-01-05Update copyright notices to include 2018Albert Krewinkel1-2/+2
2017-11-16Introduce `HasSyntaxExtensions` typeclass (#4074)Alexander1-13/+9
+ Added new `HasSyntaxExtensions` typeclass for `ReaderOptions` and `WriterOptions`. + Reimplemented `isEnabled` function from `Options.hs` to accept both `ReaderOptions` and `WriterOptions`. + Replaced `enabled` from `CommonMark.hs` with new `isEnabled`.
2017-11-01hlintAlexander Krotov1-1/+1
2017-10-27hlint changes.John MacFarlane1-2/+2
2017-10-27Automatic reformating by stylish-haskell.John MacFarlane1-27/+27
2017-10-26update years in copyrightKolen Cheung1-2/+2
2017-08-08Thread options through CommonMark reader.John MacFarlane1-81/+77
This is more efficient than doing AST traversals for emojis and hard breaks. Also make behavior sensitive to `raw_html` extension.
2017-08-08Support `hard_line_breaks` in CommonMark reader.John MacFarlane1-0/+7
2017-08-08CommonMark reader: support `emoji` extension.John MacFarlane1-1/+19
2017-08-08CommonMark reader: support `gfm_auto_identifiers`.John MacFarlane1-0/+31
Added `Ext_gfm_auto_identifiers`: new constructor for `Extension` in `Text.Pandoc.Extensions` [API change]. Use this in githubExtensions. Closes #2821.
2017-08-07CommonMark reader: make exts depend on extensions.John MacFarlane1-2/+4
2017-08-07Remove GFM modules; use CMarkGFM for both gfm and commonmark.John MacFarlane1-6/+63
We no longer have a separate readGFM and writeGFM; instead, we'll use readCommonMark and writeCommonMark with githubExtensions. It remains to implement these extensions conditionally. Closes #3841.
2017-08-07Added gfm (GitHub-flavored CommonMark) as an input and output format.John MacFarlane1-2/+2
This uses bindings to GitHub's fork of cmark, so it should parse gfm exactly as GitHub does (excepting certain postprocessing steps, involving notifications, emojis, etc.). * Added Text.Pandoc.Readers.GFM (exporting readGFM) * Added Text.Pandoc.Writers.GFM (exporting writeGFM) * Added `gfm` as input and output forma Note that tables are currently always rendered as HTML in the writer; this can be improved when CMarkGFM supports tables in output.
2017-06-10Changed all readers to take Text instead of String.John MacFarlane1-3/+3
Readers: Renamed StringReader -> TextReader. Updated tests. API change.
2017-03-04Stylish-haskell automatic formatting changes.John MacFarlane1-6/+6
2017-01-25Removed readerSmart and the --smart option; added Ext_smart extension.John MacFarlane1-1/+1
Now you will need to do -f markdown+smart instead of -f markdown --smart This change opens the way for writers, in addition to readers, to be sensitive to +smart, but this change hasn't yet been made. API change. Command-line option change. Updated manual.
2017-01-25Working on readers.Jesse Rosenthal1-3/+4
2015-12-29Use cmark 0.5.John MacFarlane1-4/+12
Closes #2605.
2015-12-12Modified readers to emit SoftBreak when appropriate.John MacFarlane1-1/+1
2015-08-07Updated readers, writers and README for link attributemb211-1/+1
2015-08-07Updated readers and writers for new image attribute parameter.John MacFarlane1-1/+1
(mb21)
2015-03-28Merge branch 'errortype' of https://github.com/mpickering/pandoc into ↵John MacFarlane1-2/+3
mpickering-errortype Conflicts: benchmark/benchmark-pandoc.hs src/Text/Pandoc/Readers/Markdown.hs src/Text/Pandoc/Readers/Org.hs src/Text/Pandoc/Readers/RST.hs tests/Tests/Readers/LaTeX.hs
2015-03-17Fixed a compiler warning.John MacFarlane1-1/+1