aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Parsing.hs
AgeCommit message (Collapse)AuthorFilesLines
2019-11-12Switch to new pandoc-types and use Text instead of String [API change].despresc1-150/+226
PR #5884. + Use pandoc-types 1.20 and texmath 0.12. + Text is now used instead of String, with a few exceptions. + In the MediaBag module, some of the types using Strings were switched to use FilePath instead (not Text). + In the Parsing module, new parsers `manyChar`, `many1Char`, `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`, `mantyUntilChar` have been added: these are like their unsuffixed counterparts but pack some or all of their output. + `glob` in Text.Pandoc.Class still takes String since it seems to be intended as an interface to Glob, which uses strings. It seems to be used only once in the package, in the EPUB writer, so that is not hard to change.
2019-11-11Fix typos (#5896)Brian Wignall1-1/+1
2019-09-28Use Prelude.fail to avoid ambiguity with fail from GHC.Base.John MacFarlane1-6/+6
2019-08-27Add stateAllowLineBreaks to ParserState. [API change]John MacFarlane1-0/+2
2019-08-26parseFromString': reset stateLastStrPos to Nothing before parse.John MacFarlane1-0/+1
2019-08-26Fix inline parsing in grid table cells.John MacFarlane1-14/+16
* T.P.Parsing: Change type of `setLastStrPos` so it takes a `Maybe SourcePos` rather than a `SourcePos`. [API change] * T.P.Parsing: Make `parseFromString'` and `gridTableWith` and `gridTableWith'` polymorphic in the parser state, constraining it with `HasLastStrPosition`. [API change] Closes #5708.
2019-07-02Fix redundant constraint warnings. (#5625)Pete Ryland1-8/+6
2019-03-01Remove license boilerplate.John MacFarlane1-18/+0
The haddock module header contains essentially the same information, so the boilerplate is redundant and just one more thing to get out of sync.
2019-02-04Add missing copyright notices and remove license boilerplate (#5112)Albert Krewinkel1-2/+2
Quite a few modules were missing copyright notices. This commit adds copyright notices everywhere via haddock module headers. The old license boilerplate comment is redundant with this and has been removed. Update copyright years to 2019. Closes #4592.
2018-12-31Remove unused HasHeaderMap (#5175)Alexander1-16/+1
It is updated by some readers, but never actually used.
2018-12-17Parsing: use safeRead instead of read.John MacFarlane1-1/+1
2018-11-11Text.Pandoc.Shared: add parameter to uniqueIdent, inlineListToIdentifier.John MacFarlane1-1/+1
The parameter is Extensions. This allows these functions to be sensitive to the settings of `Ext_gfm_auto_identifiers` and `Ext_ascii_identifiers`. This allows us to use `uniqueIdent` in the CommonMark reader, replacing some custom code. It also means that `gfm_auto_identifiers` can now be used in all formats. Semantically, `gfm_auto_identifiers` is now a modifier of `auto_identifiers`; for identifiers to be set, `auto_identifiers` must be turned on, and then the type of identifier produced depends on `gfm_auto_identifiers` and `ascii_identifiers` are set. Closes #5057.
2018-11-08Remove Functor and Applicative constraints where Monad already existsAlexander Krotov1-14/+7
2018-11-03Make readWithM accept Text input as well as String (API change)Alexander Krotov1-12/+6
2018-11-02Fix readWithM with Stream.John MacFarlane1-4/+2
2018-11-02T.P.Parsing: Generalize readWithM to any Char Stream.John MacFarlane1-5/+12
[API change]
2018-11-01Remove Monad constraint implied by StreamAlexander Krotov1-6/+6
2018-11-01hlint Parsing.hsAlexander Krotov1-11/+9
2018-11-01Make `uri` accept any stream with Char tokensAlexander Krotov1-1/+1
2018-11-01Rewrite "uri" without "withRaw"Alexander Krotov1-17/+16
2018-10-31Generalize gridTableWith to any streams with Char tokensAlexander Krotov1-16/+18
2018-10-31Generalize parseFromString'Alexander Krotov1-3/+3
2018-10-31Generalize parseFromString to any streams with Char tokenAlexander Krotov1-4/+5
2018-10-29LaTeX reader: allow space at end of math after `\`.John MacFarlane1-1/+1
Closes #5010. Expose trimMath from T.P.Shared.
2018-10-10Pandoc.Parsing: rewrite nonspaceChar using noneOfAlexander Krotov1-1/+1
2018-08-10Avoid incomplete pattern patch.John MacFarlane1-5/+8
2018-08-10Avoid non-exhaustive pattern match.John MacFarlane1-11/+5
2018-07-02Spellcheck commentsAlexander Krotov1-1/+1
2018-05-09Parsing: Lookahead for non-whitespace after single/double quote start.John MacFarlane1-2/+4
Closes #4637.
2018-04-19Parsing.uri: don't treat `*` characters at end as part of URI.John MacFarlane1-1/+1
This fixes #4561, a bug parsing emphasized bare links in RST.
2018-04-09Fix a commentAlexander Krotov1-1/+1
2018-03-21Parsing: Fix romanNumeral parser.John MacFarlane1-3/+3
We previously accepted 'DDC' as 1100. Closes #4480.
2018-03-18Use NoImplicitPrelude and explicitly import Prelude.John MacFarlane1-0/+2
This seems to be necessary if we are to use our custom Prelude with ghci. Closes #4464.
2018-03-16Monoid/Semiground cleanup relying on custom Prelude.John MacFarlane1-9/+0
2018-03-15Remove redundant import.John MacFarlane1-2/+0
2018-03-13Require pandoc-types 1.17.4.John MacFarlane1-2/+14
And a few tweaks related to the Semigroups/Monoid change. Closes #4448.
2018-02-23Export improved sepBy1 from Text.Pandoc.ParsingAlexander Krotov1-5/+11
2018-02-19Move manyUntil to Text.Pandoc.Parsing and use it in Txt2Tags readerAlexander Krotov1-0/+15
2018-01-31Export list marker parsers from Text.Pandoc.ParsingAlexander Krotov1-0/+5
2018-01-19hlint code improvements.John MacFarlane1-14/+10
2018-01-14Markdown reader: Improved inlinesInBalancedBrackets.John MacFarlane1-0/+1
The change both improves performance and fixes a regression whereby normal citations inside inline notes were not parsed correctly. Closes jgm/pandoc-citeproc#315.
2018-01-05Update copyright notices to include 2018Albert Krewinkel1-2/+2
2017-11-19Allow spaces after `\(` and before `\)` with `tex_math_single_backslash`.John MacFarlane1-2/+2
Previously `\( \frac{1}{a} < \frac{1}{b} \)` was not parsed as math in `markdown` or `html` `+tex_math_single_backslash`.
2017-11-14Text.Pandoc.Parsing.uri: allow `&` and `=` as word characters.John MacFarlane1-1/+1
This fixes a bug where pandoc would stop parsing a URI with an empty attribute: for example, `&a=&b=` wolud stop at `a`. (The uri parser tries to guess which punctuation characters are part of the URI and which might be punctuation after it.) Closes #4068.
2017-11-01hlintAlexander Krotov1-18/+18
2017-10-29Source code reformatting.John MacFarlane1-65/+64
2017-10-23Implemented fenced Divs.John MacFarlane1-0/+2
+ Added Ext_fenced_divs to Extensions (default for pandoc Markdown). + Document fenced_divs extension in manual. + Implemented fenced code divs in Markdown reader. + Added test. Closes #168.
2017-08-28RST reader: handle blank lines correctly in line blocks (#3881)Alexander1-1/+1
Previously pandoc would sometimes combine two line blocks separated by blanks, and ignore trailing blank lines within the line block. Test is checked to be consisted with http://rst.ninjs.org/
2017-08-19Markdown reader: use CommonMark rules for list item nesting.John MacFarlane1-8/+28
Closes #3511. Previously pandoc used the four-space rule: continuation paragraphs, sublists, and other block level content had to be indented 4 spaces. Now the indentation required is determined by the first line of the list item: to be included in the list item, blocks must be indented to the level of the first non-space content after the list marker. Exception: if are 5 or more spaces after the list marker, then the content is interpreted as an indented code block, and continuation paragraphs must be indented two spaces beyond the end of the list marker. See the CommonMark spec for more details and examples. Documents that adhere to the four-space rule should, in most cases, be parsed the same way by the new rules. Here are some examples of texts that will be parsed differently: - a - b will be parsed as a list item with a sublist; under the four-space rule, it would be a list with two items. - a code Here we have an indented code block under the list item, even though it is only indented six spaces from the margin, because it is four spaces past the point where a continuation paragraph could begin. With the four-space rule, this would be a regular paragraph rather than a code block. - a code Here the code block will start with two spaces, whereas under the four-space rule, it would start with `code`. With the four-space rule, indented code under a list item always must be indented eight spaces from the margin, while the new rules require only that it be indented four spaces from the beginning of the first non-space text after the list marker (here, `a`). This change was motivated by a slew of bug reports from people who expected lists to work differently (#3125, #2367, #2575, #2210, #1990, #1137, #744, #172, #137, #128) and by the growing prevalance of CommonMark (now used by GitHub, for example). Users who want to use the old rules can select the `four_space_rule` extension. * Added `four_space_rule` extension. * Added `Ext_four_space_rule` to `Extensions`. * `Parsing` now exports `gobbleAtMostSpaces`, and the type of `gobbleSpaces` has been changed so that a `ReaderOptions` parameter is not needed.
2017-08-08Parsing: added gobbleSpaces.John MacFarlane1-0/+12
This is a utility function to use in list parsing.