aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Parsing.hs
AgeCommit message (Collapse)AuthorFilesLines
2017-05-14Parsing: replace partial with total functionAlbert Krewinkel1-1/+1
Calling `tail` on an empty list raises an exception, while calling the otherwise equivalent `drop 1` will return the empty list again.
2017-05-13Update dates in copyright noticesAlbert Krewinkel1-2/+2
This follows the suggestions given by the FSF for GPL licensed software. <https://www.gnu.org/prep/maintain/html_node/Copyright-Notices.html>
2017-05-11Combine grid table parsersAlbert Krewinkel1-18/+51
The grid table parsers for markdown and rst was combined into one single parser, slightly changing parsing behavior of both parsers: - The markdown parser now compactifies block content cell-wise: pure text blocks in cells are now treated as paragraphs only if the cell contains multiple paragraphs, and as plain blocks otherwise. Before, this was true only for single-column tables. - The rst parser now accepts newlines and multiple blocks in header cells. Closes: #3638
2017-05-02Generalize tableWith, gridTableWithAlbert Krewinkel1-23/+26
The parsing functions `tableWith` and `gridTableWith` are generalized to work with more parsers. The parser state only has to be an instance of the `HasOptions` class instead of requiring a concrete type. Block parsers are required to return blocks wrapped into a monad, as this makes it possible to use parsers returning results wrapped in `Future`s.
2017-04-30Provide shared F monad functions for Markdown and Org readersAlbert Krewinkel1-10/+25
The `F` monads used for delayed evaluation of certain values in the Markdown and Org readers are based on a shared data type capturing the common pattern of both `F` types.
2017-04-15Avoid parsing "Notes:**" as a bare URI.John MacFarlane1-0/+2
This avoids parsing bare URIs that start with a scheme + colon + `*`, `_`, or `]`. Closes #3570.
2017-03-13Better handling of \part in LaTeX.John MacFarlane1-2/+0
Closes #1905. Removed stateChapters from ParserState. Now we parse chapters as level 0 headers, and parts as level -1 headers. After parsing, we check for the lowest header level, and if it's less than 1 we bump everything up so that 1 is the lowest header level. So `\part` will always produce a header; no command-line options are needed.
2017-03-12Issue warning for duplicate header identifiers.John MacFarlane1-2/+8
As noted in the previous commit, an autogenerated identifier may still coincide with an explicit identifier that is given for a header later in the document, or with an identifier on a div, span, link, or image. This commit adds a warning in this case, so users can supply an explicit identifier. * Added `DuplicateIdentifier` to LogMessage. * Modified HTML, Org, MediaWiki readers so their custom state type is an instance of HasLogMessages. This is necessary for `registerHeader` to issue warnings. See #1745.
2017-03-12Improved behavior of `auto_identifiers` when there are explicit ids.John MacFarlane1-1/+2
Previously only autogenerated ids were added to the list of header identifiers in state, so explicit ids weren't taken into account when generating unique identifiers. Duplicated identifiers could result. This simple fix ensures that explicitly given identifiers are also taken into account. Fixes #1745. Note some limitations, however. An autogenerated identifier may still coincide with an explicit identifier that is given for a header later in the document, or with an identifier on a div, span, link, or image. Fixing this would be much more difficult, because we need to run `registerHeader` before we have the complete parse tree (so we can't get a complete list of identifiers from the document by walking the tree). However, it might be worth issuing warnings for duplicate header identifiers; I think we can do that. It is not common for headers to have the same text, and the issue can always be worked around by adding explicit identifiers, if the user is aware of it.
2017-03-10Use pMacroDefinition in macro (for more direct parsing).John MacFarlane1-13/+8
This is newly exported in texmath 0.9.3. Note that this means that `macro` will now parse one macro at a time, rather than parsing a whole group together.
2017-03-03RST reader: support RST-style citations.John MacFarlane1-0/+2
The citations appear at the end of the document as a definition list in a special div with id `citations`. Citations link to the definitions. Added stateCitations to ParserState. Closes #853.
2017-02-20Revert "Refined constraint for HasQuoteContext instance."John MacFarlane1-1/+1
This reverts commit 3c427fc17d53a564305aadde015dd2f048d9ff71.
2017-02-20Refined constraint for HasQuoteContext instance.John MacFarlane1-1/+1
in hopes that this will help the ghc 7.8.4 build...
2017-02-20Removed redundant constraint.John MacFarlane1-2/+1
2017-02-17Parsing: Added HasLogMessages, logMessage, reportLogMessages.John MacFarlane1-0/+25
We need to do logging by updating parser state, or we'll get inappropriate and repeated log messages when there is parser backtracking. See #3447.
2017-02-11Use new warnings throughout the code base.John MacFarlane1-2/+8
2017-02-07Refactored some files formerly in LaTeX reader.John MacFarlane1-0/+23
* Export readFileFromDirs from Class. * Export insertIncludedFile from Parsing. Simplified code in LaTeX/RST readers.
2017-02-07Moved readFileFromDirs to Text.Pandoc.Class.John MacFarlane1-3/+3
This can be used in several different modules, not just LaTeX reader.
2017-01-27Shared: rename compactify', compactify'DL -> compactify, compactifyDL.John MacFarlane1-1/+1
2017-01-27Removed Shared.compactify.John MacFarlane1-12/+12
Changed signatures on Parsing.tableWith and Parsing.gridTableWith.
2017-01-25Removed readerOldDashes and --old-dashes option, added old_dashes extension.John MacFarlane1-1/+1
API change. CLI option change.
2017-01-25Removed readerSmart and the --smart option; added Ext_smart extension.John MacFarlane1-5/+1
Now you will need to do -f markdown+smart instead of -f markdown --smart This change opens the way for writers, in addition to readers, to be sensitive to +smart, but this change hasn't yet been made. API change. Command-line option change. Updated manual.
2017-01-25Make Extensions a custom type instead of a Set Extension.John MacFarlane1-4/+4
The type is implemented in terms of an underlying bitset which should be more efficient. API change: from Text.Pandoc.Extensions export Extensions, emptyExtensions, extensionsFromList, enableExtension, disableExtension, extensionEnabled.
2017-01-25LaTeX reader: Proper include file processing.John MacFarlane1-0/+2
* Removed handleIncludes from LaTeX reader [API change]. * Now the ordinary LaTeX reader handles includes in a way that is appropriate to the monad it is run in.
2017-01-25Parsing: Removed obsolete warnings stuff.John MacFarlane1-21/+3
Removed stateWarnings, addWarning, and readWithWarnings.
2017-01-25Remove OverlappingInstances pragma.Jesse Rosenthal1-1/+0
It doesn't help to solve the problem in 7.8.
2017-01-25Try adding OverlappingInstances pragma to parsing.Jesse Rosenthal1-0/+1
It's having trouble figuring out HasQuoteContext.
2017-01-25Unify Errors.Jesse Rosenthal1-1/+1
2017-01-25Add IncoherentInstances pragma for HasQuotedContext.Jesse Rosenthal1-1/+3
We can remove this if we can figure out a better way to do this.
2016-10-23Tighten up parsing of raw email addresses.John MacFarlane1-4/+13
Technically `**@user` is a valid email address, but if we allow things like this, we get bad results in markdown flavors that autolink raw email addresses. (See #2940.) So we exclude a few valid email addresses in order to avoid these more common bad cases. Closes #2940.
2016-10-13Allow empty lines when parsing line blocksAlbert Krewinkel1-2/+5
Line blocks are allowed to contain empty lines and should be parsed as a single block in that case. Previously an empty (line block) line would have terminated parsing of the line block element.
2016-09-02Remove TagSoup compatJesse Rosenthal1-3/+3
We already lower-bound tagsoup at 0.13.7, which means we were always running the compatibility layer (it was conditional on min value 0.13). Better to just use `lookupEntity` from the library directly, and convert a string to a char if need be.
2016-09-02Remove Compat.MonoidJesse Rosenthal1-1/+1
This was only necessary for GHC versions with base below 4.5 (i.e., ghc < 7.4).
2016-07-15Use liftM since otherwise Functor type constraint needen in ghc 7.8.John MacFarlane1-1/+1
2016-07-14Fixed compiler warnings.John MacFarlane1-3/+3
2016-03-22Updated copyright dates to include 2016.John MacFarlane1-2/+2
2016-01-22Changed type of Shared.uniqueIdent argument from [String] to Set String.John MacFarlane1-6/+6
This avoids performance problems in documents with many identically named headers. Closes #2671.
2016-01-08Work around tagsoup bug - not allowing uppercase x in hex entities.John MacFarlane1-0/+1
Issue submitted at tagsoup.
2016-01-08Entity handling fixes:John MacFarlane1-1/+4
- Text.Pandoc.XML.fromEntities: handle entities without a semicolon. Always lookup character references with the trailing ';', even if it wasn't present. And never add it when looking up numerical entities. (This is what tagsoup seems to require.) - Text.Pandoc.Parsing.characterReference: Always lookup character references with the trailing ';', and leave off the ';' when looking up numerical entities. This fixes a regression for e.g. `&lang;`.
2015-12-12Fixed cite key parsing regression.John MacFarlane1-1/+1
We were capturing final colons as in [@foo: bar]; the citation id was being parsed as "@foo:". Closes jgm/pandoc-citeproc#201.
2015-11-19Merge branch 'new-image-attributes' of https://github.com/mb21/pandoc into ↵John MacFarlane1-2/+14
mb21-new-image-attributes * Bumped version to 1.16. * Added Attr field to Link and Image. * Added `common_link_attributes` extension. * Updated readers for link attributes. * Updated writers for link attributes. * Updated tests * Updated stack.yaml to build against unreleased versions of pandoc-types and texmath. * Fixed various compiler warnings. Closes #261. TODO: * Relative (percentage) image widths in docx writer. * ODT/OpenDocument writer (untested, same issue about percentage widths). * Update pandoc-citeproc.
2015-11-13Allow `://` in citation keys.John MacFarlane1-1/+2
Closes jgm/pandoc-citeproc#166.
2015-11-09Restored Text.Pandoc.Compat.Monoid.John MacFarlane1-0/+1
Don't use custom prelude for latest ghc. This is a better approach to making 'stack ghci' and 'cabal repl' work. Instead of using NoImplicitPrelude, we only use the custom prelude for older ghc versions. The custom prelude presents a uniform API that matches the current base version's prelude. So, when developing (presumably with latest ghc), we don't use a custom prelude at all and hence have no trouble with ghci. The custom prelude no longer exports (<>): we now want to match the base 4.8 prelude behavior.
2015-11-09Revert "Use -XNoImplicitPrelude and 'import Prelude' explicitly."John MacFarlane1-1/+0
This reverts commit c423dbb5a34c2d1195020e0f0ca3aae883d0749b.
2015-11-08Use -XNoImplicitPrelude and 'import Prelude' explicitly.John MacFarlane1-0/+1
This is needed for ghci to work with pandoc, given that we now use a custom prelude. Closes #2503.
2015-10-14Use custom Prelude to avoid compiler warnings.John MacFarlane1-2/+0
- The (non-exported) prelude is in prelude/Prelude.hs. - It exports Monoid and Applicative, like base 4.8 prelude, but works with older base versions. - It exports (<>) for mappend. - It hides 'catch' on older base versions. This allows us to remove many imports of Data.Monoid and Control.Applicative, and remove Text.Pandoc.Compat.Monoid. It should allow us to use -Wall again for ghc 7.10.
2015-08-05Parsing: Add `extractIdClass`, modified type of `KeyTable`.John MacFarlane1-2/+14
(mb21)
2015-07-23Parsing: toKey: strip off outer brackets.John MacFarlane1-2/+4
This makes keys with extra space at the beginning and end work: e.g. [foo]: bar [ foo ] will now be a link to bar (it wasn't before).
2015-07-14Improved bare autolink detection.John MacFarlane1-3/+2
Previously we disallowed `-` at the end of an autolink, and disallowed the combination `=-`. This commit liberalizes the rules for allowing punctuation in a bare URI. Added test cases. One potential drawback is that you can no longer put a bare URI in em dashes like this this uri---http://example.com---is an example. But in this respect we now match github's treatment of bare URIs. Closes #2299.
2015-05-13Markdown reader: Made implicit header references case-insensitive.John MacFarlane1-1/+3
Added `stateHeaderKeys` to `ParserState`; this is a `KeyTable` like `stateKeys`, but it only gets consulted if we don't find a match in `stateKeys`, and if `Ext_implicit_header_references` is enabled. Closes #1606.