aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers/Txt2Tags.hs
AgeCommit message (Collapse)AuthorFilesLines
2019-11-12Switch to new pandoc-types and use Text instead of String [API change].despresc1-66/+66
PR #5884. + Use pandoc-types 1.20 and texmath 0.12. + Text is now used instead of String, with a few exceptions. + In the MediaBag module, some of the types using Strings were switched to use FilePath instead (not Text). + In the Parsing module, new parsers `manyChar`, `many1Char`, `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`, `mantyUntilChar` have been added: these are like their unsuffixed counterparts but pack some or all of their output. + `glob` in Text.Pandoc.Class still takes String since it seems to be intended as an interface to Glob, which uses strings. It seems to be used only once in the package, in the EPUB writer, so that is not hard to change.
2019-03-01Remove license boilerplate.John MacFarlane1-18/+0
The haddock module header contains essentially the same information, so the boilerplate is redundant and just one more thing to get out of sync.
2018-08-10Avoid non-exhaustive pattern match.John MacFarlane1-2/+4
2018-07-02Spellcheck commentsAlexander Krotov1-1/+1
2018-03-18Removed old-locale flag and Text.Pandoc.Compat.Time.John MacFarlane1-1/+1
This is no longer necessary since we no longer support ghc 7.8.
2018-03-18Use NoImplicitPrelude and explicitly import Prelude.John MacFarlane1-0/+2
This seems to be necessary if we are to use our custom Prelude with ghci. Closes #4464.
2018-03-16Monoid/Semiground cleanup relying on custom Prelude.John MacFarlane1-2/+1
2018-02-19Move manyUntil to Text.Pandoc.Parsing and use it in Txt2Tags readerAlexander Krotov1-2/+1
2018-01-19hlint code improvements.John MacFarlane1-2/+2
2017-11-10Txt2Tags reader: hlintAlexander Krotov1-27/+25
2017-11-02hlintAlexander Krotov1-1/+1
2017-10-27Automatic reformating by stylish-haskell.John MacFarlane1-10/+11
2017-10-27Consistent underline for Readers (#2270)hftf1-2/+2
* Added underlineSpan builder function. This can be easily updated if needed. The purpose is for Readers to transform underlines consistently. * Docx Reader: Use underlineSpan and update test * Org Reader: Use underlineSpan and add test * Textile Reader: Use underlineSpan and add test case * Txt2Tags Reader: Use underlineSpan and update test * HTML Reader: Use underlineSpan and add test case
2017-09-30Removed writerSourceURL, add source URL to common state.John MacFarlane1-8/+2
Removed `writerSourceURL` from `WriterOptions` (API change). Added `stSourceURL` to `CommonState`. It is set automatically by `setInputFiles`. Text.Pandoc.Class now exports `setInputFiles`, `setOutputFile`. The type of `getInputFiles` has changed; it now returns `[FilePath]` instead of `Maybe [FilePath]`. Functions in Class that formerly took the source URL as a parameter now have one fewer parameter (`fetchItem`, `downloadOrRead`, `setMediaResource`, `fillMediaBag`). Removed `WriterOptions` parameter from `makeSelfContained` in `SelfContained`.
2017-07-07Rewrote LaTeX reader with proper tokenization.John MacFarlane1-1/+1
This rewrite is primarily motivated by the need to get macros working properly. A side benefit is that the reader is significantly faster (27s -> 19s in one benchmark, and there is a lot of room for further optimization). We now tokenize the input text, then parse the token stream. Macros modify the token stream, so they should now be effective in any context, including math. Thus, we no longer need the clunky macro processing capacities of texmath. A custom state LaTeXState is used instead of ParserState. This, plus the tokenization, will require some rewriting of the exported functions rawLaTeXInline, inlineCommand, rawLaTeXBlock. * Added Text.Pandoc.Readers.LaTeX.Types (new exported module). Exports Macro, Tok, TokType, Line, Column. [API change] * Text.Pandoc.Parsing: adjusted type of `insertIncludedFile` so it can be used with token parser. * Removed old texmath macro stuff from Parsing. Use Macro from Text.Pandoc.Readers.LaTeX.Types instead. * Removed texmath macro material from Markdown reader. * Changed types for Text.Pandoc.Readers.LaTeX's rawLaTeXInline and rawLaTeXBlock. (Both now return a String, and they are polymorphic in state.) * Added orgMacros field to OrgState. [API change] * Removed readerApplyMacros from ReaderOptions. Now we just check the `latex_macros` reader extension. * Allow `\newcommand\foo{blah}` without braces. Fixes #1390. Fixes #2118. Fixes #3236. Fixes #3779. Fixes #934. Fixes #982.
2017-06-20Move CR filtering from tabFilter to the readers.John MacFarlane1-2/+4
The readers previously assumed that CRs had been filtered from the input. Now we strip the CRs in the readers themselves, before parsing. (The point of this is just to simplify the parsers.) Shared now exports a new function `crFilter`. [API change] And `tabFilter` no longer filters CRs.
2017-06-10Changed all readers to take Text instead of String.John MacFarlane1-3/+4
Readers: Renamed StringReader -> TextReader. Updated tests. API change.
2017-05-24Parsing: Provide parseFromString'.John MacFarlane1-3/+3
This is a verison of parseFromString specialied to ParserState, which resets stateLastStrPos at the end. This is almost always what we want. This fixes a bug where `_hi_` wasn't treated as emphasis in the following, because pandoc got confused about the position of the last word: - [o] _hi_ Closes #3690.
2017-05-23Shared: Provide custom isURI that rejects unknown schemes [isURI]Albert Krewinkel1-1/+0
We also export the set of known `schemes`. The new function replaces the function of the same name from `Network.URI`, as the latter did not check whether a scheme is well-known. E.g. MediaWiki wikis frequently feature pages with names like `User:John`. These links were interpreted as URIs, thus turning internal links into global links. This is prevented by also checking whether the scheme of a URI is frequently used (i.e. is IANA registered or an otherwise well-known scheme). Fixes: #2713 Update set of well-known URIs from IANA list All official IANA schemes (as of 2017-05-22) are included in the set of known schemes. The four non-official schemes doi, isbn, javascript, and pmid are kept.
2017-05-22Move indentWith to Text.Pandoc.Parsing (#3687)Alexander Krotov1-3/+0
2017-05-17Merge pull request #3676 from labdsf/space-charJohn MacFarlane1-1/+1
Txt2Tags parser: newline is not indentation
2017-05-17Move anyLineNewline to Parsing.hsAlexander Krotov1-3/+0
2017-05-17Txt2Tags parser: newline is not indentationAlexander Krotov1-1/+1
space parses '\n', while spaceChar parses only ' ' and '\t'
2017-03-04Stylish-haskell automatic formatting changes.John MacFarlane1-16/+16
2017-01-27Shared: rename compactify', compactify'DL -> compactify, compactifyDL.John MacFarlane1-4/+4
2017-01-25Removed `--normalize` option and normalization functions from Shared.John MacFarlane1-2/+6
* Removed normalize, normalizeInlines, normalizeBlocks from Text.Pandoc.Shared. These shouldn't now be necessary, since normalization is handled automatically by the Builder monoid instance. * Remove `--normalize` command-line option. * Don't use normalize in tests. * A few revisions to readers so they work well without normalize.
2017-01-25Readers: pass errors straight up to PandocMonad.Jesse Rosenthal1-2/+1
Since we've unified error types, we can just throw the same error at the toplevel.
2017-01-25Unify Errors.Jesse Rosenthal1-1/+2
2017-01-25Add Text2Tags to Text.PandocJesse Rosenthal1-3/+3
2017-01-25Working on readers.Jesse Rosenthal1-15/+30
2016-09-02Remove directory compatJesse Rosenthal1-1/+1
directory 1.1 depends on base 4.5 (ghc 7.4) which we are no longer supporting. So we don't have to use a compatibility layer for it.
2016-09-02Remove Compat.MonoidJesse Rosenthal1-1/+1
This was only necessary for GHC versions with base below 4.5 (i.e., ghc < 7.4).
2016-07-14Fixed compiler warnings.John MacFarlane1-1/+1
2015-12-12Modified readers to emit SoftBreak when appropriate.John MacFarlane1-1/+1
2015-11-09Restored Text.Pandoc.Compat.Monoid.John MacFarlane1-0/+1
Don't use custom prelude for latest ghc. This is a better approach to making 'stack ghci' and 'cabal repl' work. Instead of using NoImplicitPrelude, we only use the custom prelude for older ghc versions. The custom prelude presents a uniform API that matches the current base version's prelude. So, when developing (presumably with latest ghc), we don't use a custom prelude at all and hence have no trouble with ghci. The custom prelude no longer exports (<>): we now want to match the base 4.8 prelude behavior.
2015-11-09Revert "Use -XNoImplicitPrelude and 'import Prelude' explicitly."John MacFarlane1-1/+0
This reverts commit c423dbb5a34c2d1195020e0f0ca3aae883d0749b.
2015-11-08Use -XNoImplicitPrelude and 'import Prelude' explicitly.John MacFarlane1-0/+1
This is needed for ghci to work with pandoc, given that we now use a custom prelude. Closes #2503.
2015-10-14More changes to avoid compiler warnings on ghc 7.10.John MacFarlane1-1/+1
* CPP around deprecated `parseTime`. * Text.Pandoc.Compat.Locale -> Text.Pandoc.Compat.Time, now exports Data.Time.
2015-10-14Use custom Prelude to avoid compiler warnings.John MacFarlane1-4/+1
- The (non-exported) prelude is in prelude/Prelude.hs. - It exports Monoid and Applicative, like base 4.8 prelude, but works with older base versions. - It exports (<>) for mappend. - It hides 'catch' on older base versions. This allows us to remove many imports of Data.Monoid and Control.Applicative, and remove Text.Pandoc.Compat.Monoid. It should allow us to use -Wall again for ghc 7.10.
2015-02-18Change return type of Txt2Tags readerMatthew Pickering1-2/+3
2014-12-19Added Text.Pandoc.Compat.Locale to assist with transition to time 1.5.John MacFarlane1-2/+1
2014-08-21Txt2Tags Reader: Fixed crash when reading from stdinmpickering1-3/+5
2014-08-21Txt2Tags Reader: Corrected formatting of %%mtime macrompickering1-1/+1
2014-08-21Txt2Tags Reader: Parse Meta informationmpickering1-10/+36
The header is now parsed as meta information. The first line is the `title`, the second is the `author` and third line is the `date`.
2014-08-20Txt2Tags reader: Header is now parsed only if standalone flag is setmpickering1-1/+4
2014-07-30getT2TMeta: Take list of source files instead of single.John MacFarlane1-7/+8
Get latest modification time.
2014-07-27Added compatability layer to support directory-1.1Matthew Pickering1-1/+1
2014-07-27Txt2Tags Reader: Added copyright informationMatthew Pickering1-0/+26
2014-07-27Txt2Tags Reader: Added recognition of macrosMatthew Pickering1-4/+18
2014-07-27Added txt2tags readerMatthew Pickering1-0/+507
http://txt2tags.org/ There are two points which currently do not match the official implementation. 1. In the official implementation lists can not be nested like the following but the reader would interpret this as a bullet list with the first item being a numbered list. ``` - + This is not a list ``` 2. The specification describes how URIs automatically becomes links. Unfortunately as is often the case, their definitiong of URI is not clear. I tried three solutions but was unsure about which to adopt. * Using isURI from Network.URI, this matches far too many strings and is therefore unsuitable * Using uri from Text.Pandoc.Shared, this doesn't match all strings that the reference implementation matches * Try to simulate the regex which is used in the native code I went with the third approach but it is not perfect, for example trailing punctuation is captured in Urls.