aboutsummaryrefslogtreecommitdiff
path: root/src/Text
AgeCommit message (Collapse)AuthorFilesLines
2017-07-07Parsing: added takeP, takeWhileP for efficient parsing of [Char].John MacFarlane1-2/+33
2017-07-07Rewrote LaTeX reader with proper tokenization.John MacFarlane14-1143/+1746
This rewrite is primarily motivated by the need to get macros working properly. A side benefit is that the reader is significantly faster (27s -> 19s in one benchmark, and there is a lot of room for further optimization). We now tokenize the input text, then parse the token stream. Macros modify the token stream, so they should now be effective in any context, including math. Thus, we no longer need the clunky macro processing capacities of texmath. A custom state LaTeXState is used instead of ParserState. This, plus the tokenization, will require some rewriting of the exported functions rawLaTeXInline, inlineCommand, rawLaTeXBlock. * Added Text.Pandoc.Readers.LaTeX.Types (new exported module). Exports Macro, Tok, TokType, Line, Column. [API change] * Text.Pandoc.Parsing: adjusted type of `insertIncludedFile` so it can be used with token parser. * Removed old texmath macro stuff from Parsing. Use Macro from Text.Pandoc.Readers.LaTeX.Types instead. * Removed texmath macro material from Markdown reader. * Changed types for Text.Pandoc.Readers.LaTeX's rawLaTeXInline and rawLaTeXBlock. (Both now return a String, and they are polymorphic in state.) * Added orgMacros field to OrgState. [API change] * Removed readerApplyMacros from ReaderOptions. Now we just check the `latex_macros` reader extension. * Allow `\newcommand\foo{blah}` without braces. Fixes #1390. Fixes #2118. Fixes #3236. Fixes #3779. Fixes #934. Fixes #982.
2017-07-06Logging: added MacroAlreadyDefined.John MacFarlane1-0/+9
2017-06-30Allow ibooks-specific metadata in epubs. Closes #2693.John MacFarlane1-5/+20
You can now have the following fields in your YAML metadata, and it will be treated appropriately in the generated EPUB. ``` ibooks: version: 1.3.4 specified-fonts: false ipad-orientation-lock: portrait-only iphone-orientation-lock: landscape-only binding: true scroll-axis: vertical ``` This commit also fixes a regression in stylesheet paths.
2017-06-30Removed `hard_line_breaks` extension from `markdown_github`.John MacFarlane1-1/+0
GitHub has two Markdown modes, one for long-form documents like READMEs and one for short things like issue coments. In issue comments, a line break is treated as a hard line break. In README, wikis, etc., it is treated as a space as in regular Markdown. Since pandoc is more likely to be used to convert long-form documents from GitHub Markdown, `-hard_line_breaks` is a better default. Closes #3594.
2017-06-30Make `east_asian_line_breaks` affect all readers/writers.John MacFarlane2-6/+14
Closes #3703.
2017-06-30Markdown writer: Ensure that `+` and `-` are escaped properly...John MacFarlane1-0/+3
so they don't cause spurious lists. Previously they were only if succeeded by a space, not if they were at end of line. Closes #3773.
2017-06-29Added parameter for user data directory to runLuaFilter.John MacFarlane3-9/+9
in Text.Pandoc.Lua. Also to pushPandocModule. This change allows users to override pandoc.lua with a file in their local data directory, adding custom functions, etc. @tarleb, if you think this is a bad idea, you can revert this. But in general our data files are all overridable.
2017-06-29Text.Pandoc.Lua: more code simplification.John MacFarlane1-30/+26
Also, now we check before running walkM that the function table actually does contain something relevant. E.g. if your filter just defines Str, there's no need to run walkM for blocks, meta, or the whole document. This should help performance a bit (and it does, in my tests).
2017-06-29Lua filters: Remove special treatment of Quoted, Math.John MacFarlane1-24/+8
No more SingleQuoted, DoubleQuoted, InlineMath, DisplayMath. This makes everything uniform and predictable, though it does open up a difference btw lua filters and custom writers.
2017-06-29Text.Pandoc.Lua: refactored to remove duplicated code.John MacFarlane1-34/+25
2017-06-29Text.Pandoc.Lua: use generics to reduce boilerplate.John MacFarlane1-32/+3
I tested this with the str.lua filter on MANUAL.txt, and I could see no significant performance degradation. Doing things this way will ease maintenance, as we won't have to manually modify this module when types change. @tarleb, do we really need special cases for things like DoubleQuoted and InlineMath?
2017-06-28Make `papersize: a4` work regardless of the case of `a4`.John MacFarlane2-0/+9
It is converted to `a4` in LaTeX and `A4` in ConTeXt.
2017-06-28Muse reader: parse indented blockquotes (#3769)Alexander Krotov1-1/+22
2017-06-28LaTeX writer: fixed detection of otherlangs.John MacFarlane1-3/+3
We weren't recursing into inline contexts. Closes #3770.
2017-06-27Text.Pandoc.Lua: catch lua errors in filter functionsAlbert Krewinkel1-11/+20
Replace lua errors with `LuaException`s.
2017-06-27Text.Pandoc.Lua: keep element unchanged if filter returns nilAlbert Krewinkel1-8/+13
This was suggested by jgm and is consistent with the behavior of other filtering libraries.
2017-06-27Text.Pandoc.Lua: simplify filter function runnerAlbert Krewinkel1-25/+11
The code still allowed to pass an arbitrary number of arguments to the filter function, as element properties were passed as function arguments at some point. Now we only pass the element as the single arg, so the code to handle multiple arguments is no longer necessary.
2017-06-27Require nonempty alt text for `implicit_figures`.John MacFarlane1-1/+2
A figure with an empty caption doesn't make sense. Closes #2844.
2017-06-27RST reader: support anchors.John MacFarlane1-1/+23
E.g. `hello` .. _hello: paragraph This is supported by putting "paragraph" in a Div with id `hello`. Closes #262.
2017-06-27RST reader: Handle chained link definitions.John MacFarlane1-7/+20
For example, .. _hello: .. _goodbye: example.com Here both `hello` and `goodbye` should link to `example.com`. Fixes the first part of #262.
2017-06-27Docx writer: Allow 9 list levels.John MacFarlane1-3/+9
Closes #3519.
2017-06-27HTML reader: Use the lang value of <html> to set the lang meta value. (#3765)bucklereed1-0/+9
* HTML reader: Use the lang value of <html> to set the lang meta value. * Fix for pre-AMP environments.
2017-06-26OpenDocument/ODT writer: Added support for table of contents.John MacFarlane1-0/+1
Closes #2836. Thanks to @anayrat.
2017-06-26Use `table-of-contents` for contents of toc, make `toc` a boolean.John MacFarlane3-6/+18
Changed markdown, rtf, and HTML-based templates accordingly. This allows you to set `toc: true` in the metadata; this previously produced strange results in some output formats. Closes #2872. For backwards compatibility, `toc` is still set to the toc contents. But it is recommended that you update templates to use `table-of-contents` for the toc contents and `toc` for a boolean flag.
2017-06-26Muse writer: fix hlint errors (#3764)Alexander Krotov1-17/+13
2017-06-26LaTeX writer: use BCP47 parser.John MacFarlane1-89/+105
2017-06-26parseBCP47: Parse extensions and private-use as variants.John MacFarlane1-4/+20
Even though officially they aren't. This suffices for our purposes.
2017-06-26minor updates to vimwiki reader. (#3759)Yuchen Pei1-7/+6
- updated comments in Vimwiki.hs to reflect current status of implementation - added vimwiki to trypandoc
2017-06-26Muse reader: fix horizontal rule parsing (#3762)Alexander Krotov1-2/+4
Do not parse 3 dashes as horizontal rule and allow whitespace after rule
2017-06-26Muse reader: simplify para implementation (#3761)Alexander Krotov1-3/+1
2017-06-25BCP47: split toLang from getLang, rearranged types.John MacFarlane4-48/+55
2017-06-25Refactored ConTeXt writer to use BCP47.John MacFarlane2-39/+39
BCP47 - consistent case for BCP47 fields (e.g. uppercase for region).
2017-06-25Moved BCP47 specific functions from Writers.Shared to new module.John MacFarlane5-87/+125
Text.Pandoc.BCP47 (unexported, internal module). `getLang`, `Lang(..)`, `parseBCP47`.
2017-06-25Writers.Shared: improve type of Lang and bcp47 parser.John MacFarlane3-41/+79
Use a real parsec parser for BCP47, include variants.
2017-06-25Fixed log message for InvalidLang.John MacFarlane1-1/+1
2017-06-25Writers.Shared: refactored getLang, splitLang...John MacFarlane4-36/+55
into `Lang(..)`, `getLang`, `parceBCP47`.
2017-06-25Fixed support for `lang` attribute in OpenDocument and ODT writers.John MacFarlane1-20/+15
This improves on the last commit, which didn't work in some important ways. See #1667.
2017-06-25Support `lang` attribute in OpenDocument and ODT writers.John MacFarlane3-18/+72
This adds the required attributes to the temporary styles, and also replaces existing language attributes in styles.xml. Support for lang attributes on Div and Span has also been added. Closes #1667.
2017-06-25Added InvalidLang to LogMessage.John MacFarlane1-0/+7
2017-06-25Text.Pandoc.Writers.Shared: export splitLang.John MacFarlane1-0/+19
2017-06-25Text.Pandoc.Writers.Shared: added getLang.John MacFarlane1-2/+13
2017-06-25Muse reader: Require space before and after '=' for code (#3758)Alexander Krotov1-3/+10
2017-06-24Readers.getReader, Writers.getWriter API change.John MacFarlane4-22/+20
Now these functions return a pair of a reader/writer and an Extensions, instead of building the extensions into the reader/writer. The calling code must explicitly set readerExtensions or writerExtensions using the Extensions returned. The point of the change is to make it possible for the calling code to determine what extensions are being used. See #3659.
2017-06-24Extensions: Monoid instance for Extensions.John MacFarlane1-1/+5
[API change]
2017-06-23Added comment in source.John MacFarlane1-0/+3
2017-06-23Markdown reader: interpret YAML metadata as Inlines when possible.John MacFarlane1-12/+13
If the metadata field is all on one line, we try to interpret it as Inlines, and only try parsing as Blocks if that fails. If it extends over one line (including possibly the `|` or `>` character signaling an indented block), then we parse as Blocks. This was motivated by some German users finding that date: '22. Juin 2017' got parsed as an ordered list. Closes #3755.
2017-06-23Markdown writer: make sure `plain`, `markdown_github`, etc. work for raw.John MacFarlane1-5/+9
Previously only `markdown` worked. Note: currently a raw block labeled `markdown_github` will be printed for any `markdown` format.
2017-06-23HTML writer: make sure html4, html5 formats work for raw blocks/inlines.John MacFarlane1-14/+26
2017-06-23Text.Pandoc.Extensions: Added `Ext_raw_attribute`.John MacFarlane2-9/+37
Documented in MANUAL.txt. This is enabled by default in pandoc markdown and multimarkdown.