aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers
AgeCommit message (Collapse)AuthorFilesLines
2017-06-01LaTeX reader: Handle block structure inside table cells.John MacFarlane1-18/+18
minipage is no longer required. Closes #3709.
2017-06-01Merge pull request #3714 from tarleb/odt-reader-cleanupJohn MacFarlane7-902/+4
Odt reader: remove dead code
2017-06-01Add \colorbox supportMarc Schreiber1-10/+12
2017-05-31Org reader: respect export option for tagsAlbert Krewinkel3-2/+8
Tags are appended to headlines by default, but will be omitted when the `tags` export option is set to nil. Closes: #3713
2017-05-31Org reader: include tags in headlinesAlbert Krewinkel1-6/+16
The Emacs default is to include tags in the headline when exporting. Instead of just empty spans, which contain the tag name as attribute, tags are rendered as small caps and wrapped in those spans. Non-breaking spaces serve as separators for multiple tags.
2017-05-31Org reader: recognize babel result blocks with attributesAlbert Krewinkel2-22/+22
Babel result blocks can have block attributes like captions and names. Result blocks with attributes were not recognized and were parsed as normal blocks without attributes. Fixes: #3706
2017-05-31Org reader: fix module names in haddock commentsAlbert Krewinkel7-9/+8
Copy-pasting had lead to haddock module descriptions containing the wrong module names.
2017-05-31Odt reader: remove dead codeAlbert Krewinkel7-902/+4
The ODT reader contained a lot of general code useful for working with arrows. However, many of these utils weren't used and are hence removed.
2017-05-30Added eastAsianLineBreakFilter to Shared.John MacFarlane1-11/+1
This used to live in the Markdown reader.
2017-05-29LaTeX reader: handle escaped & inside table cell.John MacFarlane1-3/+5
Closes #3708.
2017-05-29LaTeX reader: don't crash on empty enumerate environment.John MacFarlane1-1/+1
Closes #3707.
2017-05-29Merge pull request #3704 from labdsf/anylinenewlineJohn MacFarlane1-4/+3
Markdown reader: use anyLineNewline
2017-05-28Markdown reader: use anyLineNewlineAlexander Krotov1-4/+3
2017-05-28Org reader: Fix cite parsing behaviourHerwig Stuetz1-2/+10
Until now, org-ref cite keys included special characters also at the end. This caused problems when citations occur right before colons or at the end of a sentence. With this change, all non alphanumeric characters at the end of a cite key are ignored. This also adds `,` to the list of special characters that are legal in cite keys to better mirror the behaviour of org-export.
2017-05-28Parsing: `many1Till`: Check for the end condition before parsingHerwig Stuetz3-5/+5
By not checking for the end condition before the first parse, the parser was applied too often, consuming too much of the input. This fixes the behaviour of `testStringWith (many1Till (oneOf "ab") (string "aa")) "aaa"` which before incorrectly returned `Right "a"`. With this change, it instead correctly fails with `Left (PandocParsecError ...)` because it is not able to parse at least one occurence of `oneOf "ab"` that is not `"aa"`. Note that this only affects `many1Till p end` where `p` matches on a prefix of `end`.
2017-05-28RST reader: use anyLineNewline in rawListItem (#3702)Alexander Krotov1-2/+2
2017-05-27Markdown writer: changes to `--reference-links`.John MacFarlane1-2/+6
With `--reference-location` of `section` or `block`, pandoc will now repeat references that have been used in earlier sections. The Markdown reader has also been modified, so that *exactly* repeated references do not generate a warning, only references with the same label but different targets. The idea is that, with references after every block, one might want to repeat references sometimes. Closes #3701.
2017-05-27Org reader: subject full doc tree to headline transformationsAlbert Krewinkel2-6/+37
Emacs parses org documents into a tree structure, which is then post-processed during exporting. The reader is changed to do the same, turning the document into a single tree of headlines starting at level 0. Fixes: #3695
2017-05-25Added `spaced_reference_links` extension.John MacFarlane1-3/+5
This is now the default for pandoc's Markdown. It allows whitespace between the two parts of a reference link: e.g. [a] [b] [b]: url This is now forbidden by default. Closes #2602.
2017-05-25Markdown reader: warn for notes defined but not used.John MacFarlane1-6/+14
Closes #1718. Parsing.ParserState: Make stateNotes' a Map, add stateNoteRefs.
2017-05-25MediaWiki reader: don't do curly quotes inside `<tt>` contexts.John MacFarlane1-1/+10
Even if `+smart`. See #3585.
2017-05-25MediaWiki reader: Make smart double quotes depend on `smart` extension.John MacFarlane1-1/+3
Closes #3585.
2017-05-24Markdown reader: fixed smart quotes after emphasis.John MacFarlane1-5/+6
E.g. in *foo*'s 'foo' Closes #2228.
2017-05-24LaTeX reader: Fixed failures on \ref{}, \label{} with `+raw_tex`.John MacFarlane1-6/+9
Now these commands are parsed as raw if `+raw_tex`; otherwise, their argument is parsed as a bracketed string.
2017-05-24Parsing: Provide parseFromString'.John MacFarlane6-61/+63
This is a verison of parseFromString specialied to ParserState, which resets stateLastStrPos at the end. This is almost always what we want. This fixes a bug where `_hi_` wasn't treated as emphasis in the following, because pandoc got confused about the position of the last word: - [o] _hi_ Closes #3690.
2017-05-24LaTeX reader: parse tikzpicture as raw verbatim environment...John MacFarlane1-0/+14
if `raw_tex` extension is selected. Otherwise skip with a warning. This is better than trying to parse it as text! Closes #3692.
2017-05-24HTML reader: Add `details` tag to list of block tags.John MacFarlane1-1/+2
Closes #3694.
2017-05-23Add suggestions of @jgm: parse bracketed stuff as inlinesMarc Schreiber1-3/+9
2017-05-23RST reader: reformatting (code line length).John MacFarlane1-23/+47
2017-05-23RST Reader: parse list table directive (#3688)keiichiro shikano1-1/+28
Closes #3432.
2017-05-23Shared: Provide custom isURI that rejects unknown schemes [isURI]Albert Krewinkel1-1/+0
We also export the set of known `schemes`. The new function replaces the function of the same name from `Network.URI`, as the latter did not check whether a scheme is well-known. E.g. MediaWiki wikis frequently feature pages with names like `User:John`. These links were interpreted as URIs, thus turning internal links into global links. This is prevented by also checking whether the scheme of a URI is frequently used (i.e. is IANA registered or an otherwise well-known scheme). Fixes: #2713 Update set of well-known URIs from IANA list All official IANA schemes (as of 2017-05-22) are included in the set of known schemes. The four non-official schemes doi, isbn, javascript, and pmid are kept.
2017-05-22Move indentWith to Text.Pandoc.Parsing (#3687)Alexander Krotov4-21/+1
2017-05-21Finished implemtation of `--resource-path`.John MacFarlane1-2/+2
* Default is just working directory. * Working directory must be explicitly specifide if `--resource-path` option is used.
2017-05-20RST reader: make use of anyLineNewline (#3686)Alexander Krotov1-2/+1
2017-05-18Org reader: fix smart parsing behaviorAlbert Krewinkel2-10/+15
Parsing of smart quotes and special characters can either be enabled via the `smart` language extension or the `'` and `-` export options. Smart parsing is active if either the extension or export option is enabled. Only smart parsing of special characters (like ellipses and en and em dashes) is enabled by default, while smart quotes are disabled. This means that all smart parsing features will be enabled by adding the `smart` language extension. Fine-grained control is possible by leaving the language extension disabled. In that case, smart parsing is controlled via the aforementioned export OPTIONS only. Previously, all smart parsing was disabled unless the language extension was enabled.
2017-05-18Markdown: allow attributes in reference links to start on next line.John MacFarlane1-1/+3
This addresses a subsidiary issue in #3674.
2017-05-17Merge pull request #3676 from labdsf/space-charJohn MacFarlane1-1/+1
Txt2Tags parser: newline is not indentation
2017-05-17Merge pull request #3677 from labdsf/anylinenewlineJohn MacFarlane4-8/+2
Move anyLineNewline to Parsing.hs
2017-05-17Move anyLineNewline to Parsing.hsAlexander Krotov4-8/+2
2017-05-17Txt2Tags parser: newline is not indentationAlexander Krotov1-1/+1
space parses '\n', while spaceChar parses only ' ' and '\t'
2017-05-16Org reader: replace `sequence . map` with `mapM`Albert Krewinkel2-3/+3
2017-05-16Org reader: put tree parsing code into dedicated moduleAlbert Krewinkel2-210/+262
2017-05-16Merge pull request #3671 from WUUUGI/horizont-spacingJohn MacFarlane1-1/+1
Added support for horizontal spacing in LaTeX
2017-05-15Textile reader: fix bug for certain links in table cells.John MacFarlane1-2/+5
Closes #3667.
2017-05-15Added support for horizontal spacing in LaTeX: parse \, to \8198 (six-per-em ↵Henri Werth1-1/+1
space)
2017-05-14Org reader: add basic file inclusion mechanismAlbert Krewinkel3-5/+43
Support for the `#+INCLUDE:` file inclusion mechanism was added. Recognized include types are *example*, *export*, *src*, and normal org file inclusion. Advanced features like line numbers and level selection are not implemented yet. Closes: #3510
2017-05-13Update dates in copyright noticesAlbert Krewinkel17-34/+35
This follows the suggestions given by the FSF for GPL licensed software. <https://www.gnu.org/prep/maintain/html_node/Copyright-Notices.html>
2017-05-12Replace `repeat' and `take' with `replicate' once moreAlexander Krotov1-1/+1
2017-05-11Combine grid table parsersAlbert Krewinkel1-83/+1
The grid table parsers for markdown and rst was combined into one single parser, slightly changing parsing behavior of both parsers: - The markdown parser now compactifies block content cell-wise: pure text blocks in cells are now treated as paragraphs only if the cell contains multiple paragraphs, and as plain blocks otherwise. Before, this was true only for single-column tables. - The rst parser now accepts newlines and multiple blocks in header cells. Closes: #3638
2017-05-06Markdown reader: improved parsing of indented raw HTML blocks.John MacFarlane1-1/+7
Previously we inadvertently interpreted indented HTML as code blocks. This was a regression. We now seek to determine the indentation level of the contents of an HTML block, and (optionally) skip that much indentation. As a side effect, indentation may be stripped off of raw HTML blocks, if `markdown_in_html_blocks` is used. This is better than having things interpreted as indented code blocks. Closes #1841.