aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers
AgeCommit message (Collapse)AuthorFilesLines
2017-08-22Muse reader: avoid crashes on multiparagraph inline tags (#3866)Alexander1-2/+2
Test checks that behavior is consistent with Amusewiki
2017-08-22Muse reader: do not allow closing tags with EOF (#3863)Alexander1-3/+2
This behavior is compatible to Amusewiki
2017-08-21Muse reader: add definition list support (#3860)Alexander1-1/+28
2017-08-20LaTeX reader: Set identifiers on Spans used for \label.John MacFarlane1-1/+2
2017-08-20LaTeX reader: allow `]` inside group in option brackets.John MacFarlane1-3/+2
Closes #3857.
2017-08-19Markdown reader: use CommonMark rules for list item nesting.John MacFarlane1-68/+61
Closes #3511. Previously pandoc used the four-space rule: continuation paragraphs, sublists, and other block level content had to be indented 4 spaces. Now the indentation required is determined by the first line of the list item: to be included in the list item, blocks must be indented to the level of the first non-space content after the list marker. Exception: if are 5 or more spaces after the list marker, then the content is interpreted as an indented code block, and continuation paragraphs must be indented two spaces beyond the end of the list marker. See the CommonMark spec for more details and examples. Documents that adhere to the four-space rule should, in most cases, be parsed the same way by the new rules. Here are some examples of texts that will be parsed differently: - a - b will be parsed as a list item with a sublist; under the four-space rule, it would be a list with two items. - a code Here we have an indented code block under the list item, even though it is only indented six spaces from the margin, because it is four spaces past the point where a continuation paragraph could begin. With the four-space rule, this would be a regular paragraph rather than a code block. - a code Here the code block will start with two spaces, whereas under the four-space rule, it would start with `code`. With the four-space rule, indented code under a list item always must be indented eight spaces from the margin, while the new rules require only that it be indented four spaces from the beginning of the first non-space text after the list marker (here, `a`). This change was motivated by a slew of bug reports from people who expected lists to work differently (#3125, #2367, #2575, #2210, #1990, #1137, #744, #172, #137, #128) and by the growing prevalance of CommonMark (now used by GitHub, for example). Users who want to use the old rules can select the `four_space_rule` extension. * Added `four_space_rule` extension. * Added `Ext_four_space_rule` to `Extensions`. * `Parsing` now exports `gobbleAtMostSpaces`, and the type of `gobbleSpaces` has been changed so that a `ReaderOptions` parameter is not needed.
2017-08-18Markdown reader: fixed parsing of fenced code after list...John MacFarlane1-1/+4
...when there is no intervening blank line. Closes #3733.
2017-08-18Markdown reader: parse `-@roe` as suppress-author citation.John MacFarlane1-2/+4
Previously only `[-@roe]` (with brackets) was recognized as suppress-author, and `-@roe` was treated the same as `@roe`. Closes jgm/pandoc-citeproc#237.
2017-08-18LaTeX reader: implement \newtoggle, \iftoggle, \toggletrue|falseJohn MacFarlane1-5/+47
from etoolbox. Closes #3853.
2017-08-17RST reader/writer: support unknown interpreted text roles...John MacFarlane1-4/+2
...by parsing them as Span with "role" attributes. This way they can be manipulated in the AST. Closes #3407.
2017-08-17HTML reader: support column alignments.John MacFarlane1-13/+30
These can be set either with a `width` attribute or with `text-width` in a `style` attribute. Closes #1881.
2017-08-17LaTeX reader: support \lq, \rq.John MacFarlane1-0/+2
2017-08-17LaTeX reader: support \textquoteleft|right, \textquotedblleft|right.John MacFarlane1-0/+4
Closes #3849.
2017-08-16LaTeX reader: rudimentary support for `\hyperlink`.John MacFarlane1-0/+4
2017-08-16LaTeX reader: use Link instead of Span for `\ref`.John MacFarlane1-5/+6
This makes more sense semantically and avoids unnecessary Span [Link] nestings when references are resolved.
2017-08-16LaTeX reader: add Support for `glossaries` and `acronym` package (#3589)schrieveslaach1-0/+39
Acronyms are not resolved by the reader, but acronym and glossary information is put into attributes on Spans so that they can be processed in filters.
2017-08-13Better handle complex \def macros as raw latex.John MacFarlane1-9/+11
2017-08-13LaTeX reader: Allow @ as a letter in control sequences.John MacFarlane1-2/+8
@ is commonly used in macros using `\makeatletter`. Ideally we'd make the tokenizer sensitive to `\makeatletter` and `\makeatother`, but until then this seems a good change.
2017-08-13LaTeX reader: fix `\let\a=0` case, with single character token.John MacFarlane1-13/+18
2017-08-13Resolve references to section numbers in LaTeX reader.John MacFarlane1-3/+17
2017-08-13LaTeX reader: track header numbers and correlate with labels.John MacFarlane1-22/+49
2017-08-13Put content of \ref, \label commands into span… (#3639)schrieveslaach1-3/+17
* Put content of `\ref` and `\label` commands into Span elements so they can be used in filters. * Add support for `\eqref`
2017-08-12LaTeX reader: Fixed space after \figurename etc.John MacFarlane1-4/+1
2017-08-12LaTeX reader: support \chaptername, \partname, \abstractname, etc.John MacFarlane1-0/+20
See #3559. Obsoletes #3560.
2017-08-12LaTeX reader: have `\setmainlanguage` set `lang` in metadata.John MacFarlane1-4/+6
2017-08-11Added support for translations (localization) (see #3559).John MacFarlane1-2/+133
* readDataFile, readDefaultDataFile, getReferenceDocx, getReferenceODT have been removed from Shared and moved into Class. They are now defined in terms of PandocMonad primitives, rather than being primitve methods of the class. * toLang has been moved from BCP47 to Class. * NoTranslation and CouldNotLoudTranslations have been added to LogMessage. * New module, Text.Pandoc.Translations, exporting Term, Translations, readTranslations. * New functions in Class: translateTerm, setTranslations. Note that nothing is loaded from data files until translateTerm is used; setTranslation just sets the language to be used. * Added two translation data files in data/translations. * LaTeX reader: Support `\setmainlanguage` or `\setdefaultlanguage` (polyglossia) and `\figurename`.
2017-08-10RST reader: implement csv-table directive.John MacFarlane1-45/+56
Most attributes are supported, including `:file:` and `:url:`. A (probably insufficient) test case has been added. Closes #3533.
2017-08-10RST reader: Basic support for csv-table directive.John MacFarlane1-0/+52
* Added Text.Pandoc.CSV, simple CSV parser. * Options still not supported, and we need tests. See #3533.
2017-08-09RST reader: reorganize block parsers for ~20% faster parsing.John MacFarlane1-3/+4
2017-08-09Removed spurious comments.John MacFarlane1-4/+0
2017-08-09Org reader: use org-language attribute rather than data-org-language.John MacFarlane1-1/+1
2017-08-09Org reader: use tag-name attribute instead of data-tag-name.John MacFarlane1-1/+1
2017-08-09LaTeX reader: Use `label` instead of `data-label` for label in caption.John MacFarlane1-1/+1
See d441e656db576f266c4866e65ff9e4705d376381, #3639.
2017-08-09HTML reader: parse <main> like <div role=main>. (#3791)bucklereed1-7/+11
* HTML reader: parse <main> like <div role=main>. * <main> closes <p> and behaves like a block element generally
2017-08-09Muse reader: simplify tableCell implementation (#3846)Alexander1-3/+1
2017-08-08RST reader: support :widths: attribute for table directive.John MacFarlane1-3/+13
2017-08-08Thread options through CommonMark reader.John MacFarlane1-81/+77
This is more efficient than doing AST traversals for emojis and hard breaks. Also make behavior sensitive to `raw_html` extension.
2017-08-08Support `hard_line_breaks` in CommonMark reader.John MacFarlane1-0/+7
2017-08-08CommonMark reader: support `emoji` extension.John MacFarlane1-1/+19
2017-08-08CommonMark reader: support `gfm_auto_identifiers`.John MacFarlane1-0/+31
Added `Ext_gfm_auto_identifiers`: new constructor for `Extension` in `Text.Pandoc.Extensions` [API change]. Use this in githubExtensions. Closes #2821.
2017-08-07CommonMark reader: make exts depend on extensions.John MacFarlane1-2/+4
2017-08-07Remove GFM modules; use CMarkGFM for both gfm and commonmark.John MacFarlane2-191/+63
We no longer have a separate readGFM and writeGFM; instead, we'll use readCommonMark and writeCommonMark with githubExtensions. It remains to implement these extensions conditionally. Closes #3841.
2017-08-07Markdown reader: fixed spurious parsing as citation as reference def.John MacFarlane1-2/+4
We now disallow reference keys starting with `@` if the `citations` extension is enabled. Closes #3840.
2017-08-07Added gfm (GitHub-flavored CommonMark) as an input and output format.John MacFarlane2-2/+187
This uses bindings to GitHub's fork of cmark, so it should parse gfm exactly as GitHub does (excepting certain postprocessing steps, involving notifications, emojis, etc.). * Added Text.Pandoc.Readers.GFM (exporting readGFM) * Added Text.Pandoc.Writers.GFM (exporting writeGFM) * Added `gfm` as input and output forma Note that tables are currently always rendered as HTML in the writer; this can be improved when CMarkGFM supports tables in output.
2017-08-07Small tweak to previous commit.John MacFarlane1-1/+1
2017-08-07LaTeX reader: Support simple `\def` macros.John MacFarlane1-2/+21
Note that we still don't support macros with fancy parameter delimiters, like \def\foo#1..#2{...}
2017-08-07LaTeX reader: Support `\let`.John MacFarlane2-14/+33
Also, fix regular macros so they're expanded at the point of use, and NOT also the point of definition. `\let` macros, by contrast, are expanded at the point of definition. Added an `ExpansionPoint` field to `Macro` to track this difference.
2017-08-06Muse reader: debug indented paragraph support (#3839)Alexander1-21/+5
Take only first line indentation into account and do not start new paragraph on indentation change.
2017-08-06Docx reader: Avoid 0-level headers.Jesse Rosenthal1-6/+5
We used to parse paragraphs styled with "HeadingN" as "nth-level header." But if a document has a custom style named "Heading0", this will produce a 0-level header, which shouldn't exist. We only parse this style if N>0. Otherwise we treat it as a normal style name, and follow its dependencies, if any. Closes #3830.
2017-08-06Muse reader: debug list and list item separation rules (#3837)Alexander1-5/+4