aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers
AgeCommit message (Collapse)AuthorFilesLines
2019-09-22Use HsYAML-0.2.0.0John MacFarlane1-11/+12
Most of this is due to @vijayphoenix (#5704), but it needed some revisions to integrate with current master, and to use the released HsYAML. Closes #5704.
2019-09-21[Docx Reader] Use style names, not ids, for assigning semantic meaningNikolay Yakimov3-183/+287
Motivating issues: #5523, #5052, #5074 Style name comparisons are case-insensitive, since those are case-insensitive in Word. w:styleId will be used as style name if w:name is missing (this should only happen for malformed docx and is kept as a fallback to avoid failing altogether on malformed documents) Block quote detection code moved from Docx.Parser to Readers.Docx Code styles, i.e. "Source Code" and "Verbatim Char" now honor style inheritance Docx Reader now honours "Compact" style (used in Pandoc-generated docx). The side-effect is that "Compact" style no longer shows up in docx+styles output. Styles inherited from "Compact" will still show up. Removed obsolete list-item style from divsToKeep. That didn't really do anything for a while now. Add newtypes to differentiate between style names, ids, and different style types (that is, paragraph and character styles) Since docx style names can have spaces in them, and pandoc-markdown classes can't, anywhere when style name is used as a class name, spaces are replaced with ASCII dashes `-`. Get rid of extraneous intermediate types, carrying styleId information. Instead, styleId is saved with other style data. Use RunStyle for inline style definitions only (lacking styleId and styleName); for Character Styles use CharStyle type (which is basicaly RunStyle with styleId and StyleName bolted onto it).
2019-09-21[Docx Reader] Code clean-upNikolay Yakimov2-63/+39
Reduce code duplication, remove redundant brackets, use newtype instead of data where appropriate
2019-09-19MediaWiki: skip optional {{table}} template.John MacFarlane1-0/+1
See https://en.wikipedia.org/wiki/Template:Table Closes #5757.
2019-09-09LaTeX reader: Fix parsing of optional arguments that contain braced text.John MacFarlane1-4/+3
Closes #5740.
2019-09-08Org reader: modify handling of example blocks. (#5717)Brian Leung2-14/+43
* Org reader: allow the `-i` switch to ignore leading spaces. * Org reader: handle awkwardly-aligned code blocks within lists. Code blocks in Org lists must have their #+BEGIN_ aligned in a reasonable way, but their other components can be positioned otherwise.
2019-09-05Roff reader: Better support for 'while'.John MacFarlane1-0/+3
2019-09-05Roff reader: improve handling of groups.John MacFarlane1-4/+2
2019-09-04Roff reader: Fix problem parsing comments before macro.John MacFarlane1-2/+0
2019-09-04Roff reader: more improvements in parsing conditionals.John MacFarlane1-3/+4
2019-09-04Roff readers: better parsing of groups.John MacFarlane1-9/+5
We now allow groups where the closing `\\}` isn't at the beginning of a line. Closes #5410.
2019-09-02LaTeX reader: don't try to parse includes if raw_tex is set.John MacFarlane1-5/+13
When the `raw_tex` extension is set, we just carry through `\usepackage`, `\input`, etc. verbatim as raw LaTeX. Closes #5673.
2019-09-02LaTeX reader: properly handle optional arguments for macros.John MacFarlane2-2/+2
Closes #5682.
2019-08-27LaTeX reader: fix `\\` in `\parbox` inside a table cell.John MacFarlane1-3/+18
Closes #5711.
2019-08-27Markdown reader: Headers: don't parse content over newline boundary.John MacFarlane1-4/+15
Closes #5714.
2019-08-26Use parseFromString' in Muse reader.John MacFarlane1-1/+1
Now that it is polymorphic, this is possible, and it's a better choice because it resets last string pos.
2019-08-26Fix inline parsing in grid table cells.John MacFarlane2-2/+2
* T.P.Parsing: Change type of `setLastStrPos` so it takes a `Maybe SourcePos` rather than a `SourcePos`. [API change] * T.P.Parsing: Make `parseFromString'` and `gridTableWith` and `gridTableWith'` polymorphic in the parser state, constraining it with `HasLastStrPosition`. [API change] Closes #5708.
2019-08-23RST reader: use title, not admonition-title, for admonition title.John MacFarlane1-1/+1
This puts RST reader into alignment with docbook reader.
2019-08-23docbook: richer parse for admonitions (#5593)Michael Peyton Jones1-16/+27
Fixes #1234. This parses admonitions not as a blockquote, but rather as a div with an appropriate class. We also handle titles for admonitions as a nested div with the "title" class. (I followed the behaviour of other docbook-to-html converters in this - there are clearly other ways you could encode it.) In general, the handling of elements with nested title elements is very inconsistent. I think we should make it consistent, but I'm leaivng that for later to make this a small change. Example: ```docbook <warning xml:id="someId"> <title>My title</title> <simpara>An admonition block</simpara> </warning> ``` goes to ```html <div id="someId" class="warning"> <div class="title">My title</div> <p>An admonition block</p> </div> ```
2019-08-14LaTeX reader: improve withRaw so it can handle cases where...John MacFarlane1-2/+3
the token string is modified by a parser (e.g. accent when it only takes part of a Word token). Closes #5686. Still not ideal, because we get the whole `\t0BAR` and not just `\t0` as a raw latex inline command. But I'm willing to let this be an edge case, since you can easily work around this by inserting a space, braces, or raw attribute. The important thing is that we no longer drop the rest of the document after a raw latex inline command that gobbles only part of a Word token!
2019-08-14Removed some needless lookaheads in Markdown reader.John MacFarlane1-2/+0
2019-08-05Treat `ly` as verbatim too (#5671)Urs Liska1-0/+1
According to https://github.com/jgm/pandoc/issues/4725#issuecomment-399772217 not only the `lilypond` environment but also `ly` should be included in the verbatim list. @jperon https://github.com/jperon/lyluatex/issues/203
2019-07-24LaTeX reader: handle `\passthrough` macro used by latex writer.John MacFarlane1-0/+2
Closes #5659.
2019-07-22LaTeX reader: support tex `\tt` command.John MacFarlane1-0/+1
Closes #5654.
2019-07-22Org reader: accept ATTR_LATEX in block attributesAlbert Krewinkel1-3/+11
Attributes for LaTeX output are accepted as valid block attributes; however, their values are ignored. Fixes: #5648
2019-07-20LaTeX reader: search for image with list of extensions...John MacFarlane1-6/+16
like latex does, if an extension is not provided. Closes #4933.
2019-07-19Markdown: Ensure that expanded latex macros end with space if original did.John MacFarlane1-1/+10
Closes #4442.
2019-07-16LaTeX reader: handle \looseness command values better.John MacFarlane1-5/+4
Closes #4439.
2019-07-14Muse: add RTL supportAlexander Krotov1-0/+12
Closes #5551
2019-07-13Fix #4499: add mbox and hbox handling to LaTeX reader (#5586)Vasily Alferov1-1/+11
When `+raw_tex` is enabled, these are passed through literally. Otherwise, they are handled in a way that emulates LaTeX's behavior.
2019-07-13Merge pull request #5589 from blmage/fix-3992John MacFarlane1-8/+15
Add support for EPUB2 covers (fix #3992)
2019-07-13Merge pull request #5606 from blmage/odt-framesJohn MacFarlane4-75/+127
Improve the parsing of frames in ODT documents
2019-07-13LaTeX reader: Properly handle \providecommand and environment...John MacFarlane1-21/+30
They are now ignored if the corresponding command or environment is already defined. Closes #5635.
2019-07-10RST reader: keep `name` property in `imgAttr`. (#5637)Brian Leung1-1/+1
Closes #5619.
2019-07-06Markdown reader: handle inline code more eagerly within lists. (#5628)Brian Leung1-5/+7
Closes #5627.
2019-07-02Fix redundant constraint warnings. (#5625)Pete Ryland6-9/+8
2019-06-21Support epigraph command in LaTeX Reader.oquechy1-0/+8
Closes #3523.
2019-06-20Improve the parsing of frames in ODT documentsblmage4-75/+127
2019-06-18Handle the case where the "cover" meta does not link to the manifestblmage1-2/+2
2019-06-18Add support for EPUB2 covers (fix #3992)blmage1-7/+14
2019-06-09DocBook reader: Issue IgnoredElement warnings.John MacFarlane1-28/+37
2019-06-09FB2 reader: skip unknown elements rather than throwing errors.John MacFarlane1-20/+39
Sometimes custom elements are used, and the reader should not abort but skip them with a warning. (For example, id element in author.) Closes #5560.
2019-06-08LaTeX reader: pass through unknown listings language as class.John MacFarlane1-7/+13
Previously if the language was not in the list of listings- supported languages, it would not be added as a class, so custom syntax highlighting could not be used. Closes #5540.
2019-06-04Include trailing {}s in raw latex commands.John MacFarlane1-2/+7
Change is in rawLaTeXInline in LaTeX reader, but it affects the markdown reader and other readers that allow raw LaTeX. Previously, trailing `{}` would be included for unknown commands, but not for known commands. However, they are sometimes used to avoid a trailing space after the command. The chances that a `{}` after a LaTeX command is not part of the command are very small. Closes #5439.
2019-06-04Docx reader: Add support for w:rtl (ltr annotation).John MacFarlane2-4/+19
Closes #5545.
2019-06-04Markdown reader: don't create implicit reference for empty header.John MacFarlane1-4/+7
Closes #5549.
2019-05-29HTML reader: misc. epub related fixes.John MacFarlane1-30/+41
- With epub extensions, check for epub:type in addition to type. - Fix problem with noteref parsing which caused block-level content to be eaten with the noteref. - Rename pAnyTag to pAny. - Refactor note resolution.
2019-05-27consolidate simple-table detection (#5524)Mauro Bieg1-7/+2
add `onlySimpleTableCells` to `Text.Pandoc.Shared` [API change] This fixes an inconsistency in the HTML reader, which did not treat tables with `<p>` inside cells as simple.
2019-05-25Muse reader: allow images inside link descriptionsAlexander Krotov1-5/+4
2019-05-25HTML reader: trim definition list termsAlexander Krotov1-1/+1