aboutsummaryrefslogtreecommitdiff
path: root/test/Tests/Readers
AgeCommit message (Collapse)AuthorFilesLines
2020-06-30Org reader: respect export setting which disables entitiesAlbert Krewinkel1-0/+6
MathML-like entities, e.g., `\alpha`, can be disabled with the `#+OPTION: e:nil` export setting.
2020-06-29Org reader: keep unknown keyword lines as raw orgAlbert Krewinkel1-2/+5
The lines of unknown keywords, like `#+SOMEWORD: value` are no longer read as metadata, but kept as raw `org` blocks. This ensures that more information is retained when round-tripping org-mode files; additionally, this change makes it possible to support non-standard org extensions via filters.
2020-06-29Org reader: unify keyword handlingAlbert Krewinkel1-48/+56
Handling of export settings and other keywords (like `#+LINK`) has been combined and unified.
2020-06-29Org reader: support LATEX_HEADER_EXTRA and HTML_HEAD_EXTRA settingsAlbert Krewinkel1-29/+49
These export settings are treated like their non-extra counterparts, i.e., the values are added to the `header-includes` metadata list.
2020-06-29Org reader: allow multiple #+SUBTITLE export settingsAlbert Krewinkel1-0/+7
The values of all lines are read as inlines and collected in the `subtitle` metadata field.
2020-06-28JATS reader: parse abstract element into metadata field of same name (#6482)Albert Krewinkel1-0/+17
Closes: #6480
2020-06-28Org reader: read `#+INSTITUTE` values as text with markupAlbert Krewinkel1-0/+4
The value is stored in the `institute` metadata field and used in the default beamer presentation template.
2020-06-28Org tests: group export settings test for Org readerAlbert Krewinkel1-74/+79
2020-06-28Org reader: update behavior of author, keywords export settingsAlbert Krewinkel1-20/+35
The behavior of the `#+AUTHOR` and `#+KEYWORD` export settings has changed: Org now allows multiple such lines and adds a space between the contents of each line. Pandoc now always parses these settings as meta inlines; setting values are no longer treated as comma-separated lists. Note that a Lua filter can be used to restore the previous behavior.
2020-06-27Org reader: read description lines as inlinesAlbert Krewinkel1-5/+17
`#+DESCRIPTION` lines are treated as text with markup. If multiple such lines are given, then all lines are read and separated by soft linebreaks. Closes: #6485
2020-06-25Org reader: honor tex export optionAlbert Krewinkel1-0/+73
The `tex` export option can be set with `#+OPTION: tex:nil` and allows three settings: - `t` causes LaTeX fragments to be parsed as TeX or added as raw TeX, - `nil` removes all LaTeX fragments from the document, and - `verbatim` treats LaTeX as text. The default is `t`. Closes: #4070
2020-06-22Use native Underline instead of Span in JiraJohn MacFarlane1-1/+1
2020-06-20Recognize images with uppercase extensionsAlbert Krewinkel1-0/+4
Fixes: #6472
2020-04-28Support new Underline element in readers and writers (#6277)Vaibhav Sagar4-10/+6
Deprecate `underlineSpan` in Shared in favor of `Text.Pandoc.Builder.underline`.
2020-04-19More fixes for round-trip tests of HTML reader.John MacFarlane1-6/+10
We exclude tables that have default widths but non-simple content, as these can't really round-trip.
2020-04-18Fixed round-trip HTML tests.John MacFarlane1-0/+5
Exclude tables with cells with line breaks because they don't currently round-trip. (Table goes from being simple to having explicit widths.)
2020-04-15Use the new builders, modify readers to preserve empty headersdespresc6-135/+141
The Builder.simpleTable now only adds a row to the TableHead when the given header row is not null. This uncovered an inconsistency in the readers: some would unconditionally emit a header filled with empty cells, even if the header was not present. Now every reader has the conditional behaviour. Only the XWiki writer depended on the header row being always present; it now pads its head as necessary.
2020-04-15Adapt to the newest Table type, fix some previous adaptation issuesdespresc6-47/+77
- Writers.Native is now adapted to the new Table type. - Inline captions should now be conditionally wrapped in a Plain, not a Para block. - The toLegacyTable function now lives in Writers.Shared.
2020-04-15Implement the new Table typedespresc6-41/+63
2020-04-15Markdown Reader: Fix inline code in lists (#6284)Nikolay Yakimov1-0/+47
Closes #6284. Previously inline code containing list markers was sometimes parsed incorrectly.
2020-04-04Jira: support citations, attachment links, and user linksAlbert Krewinkel1-3/+33
Closes: #6231 Closes: #6238 Closes: #6239
2020-04-03Jira reader: resolve parsing issues of blockquote, colorAlbert Krewinkel1-2/+16
Parsing problems occurring with block quotes and colored text have been resolved. Fixes: #6233 Fixes: #6235
2020-03-31Jira reader: use span with class `underline` for inserted textAlbert Krewinkel1-0/+4
Jira text which is marked as `+inserted+` is converted into pandoc's default representation for underlined text: a span with class `underline`. Previously, the span was marked with the non-standard class `inserted`. Closes: #6237
2020-03-30Jira reader: retain image attributesAlbert Krewinkel1-0/+9
Jira images attributes as in `!image.jpg|align=right!` are retained as key-value pairs. Thumbnail images, such as `!example.gif|thumbnail!`, are marked by a `thumbnail` class in their attributes. Related to #6234.
2020-03-28More cleanup (#6209)Joseph C. Sible1-3/+2
* Simplify by collapsing a do block into a single <$> * Remove an unnecessary variable: `all` takes any Foldable, so only blocksToInlines needs toList.
2020-03-19Jira reader: fix parsing of tables without preceding blanklineAlbert Krewinkel1-0/+5
A bug was fixed which caused faulty parsing if a table was not preceded by a newline and the first table cell had no space after the initial `|` characters. Fixes: #6198
2020-03-18Jira reader: fix parsing of strikeout, emphasisAlbert Krewinkel1-0/+4
A bug was fixed which caused non-emphasized text containing digits and/or non-special symbols (like dots) to sometimes be parsed incorrectly. Fixes: #6196
2020-03-13Update copyright year (#6186)Albert Krewinkel28-29/+29
* Update copyright year * Copyright: add notes for Lua and Jira modules
2020-03-13Jira reader: support colored inline text, indented listsAlbert Krewinkel1-0/+4
* Support for colored inlines has been added. * Lists are now allowed to be indented; i.e., lists are still recognized if list markers are preceded by spaces. Closes: #6183, #6184
2020-03-05Fix man reader test for previous change.John MacFarlane1-1/+1
2020-02-08Use <$> instead of >>= and return (#6128)Joseph C. Sible1-1/+1
2020-02-07Apply linter suggestions. Add fix_spacing to lint target in Makefile.John MacFarlane5-24/+24
2019-12-21HTML reader tests: modify round-trip tests...John MacFarlane1-0/+4
to avoid a special failure case involving makeSections.
2019-12-19Org reader: fix parsing problem for colons in headlineAlbert Krewinkel1-0/+10
Fixed a problem where words surrounded by colons could causing parse failures in some cases when they occurred in headers. Fixes: #5993
2019-12-18Org reader: wrap named table in div, using name as idAlbert Krewinkel1-0/+7
Closes: #5984
2019-12-17Add jira reader (#5913)Albert Krewinkel1-0/+114
Closes #5556
2019-11-18DokuWiki reader: parse markup inside monospace ('') (#5917)Alexander Krotov1-0/+3
Fixes #5916
2019-11-12Switch to new pandoc-types and use Text instead of String [API change].despresc8-16/+23
PR #5884. + Use pandoc-types 1.20 and texmath 0.12. + Text is now used instead of String, with a few exceptions. + In the MediaBag module, some of the types using Strings were switched to use FilePath instead (not Text). + In the Parsing module, new parsers `manyChar`, `many1Char`, `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`, `mantyUntilChar` have been added: these are like their unsuffixed counterparts but pack some or all of their output. + `glob` in Text.Pandoc.Class still takes String since it seems to be intended as an interface to Glob, which uses strings. It seems to be used only once in the package, in the EPUB writer, so that is not hard to change.
2019-11-04HTML Reader/Writer - Add support for <var> and <samp> (#5861)Amogh Rathore1-0/+6
Closes #5799
2019-11-03Docx reader: fix list number resumption for sublists. Closes #4324.John MacFarlane1-0/+4
The first list item of a sublist should not resume numbering from the number of the last sublist item of the same level, if that sublist was a sublist of a different list item. That is, we should not get: ``` 1. one 1. sub one 2. sub two 2. two 3. sub one ```
2019-10-27Org reader: fix parsing of empty comment linesAlbert Krewinkel1-1/+11
Comment lines in Org-mode can be completely empty; both of these line should produce no output: # a comment # The reader used to produce a wrong result for the latter, but ignores that line as well now. Fixes: #5856
2019-10-23Add Reader support for HTML <samp> element (#5843)Amogh Rathore1-0/+6
The `<samp>` element is parsed as a Span with class `sample`. Closes #5792.
2019-10-15Muse reader: do not allow closing asterisks to be followed by "*"Alexander Krotov1-3/+23
2019-10-15Muse reader: do not split series of asterisks into symbols and emphasisAlexander Krotov1-0/+8
Fixes #5821
2019-10-15Muse reader: do not terminate emphasis on "*" not followed by spaceAlexander Krotov1-0/+4
2019-10-04hlint Muse reader testsAlexander Krotov1-1/+1
2019-09-21[Docx Reader] Use style names, not ids, for assigning semantic meaningNikolay Yakimov1-0/+9
Motivating issues: #5523, #5052, #5074 Style name comparisons are case-insensitive, since those are case-insensitive in Word. w:styleId will be used as style name if w:name is missing (this should only happen for malformed docx and is kept as a fallback to avoid failing altogether on malformed documents) Block quote detection code moved from Docx.Parser to Readers.Docx Code styles, i.e. "Source Code" and "Verbatim Char" now honor style inheritance Docx Reader now honours "Compact" style (used in Pandoc-generated docx). The side-effect is that "Compact" style no longer shows up in docx+styles output. Styles inherited from "Compact" will still show up. Removed obsolete list-item style from divsToKeep. That didn't really do anything for a while now. Add newtypes to differentiate between style names, ids, and different style types (that is, paragraph and character styles) Since docx style names can have spaces in them, and pandoc-markdown classes can't, anywhere when style name is used as a class name, spaces are replaced with ASCII dashes `-`. Get rid of extraneous intermediate types, carrying styleId information. Instead, styleId is saved with other style data. Use RunStyle for inline style definitions only (lacking styleId and styleName); for Character Styles use CharStyle type (which is basicaly RunStyle with styleId and StyleName bolted onto it).
2019-09-15Revert "FB2 reader test: better diagnostics on failure."John MacFarlane1-28/+1
This reverts commit c65af7d1a2f35cbfd1235df2960f7156d38e8f92.
2019-09-15FB2 reader test: better diagnostics on failure.John MacFarlane1-1/+28
2019-09-14FB2 reader test: Another attempt to fix test failure on GitHub CI.John MacFarlane1-4/+5