aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers
AgeCommit message (Collapse)AuthorFilesLines
2020-06-29Unify defaults and markdown metadata parsersNikolay Yakimov2-15/+20
2020-06-28Remove obsolete RelaxedPolyRec extension (#6487)Nikolay Yakimov5-7/+0
2020-06-28JATS reader: parse abstract element into metadata field of same name (#6482)Albert Krewinkel1-0/+9
Closes: #6480
2020-06-28Org reader: read `#+INSTITUTE` values as text with markupAlbert Krewinkel1-7/+13
The value is stored in the `institute` metadata field and used in the default beamer presentation template.
2020-06-28Org reader: update behavior of author, keywords export settingsAlbert Krewinkel1-19/+9
The behavior of the `#+AUTHOR` and `#+KEYWORD` export settings has changed: Org now allows multiple such lines and adds a space between the contents of each line. Pandoc now always parses these settings as meta inlines; setting values are no longer treated as comma-separated lists. Note that a Lua filter can be used to restore the previous behavior.
2020-06-28Org reader: refactor export setting handlingAlbert Krewinkel1-79/+67
2020-06-27Org reader: read description lines as inlinesAlbert Krewinkel1-10/+46
`#+DESCRIPTION` lines are treated as text with markup. If multiple such lines are given, then all lines are read and separated by soft linebreaks. Closes: #6485
2020-06-25Org reader: honor tex export optionAlbert Krewinkel4-30/+75
The `tex` export option can be set with `#+OPTION: tex:nil` and allows three settings: - `t` causes LaTeX fragments to be parsed as TeX or added as raw TeX, - `nil` removes all LaTeX fragments from the document, and - `verbatim` treats LaTeX as text. The default is `t`. Closes: #4070
2020-06-23LaTeX reader: Retain the Div around tables with attributes.John MacFarlane1-1/+8
We'll need this to store table attributes until all writers are adjusted to react to attributes on the Table element.
2020-06-22Use native Underline instead of Span in JiraJohn MacFarlane1-1/+1
2020-06-20Recognize images with uppercase extensionsAlbert Krewinkel1-1/+2
Fixes: #6472
2020-06-17RST reader: pass arbitrary attributes through in code blocks.John MacFarlane1-12/+12
Exceptions: name (which becomes the id), class (which becomes the classes), and number-lines (which is treated specially to fit with pandoc highlighting). Closes #6465.
2020-06-14Docbook reader: implement <procedure> (#6442)Mathieu Boespflug1-4/+6
A `<procedure>` contains a sequence of `<step>`'s, or `<substeps>` that themselves contain `<step>`'s.
2020-06-14Docbook reader: implement <phrase> (#6438)Mathieu Boespflug1-1/+7
A `<phrase>` has no semantic meaning. It is only useful to hang an `id` or other attributes around a piece of text.
2020-06-14Docbook reader: treat envar and systemitem like code (#6435)Mathieu Boespflug1-2/+4
2020-06-14Docbook: implement <replaceable> (#6437)Mathieu Boespflug1-1/+3
A `<replaceable>` is a placeholder that a user is instructed to replace with a value of their own, like `<replaceable>prefix</replacable>/bin/foo`. In the standard Docbook toolchain, this typically appears emphasized, and no other adornement. But a `<replaceable>` is nearly always in a code element, where emphasis won't work. So we do the same thing as for `<optional>`: decorate the content with brackets.
2020-06-14Docbook: map <simplesect> to unnumbered section (#6436)Mathieu Boespflug1-15/+19
A <simplesect> is a section like any other, except that it never contains an subsection, and is typically rendered unnumbered.
2020-06-13Textile reader: support "pre." for code blocks.John MacFarlane1-8/+8
Cloess #6454.
2020-06-09Ipynb reader: handle application/pdf output as image.John MacFarlane1-1/+1
Closes #6430.
2020-06-09Ipynb reader: properly handle image/svg+xml as an image.John MacFarlane1-3/+5
Partially addresses #6430.
2020-05-20Add "summary" to list of block-level HTML tags.John MacFarlane1-1/+1
Closes #6385. (The summary element needs to be the first child of details and should not be enclosed by p tags.) NOTE: you need to include a blank line before the closing `</details>`, if you want the last part of the content to be parsed as a paragraph.
2020-05-19LaTeX reader: don't parse beyond `\end{document}`.John MacFarlane1-13/+25
This required some internal changes to `\subfile` handling. Closes #6380.
2020-05-14DocBook writer: add id of figure to enclosed image.John MacFarlane1-4/+12
2020-05-08Implement implicit_figures extension for commonmark reader.John MacFarlane1-1/+6
Closes #6350.
2020-05-05Avoid unnecessary guard (#6340)Joseph C. Sible1-1/+1
2020-05-04Fix mediawiki reader with gfm_auto_identifiers.John MacFarlane1-1/+4
Previously the `-` was being replaced by `_`. Closes #6335.
2020-04-28Support new Underline element in readers and writers (#6277)Vaibhav Sagar11-23/+32
Deprecate `underlineSpan` in Shared in favor of `Text.Pandoc.Builder.underline`.
2020-04-18HTML reader: parse attributes into table attributes.John MacFarlane1-14/+18
2020-04-17LaTeX reader: don't put surrounding Div around Table.John MacFarlane1-2/+5
This reverts a change in the last release; the Div is no longer needed, because we can now put the id right in the Table's attributes. However, writers may still need to be modified to do something with the id in a Table (e.g. create an anchor), so in the short term we may lose the ability to link to tables in some writers.
2020-04-15Markdown reader: Remove unnecessary qualificationdespresc1-8/+8
2020-04-15Use the new builders, modify readers to preserve empty headersdespresc18-60/+154
The Builder.simpleTable now only adds a row to the TableHead when the given header row is not null. This uncovered an inconsistency in the readers: some would unconditionally emit a header filled with empty cells, even if the header was not present. Now every reader has the conditional behaviour. Only the XWiki writer depended on the header row being always present; it now pads its head as necessary.
2020-04-15Adapt to the removal of the RowSpan, ColSpan, RowHeadColumns accessorsdespresc1-1/+1
2020-04-15Adapt to the newest Table type, fix some previous adaptation issuesdespresc19-72/+74
- Writers.Native is now adapted to the new Table type. - Inline captions should now be conditionally wrapped in a Plain, not a Para block. - The toLegacyTable function now lives in Writers.Shared.
2020-04-15Remove the onlySimpleCellBodies function from Shareddespresc1-2/+2
2020-04-15Implement the new Table typedespresc20-126/+150
2020-04-15Markdown Reader: Fix inline code in lists (#6284)Nikolay Yakimov1-6/+11
Closes #6284. Previously inline code containing list markers was sometimes parsed incorrectly.
2020-04-15JATS reader: handle "label" element in section title.John MacFarlane1-1/+7
Closes #6288.
2020-04-12RST reader: handle "date::" directive.John MacFarlane1-1/+10
Closes #6276.
2020-04-11HTML reader: support <bdo> (#6271)Tristan de Cacqueray1-0/+13
See https://developer.mozilla.org/en-US/docs/Web/HTML/Element/bdo Closes #5794
2020-04-09Jira reader: improve icon conversionAlbert Krewinkel1-12/+12
Icons are now converted as follows: `(/)` to ✔, `(x)` to ❌, `(!)` to ❗, `(+)` to ➕, `(-)` to ➖, `(off)` to 🌙, and `(*)` to ☆. The new icons render well in most fonts. Furthermore, the UTF-8 characters all fit into 4-bytes. Closes: #6264
2020-04-07LaTeX reader: better handling of `\lettrine`.John MacFarlane1-1/+8
- SmallCaps instead of Span for the part after the initial capital. - Ensure that both arguments are parsed, so that in Markdown both are treated as raw LateX. (Closes #6258.)
2020-04-06Vimwiki reader: Add nested syntax highlighting (#6257)Vlad Hanciuta1-1/+5
Nested syntaxes are specified like this: {{{sql SELECT * FROM table }}} The preformatted code block parser has been extended to check if the first attribute of the block is not a `key=value` pair, and in that case it will be considered as a class. Closes #6256.
2020-04-04Jira: support citations, attachment links, and user linksAlbert Krewinkel1-1/+15
Closes: #6231 Closes: #6238 Closes: #6239
2020-04-02HTML reader: fix parsing unclosed th elements in a table.John MacFarlane1-0/+1
Closes #6247.
2020-03-31Jira reader: use span with class `underline` for inserted textAlbert Krewinkel1-1/+1
Jira text which is marked as `+inserted+` is converted into pandoc's default representation for underlined text: a span with class `underline`. Previously, the span was marked with the non-standard class `inserted`. Closes: #6237
2020-03-30Jira reader: retain image attributesAlbert Krewinkel1-1/+13
Jira images attributes as in `!image.jpg|align=right!` are retained as key-value pairs. Thumbnail images, such as `!example.gif|thumbnail!`, are marked by a `thumbnail` class in their attributes. Related to #6234.
2020-03-30Jira reader: read `(?)` icon as "small questionmark" characterAlbert Krewinkel1-1/+1
Closes: #6236
2020-03-29Clean up and simplify Text.Pandoc.Readers.Docx (#6225)Joseph C. Sible1-61/+43
* Simplify resolveDependentRunStyle * Simplify runToInlines * Simplify isAnchorSpan * Simplify parStyleToTransform * Only call getStyleName once * Simplify ils'' * Use case matching to simplify bodyPartToBlocks * Simplify key expiration
2020-03-29Clean up some fmaps (#6226)Joseph C. Sible4-16/+16
* Avoid fmapping when we're just binding right after anyway * Clean up unnecessary fmaps in the LaTeX reader
2020-03-29Docx reader: better error messages.John MacFarlane1-8/+12
Distinguish between docx parsing and docx container unpacking errors.