aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers
AgeCommit message (Collapse)AuthorFilesLines
2021-05-17HTML writer: keep attributes from code nested below pre tag.Albert Krewinkel1-1/+12
If a code block is defined with `<pre><code class="language-x">…</code></pre>`, where the `<pre>` element has no attributes, then the attributes from the `<code>` element are used instead. Any leading `language-` prefix is dropped in the code's *class* attribute are dropped to improve syntax highlighting. Closes: #7221
2021-05-15HTML writer: parse `<header>` as a DivAlbert Krewinkel1-0/+2
HTML5 `<header>` elements are treated like `<div>` elements.
2021-05-14HTML reader: keep h1 tags as normal headers (#7274)Albert Krewinkel1-5/+1
The tags `<title>` and `<h1 class="title">` often contain the same information, so the latter was dropped from the document. However, as this can lead to loss of information, the heading is now always retained. Use `--shift-heading-level-by=-1` to turn the `<h1>` into the document title, or a filter to restore the previous behavior. Closes: #2293
2021-05-14HTML reader: don't fail on unmatched closing "script" tag.Albert Krewinkel1-7/+9
Prevent the reader from crashing if the HTML input contains an unmatched closing `</script>` tag. Fixes: #7282
2021-05-13Implement curly-brace syntax for Markdown citation keys.John MacFarlane2-7/+7
The change provides a way to use citation keys that contain special characters not usable with the standard citation key syntax. Example: `@{foo_bar{x}'}` for the key `foo_bar{x}`. Closes #6026. The change requires adding a new parameter to the `citeKey` parser from Text.Pandoc.Parsing [API change]. Markdown reader: recognize @{..} syntax for citatinos. Markdown writer: use @{..} syntax for citations when needed. Update manual with curly-brace syntax for citations. Closes #6026.
2021-05-12Fix source position reporting for YAML bibliographies.John MacFarlane2-4/+6
Closes #7273.
2021-05-09RST reader: seek include files in the directory...John MacFarlane1-1/+3
...of the file containing the include directive, as RST requires. Closes #6632.
2021-05-09Org reader: Resolve org includes relative to ...John MacFarlane2-2/+5
...the directory containing the file containing the INCLUDE directive. Closes #5501.
2021-05-09RST reader: use `insertIncludedFile` from T.P.Parsing...John MacFarlane1-58/+36
instead of reproducing much of its code.
2021-05-09T.P.Parsing: improve include file functions.John MacFarlane2-3/+3
Remove old `insertIncludedFileF`. [API change] Give `insertIncludedFile` a more general type, allowing it to be used where `insertIncludedFileF` was.
2021-05-09Change reader types, allowing better tracking of source positions.John MacFarlane34-355/+443
Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn't report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn't resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752). Add Text.Pandoc.Sources (exported module), with a `Sources` type and a `ToSources` class. A `Sources` wraps a list of `(SourcePos, Text)` pairs. [API change] A parsec `Stream` instance is provided for `Sources`. The module also exports versions of parsec's `satisfy` and other Char parsers that track source positions accurately from a `Sources` stream (or any instance of the new `UpdateSourcePos` class). Text.Pandoc.Parsing now exports these modified Char parsers instead of the ones parsec provides. Modified parsers to use a `Sources` as stream [API change]. The readers that previously took a `Text` argument have been modified to take any instance of `ToSources`. So, they may still be used with a `Text`, but they can also be used with a `Sources` object. In Text.Pandoc.Error, modified the constructor PandocParsecError to take a `Sources` rather than a `Text` as first argument, so parse error locations can be accurately reported. T.P.Error: showPos, do not print "-" as source name.
2021-04-29Docx reader: add handling of vml image objects (jgm#4735) (#7257)mbrackeantidot1-2/+9
They represent images, the same way as other images in vml format.
2021-04-28Smarter smart quotes.John MacFarlane3-47/+12
Treat a leading " with no closing " as a left curly quote. This supports the practice, in fiction, of continuing paragraphs quoting the same speaker without an end quote. It also helps with quotes that break over lines in line blocks. Closes #7216.
2021-04-25Minor code reformatting.John MacFarlane1-1/+2
Also taking this opportunity to note, for the record, that the commit for #7241 should be marked [API change]. It changes the type of `languagesByExtension` in Highlighting, adding a parameter for a `SyntaxMap`.
2021-04-25Writers: Recognize custom syntax definitions (#7241)Jan Tojnar1-1/+2
Languages defined using `--syntax-definition` were not recognized by `languagesByExtension`. This patch corrects that, allowing the writers to see all custom definitions. The LaTeX still uses the default syntax map, but that's okay in that context, since `--syntax-definition` won't create new listings styles.
2021-04-18Use MetaInlines not MetaBlocks for multimarkdown metadata fields.John MacFarlane1-1/+1
This gives better results in converting to e.g. pandoc markdown. Ref: <https://groups.google.com/d/msgid/pandoc-discuss/9728d1f4-040e-4392-aa04-148f648a8dfdn%40googlegroups.com>
2021-04-17Update to released unicode-collation, latest citeproc dev version.John MacFarlane2-2/+2
Update citeproc test.
2021-04-17Remove Text.Pandoc.BCP47 module.John MacFarlane3-123/+129
[API change] Use Lang from UnicodeCollation.Lang instead. This is a richer implementation of BCP 47.
2021-04-02Fix "phrase" in DocBook: take classes from "role" not "class".John MacFarlane1-1/+1
Closes #7195. Revises #6438.
2021-03-31Treat tabs as spaces in ODT Reader. (#7185)niszet1-1/+7
2021-03-24Fix DocBook reader mathml regression...John MacFarlane2-4/+7
...caused by the switch in XML libraries. Also fixed a similar issue in JATS. Closes #7173.
2021-03-20Support `yaml_metadata_block` extension form commonmark, gfm.John MacFarlane1-0/+30
This is a bit more limited than with markdown, as documented in the manual: - The YAML block must be the first thing in the input. - The leaf notes are parsed in isolation from the rest of the document. So, for example, you can't use reference links if the references are defined later in the document. Closes #6537.
2021-03-20Move yamlMetaBlock from Markdown reader to T.P.Readers.Metadata.John MacFarlane2-22/+22
2021-03-20Markdown reader: export `yamlMetaBlock`.John MacFarlane1-17/+23
[API change] This will allow us to parse YAML metadata blocks in other readers, potentially.
2021-03-20Text.Pandoc.Parsing: remove F type synonym.John MacFarlane3-3/+5
Muse and Org were defining their own F anyway, with their own state. We therefore move this definition to the Markdown reader.
2021-03-20T.P.Readers.Metadata: made `yamlBsToMeta`, `yamlBsToRefs` polymorphic...John MacFarlane1-15/+15
on the parser state, instead of requiring ParserState. [API change]
2021-03-19Hlint suggestion.John MacFarlane1-2/+3
2021-03-19Protect partial uses of maximum with NonEmpty.John MacFarlane6-17/+24
2021-03-19Use NonEmpty instead of minimumDef.John MacFarlane3-6/+6
2021-03-19Docx reader: Don't reimplement NonEmpty.John MacFarlane1-5/+1
2021-03-18Use minimumDef instead of minimum (partial function).John MacFarlane3-4/+6
2021-03-18Require safe >= 0.3.18 and remove cpp.John MacFarlane1-5/+0
2021-03-18Rewrite a foldl1 as a foldl'.John MacFarlane1-1/+5
2021-03-18Remove another foldr1 partial function use.John MacFarlane1-5/+6
2021-03-18T.P.Readers.Odt.StyleReader: rewrite foldr1 use as foldr.John MacFarlane1-5/+6
This avoids a partial function.
2021-03-17Fix regression with `tex_math_backslash` in Markdown reader.John MacFarlane1-1/+1
Added regression test. Closes #7155.
2021-03-15Remove an unneeded importJohn MacFarlane1-1/+0
2021-03-15Use foldl' instead of foldl everywhere.John MacFarlane7-13/+14
2021-03-13MediaWiki reader: Allow block-level content in notes (ref).John MacFarlane1-1/+9
Closes #7145.
2021-03-13Jira reader: mark divs created from panels with class "panel".Albert Krewinkel1-2/+2
Closes: tarleb/jira-wiki-markup#2
2021-03-09RST reader: fix logic for ending comments.John MacFarlane1-1/+2
Previously comments sometimes got extended too far. Closes #7134.
2021-03-07LaTeX reader: handle table cells containing `&` in `\verb`.John MacFarlane1-1/+6
Closes #7129.
2021-03-07LaTeX reader: support hyperref command.John MacFarlane1-4/+13
Closes #7127.
2021-03-04Revert "Revert "Relax `--abbreviations` rules so that a period isn't required.John MacFarlane1-1/+1
This reverts commit 916ce4d51121e0529b938fda71f37e947882abe5. I was confused in thinking it wouldn't work.
2021-03-04Revert "Relax `--abbreviations` rules so that a period isn't required."John MacFarlane1-1/+1
This reverts commit e461b7dd45f717f3317216c7d3207a1d24bf1c85. Ill-advised change. This doesn't work because we parse strings in chunks.
2021-03-04Relax `--abbreviations` rules so that a period isn't required.John MacFarlane1-1/+1
Partially addresses #7124.
2021-03-03Revert "Add T.P.Readers.LaTeX.Include."John MacFarlane3-86/+52
This reverts commit b569b0226d4bd5e0699077089d54fb03d4394b7d. Memory usage improvement in compilation wasn't very significant.
2021-03-03Add T.P.Readers.LaTeX.Include.John MacFarlane3-52/+86
2021-03-03Remove T.P.Readers.LaTeX.Accent.John MacFarlane3-82/+69
Incorporate accentCommands into T.P.Readers.LaTeX.Inline.
2021-03-03Move enquote commands to T.P.LaTeX.Lang.John MacFarlane3-24/+34