Age | Commit message (Collapse) | Author | Files | Lines |
|
This exports functions that uses xml-conduit's parser to
produce an xml-light Element or [Content]. This allows
existing pandoc code to use a better parser without
much modification.
The new parser is used in all places where xml-light's
parser was previously used. Benchmarks show a significant
performance improvement in parsing XML-based formats
(especially ODT and FB2).
Note that the xml-light types use String, so the
conversion from xml-conduit types involves a lot
of extra allocation. It would be desirable to
avoid that in the future by gradually switching
to using xml-conduit directly. This can be done
module by module.
The new parser also reports errors, which we report
when possible.
A new constructor PandocXMLError has been added to
PandocError in T.P.Error [API change].
Closes #7091, which was the main stimulus.
These changes revealed the need for some changes
in the tests. The docbook-reader.docbook test
lacked definitions for the entities it used; these
have been added. And the docx golden tests have been
updated, because the new parser does not preserve
the order of attributes.
Add entity defs to docbook-reader.docbook.
Update golden tests for docx.
|
|
See #7091.
|
|
|
|
Add support for informalfigure.
|
|
Previously they only worked for links that had titles. Closes #7080.
|
|
The interpretation of this line is not affected
by the delim option. Closes #7064.
|
|
Previously there was a messy code path that gave strange
results in some cases, not passing through raw tex but
trying to extract a string content. This was an artefact
of trying to handle some special bibtex-specific commands
in the BibTeX reader. Now we just handle these in the
LaTeX reader and simplify parsing in the BibTeX reader.
This does mean that more raw tex will be passed through
(and currently this is not sensitive to the `raw_tex`
extension; this should be fixed).
Closes #7049.
|
|
This reverts commit 6efd3460a776620fdb93812daa4f6831e6c332ce.
Since this extension is designed to be used with
GitHub markdown (gfm), we need to implement the parser
as a commonmark extension (commonmark-extensions),
rather than in pandoc's markdown reader. When that is
done, we can add it here.
|
|
Canges overview:
* Add a `Ext_markdown_github_wikilink` constructor to `Extension` [API change].
* Add the parser `githubWikiLink` in `Text.Pandoc.Readers.Markdown`
* Add tests.
|
|
Additional pipe chars, used to separate "action" state from "no further
action" states, are ignored. E.g., for the following sequence, both
`DONE` and `FINISHED` are states with no further action required.
#+TODO: UNFINISHED | DONE | FINISHED
Previously, parsing of the todo sequence failed if multiple pipe chars
were included.
Closes: #7014
|
|
|
|
Closes #7003.
|
|
* Replace org-mode’s verbatim from code to codeWith.
This adds the `"verbatim"` class so that exporters can apply a specific
style on it. For instance, it will be possible for HTML to add a CSS
rule for code + verbatim class.
* Alter test for org-mode’s verbatim change.
See previous commit for further detail on the new implementation.
|
|
when `raw_tex` is not enabled. (When `raw_tex` is enabled,
the whole environment is parsed as a raw block.)
The class name is the name of the environment.
Previously, we just included the contents without the
surrounding Div, but having a record of the environment's
boundaries and name can be useful.
Closes #6997.
|
|
The Div wrapper of code blocks with captions now has the class
"captioned-content". The caption itself is added as a Plain block
inside a Div of class "caption". This makes it easier to write filters
which match on captioned code blocks. Existing filters will need to be
updated.
Closes: #6977
|
|
Closes #6993.
|
|
|
|
The `renderTags'` function was duplicated when the reader used `Text` as
its string type. The duplication is no longer necessary.
A side effect of this change is that empty `<col>` elements are written
as self-closing tags in raw HTML blocks.
|
|
These (as well as lang attributes on html) should update
lang in metadata. See #6938.
|
|
Previously we stripped attribute prefixes, reading
`xml:lang` as `lang` for example. This resulted in
two duplicate `lang` attributes when `xml:lang` and
`lang` were both used. This commit causes the prefixes
to be retained, and also avoids invald duplicate
attributes.
Closes #6938.
|
|
* Add `Ext_sourcepos` constructor for `Extension`.
* Add `sourcepos` extension (only for commonmark).
* Bump to 2.11.3
With the `sourcepos` extension set set, `data-pos` attributes are added
to the AST by the commonmark reader. No other readers are affected. The
`data-pos` attributes are put on elements that accept attributes; for
other elements, an enlosing Div or Span is added to hold the attributes.
Closes #4565.
|
|
|
|
DokuWiki lets the user define his own Interwiki links.
Previously pandoc reacted to these by emitting a
google search link, which is not helpful. Instead,
we now just emit the full URL including the
wikilink prefix, e.g. `faquk>FAQ-mathml`.
This at least gives users the ability to
modify the links using filters.
Closes #6932.
|
|
Links with (internal) targets that the reader doesn't know about are
converted into emphasized text. Information on the link target is now
preserved by wrapping the text in a Span of class `spurious-link`, with
an attribute `target` set to the link's original target. This allows to
recover and fix broken or unknown links with filters.
See: #6916
|
|
If we put an image in italics, then when rendering to Markdown
we no longer get an implicit figure.
Closes #6925.
|
|
|
|
Header comment in the CSV reader module says "RST" instead of "CSV".
|
|
|
|
Closes: #6312
|
|
The contents of the `center` environment are put in a `Div`
with class `center`.
|
|
- `<tfoot>` elements are no longer added to the table body but used as
table footer.
- Separate `<tbody>` elements are no longer combined into one.
- Attributes on `<thead>`, `<tbody>`, `<th>`/`<td>`, and `<tfoot>`
elements are preserved.
|
|
|
|
This removes the `foldOrElse` function from the internal Text.Pandoc.CSS
module.
|
|
|
|
|
|
|
|
Reducing module size should reduce memory use during compilation.
This is preparatory work to tackle support for more table features.
|
|
Fixes: #6845
|
|
Improves on 9a40976. Closes #6873.
|
|
Table width in relation to text width is not natively supported
by docbook but is by the docbook fo stylesheets through an XML
processing instruction, <?dbfo table-width="50%"?> .
Implement support for this instruction in the DocBook reader.
|
|
in cases where we run into trouble parsing inlines til the
closing `]`, e.g. quotes, we return a plain string with the
option contents. Previously we mistakenly included the brackets
in this string.
Closes #6869.
|
|
...and put it in a div with class `formalpara-title`, so that
people can reformat with filters.
Closes #6562.
Thanks to rdmuller.
|
|
We now better handle `.IP` when it is used with non-bullet,
non-numbered lists, creating a definition list.
We also skip blank lines like groff itself.
Closes #6858.
|
|
As of ~2 years ago, lower case keywords became the standard (though they
are handled case insensitive, as always):
https://code.orgmode.org/bzg/org-mode/commit/13424336a6f30c50952d291e7a82906c1210daf0
Upper case keywords are exclusive to the manual:
- https://orgmode.org/list/871s50zn6p.fsf@nicolasgoaziou.fr/
- https://orgmode.org/list/87tuuw3n15.fsf@nicolasgoaziou.fr/
|
|
This reproduces earlier pandoc-citeproc behavior.
Closes jgm/citeproc#26.
|
|
This affects example list references followed by dashes.
Introduced by commit b8d17f7.
Closes #6855.
|
|
|
|
- use real minus sign
- use tests contributed by Igor Pashev.
|
|
The commit a157e1a broke negative numbers, e.g.
`\SI{-33}{\celcius}` or `\num{-3}`. This fixes the regression.
|
|
Prevously, if we had `@foo [p. 33; @bar]`, the `p. 33` would be
incorrectly parsed as a prefix of `@bar` rather than a suffix
of `@foo`.
|