Age | Commit message (Collapse) | Author | Files | Lines |
|
Do a lookahead to find the right parser to use.
Benchmarks from 34ms to 23ms, with less allocation.
Also speeds up the epub reader.
|
|
With the new XML parser, we can avoid the expensive tree
normalization step we used to do.
This gives a significant speed boost in docbook and JATS
parsing (e.g. 9.7 to 6 ms).
|
|
|
|
This is to prevent accidental creation of ligatures like
`` ?` `` and `` !` `` (especially in languages with quotations
like German), and similar ligature issues.
See jgm/citeproc#54.
|
|
Use `\vadjust pre` so that the hypertarget takes you to the
beginning of the paragraph rather than one line down.
Closes #7078.
This makes a particular difference for links to citations
using `--citeproc` and `link-citations: true`.
|
|
Cleanup up some functions and added deprecation pragmas
to funtions no longer used in the code base.
|
|
The org-ref syntax allows to list multiple citations separated by comma.
This fixes a bug that accepted commas as part of the citation id, so all
citation lists were parsed as one single citation.
Fixes: #7101
|
|
This gives a speedup of about 5-10%.
The reader is now approximately twice as fast as in the last
release.
|
|
This reverts commit d8fc4971868104274881570ce9bc3d9edf0d2506.
|
|
|
|
|
|
|
|
..and add new definitions isomorphic to xml-light's, but with
Text instead of String. This allows us to keep most of the code in
existing readers that use xml-light, but avoid lots of unnecessary
allocation.
We also add versions of the functions from xml-light's
Text.XML.Light.Output and Text.XML.Light.Proc that operate
on our modified XML types, and functions that convert
xml-light types to our types (since some of our dependencies,
like texmath, use xml-light).
Update golden tests for docx and pptx.
OOXML test: Use `showContent` instead of `ppContent` in `displayDiff`.
Docx: Do a manual traversal to unwrap sdt and smartTag.
This is faster, and needed to pass the tests.
Benchmarks:
A = prior to 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8)
B = as of 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8)
C = this commit
| Reader | A | B | C |
| ------- | ----- | ------ | ----- |
| docbook | 18 ms | 12 ms | 10 ms |
| opml | 65 ms | 62 ms | 35 ms |
| jats | 15 ms | 11 ms | 9 ms |
| docx | 72 ms | 69 ms | 44 ms |
| odt | 78 ms | 41 ms | 28 ms |
| epub | 64 ms | 61 ms | 56 ms |
| fb2 | 14 ms | 5 ms | 4 ms |
|
|
|
|
- If src is empty, we simply skip the iframe.
- If src is invalid or cannot be fetched, we issue a warning
and skip instead of failing with an error.
- Closes #7099.
|
|
Refactor `handleError` to use `renderError`. This allows us
render error messages without exiting.
|
|
The tasks lists extension is now supported by the org reader and writer;
the extension is turned on by default.
Closes: #6336
|
|
|
|
|
|
This attribute was listed as allowed, but it didn't actually
do anything. Use `attributes` for code attributes and more.
Closes #7097.
|
|
|
|
* Rewrote `withRaw` so it doesn't rely on fragile assumptions
about token positions (which break when macros are expanded).
This requires the addition of `sEnableWithRaw` and `sRawTokens`
in `LaTeXState`, and a new combinator `disablingWithRaw` to
disable collecting of raw tokens in certain contexts.
* Add `parseFromToks` to T.P.Readers.LaTeX.Parsing.
* Fix parsing of single character tokens so it doesn't mess
up the new raw token collecting.
* These changes slightly increase allocations and have a small
performance impact, but it's minor.
Closes #7092.
|
|
Setting SOURCE_DATE_EPOCH will allow reproducible builds.
Partially addresses #7093. This does not suffice to fully enable
reproducible in EPUB, since a unique id is being generated for each
build.
|
|
This attempts to read the SOURCE_DATE_EPOCH environment variable
and parse a UTC time from it (treating it as a unix date stamp,
see https://reproducible-builds.org/specs/source-date-epoch/).
If the variable is not set or can't be parsed as a unix date
stamp, then the function returns the current date.
|
|
See jgm/citeproc#53.
|
|
This exports functions that uses xml-conduit's parser to
produce an xml-light Element or [Content]. This allows
existing pandoc code to use a better parser without
much modification.
The new parser is used in all places where xml-light's
parser was previously used. Benchmarks show a significant
performance improvement in parsing XML-based formats
(especially ODT and FB2).
Note that the xml-light types use String, so the
conversion from xml-conduit types involves a lot
of extra allocation. It would be desirable to
avoid that in the future by gradually switching
to using xml-conduit directly. This can be done
module by module.
The new parser also reports errors, which we report
when possible.
A new constructor PandocXMLError has been added to
PandocError in T.P.Error [API change].
Closes #7091, which was the main stimulus.
These changes revealed the need for some changes
in the tests. The docbook-reader.docbook test
lacked definitions for the entities it used; these
have been added. And the docx golden tests have been
updated, because the new parser does not preserve
the order of attributes.
Add entity defs to docbook-reader.docbook.
Update golden tests for docx.
|
|
See #7091.
|
|
|
|
Add support for informalfigure.
|
|
|
|
Previously they only worked for links that had titles. Closes #7080.
|
|
|
|
|
|
The module allows to work with file paths in a convenient and
platform-independent manner.
Closes: #6001
Closes: #6565
|
|
Exported by Text.Pandoc.App.
|
|
This change allows bibtex/biblatex output to wrap as other
formats do, depending on the settings of `--wrap` and `--columns`.
It also introduces default templates for bibtex and biblatex,
which allow for using the variables `header-include`, `include-before`
or `include-after` (or alternatively the command line options
`--include-in-header`, `--include-before-body`, `--include-after-body`)
to insert content into the generated bibtex/biblatex.
This change requires a change in the return type of the unexported
`T.P.Citeproc.writeBibTeXString` from `Text` to `Doc Text`.
Closes #7068.
|
|
+ Require citeproc 0.3.0.7, which correctly titlecases when titles
contain non-ASCII characters.
+ Correctly handle 'pages' (= 'page' in CSL).
+ Correctly handle BibLaTeX 'langid' (= 'language' in CSL).
+ In BibTeX output, protect foreign titles since there's no language
field.
|
|
The interpretation of this line is not affected
by the delim option. Closes #7064.
|
|
|
|
instead of raising a PandocAppError as before.
|
|
We insert an HTML comment to avoid a `$` right before
a digit, which pandoc will not recognize as a math delimiter.
|
|
Prevents the generation of invalid markup if a citation element contains
an ampersand or another character with a special meaning in XML.
|
|
Previously there was a messy code path that gave strange
results in some cases, not passing through raw tex but
trying to extract a string content. This was an artefact
of trying to handle some special bibtex-specific commands
in the BibTeX reader. Now we just handle these in the
LaTeX reader and simplify parsing in the BibTeX reader.
This does mean that more raw tex will be passed through
(and currently this is not sensitive to the `raw_tex`
extension; this should be fixed).
Closes #7049.
|
|
fixes #7047
|
|
The Lua modules `pandoc` and `pandoc.List` are now always loaded from the
system's default data directory. Loading from a different directory by
overriding the default path, e.g. via `--data-dir`, is no longer supported to
avoid unexpected behavior and to address security concerns.
|
|
This change allows pandoc to extract size information
from more SVGs. Closes #7045.
|
|
JATS writer: use element citations
|
|
|
|
* `biblatex` and `bibtex` are now supported as output
as well as input formats.
* New module Text.Pandoc.Writers.BibTeX, exporting
writeBibTeX and writeBibLaTeX. [API change]
* New unexported function `writeBibtexString` in
Text.Pandoc.Citeproc.BibTeX.
|
|
This allows to import the module in writers without causing a circular
dependency.
|