Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
These are handled anyway by regularSymbol.
|
|
|
|
|
|
To help reduce memory demands compiling the main LaTeX reader.
|
|
Fixes: #6674
|
|
|
|
Previously we didn't allow unescaped quotes in unquoted values,
but they are allowed. Closes #7112.
|
|
...when handling URL argument served with no charset in the mime type.
The assumption is that most pages that don't specify a charset
in the mime type are either UTF-8 or latin1. I think that's a good
assumption, though I'm not sure.
|
|
the character encoding. We can properly handle UTF-8 and
latin1 (ISO-8859-1); for others we raise an error.
See #5600.
|
|
...for PandocError. [API change]
|
|
[API change]
|
|
[API change] This affects `readFile`, `getContents`, `writeFileWith`,
`writeFile`, `putStrWith`, `putStr`, `putStrLnWith`, `putStrLn`.
`hPutStrWith`, `hPutStr`, `hPutStrLnWith`, `hPutStrLn`, `hGetContents`.
This avoids the need to uselessly create a linked list of characters
when emiting output.
|
|
Benchmarks show 2/3 of the run time and 2/3 of the allocation
of the Feb. 10 benchmarks.
|
|
This isn't actually needed and checking it doesn't change
anything.
Also remove an unnecessary `doMacros` before `satisfyTok`,
which does it anyway.
|
|
Avoid unnecessary 'doMacros'.
|
|
|
|
Removed:
- `splitByIndices`
- `splitStringByIndicies`
- `substitute`
- `underlineSpan`
None of these are used elsewhere in the code base.
|
|
Also, remove exported class NamedTag(..) [API change].
This was just intended to smooth over the transition from String to Text
and is no longer needed.
The functions isInlineTag and isBlockTag are no longer
polymorphic.
|
|
|
|
|
|
Remove the parameter, have it parse the opening brace,
and make it more efficient.
|
|
Do a lookahead to find the right parser to use.
Benchmarks from 34ms to 23ms, with less allocation.
Also speeds up the epub reader.
|
|
With the new XML parser, we can avoid the expensive tree
normalization step we used to do.
This gives a significant speed boost in docbook and JATS
parsing (e.g. 9.7 to 6 ms).
|
|
|
|
This is to prevent accidental creation of ligatures like
`` ?` `` and `` !` `` (especially in languages with quotations
like German), and similar ligature issues.
See jgm/citeproc#54.
|
|
Use `\vadjust pre` so that the hypertarget takes you to the
beginning of the paragraph rather than one line down.
Closes #7078.
This makes a particular difference for links to citations
using `--citeproc` and `link-citations: true`.
|
|
Cleanup up some functions and added deprecation pragmas
to funtions no longer used in the code base.
|
|
The org-ref syntax allows to list multiple citations separated by comma.
This fixes a bug that accepted commas as part of the citation id, so all
citation lists were parsed as one single citation.
Fixes: #7101
|
|
This gives a speedup of about 5-10%.
The reader is now approximately twice as fast as in the last
release.
|
|
This reverts commit d8fc4971868104274881570ce9bc3d9edf0d2506.
|
|
|
|
|
|
|
|
..and add new definitions isomorphic to xml-light's, but with
Text instead of String. This allows us to keep most of the code in
existing readers that use xml-light, but avoid lots of unnecessary
allocation.
We also add versions of the functions from xml-light's
Text.XML.Light.Output and Text.XML.Light.Proc that operate
on our modified XML types, and functions that convert
xml-light types to our types (since some of our dependencies,
like texmath, use xml-light).
Update golden tests for docx and pptx.
OOXML test: Use `showContent` instead of `ppContent` in `displayDiff`.
Docx: Do a manual traversal to unwrap sdt and smartTag.
This is faster, and needed to pass the tests.
Benchmarks:
A = prior to 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8)
B = as of 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8)
C = this commit
| Reader | A | B | C |
| ------- | ----- | ------ | ----- |
| docbook | 18 ms | 12 ms | 10 ms |
| opml | 65 ms | 62 ms | 35 ms |
| jats | 15 ms | 11 ms | 9 ms |
| docx | 72 ms | 69 ms | 44 ms |
| odt | 78 ms | 41 ms | 28 ms |
| epub | 64 ms | 61 ms | 56 ms |
| fb2 | 14 ms | 5 ms | 4 ms |
|
|
|
|
- If src is empty, we simply skip the iframe.
- If src is invalid or cannot be fetched, we issue a warning
and skip instead of failing with an error.
- Closes #7099.
|
|
Refactor `handleError` to use `renderError`. This allows us
render error messages without exiting.
|
|
The tasks lists extension is now supported by the org reader and writer;
the extension is turned on by default.
Closes: #6336
|
|
|
|
|
|
This attribute was listed as allowed, but it didn't actually
do anything. Use `attributes` for code attributes and more.
Closes #7097.
|
|
|
|
* Rewrote `withRaw` so it doesn't rely on fragile assumptions
about token positions (which break when macros are expanded).
This requires the addition of `sEnableWithRaw` and `sRawTokens`
in `LaTeXState`, and a new combinator `disablingWithRaw` to
disable collecting of raw tokens in certain contexts.
* Add `parseFromToks` to T.P.Readers.LaTeX.Parsing.
* Fix parsing of single character tokens so it doesn't mess
up the new raw token collecting.
* These changes slightly increase allocations and have a small
performance impact, but it's minor.
Closes #7092.
|
|
Setting SOURCE_DATE_EPOCH will allow reproducible builds.
Partially addresses #7093. This does not suffice to fully enable
reproducible in EPUB, since a unique id is being generated for each
build.
|
|
This attempts to read the SOURCE_DATE_EPOCH environment variable
and parse a UTC time from it (treating it as a unix date stamp,
see https://reproducible-builds.org/specs/source-date-epoch/).
If the variable is not set or can't be parsed as a unix date
stamp, then the function returns the current date.
|
|
See jgm/citeproc#53.
|
|
This exports functions that uses xml-conduit's parser to
produce an xml-light Element or [Content]. This allows
existing pandoc code to use a better parser without
much modification.
The new parser is used in all places where xml-light's
parser was previously used. Benchmarks show a significant
performance improvement in parsing XML-based formats
(especially ODT and FB2).
Note that the xml-light types use String, so the
conversion from xml-conduit types involves a lot
of extra allocation. It would be desirable to
avoid that in the future by gradually switching
to using xml-conduit directly. This can be done
module by module.
The new parser also reports errors, which we report
when possible.
A new constructor PandocXMLError has been added to
PandocError in T.P.Error [API change].
Closes #7091, which was the main stimulus.
These changes revealed the need for some changes
in the tests. The docbook-reader.docbook test
lacked definitions for the entities it used; these
have been added. And the docx golden tests have been
updated, because the new parser does not preserve
the order of attributes.
Add entity defs to docbook-reader.docbook.
Update golden tests for docx.
|
|
See #7091.
|
|
|