Age | Commit message (Collapse) | Author | Files | Lines |
|
The rewrite is much more direct, avoiding parseFromString.
And it performs significantly better; unfortunately, parsing
time still increases exponentially.
See #1735.
|
|
|
|
There isn't any reason to have numberous anchors in the same place,
since we can't maintain docx's non-nesting overlapping. So we reduce
to a single anchor, and have all links pointing to one of the
overlapping anchors point to that one. This changes the behavior from
commit e90c714c7 slightly (use the first anchor instead of the last)
so we change the expected test result.
Note that because this produces a state that has to be set after every
invocation of `parPartToInlines`, we make the main function into a
primed subfunction `parPartToInlines'`, and make `parPartToInlines` a
wrapper around that.
|
|
This seems to help with the performance problem, #4216.
|
|
|
|
Docx produces a lot of anchors with nothing pointing to them -- we now
remove these to produce cleaner output. Note that this has to occur at
the end of the process because it has to follow link/anchor rewriting.
Closes #3679.
|
|
Amusewiki uses #cover directive to specify cover image.
|
|
Previously we had only read the first child of an sdtContents tag. Now
we replace sdt with all children of the sdtContents tag.
This changes the expected test result of our nested_anchors test,
since now we read docx's generated TOCs.
|
|
This allows us to parse unknown tabular environments
as raw LaTeX. Closes #4208.
|
|
The level of headers in included files can be shifted to a higher level
by specifying a minimum header level via the `:minlevel` parameter. E.g.
`#+include: "tour.org" :minlevel 1` will shift the headers in tour.org
such that the topmost headers become level 1 headers.
Fixes: #4154
|
|
|
|
|
|
HTML Reader: be more forgiving about figcaption
|
|
See #4162.
|
|
We walk through the document (using the zipper in
Text.XML.Light.Cursor) to unwrap the sdt tags before doing the rest of
the parsing of the document. Note that the function is generically
named `walkDocument` in case we need to do any further preprocessing
in the future.
Closes #4190
|
|
|
|
Closes #4193.
|
|
|
|
|
|
|
|
|
|
We now convert a ref-list element into a list of
citations in metadata, suitable for use with pandoc-citeproc.
We also convert references to pandoc citation elements.
Thus a JATS article with embedded bibliographic information
can be processed with pandoc and pandoc-citeproc to produce
a formatted bibliography.
|
|
fixes #4183
|
|
API change: export blocksToInlines' from Text.Pandoc.Shared
|
|
Don't pass through macro definitions themselves when `latex_macros`
is set. The macros have already been applied.
If `latex_macros` is enabled, then `rawLaTeXBlock` in
Text.Pandoc.Readers.LaTeX will succeed in parsing a macro definition,
and will update pandoc's internal macro map accordingly, but the
empty string will be returned.
Together with earlier changes, this closes #4179.
|
|
+ Preserve original whitespace between blocks.
+ Recognize `\placeformula` as context.
|
|
|
|
|
|
|
|
Add Basic JATS reader based on DocBook reader
|
|
|
|
Material following `^^` was dropped if it wasn't a character
escape. This only affected invalid LaTeX, so we didn't see it
in the wild, but it appeared in a QuickCheck test failure
https://travis-ci.org/jgm/pandoc/jobs/319812224
|
|
|
|
A parsing error was fixed which caused the org reader to fail when
parsing a paragraph starting with two or more asterisks.
Fixes: #4180
|
|
|
|
|
|
|
|
This fixes a regression in 2.0.
Note that extensions can now be individually disabled, e.g.
`-f opml-smart-raw_html`.
Closes #4164.
|
|
Mainly so they can be tested.
|
|
This mainly affects the Markdown reader when parsing
raw LaTeX with escaped spaces. Closes #4159.
|
|
Previously we erroneously included the enclosing
backticks in a reference ID (closes #4156).
This change also disables interpretation of
syntax inside references, as in docutils.
So, there is no emphasis in
`my *link*`_
|
|
A caption starts with a `:` which can't be followed
by punctuation. Otherwise we can falsely interpret
the start of a fenced div, or even a table header line
like `:--:|:--:`, as a caption.
|
|
Docx expects that lists will continue where they left off after an
interruption and introduces a new id if a list is starting again. So
we keep track of the state of lists and use them to define a "start"
attribute, if necessary.
Closes #4025
|
|
It would be awkward to indent example list contents to the
first non-space character after the label, since example
list labels are often long.
Thanks to Bernhard Fisseni for the suggestion.
|
|
|
|
Previously we computed the column sizes based on the ratio
between the header lines and the text width (as set by `--columns`).
This meant that tables with very short header lines would be
very narrow. With this change, pipe tables with wrapping cells will
always take up the whole text width. The relative column widths
will still be determined by the ratio of header lines, but they
will be normalized to add up to 1.0.
|
|
This should be a nonbreaking space, as long as it's not
followed by a blank line. This has been fixed at the tokenizer
level.
Closes #4134.
|
|
Closes #4125.
|
|
|
|
|