Age | Commit message (Collapse) | Author | Files | Lines |
|
Try fixing a parsing error on windows by insisting that the parser use
a Posix filepath library for splitting doc paths in a zipfile. (It
might default on Windows to using a backslash as a separator, while
it's always a forward-slash in zip archives.)
|
|
* clarify function name. We had previously used `getDocumentPath`,
but `Document` is an overdetermined term here. Use
`getDocumentXmlPath` to make clear what we're doing.
* Use field notation for setting ReaderEnv. As we've added (and
continue to add) fields, the assignment by position has gotten
harder to read.
* figure out document.xml path once at the beginning of parsing, and
add it to the environment, so we can avoid repeated lookups.
|
|
Getting the location used to depend on a hard-coded .rels file based
on "word/document.xml". We now dynamically detect that file based on
the document.xml file specified in "_rels/.rels"
|
|
The desktop Word program places the main document file in
"word/document.xml", but the online word places it in
"word/document2.xml". This file path is actually stated in the root
"_rels/.rels" file, in the "Relationship" element with an
"http://../officedocument" type.
Closes #5277
|
|
For some reason, Word in Office 365 Online uses `document2.xml`
for the content, instead of `document.xml`. This causes pandoc
not to be able to parse docx.
This quick fix has the parser check for both `document.xml`
and `document2.xml`.
Addresses #5277, but a more robust solution would be to
get the name of the main document dynamically (who knows
whether it might change again?).
|
|
Quite a few modules were missing copyright notices.
This commit adds copyright notices everywhere via haddock module
headers. The old license boilerplate comment is redundant with this and has
been removed.
Update copyright years to 2019.
Closes #4592.
|
|
Otherwise last block gets parsed as a Plain rather than
a Para.
This is a regression in pandoc 2.x. This patch restores
pandoc 1.19 behavior.
Closes #5271.
|
|
Previously we didn't strip off the attachment: prefix,
so even though the attachment is available in the mediabag,
pandoc couldn't find it.
|
|
`braced` now actually requires nested braces.
Otherwise some legitimate command and environment
definitions can break (see test/command/tex-group.md).
|
|
|
|
|
|
Partially addresses #4731.
We may not still be exactly matching mediawiki's algorithm
for identifiers.
|
|
|
|
This reverts commit 5eaff399d5d6dc30b0d453eff42c4101674d75ab.
|
|
This avoids conflics with things like 'toc'.
|
|
[API change]
* Depend on ipynb library.
* Add `ipynb` as input and output format.
* Added Text.Pandoc.Readers.Ipynb (supports both nbformat v3 and v4).
* Added Text.Pandoc.Writers.Ipynb (supports nbformat v4).
* Added ipynb readers and writers to T.P.Readers,
T.P.Writers, and T.P.Extensions. Register the
file extension .ipynb for this format.
* Add `PandocIpynbDecodingError` constructor to Text.Pandoc.Error.Error.
* Note: there is no template for ipynb.
|
|
|
|
|
|
We don't want to parse its contents as Markdown or HTML.
Closes #5241.
|
|
Previously the `.0` was interpreted as a file extension,
leading pandoc not to add `.tex` (and thus not to find the
file).
The new behavior matches tex more closely.
|
|
|
|
Directives of this type without numeric inputs should not have a
`startFrom` attribute; with a blank value, the writers can produce
extra whitespace.
|
|
* These were added by the RST reader and, for literate Haskell,
by the Markdown and LaTeX readers. There is no point to
this class, and it is not applied consistently by all readers.
See #5047.
* Reverse order of `literate` and `haskell` classes on code blocks
when parsing literate Haskell. Better if `haskell` comes first.
|
|
Closes #5204.
|
|
See #5190.
|
|
When `minlevel` exceeds the original minimum level observed in the
file to be included, every heading should be shifted rightward.
|
|
Underscore emphasis can't cross table cell boundaries,
but the parser wasn't respecting this, leading to exponential
behavior in documents with table cells containing underscores.
This fixes the original sample; it's possible that there
are other performance issues involving underscores.
Closes #3921.
|
|
Closes #1792
|
|
Closes #3051
|
|
Fixes a regression introduced by the previous commit.
|
|
Links with descriptions which are pointing to images are no longer read
as inline images, but as proper links.
Fixes: #5191
|
|
|
|
It is updated by some readers, but never actually used.
|
|
closes #5180
|
|
Closes #5149.
|
|
|
|
See #5162.
|
|
There can be overrides for the definitions of certain levels in
numbering definitions. This implements that behavior.
Closes: #5134
|
|
|
|
It had previously been an alias for a tuple.
|
|
|
|
|
|
Starting with pandoc 2.4, citations and quoted inlines
were no longer recognized after parentheses. This is
because of commit 9b0bd4ec6f5c9125efb3e36232e6d1f6ac08a728,
which is reverted here.
The point of that commit was to allow relocation of
soft line breaks to before an abbreviation, so that
a nonbreaking space could be added after the
abbreviation. Now we simply leave the soft line
break in place, even though this means that
we won't get a nonbreaking space after "Mr."
at the end of a line (and in LaTeX this may
result in a longer intersentential space).
Those who care about this issue should take care
not to end lines with an abbreviation, or to
insert nonbreaking spaces manually.
Closes #5099.
|
|
|
|
|
|
Allow decimal points, preceding space.
Also require text 1.1+.
|
|
|
|
|
|
|
|
Parse as raw, but know that these font changing commands
take no arguments.
|