Age | Commit message (Collapse) | Author | Files | Lines |
|
This allows users to turn off the default pandoc behavior of
parsing contents of div and span tags in markdown and HTML
as native pandoc Div blocks and Span inlines.
Setting of default epub extensions has been moved from the EPUB
reader to Text.Pandoc.
|
|
OMath parser: Change signature of exported function.
|
|
This changes the signature of the exported `readOMML` to `String ->
Either String [Exp]`, so it can now, in theory, be slotted into
TeXMath. It doesn't have any real error reporting yet, but that might
make more sense once I put it in a branch, and understand how it works
in the other readers.
It also now reads strings that parse to either oMath or oMathPara
elements. Note that the distinction is lost in the output. It's up to
the caller to remember the display type.
|
|
|
|
This matches behavior of RedCarpet, avoids some ugly bugs, and improves
performance.
|
|
This gets rid of commented-out functions, cleans up whitespace errors,
and exports and imports the correct functions.
|
|
We still need to test against prefixes, but this is only going to look
at oMath fragments, so we're not going to be worried about looking up
the real namespace.
|
|
This is the first step in removing the intermediate OMath type, which we
no longer need since we're writing straight to TeXMath Exp.
|
|
|
|
|
|
This actually does what d71b013841f3c9c8c595591e312a31df16a728cb
said it did.
Revised epub tests to remove the repeated DOCTYPE and xml tags.
|
|
EPUB Reader: Improved image extraction
|
|
|
|
Math module
|
|
Previous drawings that were under some other toplevel run (i.e., a
hyperlink) wouldn't be properly handled. This should fix that.
|
|
|
|
Could use some cleanup, but this is the first step for getting
an OMML reader into TeXMath.
|
|
Signed-off-by: Jesse Rosenthal <jrosenthal@jhu.edu>
|
|
Since changing the Docx type, this is no longer necessary. Thanks to
Matthew Pickering for picking up on this.
|
|
Docx Reader: Use TeXMath for writing equations.
|
|
This also introduces a `defaultDState` value.
|
|
TeXMath does the work now.
|
|
|
|
The new version of TeXMath can translate from its type system into
LaTeX. So instead of writing the LaTeX ourself, we write to the TeXMath
`Exp` type, and let TeXMath do the rest.
|
|
|
|
Very minor cleanup and readability changes
|
|
Docx Parser: Produce endnotes.
|
|
Previously they were parsed as raw.
|
|
|
|
|
|
|
|
|
|
mpickering-epubend
Conflicts:
pandoc.cabal
|
|
The parser had been changing footnotes and endnotes into footnotes. This
isn't a problem, because pandoc collapses them, but the parser should
maintain as much of the docx structure as is collapsed, and let the
toplevel reader worry about how to translate it into Pandoc. (This would
be an issue when, as is planned, the docx parser spins off into its
own module.)
The output is the same, so no test change is required.
|
|
All other underlines are ignored.
|
|
|
|
|
|
|
|
|
|
|
|
Moved `MediaBag` definition and functions from Shared:
`lookupMedia`, `mediaDirectory`, `insertMedia`, `extractMediaBag`.
Removed `emptyMediaBag`; use `mempty` instead, since `MediaBag`
is a Monoid.
|
|
Shared now exports functions for interacting with a MediaBag:
- `emptyMediaBag`
- `lookuMedia`
- `insertMedia`
- `mediaDirectory`
- `extractMediaBag`
|
|
Get latest modification time.
|
|
Image data will not be put in a media bag map, which will be output
along with the pandoc output.
|
|
This will make paragraphs styled with `Author`, `Title`, `Subtitle`,
`Date`, and `Abstract` into pandoc metavalues, rather than text. The
implementation only takes those elements from the beginning of the
document (ignoring empty paragraphs).
Multiple paragraphs in the `Author` style will be made into a metaList,
one paragraph per item. Hard linebreaks (shift-return) in the paragraph
will be maintained, and can be used for institution, email, etc.
|
|
Generalised more in Parsing.hs to enable the use of custom state
|
|
|
|
|
|
|
|
http://txt2tags.org/
There are two points which currently do not match the official
implementation.
1. In the official implementation lists can not be nested like the
following but the reader would interpret this as a bullet list with the
first item being a numbered list.
```
- + This is not a list
```
2. The specification describes how URIs automatically becomes links.
Unfortunately as is often the case, their definitiong of URI is not
clear. I tried three solutions but was unsure about which to adopt.
* Using isURI from Network.URI, this matches far too many strings and is
therefore unsuitable
* Using uri from Text.Pandoc.Shared, this doesn't match all strings that
the reference implementation matches
* Try to simulate the regex which is used in the native code
I went with the third approach but it is not perfect, for example
trailing punctuation is captured in Urls.
|