Age | Commit message (Collapse) | Author | Files | Lines |
|
The parser had been changing footnotes and endnotes into footnotes. This
isn't a problem, because pandoc collapses them, but the parser should
maintain as much of the docx structure as is collapsed, and let the
toplevel reader worry about how to translate it into Pandoc. (This would
be an issue when, as is planned, the docx parser spins off into its
own module.)
The output is the same, so no test change is required.
|
|
All other underlines are ignored.
|
|
Moved `MediaBag` definition and functions from Shared:
`lookupMedia`, `mediaDirectory`, `insertMedia`, `extractMediaBag`.
Removed `emptyMediaBag`; use `mempty` instead, since `MediaBag`
is a Monoid.
|
|
Shared now exports functions for interacting with a MediaBag:
- `emptyMediaBag`
- `lookuMedia`
- `insertMedia`
- `mediaDirectory`
- `extractMediaBag`
|
|
Get latest modification time.
|
|
Image data will not be put in a media bag map, which will be output
along with the pandoc output.
|
|
This will make paragraphs styled with `Author`, `Title`, `Subtitle`,
`Date`, and `Abstract` into pandoc metavalues, rather than text. The
implementation only takes those elements from the beginning of the
document (ignoring empty paragraphs).
Multiple paragraphs in the `Author` style will be made into a metaList,
one paragraph per item. Hard linebreaks (shift-return) in the paragraph
will be maintained, and can be used for institution, email, etc.
|
|
Generalised more in Parsing.hs to enable the use of custom state
|
|
|
|
|
|
|
|
http://txt2tags.org/
There are two points which currently do not match the official
implementation.
1. In the official implementation lists can not be nested like the
following but the reader would interpret this as a bullet list with the
first item being a numbered list.
```
- + This is not a list
```
2. The specification describes how URIs automatically becomes links.
Unfortunately as is often the case, their definitiong of URI is not
clear. I tried three solutions but was unsure about which to adopt.
* Using isURI from Network.URI, this matches far too many strings and is
therefore unsuitable
* Using uri from Text.Pandoc.Shared, this doesn't match all strings that
the reference implementation matches
* Try to simulate the regex which is used in the native code
I went with the third approach but it is not perfect, for example
trailing punctuation is captured in Urls.
|
|
|
|
Of course, we can't include structure in the code block, but
this way we at least preserve the text. Closes #1449.
|
|
|
|
Closes #1434.
|
|
It now works as in PHP markdown extra. Setting `markdown="1"` on
an outer tag affects all contained tags until it is reversed with
`markdown="0"`. Closes #1378.
Added `stateMarkdownAttribute` to `ParserState`.
|
|
Test case:
<aside markdown="1">
*hi*
</aside>
Previously gave:
<article markdown="1">
<p><em>hi</em> </article></p>
|
|
* This change brings pandoc's definition list syntax into alignment
with that used in PHP markdown extra and multimarkdown (with the
exception that pandoc is more flexible about the definition markers,
allowing tildes as well as colons).
* Lazily wrapped definitions are now allowed; blank space is required
between list items; and the space before definition is used to
determine whether it is a paragraph or a "plain" element.
* For backwards compatibility, a new extension,
`compact_definition_lists`, has been added that restores the behavior
of pandoc 1.12.x, allowing tight definition lists with no blank space
between items, and disallowing lazy wrapping.
|
|
This gives better results for tight lists. Closes #1437.
An alternative solution would be to use Para everywhere, and
never Plain. I am not sufficiently familiar with org to know
which is best. Thoughts, @tarleb?
|
|
Also removed deprecated readTeXMath.
|
|
Adds support to the org reader for conditionally exporting either the code block,
results block immediately following, both, or neither, depending on the value
of the `:exports` header argument. If no such argument is supplied, the default
org behavior (for most languages) of exporting code is used.
|
|
Thanks @dubiousjim. Close #1431.
|
|
If header anchors (bookmarks in a header paragraph) already have an
auto-id, which will happen if they're generated by pandoc, we don't want
to rename it twice, and thus end up with an unnecessary number at the
end. So we add a state value to check if we're in a header. If we are,
we don't rename the bookmark -- wait until we rename it in our header
handling.
|
|
We don't need `updateDState` -- the built-in `modify` works just
fine. And we redefine `withDState` to use modify.
|
|
This properly handles tags that should be self-closing.
Previously `<hr/>` would appear in EPUB output as `<hr></hr>`.
Closes #1420.
|
|
|
|
Improvements to Parsing.hs
|
|
Nicer Docx type
|
|
mtl switched from ErrorT to ExceptT, but we're not sure which mtl we'll
be dealing with. This should make errors work with both.
The main difference (beside the name of the module and the monad
transformer) is that Except doesn't require an instance of an Error
Typeclass. So we define that for compatability. When we switch to a
later mtl, using Control.Monad.Exception, we can just erase the instance
declaration, and all should work fine.
|
|
This modifies the Docx type in the parser to avoid all the extra files
(Notes, numbering, etc). A reader monad keeps track of these, and applies
them at the end. The reader monad is stacked with ErrorT to enable better
error-handling than the old Maybes. (Note that the better error handling
isn't really there yet, but it is now possible.)
One long-term goal of these changes is to make it easier to write the Docx
type. This should make it easier to develop a standalone docx package in the
future.
|
|
This function is equivalent to the more general (<*) which is defined in
Control.Applicative. This change makes pandoc code easier to understand for
those not familar with the codebase.
|
|
This sets `stateInHtmlBlock` to `Just "div"` when we're parsing
an HTML div.
Without this fix, a closing `</div>` tag could be parsed as part
of a list item rather than after the list.
|
|
Closes #1121.
|
|
Semantics should be the same.
|
|
- We no longer include trailing spaces and newlines in the
raw blocks.
- We look for closing tags for elements (but without backtracking).
- Each block-level tag is its own RawBlock; we no longer try to
consolidate them (though `--normalize` will do so).
Closes #1330.
|
|
- Added `audio` and `source` in `eitherBlockOrInline`.
- Moved `video`, `svg`, `progress`, `script`, `noscript`, `svg` from
`blockTags` to `eitherBlockOrInline`.
- `map` and `object` were mistakenly in both lists; they have been removed
from `blockTags`.
|
|
This is a first stab at writing out equations in LaTeX based on
omml equations in Word. There are some glitches: unicode chars not known to
LaTeX are silently skipped, and functions (such as `\oiiint`) not in the
standard LaTeX packages are inserted, which can lead to pdf compilation
errors (depending, of course, on your preamble).
Adding, for example, `\usepackage[charter]{mathdesign}` to the preamble will
allow you to use most of the more esoteric functions.
|
|
This will allow us to deal with unicode characters from word equations. This
part of the process will need to continue to be improved.
|
|
|
|
This gets rid of `divAttrToContainers`: an internal convenience function
which had become pretty inconvenient. Rather than converting classes and
indentations to string lists and back, we deal with the `pPr` attribute
directly.
|
|
Fix hanging indent behavior
|
|
Here, when hanging indents are greater than or equal to left indents, we
don't set it to block quote. Such indents are frequently used in
academic bibliographies. (Thanks to Caleb McDaniel.)
|
|
This lets us keep more information about the indentation, and act
accordingly in the reader.
|
|
Previously, a fresh state was created for the purpose of updating. In
the future, when there is more than one field in the state, this
obviously won't work.
|
|
Previously, only those with an anchor got an auto id. Now, all do, which
puts it in line with pandoc's markdown extension.
|
|
|
|
Record relationship between original id and auto id, so we can fix links
after.
|
|
In preparation for auto ids.
|
|
Using pattern guard, in preparation for doing some more complicated
stuff with it (recording header anchors, so we can change them to auto
ids.)
|