Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
Improvements to Parsing.hs
|
|
Nicer Docx type
|
|
mtl switched from ErrorT to ExceptT, but we're not sure which mtl we'll
be dealing with. This should make errors work with both.
The main difference (beside the name of the module and the monad
transformer) is that Except doesn't require an instance of an Error
Typeclass. So we define that for compatability. When we switch to a
later mtl, using Control.Monad.Exception, we can just erase the instance
declaration, and all should work fine.
|
|
This modifies the Docx type in the parser to avoid all the extra files
(Notes, numbering, etc). A reader monad keeps track of these, and applies
them at the end. The reader monad is stacked with ErrorT to enable better
error-handling than the old Maybes. (Note that the better error handling
isn't really there yet, but it is now possible.)
One long-term goal of these changes is to make it easier to write the Docx
type. This should make it easier to develop a standalone docx package in the
future.
|
|
This function is equivalent to the more general (<*) which is defined in
Control.Applicative. This change makes pandoc code easier to understand for
those not familar with the codebase.
|
|
This sets `stateInHtmlBlock` to `Just "div"` when we're parsing
an HTML div.
Without this fix, a closing `</div>` tag could be parsed as part
of a list item rather than after the list.
|
|
Closes #1121.
|
|
Semantics should be the same.
|
|
- We no longer include trailing spaces and newlines in the
raw blocks.
- We look for closing tags for elements (but without backtracking).
- Each block-level tag is its own RawBlock; we no longer try to
consolidate them (though `--normalize` will do so).
Closes #1330.
|
|
- Added `audio` and `source` in `eitherBlockOrInline`.
- Moved `video`, `svg`, `progress`, `script`, `noscript`, `svg` from
`blockTags` to `eitherBlockOrInline`.
- `map` and `object` were mistakenly in both lists; they have been removed
from `blockTags`.
|
|
This is a first stab at writing out equations in LaTeX based on
omml equations in Word. There are some glitches: unicode chars not known to
LaTeX are silently skipped, and functions (such as `\oiiint`) not in the
standard LaTeX packages are inserted, which can lead to pdf compilation
errors (depending, of course, on your preamble).
Adding, for example, `\usepackage[charter]{mathdesign}` to the preamble will
allow you to use most of the more esoteric functions.
|
|
This will allow us to deal with unicode characters from word equations. This
part of the process will need to continue to be improved.
|
|
|
|
This gets rid of `divAttrToContainers`: an internal convenience function
which had become pretty inconvenient. Rather than converting classes and
indentations to string lists and back, we deal with the `pPr` attribute
directly.
|
|
Fix hanging indent behavior
|
|
Here, when hanging indents are greater than or equal to left indents, we
don't set it to block quote. Such indents are frequently used in
academic bibliographies. (Thanks to Caleb McDaniel.)
|
|
This lets us keep more information about the indentation, and act
accordingly in the reader.
|
|
Previously, a fresh state was created for the purpose of updating. In
the future, when there is more than one field in the state, this
obviously won't work.
|
|
Previously, only those with an anchor got an auto id. Now, all do, which
puts it in line with pandoc's markdown extension.
|
|
|
|
Record relationship between original id and auto id, so we can fix links
after.
|
|
In preparation for auto ids.
|
|
Using pattern guard, in preparation for doing some more complicated
stuff with it (recording header anchors, so we can change them to auto
ids.)
|
|
Use PatternGuards to get rid of need for `isJust`, `fromJust`
altogether.
|
|
It only applies to headers, so we can just apply it when we make a
header.
|
|
This is a ReaderT State stack, which keeps track of some environment info, such
as the options and the docx doc. The state will come in handy in the future,
for a couple of planned features (rewriting the section anchors as auto_idents,
and hopefully smart-quoting).
|
|
See #1346.
|
|
Track changes with options
|
|
Remove some redundant ways of dealing with Maybe.
|
|
|
|
mapMaybe does the filtering for us.
|
|
This will only read the insertions, and ignore the deletions.
|
|
This is just for the Parse module, reading it into the Docx format. It
still has to be translated into pandoc.
|
|
Insertion and deletion. Dates are just strings for now.
|
|
If a block has an indentation less than or equal to zero, it should not be
treated as a block quote.
|
|
This marks the removal of the final tree-walk in the code. (Though there
is still one in the Lists module.)
|
|
This commit also fixes a problem with the previous code pushes, which
wouldn't allow code blocks to share a div.
|
|
|
|
|
|
Docx rewrite and cleanup (in terms of Reducible typeclass)
|
|
This cleans up them implementation, and cuts down on tree-walking.
Anecdotally, I've seen about a 3-fold speedup.
|
|
This will allow us to get rid of more general functions we no longer need in
the main reader.
|
|
This defines a typeclass `Reducible` which allows us to "reduce" pandoc
Inlines and Blocks, like so
Emph [Strong [Str "foo", Space]] <++> Strong [Emph [Str "bar"]], Str
"baz"] =
[Strong [Emph [Str "foo", Space, Str "bar"], Space, Str "baz"]]
So adjacent formattings and strings are appropriately grouped.
Another set of operators for `(Reducible a) => (Many a)` are also
included.
|
|
This helps when you have two minipages which can't have
blank lines between them.
See #690, #1196.
|
|
The normalizing tests revealed a problem with unformatted spaces, brought about
by `spanTrim`. This fixes by not trimming the spaces out of spans until they
are in their final form.
|
|
There were some problems with the old str normalization. This fixes those
problems. Also, since it drills down on its own, it only needs to be
mapped over the blocks, not walked over the tree.
|
|
`<span style="font-variant:small-caps;">foo</span>` will be
parsed as a `SmallCaps` inline, and will work in all output
formats that support small caps.
Closes #1360.
|
|
The opening "{{" must be followed by an alphanumeric or ':'.
This prevents the exponential slowdown in #1033.
Closes #1033.
|
|
|