Age | Commit message (Collapse) | Author | Files | Lines |
|
Previously, if a URL had an anchor, such as
http://johnmacfarlane.net/pandoc/README.html#synopsis
the reader would incorrectly identify it as an internal link
and return "#synopsis" for the link in output.
|
|
Fix 'Ext_lists_without_preceding_blankline' bug.
|
|
While empty links are not allowed in Emacs org-mode, Pandoc org-mode
should support them: gitit relies on empty links as they are used to
create wiki links.
Fixes jgm/gitit#471
|
|
The org reader was to restrictive when parsing links, some relative
links and links to files given as absolute paths were not recognized
correctly. The org reader's link parsing function was amended to handle
such cases properly.
This fixes #1741
|
|
|
|
This patch builds paragraph styles tree, then checks if paragraph has
style.styleId or style/name.val matching predetermined patterns.
Works with "Heading#" (name.val="heading #") for headings and
"Quote"|"BlockQuote"|"BlockQuotation" (name.val="Quote"|"Block Text")
for block quotes.
|
|
Org supports special symbols which can be included using LaTeX syntax,
but are actually MathML entities. Examples for this are
`\nbsp` (non-breaking space), `\Aacute` (the letter A with accent acute)
or `\copy` (the copyright sign ©).
This fixes #1657.
|
|
Formerly `pandoc -f markdown-fancy_lists+startnum` did not work
properly.
|
|
Respect indent when parsing Org bullet lists
|
|
Org reader: fix rules for emphasis recognition
|
|
Document trees under a header starting with the word `COMMENT` are
comment trees and should not be exported. Those trees are dropped
silently.
This closes #1678.
|
|
Things like `/hello,/` or `/hi'/` were falsy recognized as emphasised
strings. This is wrong, as `,` and `'` are forbidden border chars and
may not occur on the inner border of emphasized text. This patch
enables the reader to matches the reference implementation in that it
reads the above strings as plain text.
|
|
Tidy up fix for #1650, #1698 as per comments in #1680.
Fix same issue for definition lists with the same method.
|
|
Fixes issue with top-level bullet list parsing.
Previously we would use `many1 spaceChars` rather than respecting
the list's indent level. We also permitted `*` bullets on unindented
lists, which should unambiguously parse as `header 1`.
Combined, this meant headers at a different indent level were
being unwittingly slurped into preceding bullet lists, as per
Issue #1650.
|
|
Adding inlineCommands
|
|
Fix path-slashes inside archive for windows
|
|
Closes #1649
|
|
* Fixes #1636.
* Adds a test.
|
|
Closes #1620
|
|
When we encounter one of the polyglot header styles, we want to remove
that from the par styles after we convert to a header. To do that, we
have to keep track of the style name, and remove it appropriately.
|
|
We're just keeping a list of header formats that different languages use
as their default styles. At the moment, we have English, German, Danish,
and French. We can continue to add to this.
This is simpler than parsing the styles file, and perhaps less
error-prone, since there seems to be some variations, even within a
language, of how a style file will define headers.
|
|
|
|
This allows us to emphasize at the beginning of a new paragraph (or, in
general, after blank lines).
|
|
There could be new top-level headers after making lists, so we have to
rewrite links after that.
|
|
When users number their headers, Word understands that as a single item
enumerated list. We make the assumption that such a list is, in fact, a header.
|
|
Don't use os-sensitive "combine", since we always want the paths in our
zip-archive to use forward-slashes.
|
|
Previously text that ended a div would be parsed as Plain
unless there was a blank line before the closing div tag.
Test case:
<div class="first">
This is a paragraph.
This is another paragraph.
</div>
Closes #1591.
|
|
Conflicts:
src/Text/Pandoc/Writers/Docx.hs
|
|
This makes to docx reader's native output fit with the way the markdown
reader understands its markdown output. Ie, as far as table cells go:
docx -> native == docx -> native -> markdown -> native
(This identity isn't true for other things outside of table cells, of
course).
|
|
the start of the line.
|
|
|
|
|
|
The header is now parsed as meta information. The first line is the
`title`, the second is the `author` and third line is the `date`.
|
|
|
|
Docx reader: parsing styles
|
|
Previously a section like this would be enclosed in a paragraph,
with RawInline for the video tags (since video is a tag that can
be either block or inline):
<video controls="controls">
<source src="../videos/test.mp4" type="video/mp4" />
<source src="../videos/test.webm" type="video/webm" />
<p>
The videos can not be played back on your system.<br/>
Try viewing on Youtube (requires Internet connection):
<a href="http://youtu.be/etE5urBps_w">Relative Velocity on
Youtube</a>.
</p>
</video>
This change will cause the video and source tags to be parsed
as RawBlock instead, giving better output.
The general change is this: when we're parsing a "plain" sequence
of inlines, we don't parse anything that COULD be a block-level tag.
|
|
|
|
We no longer need the explicit lists since we're deriving them from the
ground up.
|
|
This is the only one so far. We'll add others as they show up.
|
|
We now no longer check against explicit styles.
|
|
We always favor an explicit positive or negative in a style in a
descendent, and only turn to the ancestor if nothing is set.
We also introduce an (empty) list of styles that are black-listed. We
won't check them. (Think underlines in hyperlinks).
|
|
Two points here: (1) We're going bottom-up, from styles not based on
anything, to avoid circular dependencies or any other sort of
maliciousness/incompetence. And (2) each style points to its
parent. That way, we don't need the whole tree to pass a style over to
Docx.hs
|
|
|
|
|
|
This will make it easier to build the style map from the bottom up (to
avoid any infinite references).
|
|
Just discards info at the moment, so at least it works the same.
|
|
We want to be able to read user-defined styles. Eventually we'll be able
to figure out styles in terms of inheritance as well. The actual
cascading will happen in the docx reader.
|
|
In docx, super- and subscript are attributes of Vertalign. It makes more
sense to follow this, and have different possible values of Vertalign in
runStyle. This is mainly a preparatory step for real style parsing,
since it can distinguish between vertical align being explicitly turned
off and it not being set.
In addition, it makes parsing a bit clearer, and makes sure we don't do
docx-impossible things like being simultaneously super and sub.
|
|
|
|
functions like runElemsToInlines and parPartsToInlines are just defined
in terms of concatting and mapping their singular
version (e.g. `runElemToInlines`). Having two functions with almost
identical names makes it easier to introduce errors. It's easy enough to
just concat and map inline, and it makes it clearer what is going on in
the code.
|