Age | Commit message (Collapse) | Author | Files | Lines |
|
[odt] images parser
|
|
|
|
@tarleb this is an interesting one, see the build log in
https://travis-ci.org/jgm/pandoc/jobs/168612017
It only failed on ghc 7.8; I think this must have to do with
the change making Monad a superclass of Applicative, hence
this change.
|
|
When creating an anchor element we were adding its representation
as well as the original content, leading to text duplication.
|
|
|
|
Frame can contain other frames with the text boxes.
This is something that has not been considered before
and meant that the whole construction of images was
broken in those cases. Also the captions were fixed/ignored.
|
|
RST requires a space before a footnote marker. We discard those spaces
so that footnotes will be adjacent to the text that comes before
it. This is in line with what rst2latex does. rst2html does not discard
the space, but its html output is different than pandoc's, so this seems
the most semantically correct approach.
Closes #3163
|
|
A `#+CAPTION` attribute before an image is enough to turn an image into a
figure. This wasn't the case because the `parseFromString` function, which
processes the caption value, would fail on empty values. Adding a newline
character to the caption value fixes this.
Fixes: #3161
|
|
[ODT Parser] Include list's starting value
|
|
Review revealed that we didn't handle the case
when the starting point is an empty string. While
this is not a valid .odt file, we simply added
a special case to deal with it.
Also added tests for the new feature.
|
|
Markup-features focusing on lines as distinctive part of the markup are read
into `LineBlock` elements. This currently means line blocks in reStructuredText
and Markdown (the latter only if the `line_block` extension is enabled), the
`linegroup`/`line` combination from the Docbook 5.1 working draft, and Org-mode
`VERSE` blocks.
|
|
Previously the starting value of the lists' items has been
hardcoded to 1. In reality ODT's list style definition can
provide a new starting value in one of its attributes.
Writers already handle the modified start value so no need
to change anything in that area.
|
|
Highly influenced by the docx support, refactored
some code to avoid DRY.
|
|
An empty verse line should not result in `Str ""` but in `mempty`.
|
|
`foo bar.jpg` becomes `foo_bar.jpg`. This was already done
for internal links, but it also needs to happen for images.
Closes #3052.
|
|
See #168.
Text.Pandoc.Options.Extension has a new constructor `Ext_brackted_spans`,
which is enabled by default in pandoc's Markdown.
|
|
We already lower-bound tagsoup at 0.13.7, which means we were always
running the compatibility layer (it was conditional on min value
0.13). Better to just use `lookupEntity` from the library directly, and
convert a string to a char if need be.
|
|
directory 1.1 depends on base 4.5 (ghc 7.4) which we are no longer
supporting. So we don't have to use a compatibility layer for it.
|
|
|
|
Some source files keep imports in tidy groups. Changing
`Text.Pandoc.Compat.Monoid` to `Data.Monoid` could upset that. This
restores tidiness.
|
|
This was only necessary for GHC versions with base below 4.5
(i.e., ghc < 7.4).
|
|
Sections the `unnumbered` property should, as the name implies, be
excluded from the automatic numbering of section provided by some output
formats. The Pandoc convention for this is to add an "unnumbered" class
to the header. The reader treats properties as key-value pairs per
default, so a special case is added to translate the above property to a
class instead.
Closes #3095.
|
|
The last attempt to make 7.8 happy made 7.10 unhappy. So we need some
conditional logic to appease all versions.
|
|
The GHC 7.8 build was erroring without it.
|
|
|
|
|
|
The `creator` option controls whether the creator meta-field should be
included in the final markup. Setting `#+OPTIONS: creator:nil` will
drop the creator field from the final meta-data output.
Org-mode recognizes the special value `comment` for this field, causing
the creator to be included in a comment. This is difficult to translate
to Pandoc internals and is hence interpreted the same as other truish
values (i.e. the meta field is kept if it's present).
|
|
The `email` option controls whether the email meta-field should be
included in the final markup. Setting `#+OPTIONS: email:nil` will drop
the email field from the final meta-data output.
|
|
The `author` option controls whether the author should be included in
the final markup. Setting `#+OPTIONS: author:nil` will drop the author
from the final meta-data output.
|
|
HTML-specific head content can be defined in `#+HTML_head` lines. They
are parsed as format-specific inlines to ensure that they will only show
up in HTML output.
|
|
|
|
|
|
LaTeX-specific header commands can be defined in `#+LaTeX_header` lines.
They are parsed as format-specific inlines to ensure that they will only
show up in LaTeX output.
|
|
The last meta-line of any given type is the significant line.
Previously the value of the first line was kept, even if more lines of
the same type were encounterd.
|
|
Multiple authors can be specified in the `#+AUTHOR` meta line if they
are given as a comma-separated list.
|
|
Most meta-keys should be read as normal string values, only a few are
interpreted as marked-up text.
|
|
Parsing of meta-data is well separable from other block parsing tasks.
Moving into new module to get small files and clearly arranged code.
|
|
|
|
Previously we only used the first anchor span to affect header ids. This
allows us to use all the anchor spans in a header, whether they're
nested or not.
Along with 62882f97, this closes #3088.
|
|
Previously we always generated an id for headers (since they wouldn't
bring one from Docx). Now we let it use an existing one if
possible. This should allow us to recurs through anchor spans.
|
|
Previously, we would only be able to figure out internal links to a
header in a docx if the anchor span was empty. We change that to read
the inlines out of the first anchor span in a header.
This still leaves another problem: what to do if there are multiple
anchor spans in a header. That will be addressed in a future commit.
|
|
We're going to want `getMap` in the Docx Writer.
|
|
The functions `isElem` and `elemName` (defined in Docx/Util.hs) make the
code a lot cleaner than the original XML.Light functions, but they had
been used inconsistently. This puts them in wherever applicable.
|
|
LaTeX reader: drop duplicate `*` in bibtexKeyChars
|
|
Org reader: preserve indentation of verse lines
|
|
Image sources as those in plain images, image links, or figures, must be
proper URIs or relative file paths to be recognized as images. This
restriction is now enforced for all image sources.
This also fixes the reader's usage of uncleaned image sources, leading
to `file:` prefixes not being deleted from figure
images (e.g. `[[file:image.jpg]]` leading to a broken image `<img
src="file:image.jpg"/>)
Thanks to @bsag for noticing this bug.
|
|
Leading spaces in verse lines are converted to non-breaking spaces, so
indentation is preserved.
This fixes #3064.
|
|
They are meant to be interpreted as literal text in textile.
Closes #3042.
|
|
Previously these yielded strings of alternating Code and Space
elements; we now incorporate the spaces into the Code. Emphasis
etc. is still possible inside these.
Closes #3055.
|
|
Previously an unquoted attribute value in a table row
could cause parsing problems.
Fixes #3053 (well, proper rowspans and colspans aren't
created, but that's a bigger limitation with the current
Pandoc document model for tables).
|