pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2016-10-17	RST reader: skip whitespace before note.	Jesse Rosenthal	1	-2/+3
	RST requires a space before a footnote marker. We discard those spaces so that footnotes will be adjacent to the text that comes before it. This is in line with what rst2latex does. rst2html does not discard the space, but its html output is different than pandoc's, so this seems the most semantically correct approach. Closes #3163
2016-10-14	Org reader: allow figure with empty caption	Albert Krewinkel	1	-3/+1
	A `#+CAPTION` attribute before an image is enough to turn an image into a figure. This wasn't the case because the `parseFromString` function, which processes the caption value, would fail on empty values. Adding a newline character to the caption value fixes this. Fixes: #3161
2016-10-14	Merge pull request #3146 from hubertp-lshift/feature/odt-list-start-value	John MacFarlane	2	-13/+21
	[ODT Parser] Include list's starting value
2016-10-14	Added tests and a corner case for starting number	Hubert Plociniczak	1	-0/+1
	Review revealed that we didn't handle the case when the starting point is an empty string. While this is not a valid .odt file, we simply added a special case to deal with it. Also added tests for the new feature.
2016-10-13	Parse line-oriented markup as LineBlock	Albert Krewinkel	4	-9/+9
	Markup-features focusing on lines as distinctive part of the markup are read into `LineBlock` elements. This currently means line blocks in reStructuredText and Markdown (the latter only if the `line_block` extension is enabled), the `linegroup`/`line` combination from the Docbook 5.1 working draft, and Org-mode `VERSE` blocks.
2016-10-12	[ODT Parser] Include list's starting value	Hubert Plociniczak	2	-13/+20
	Previously the starting value of the lists' items has been hardcoded to 1. In reality ODT's list style definition can provide a new starting value in one of its attributes. Writers already handle the modified start value so no need to change anything in that area.
2016-10-12	Basic support for images in ODT documents	Hubert Plociniczak	3	-38/+115
	Highly influenced by the docx support, refactored some code to avoid DRY.
2016-10-10	Org reader: trim verse lines properly	Albert Krewinkel	1	-2/+4
	An empty verse line should not result in `Str ""` but in `mempty`.
2016-10-02	MediaWiki writer: transform filename with underscores in images.	John MacFarlane	1	-1/+1
	`foo bar.jpg` becomes `foo_bar.jpg`. This was already done for internal links, but it also needs to happen for images. Closes #3052.
2016-09-28	Markdown reader: added bracket syntax for native spans.	John MacFarlane	1	-0/+8
	See #168. Text.Pandoc.Options.Extension has a new constructor `Ext_brackted_spans`, which is enabled by default in pandoc's Markdown.
2016-09-02	Remove TagSoup compat	Jesse Rosenthal	2	-5/+5
	We already lower-bound tagsoup at 0.13.7, which means we were always running the compatibility layer (it was conditional on min value 0.13). Better to just use `lookupEntity` from the library directly, and convert a string to a char if need be.
2016-09-02	Remove directory compat	Jesse Rosenthal	1	-1/+1
	directory 1.1 depends on base 4.5 (ghc 7.4) which we are no longer supporting. So we don't have to use a compatibility layer for it.
2016-09-02	Remove Text.Pandoc.Compat.Except	Jesse Rosenthal	5	-8/+5

2016-09-02	Fix grouping of imports.	Jesse Rosenthal	7	-7/+8
	Some source files keep imports in tidy groups. Changing `Text.Pandoc.Compat.Monoid` to `Data.Monoid` could upset that. This restores tidiness.
2016-09-02	Remove Compat.Monoid	Jesse Rosenthal	14	-14/+14
	This was only necessary for GHC versions with base below 4.5 (i.e., ghc < 7.4).
2016-08-30	Org reader: respect unnumbered header property	Albert Krewinkel	1	-2/+10
	Sections the `unnumbered` property should, as the name implies, be excluded from the automatic numbering of section provided by some output formats. The Pandoc convention for this is to add an "unnumbered" class to the header. The reader treats properties as key-value pairs per default, so a special case is added to translate the above property to a class instead. Closes #3095.
2016-08-29	Docx reader: make all compilers happy with traversable.	Jesse Rosenthal	1	-1/+3
	The last attempt to make 7.8 happy made 7.10 unhappy. So we need some conditional logic to appease all versions.
2016-08-29	Docx reader: Import traverse for ghc 7.8	Jesse Rosenthal	1	-0/+1
	The GHC 7.8 build was erroring without it.
2016-08-29	Docx reader: clean up function with `traverse`	Jesse Rosenthal	1	-6/+1

2016-08-29	Merge branch 'org-meta-handling'	Albert Krewinkel	4	-69/+195

2016-08-29	Org reader: respect `creator` export option	Albert Krewinkel	3	-5/+8
	The `creator` option controls whether the creator meta-field should be included in the final markup. Setting `#+OPTIONS: creator:nil` will drop the creator field from the final meta-data output. Org-mode recognizes the special value `comment` for this field, causing the creator to be included in a comment. This is difficult to translate to Pandoc internals and is hence interpreted the same as other truish values (i.e. the meta field is kept if it's present).
2016-08-29	Org reader: respect `email` export option	Albert Krewinkel	3	-5/+7
	The `email` option controls whether the email meta-field should be included in the final markup. Setting `#+OPTIONS: email:nil` will drop the email field from the final meta-data output.
2016-08-29	Org reader: respect `author` export option	Albert Krewinkel	4	-4/+23
	The `author` option controls whether the author should be included in the final markup. Setting `#+OPTIONS: author:nil` will drop the author from the final meta-data output.
2016-08-29	Org reader: read HTML_head as header-includes	Albert Krewinkel	1	-0/+2
	HTML-specific head content can be defined in `#+HTML_head` lines. They are parsed as format-specific inlines to ensure that they will only show up in HTML output.
2016-08-29	Org reader: set classoption meta from LaTeX_class_options	Albert Krewinkel	1	-1/+8

2016-08-29	Org reader: set documentclass meta from LaTeX_class	Albert Krewinkel	1	-0/+1

2016-08-29	Org reader: read LaTeX_header as header-includes	Albert Krewinkel	1	-9/+31
	LaTeX-specific header commands can be defined in `#+LaTeX_header` lines. They are parsed as format-specific inlines to ensure that they will only show up in LaTeX output.
2016-08-29	Org reader: give precedence to later meta lines	Albert Krewinkel	1	-1/+1
	The last meta-line of any given type is the significant line. Previously the value of the first line was kept, even if more lines of the same type were encounterd.
2016-08-29	Org reader: allow multiple, comma-separated authors	Albert Krewinkel	1	-1/+9
	Multiple authors can be specified in the `#+AUTHOR` meta line if they are given as a comma-separated list.
2016-08-29	Org reader: read markup only for special meta keys	Albert Krewinkel	1	-5/+20
	Most meta-keys should be read as normal string values, only a few are interpreted as marked-up text.
2016-08-29	Org reader: extract meta parsing code to module	Albert Krewinkel	2	-64/+111
	Parsing of meta-data is well separable from other block parsing tasks. Moving into new module to get small files and clearly arranged code.
2016-08-28	Docx reader: update copyright.	Jesse Rosenthal	3	-6/+6

2016-08-28	Docx reader: use all anchor spans for header ids.	Jesse Rosenthal	1	-1/+1
	Previously we only used the first anchor span to affect header ids. This allows us to use all the anchor spans in a header, whether they're nested or not. Along with 62882f97, this closes #3088.
2016-08-28	Docx reader: Let headers use exisiting id.	Jesse Rosenthal	1	-6/+10
	Previously we always generated an id for headers (since they wouldn't bring one from Docx). Now we let it use an existing one if possible. This should allow us to recurs through anchor spans.
2016-08-28	Docx reader: Handle anchor spans with content in headers.	Jesse Rosenthal	1	-7/+8
	Previously, we would only be able to figure out internal links to a header in a docx if the anchor span was empty. We change that to read the inlines out of the first anchor span in a header. This still leaves another problem: what to do if there are multiple anchor spans in a header. That will be addressed in a future commit.
2016-08-15	StyleMap: export functions on StyleMap instances	Jesse Rosenthal	1	-0/+2
	We're going to want `getMap` in the Docx Writer.
2016-08-13	Docx parser: Use xml convenience functions	Jesse Rosenthal	1	-38/+27
	The functions `isElem` and `elemName` (defined in Docx/Util.hs) make the code a lot cleaner than the original XML.Light functions, but they had been used inconsistently. This puts them in wherever applicable.
2016-08-11	Merge pull request #3048 from tarleb/latex-mini-fix	John MacFarlane	1	-1/+1
	LaTeX reader: drop duplicate `*` in bibtexKeyChars
2016-08-09	Merge pull request #3065 from tarleb/org-verse-indent	John MacFarlane	1	-1/+10
	Org reader: preserve indentation of verse lines
2016-08-09	Org reader: ensure image sources are proper links	Albert Krewinkel	3	-39/+53
	Image sources as those in plain images, image links, or figures, must be proper URIs or relative file paths to be recognized as images. This restriction is now enforced for all image sources. This also fixes the reader's usage of uncleaned image sources, leading to `file:` prefixes not being deleted from figure images (e.g. `[[file:image.jpg]]` leading to a broken image `<img src="file:image.jpg"/>) Thanks to @bsag for noticing this bug.
2016-08-08	Org reader: preserve indentation of verse lines	Albert Krewinkel	1	-1/+10
	Leading spaces in verse lines are converted to non-breaking spaces, so indentation is preserved. This fixes #3064.
2016-08-06	MediaWiki reader: properly interpret XML tags in pre environments.	John MacFarlane	1	-3/+2
	They are meant to be interpreted as literal text in textile. Closes #3042.
2016-08-06	Improved mediawiki reader's treatment of verbatim constructions.	John MacFarlane	1	-7/+13
	Previously these yielded strings of alternating Code and Space elements; we now incorporate the spaces into the Code. Emphasis etc. is still possible inside these. Closes #3055.
2016-08-06	Fix for unquoted attribute values in mediawiki tables.	John MacFarlane	1	-1/+1
	Previously an unquoted attribute value in a table row could cause parsing problems. Fixes #3053 (well, proper rowspans and colspans aren't created, but that's a bigger limitation with the current Pandoc document model for tables).
2016-07-29	LaTeX reader: drop duplicate `*` in bibtexKeyChars	Albert Krewinkel	1	-1/+1

2016-07-22	Textile reader: disallow empty URL in explicit link.	John MacFarlane	1	-1/+1
	Closes #3036.
2016-07-22	Textile reader: support `bc..` extended code blocks.	John MacFarlane	1	-5/+25
	Also, remove trailing newline in code blocks (consistently with Markdown reader).
2016-07-20	LaTeX reader: be more forgiving of non-standard characters.	John MacFarlane	1	-1/+1
	E.g. `^` outside of math. Some custom environments give these a meaning, so we should try not to fall over when we encounter them.
2016-07-20	LaTeX reader: more robust parsing of unknown environments.	John MacFarlane	1	-2/+9
	We no longer fail on things like `^` inside options for tikz. Closes #3026.
2016-07-20	RST reader: use Div for admonitions.	John MacFarlane	1	-8/+6
	Previously blockquotes were used. Now a Div is used with class `admonition` and (if relevant) one of the following: `attention`, `caution`, `danger`, `error`, `hint`, `important`, `note`, `tip`, `warning`. `sidebar` is also put into a Div. Note: This will change rendering of RST documents! It should provide much more flexibility. Closes #3031.