aboutsummaryrefslogtreecommitdiff
path: root/src/Text
AgeCommit message (Collapse)AuthorFilesLines
2016-08-13Docx parser: Use xml convenience functionsJesse Rosenthal1-38/+27
The functions `isElem` and `elemName` (defined in Docx/Util.hs) make the code a lot cleaner than the original XML.Light functions, but they had been used inconsistently. This puts them in wherever applicable.
2016-08-11Merge pull request #3048 from tarleb/latex-mini-fixJohn MacFarlane1-1/+1
LaTeX reader: drop duplicate `*` in bibtexKeyChars
2016-08-09Merge pull request #3065 from tarleb/org-verse-indentJohn MacFarlane1-1/+10
Org reader: preserve indentation of verse lines
2016-08-09Org reader: ensure image sources are proper linksAlbert Krewinkel3-39/+53
Image sources as those in plain images, image links, or figures, must be proper URIs or relative file paths to be recognized as images. This restriction is now enforced for all image sources. This also fixes the reader's usage of uncleaned image sources, leading to `file:` prefixes not being deleted from figure images (e.g. `[[file:image.jpg]]` leading to a broken image `<img src="file:image.jpg"/>) Thanks to @bsag for noticing this bug.
2016-08-08Org reader: preserve indentation of verse linesAlbert Krewinkel1-1/+10
Leading spaces in verse lines are converted to non-breaking spaces, so indentation is preserved. This fixes #3064.
2016-08-06MediaWiki reader: properly interpret XML tags in pre environments.John MacFarlane1-3/+2
They are meant to be interpreted as literal text in textile. Closes #3042.
2016-08-06Improved mediawiki reader's treatment of verbatim constructions.John MacFarlane1-7/+13
Previously these yielded strings of alternating Code and Space elements; we now incorporate the spaces into the Code. Emphasis etc. is still possible inside these. Closes #3055.
2016-08-06Fix for unquoted attribute values in mediawiki tables.John MacFarlane1-1/+1
Previously an unquoted attribute value in a table row could cause parsing problems. Fixes #3053 (well, proper rowspans and colspans aren't created, but that's a bigger limitation with the current Pandoc document model for tables).
2016-08-06Fix out of index error in handleErrorMatthew Pickering1-4/+8
In the latex parser when includes are processed, the text of the included file is directly included into the parse stream. This caused problems when there was an error in the included file (and the included file was longer than the original file) as the error would be reported at this position. The error handling tries to display the line and position where the error occured. It works by including a copy of the input and finding the place in the input when given the position of the error. In the previously described scenario, the input file would be the original source file but the error position would be the position of the error in the included file. The fix is to not try to show the exact line when it would cause an out-of-bounds error.
2016-08-06LaTeX writer: don't use * for unnumbered paragraph, subparagraph.John MacFarlane1-2/+2
The starred variants don't exist. This helps with part of #3058...it gets rid of the spurious *s. But we still have numbers on the 4th and 5th level headers.
2016-07-29LaTeX reader: drop duplicate `*` in bibtexKeyCharsAlbert Krewinkel1-1/+1
2016-07-22Merge pull request #3033 from tarleb/github-readmeJohn MacFarlane2-3/+3
PoC: GitHub-optimized README
2016-07-22Textile reader: disallow empty URL in explicit link.John MacFarlane1-1/+1
Closes #3036.
2016-07-22Textile reader: support `bc..` extended code blocks.John MacFarlane1-5/+25
Also, remove trailing newline in code blocks (consistently with Markdown reader).
2016-07-20Rename README to MANUAL.txtAlbert Krewinkel2-3/+3
2016-07-20LaTeX reader: be more forgiving of non-standard characters.John MacFarlane1-1/+1
E.g. `^` outside of math. Some custom environments give these a meaning, so we should try not to fall over when we encounter them.
2016-07-20LaTeX reader: more robust parsing of unknown environments.John MacFarlane1-2/+9
We no longer fail on things like `^` inside options for tikz. Closes #3026.
2016-07-20RST reader: use Div for admonitions.John MacFarlane1-8/+6
Previously blockquotes were used. Now a Div is used with class `admonition` and (if relevant) one of the following: `attention`, `caution`, `danger`, `error`, `hint`, `important`, `note`, `tip`, `warning`. `sidebar` is also put into a Div. Note: This will change rendering of RST documents! It should provide much more flexibility. Closes #3031.
2016-07-19Textile reader: improve definition list parsing.John MacFarlane1-6/+13
- Allow multiple terms (which we concatenate with linebreaks). - Fix exponential parsing bug (closes #3020 for real this time).
2016-07-18Textile reader: improved table parsing.John MacFarlane1-22/+62
We now handle cell and row attributes, mostly by skipping them. However, alignments are now handled properly. Since in pandoc alignment is per-column, not per-cell, we try to devine column alignments from cell alignments. Table captions are also now parsed, and textile indicators for thead and tfoot no longer cause parse failure. (However, a row designated as tfoot will just be a regular row in pandoc.)
2016-07-15Don't require haddock-library 1.4.John MacFarlane1-0/+4
Instead use CPP to work around version differences.
2016-07-15Use liftM since otherwise Functor type constraint needen in ghc 7.8.John MacFarlane1-1/+1
2016-07-14Fixed compiler warnings.John MacFarlane6-14/+11
2016-07-14Haddock reader - support math.John MacFarlane1-0/+4
The Haddock document model added elements for math in 1.4.
2016-07-14Docx Writer: Use actual creation time as doc propJesse Rosenthal1-4/+3
Previously, we had used the user-supplied date, if available, for Word's document creation metadata. This could lead to weird results, as in cases where the user post-dates a document (so the modification might be prior to the creation). Here we use the actual computer time to set the document creation.
2016-07-14Shared: improve year sanity check in normalizeDateJesse Rosenthal1-6/+6
Previously we parsed a list of dates, took the first one, and then tested its year range. That meant that if the first one failed, we returned nothing, regardless of what the others did. Now we test for sanity before running `msum` over the list of Maybe values. Anything failing the test will be Nothing, so will not be a candidate.
2016-07-14Shared: normalizeDate should reject illegal years.Jesse Rosenthal1-5/+10
We only allow years between 1601 and 9999, inclusive. The ISO 8601 actually says that years are supposed to start with 1583, but MS Word only allows 1601-9999. This should stop corrupted word files if the date is out of that range, or is parsed incorrectly.
2016-07-14Shared: Add further formats for `normalizeDate`Jesse Rosenthal1-1/+2
We want to avoid illegal dates -- in particular years with greater than four digits. We attempt to parse series of digits first as `%Y%m%d`, then `%Y%m`, and finally `%Y`.
2016-07-14Removed some redundant class constraints.John MacFarlane1-3/+3
2016-07-14Merge pull request #3019 from tarleb/org-verbatim-fixJohn MacFarlane1-2/+4
Org reader: fix parsing of verbatim inlines
2016-07-14Fixed exponential parsing bug in textile reader.John MacFarlane1-0/+1
Closes #3020.
2016-07-14Org reader: fix parsing of verbatim inlinesAlbert Krewinkel1-2/+4
Org rules for allowed characters before or after markup chars were not checked for verbatim text. This resultet in wrong parsing outcomes of if the verbatim text contained e.g. space enclosed markup characters as part of the text (`=is_substr = True=`). Forcing the parser to update the positions of allowed/forbidden markup border characters fixes this. This fixes #3016.
2016-07-05Merge pull request #3014 from tarleb/org-writer-divJohn MacFarlane1-7/+41
Org writer: improve Div handling
2016-07-05Org writer: improve Div handlingAlbert Krewinkel1-7/+41
Div blocks handling is changed to make the output look more like idiomatic org mode: - Div-wrapped content is output as-is if the div's attribute is the null attribute. - Div containers with an id but neither classes nor key-value pairs are unwrapped and the id is added as an anchor. - Divs with classes associated with greater block elements are wrapped in a `#+BEGIN`...`#+END` block. - The old behavior for Divs with more complex attributes is kept.
2016-07-04Org reader: replace ugly code with view patternAlbert Krewinkel1-5/+4
Some less-than-smart code required a pragma switching of overlapping pattern warnings in order to compile seamlessly. Using view patterns makes the code easier to read and also doesn't require overlapping pattern checks to be disabled.
2016-07-03Merge pull request #3010 from tarleb/org-header-treeJohn MacFarlane4-247/+386
Org reader: support archived trees, headline levels export setting
2016-07-03Odt reader: Removed redundant Monoid constraints.John MacFarlane1-7/+7
2016-07-03Fix warning for parseURl import.John MacFarlane1-2/+3
2016-07-03CPP workaround for deprecation of parseUrl in http-client.John MacFarlane1-6/+14
2016-07-03Org reader: support headline levels export settingAlbert Krewinkel3-8/+40
The depths of headlines can be modified using the `H` option. Deeper headlines will be converted to lists.
2016-07-03Allow 'standout' as a beamer frame option.John MacFarlane1-1/+1
## Slide title {.standout} Closes #3007.
2016-07-02Org reader: put export setting parser into moduleAlbert Krewinkel3-191/+191
Export option parsing is distinct enough from general block parsing to justify putting it into a separate module.
2016-07-01LaTeX reader: strip off double quotes around image source if present.John MacFarlane1-1/+8
Avoids interpreting these as part of the literal filename. See #2825.
2016-07-01LaTeX writer: don't URI-escape image source.John MacFarlane1-1/+1
Usually this is a local file, and replacing spaces with `%20` ruins things. Closes #2825.
2016-07-01Org reader: support archived trees export optionsAlbert Krewinkel2-8/+62
Handling of archived trees can be modified using the `arch` option. Archived trees are either dropped, exported completely, or collapsed to include just the header when the `arch` option is nil, non-nil, or `headline`, respectively.
2016-07-01Org reader: refactor comment tree handlingAlbert Krewinkel2-39/+21
Comment trees were handled after parsing, as pattern matching on lists is easier than matching on sequences. The new method of reading documents as trees allows for more elegant subtree removal.
2016-07-01Org reader: parse as headlines, convert to blocksAlbert Krewinkel1-47/+86
Emacs org-mode is based on outline-mode, which treats documents as trees with headlines are nodes. The reader is refactored to parse into a similar tree structure. This simplifies transformations acting on document (sub-)trees.
2016-07-01Org reader: improve tag and properties type safetyAlbert Krewinkel1-25/+57
Specific newtype definitions are used to replace stringly typing of tags and properties. Type safety is increased while readability is improved.
2016-07-01ZimWiki writer: removed commented out code that confused Haddock.John MacFarlane1-8/+8
See https://travis-ci.org/jgm/pandoc/jobs/141542247
2016-06-30Added Zim Wiki writer, template and tests.Alex Ivkin2-0/+364