aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)AuthorFilesLines
2014-06-23Move some of the clean-up logic into List module.Jesse Rosenthal1-3/+22
This will allow us to get rid of more general functions we no longer need in the main reader.
2014-06-23Add new typeclass, ReducibleJesse Rosenthal1-0/+150
This defines a typeclass `Reducible` which allows us to "reduce" pandoc Inlines and Blocks, like so Emph [Strong [Str "foo", Space]] <++> Strong [Emph [Str "bar"]], Str "baz"] = [Strong [Emph [Str "foo", Space, Str "bar"], Space, Str "baz"]] So adjacent formattings and strings are appropriately grouped. Another set of operators for `(Reducible a) => (Many a)` are also included.
2014-06-22Docx reader: Fix spacing in formatting.Jesse Rosenthal1-1/+1
The normalizing tests revealed a problem with unformatted spaces, brought about by `spanTrim`. This fixes by not trimming the spaces out of spans until they are in their final form.
2014-06-22Implement new normalization.Jesse Rosenthal1-11/+57
There were some problems with the old str normalization. This fixes those problems. Also, since it drills down on its own, it only needs to be mapped over the blocks, not walked over the tree.
2014-06-20Markdown reader: Support smallcaps through span.John MacFarlane1-1/+6
`<span style="font-variant:small-caps;">foo</span>` will be parsed as a `SmallCaps` inline, and will work in all output formats that support small caps. Closes #1360.
2014-06-20MediaWiki reader: Tightened up template parsing.John MacFarlane1-0/+1
The opening "{{" must be followed by an alphanumeric or ':'. This prevents the exponential slowdown in #1033. Closes #1033.
2014-06-20MediaWiki reader: Support --trace.John MacFarlane1-1/+10
2014-06-20LaTeX writer: Correctly handle figures in notes.John MacFarlane1-5/+7
Notes can't contain figures in LaTeX, so we fake it to avoid an error. Closes #1053.
2014-06-20Markdown reader: Prevent spurious line breaks after list items.John MacFarlane1-1/+2
When the `hard_line_breaks` option was specified, pandoc would produce a spurious line break after a tight list item. This patch solves the problem. Closes #1137.
2014-06-20ImageSize: Use default instead of failing if image size not foundJohn MacFarlane1-1/+6
in exif header. Closes #1358.
2014-06-20HTML reader: Fix performance issue with malformed HTML tables.John MacFarlane1-0/+2
We let a `</table>` tag close an open `<tr>` or `<td>`. Closes #1167.
2014-06-20Support --trace in HTML reader.John MacFarlane1-1/+10
2014-06-20LaTeX writer: Fixed strikeout + highlighted code. Closes #1294.John MacFarlane1-1/+10
Previously strikeout highlighted code caused an error.
2014-06-20Make strNormalize go bottomUp.Jesse Rosenthal1-5/+5
This was how it used to be before it was folded into blockNormalize.
2014-06-20Docx reader: Add a comment explaining strNormalizeJesse Rosenthal1-0/+4
`normalize` from Text.Pandoc.Shared is more general. In tests, though, it more than doubles the run time. `strNormalize` does less, but it does what we need. This comment is added for future maintainability.
2014-06-20Docx Reader: Normalize DefinitionListsJesse Rosenthal1-0/+2
Previously DefinitionList had been left out of `blockNormalize`. Now it is included.
2014-06-20Docx reader: simplify blockNormalizeJesse Rosenthal1-10/+8
Use a function `stripSpaces`, instead of recursion. Makes it a bit easier to read and mantain, and simplify normalizing DefinitionList, which was left out the first time.
2014-06-20Docx reader: Fix hdr handling in block normJesse Rosenthal1-0/+2
`blockNormalize` previously forgot to account for the case in which a Header's inlines did not start with a space.
2014-06-19Docx writer: Use Compact style for empty table cells.John MacFarlane1-1/+3
Otherwise we get overly tall lines when there are empty table cells and the other cells are compact. Closes #1353.
2014-06-19HTML reader: Allow space between `<col>` and `</col>`.John MacFarlane1-0/+1
Test case: ``` <table border="1"> <colgroup> <col> </col> <col></col> </colgroup> <tbody> <tr> <td>X</td> <td>Y</td> </tr> <tr> <td>1</td> <td>2</td> </tr> </tbody> </table> ```
2014-06-19Merge pull request #1354 from jkr/literalTabJohn MacFarlane2-2/+20
Parse literal tabs in docx
2014-06-19Introduce blockNormalizeJesse Rosenthal1-1/+14
This will help take care of spaces introduced at the beginning of strings.
2014-06-19Have Docx reader properly interpret tabs.Jesse Rosenthal1-0/+2
2014-06-19Add literal tabs to parser.Jesse Rosenthal1-1/+4
2014-06-19ImageSize: ignore unknown exif header tag rather than crashing.John MacFarlane1-1/+2
Some images seem to have tag type of 256, which was causing a runtime error.
2014-06-19Haddock writer: Use _____ for hrule.John MacFarlane1-2/+2
Avoids interpretation as list.
2014-06-18Haddock writer: Only use Decimal list style.John MacFarlane1-2/+2
2014-06-18Small fix to haddock "tables".John MacFarlane1-2/+2
2014-06-18More polish on Haddock reader/writer.John MacFarlane2-22/+47
2014-06-18Finished first draft of Haddock writer.John MacFarlane3-2/+371
2014-06-18Rewrote haddock reader to use haddock-library.John MacFarlane1-22/+102
This brings pandoc's rendering of haddock markup in line with the new haddock. Note that we preserve line breaks in `@` code blocks, unlike the earlier version. Modified tests pass. More tests would be good.
2014-06-18Removed old haddock reader code. Add dependency on haddock-library.John MacFarlane3-360/+21
This also removes the dependency on alex and happy.
2014-06-17Highlighting: Let .numberLines work even if no language given.John MacFarlane1-1/+6
Closes #1287, jgm/highlighting-kate#40.
2014-06-17DocBook reader: Support <?asciidoc-br?>.John MacFarlane1-2/+17
Closes #1236. Note, this is a bit of a kludge, to work around the fact that xml-light doesn't parse `<?asciidoc-br?>` correctly. We preprocess the input, replacing that instruction with `<br/>`, and then parse that as a line break. Other XML instructions are simply removed from the input stream.
2014-06-17LaTeX reader: Correctly handle table rows with too few cells.John MacFarlane1-3/+7
LaTeX seems to treat them as if they have empty cells at the end. Closes #241.
2014-06-16Fixed compiler warning.John MacFarlane1-1/+3
2014-06-16Naming: Use Docx instead of DocX.John MacFarlane4-47/+47
For consistency with the existing writer.
2014-06-16Merge branch 'docx' of https://github.com/jkr/pandoc into jkr-docxJohn MacFarlane4-20/+1327
2014-06-16Org reader: make tildes create inline code.John MacFarlane1-4/+4
Closes #1345. Also relabeled 'code' and 'verbatim' parsers to accord with the org-mode manual. I'm not sure what the distinction between code and verbatim is supposed to be, but I'm pretty sure both should be represented as Code inlines in pandoc. The previous behavior resulted in the text not appearing in any output format.
2014-06-16Small improvement to fix to #1333.John MacFarlane1-4/+1
This allows blank lines at end of multiline headers.
2014-06-16Markdown reader: fixed #1333 (table parsing bug).John MacFarlane1-5/+6
2014-06-16LaTeX reader: handle leading/trailing spaces in emph better.John MacFarlane1-17/+17
`\emph{ hi }` gets parsed as `[Space, Emph [Str "hi"], Space]` so that we don't get things like `* hi *` in markdown output. Also applies to textbf and some other constructions. Closes #1146. (`--normalize` isn't touched by this, but normalization should not generally be necessary with the changes to the readers.)
2014-06-16LaTeX reader: don't assume preamble doesn't contain environments.John MacFarlane1-1/+1
Closes #1338.
2014-06-16HTML reader: Fixed major parsing problem with HTML tables.John MacFarlane1-15/+11
Table cells were being combined into one cell. Closes #1341.
2014-06-16Merge pull request #1344 from mpickering/masterJohn MacFarlane2-13/+20
Moved extractSpaces to Shared.hs
2014-06-16Org reader: fixed #1342.John MacFarlane1-9/+5
This change rewrites `inlineLaTeXCommand` so that parsec will know when input is being consumed. Previously a run-time error would be produced with some input involving raw latex. (I believe this does not affect the last release, as the inline latex reading was added recently.)
2014-06-16Moved extractSpaces to Shared.hsmpickering2-13/+20
Generalised and move the extractSpaces function from `HTML.hs` to `Shared.hs` so that the docx reader can also use it.
2014-06-16Integrated the docx reader into the main pandoc program.mpickering1-20/+36
Changes also include generalising the types of reader allowed. The mechanism now mimics the more general output mechanism.
2014-06-16Add DocX files to tree.Jesse Rosenthal3-0/+1291
This introduces Text.Pandoc.DocX, and its exported `readDocX` function.
2014-06-12allow (and discard) optional argument for \captionJames Aspnes1-1/+1