aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers
AgeCommit message (Collapse)AuthorFilesLines
2012-02-04Revert "LaTeX reader: Use kpsewhich to find paths for handleIncludes."John MacFarlane1-11/+1
This reverts commit 1f90c6d7e0800621367ff72601a4f66159688ca9.
2012-02-04LaTeX reader: Use kpsewhich to find paths for handleIncludes.John MacFarlane1-1/+11
Fall back without an error if kpsewhich is not available.
2012-02-04Have handleIncludes look for local .sty files from \usepackage.John MacFarlane1-3/+7
2012-02-04LaTeX reader: small bug fixes.John MacFarlane1-8/+10
2012-02-04Minor formatting changeJohn MacFarlane1-1/+2
2012-02-04LaTeX reader: Factored out rawEnvJohn MacFarlane1-7/+11
2012-02-04Small improvements in latex table parser.John MacFarlane1-3/+2
2012-02-04Complete rewrite of LaTeX reader.John MacFarlane2-981/+737
* The new reader is more robust, accurate, and extensible. It is still quite incomplete, but it should be easier now to add features. * Text.Pandoc.Parsing: Added withRaw combinator. * Markdown reader: do escapedChar before raw latex inline. Otherwise we capture commands like \{. * Fixed latex citation tests for new citeproc. * Handle \include{} commands in latex. This is done in pandoc.hs, not the (pure) latex reader. But the reader exports the needed function, handleIncludes. * Moved err and warn from pandoc.hs to Shared. * Fixed tests - raw tex should sometimes have trailing space. * Updated lhs-test for highlighting-kate changes.
2012-01-29LaTeX reader: Require non-letter after certain commands.John MacFarlane1-6/+11
Previously "\opening" was rendered as "\248pening". The "\o" should not be parsed as a control sequence. Partially addresses #393.
2012-01-28Removed an unnecessary `many spaceChar`.John MacFarlane1-1/+1
2012-01-28Markdown reader: Fixed bug in code block attribute parser.John MacFarlane1-3/+4
Previously the ID attribute got lost if it didn't come first. Now attributes can come in any order.
2012-01-28Support github syntax for fenced code blocks.John MacFarlane1-10/+14
You can now write ```ruby x = 2 ``` instead of ~~~ {.ruby} x = 2 ~~~~
2012-01-27Fixed table parsing with wide or combining characters.John MacFarlane1-4/+4
Closes #348. Closes #108.
2012-01-26LaTeX reader: Handle \@.John MacFarlane1-1/+4
2012-01-19Added Docx writer.John MacFarlane1-4/+7
* New module `Text.Pandoc.Docx`. * New output format `docx`. * Added reference.docx. * New option `--reference-docx`. The writer includes support for highlighted code blocks and math (which is converted from TeX to OMML using texmath's new OMML module).
2012-01-12Added "title" to list of docbook block-level tags.John MacFarlane1-1/+1
2012-01-10Markdown reader: fixed bug in table/hrule parsing.John MacFarlane1-1/+1
Top line of table must not be followed by a blank line. This bug caused slowdown on some files with hrules and tables, and pandoc tried to interpret the hrules as the tops of multiline tables.
2012-01-08Markdown reader: Allow links in image captions.John MacFarlane1-13/+10
This change also means that [link with [link](/url)](/url) will turn into <p><a href="/url">link with link</a></p> instead of <p><a href="/url">link with [link](/url)</a></p>
2012-01-02Markdown reader: Fix parsing of consecutive lists.John MacFarlane1-10/+12
Pandoc previously behaved like Markdown.pl for consecutive lists of different styles. Thus, the following would be parsed as a single ordered list, rather than an ordered list followed by an unordered list: 1. one 2. two - one - two This patch makes pandoc behave more sensibly, parsing this as two lists. Any change in list type (ordered/unordered) or in list number style will trigger a new list. Thus, the following will also be parsed as two lists: 1. one 2. two a. one b. two Since we regard this as a bug in Markdown.pl, and not something anyone would ever rely on, we do not preserve the old behavior even when `--strict` is selected.
2012-01-01New treatment of dashes in --smart mode.John MacFarlane1-1/+2
* `---` is always em-dash, `--` is always en-dash. * pandoc no longer tries to guess when `-` should be en-dash. * A new option, `--old-dashes`, is provided for legacy documents. Rationale: The rules for en-dash are too complex and language-dependent for a guesser to work reliably. This change gives users greater control. The alternative of using unicode isn't very good, since unicode em- and en- dashes are barely distinguishable in a monospace font.
2011-12-31Support for math in RST reader and writer.John MacFarlane1-4/+5
Inline math uses the :math:`...` construct. Display math uses .. math:: ... or if multilin .. math:: ... These seem to be supported now by rst2latex.py.
2011-12-30Support Sphinx style math in RST reader.John MacFarlane1-4/+35
Inline: :math:`E=mc^2` Block: .. math: E = mc^2 .. math:: E = mc^2 a = b^2 (This latter will turn into a paragraph with two display math elements.) Closes #117.
2011-12-29Better smart quote parsing.John MacFarlane4-3/+15
* Added stateLastStrPos to ParserState. This lets us keep track of whether we're parsing the position immediately after a 'str'. If we encounter a ' in such a location, it must be an apostrophe, and can't be a single quote start. * Set this in the markdown, textile, html, and rst str parsers. * Closes #360.
2011-12-27Replaced Apostrophe, Ellipses, EmDash, EnDash w/ unicode strings.John MacFarlane2-11/+9
2011-12-27LaTeX reader: Return Str instead of Apostrophe.John MacFarlane1-1/+1
2011-12-27Markdown reader: Improved previous patch to allow unicode apostrophe.John MacFarlane1-1/+2
2011-12-26Modified str parser to capture apostrophes in smart mode.John MacFarlane1-2/+9
This solves a problem stemming from the fact that a parser doesn't know what came *before* in the input stream. Previously pandoc would parse D'oh l'*aide* as containing a single quoted "oh l", when both `'`s should be apostrophes. (Issue #360.) There are two issues here. (a) It is obvious that the first `'` is not an open quote, becaues of the preceding `D`. This patch solves the problem. (b) It is obvious to us that the second `'` is not an open quote, because we see that *aide* is some text. But getting a good algorithm that has good performance is a bit tricky. You can't assume that `'` followed by `*` is always an apostrophe: *'this is quoted'* This patch does not fix (b).
2011-12-05Markdown reader: Fixed backslash escapes in reference links.John MacFarlane1-4/+3
Closes #312.
2011-12-05Markdown: Better handling of escapes in link URLs and titles.John MacFarlane1-10/+8
2011-12-05Changes to fit new charsInBalanced.John MacFarlane2-8/+13
2011-12-05Markdown reader: internal changes.John MacFarlane1-5/+9
Refactored escapedChar into escapedChar', escapedChar.
2011-12-05Parsing: Changed type of escaped to return CharJohn MacFarlane2-2/+3
2011-11-12LaTeX reader: Don't crash on commands like `\itemsep`.John MacFarlane1-1/+2
Closes #314.
2011-11-12LaTeX reader: Ignore empty groups {}, { }.John MacFarlane1-0/+8
Closes #322.
2011-11-09Markdown citations: don't strip off initial space in locator.John MacFarlane1-1/+5
Previously `[@item1 and nowhere else]` yielded the locator ", and nowhere else", or, with the new citeproc-hs, "and nowhere else". Now it yields " and nowhere else".
2011-11-08TeXMath writer: Use unicode thin spaces for thin spaces.John MacFarlane1-1/+7
Partially resolves issue #333.
2011-11-06Markdown reader: allow punctuation only internally in cite keys.John MacFarlane1-1/+2
The characters '.',':',';','$','<','>','~','#','-','_' can be used only between two letters or digits in a citation key. This means that '@item1.' will be parsed as a citation, 'item1', followed by a period, instead of a citation 'item1.', as was the case previously. Thanks to David Sanson for alerting us to the problem.
2011-10-25HTML reader now recognizes DocBook block and inline tags.John MacFarlane1-5/+24
It was always possible to include raw DocBook tags in a markdown document, but now pandoc will be able to distinguish block from inline tags and behave accordingly. Thus, for example, <sidebar> hello </sidebar> will not be wrapped in `<para>` tags.
2011-08-23allow footnotes followed by newline without space charstakahashim1-2/+2
2011-08-01HTML reader: Fixed bug parsing tables w both thead and tbody.John MacFarlane1-0/+1
See bug #274, which was not completely fixed by the last patch.
2011-07-30Added PRAGMA needed for ghc 6.12.John MacFarlane1-0/+1
2011-07-30Removed applicative stuff in Markdown reader.John MacFarlane1-16/+16
It requires parsec 3, and currently pandoc can build with parsec 2.
2011-07-30Markdown reader: Improved emph/strong parsing.John MacFarlane1-13/+34
Ported code from pandoc2. Now all tests pass.
2011-07-23RST reader: Partial support for labeled footnotes.John MacFarlane1-7/+20
Also made simpleReferenceName parser more accurate, which affects several other parsers.
2011-07-23Properly handle characters in the 128..159 range.John MacFarlane1-2/+41
These aren't valid in HTML, but many HTML files produced by Windows tools contain them. We substitute correct unicode characters.
2011-07-21LaTeX reader: Handle \subtitle command.John MacFarlane1-1/+10
If there's a subtitle, it is added to the title, separated by a colon and linebreak. Closes #280.
2011-07-21LaTeX reader & writer: Use \and to separate authors.John MacFarlane1-2/+4
Closes #279.
2011-07-16HTML reader: treat Plain as Para when needed.John MacFarlane1-9/+12
For example, in Just a few glitches remaining. <ul><li> In this situation, one loses the list. </ul> And in this, the preformatting. <pre>Preformatted text not starting with its own blank line. </pre> Thansk to Dirk Laurie for noticing the issue.
2011-07-15HTML reader: Handle tbody, thead in simple tables.John MacFarlane1-7/+17
Closes #274.
2011-07-11Merge pull request #273 from qerub/masterJohn MacFarlane1-1/+1
Textile reader: Make it possible to have colons after links.