aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers/Markdown.hs
AgeCommit message (Collapse)AuthorFilesLines
2012-08-01Improved implementation of pipe tables.John MacFarlane1-25/+14
2012-08-01Parsing: removed duplication of Key and Key'.John MacFarlane1-4/+4
Now we just use the former Key' (string contents), renamed Key. lookupKeySrc and fromKey are no longer eport. Key', toKey' and KeyTable' have become Key, toKey, and KeyTable.
2012-08-01Major rewrite of markdown reader.John MacFarlane1-390/+553
* Use Builder's Inlines/Blocks instead of lists. * Return values in the reader monad, which are then run (at the end of parsing) against the final parser state. This allows links, notes, and example numbers to be resolved without a second parser pass. * An effect of using Builder is that everything is normalized automatically. * New exports from Text.Pandoc.Parsing: widthsFromIndices, NoteTable', KeyTable', Key', toKey', withQuoteContext, singleQuoteStart, singleQuoteEnd, doubleQuoteStart, doubleQuoteEnd, ellipses, apostrophe, dash * Updated opendocument tests. * Don't derive Show for ParserState. * Benchmarks: markdown reader takes 82% of the time it took before. Markdown writer takes 92% of the time (here the speedup is probably due to the fact that everything is normalized by default).
2012-07-27Markdown reader: Added sensitivity to Ext_example_lists.John MacFarlane1-1/+3
2012-07-27Markdown reader: Check fancy_lists and startnum extensions.John MacFarlane1-2/+6
2012-07-27Replaced writerStrict with writerExtensions in WriterOptions.John MacFarlane1-5/+7
Still have not implemented individual tests for all the extensions in the markdown writer.
2012-07-26Fixed whitespace errors.John MacFarlane1-31/+31
2012-07-26Use readerExtensions instead of readerStrict in readers.John MacFarlane1-84/+72
Test individually for the extensions.
2012-07-25Changed reader parameters from ParserState to ReaderOptions.John MacFarlane1-3/+4
2012-07-25Moved stateApplyMacros, stateIndentedCodeClasses to ReaderOptions.John MacFarlane1-2/+2
2012-07-25stateCitations -> readerCitations.John MacFarlane1-2/+2
2012-07-25Moved stateTabStop to readerTabStop in ReaderOptions.John MacFarlane1-8/+5
2012-07-25Moved ParseRaw from ParserState to ReaderOptions.John MacFarlane1-1/+3
2012-07-25Options -> ReaderOptions.John MacFarlane1-10/+10
Better to keep reader and writer options separate.
2012-07-25Put smart, strict in separate options field in state.John MacFarlane1-10/+11
This is the beginning of a larger transition that will make Options, not ParserState, the parameter of the read functions. (Options will also be used in writers, in place of WriterOptions.) Next step is to remove strict, replacing it with granular tests for different extensions.
2012-07-24Don't require strict HTML blocks to begin at left margin.John MacFarlane1-3/+1
Technically this is required, according to the mardkown syntax document, but Markdown.pl and other markdown processors are more liberal.
2012-07-24Small fix to fix: Allow blank lines btw table and caption.John MacFarlane1-0/+1
2012-07-24Fixed performance improvement to tables.John MacFarlane1-1/+1
2012-07-24More performance improvements on pipe tables.John MacFarlane1-2/+1
2012-07-24Refactored table parsers, captions now not part of core tableWith.John MacFarlane1-9/+15
2012-07-24Slight improvement to performance for pipe tables.John MacFarlane1-11/+16
Still, pipe tables are a huge performance drag. One benchmark: With pipe tables, 1.25 sec (including this fix). without pipe tables, 1.05 sec.
2012-07-22Revised code for pipe tables.John MacFarlane1-5/+51
* All tables now require at least one body row. * Renamed from 'extra' to 'pipe' tables. * Moved functions from Parsing to Readers.Markdown. * Cleaned up code; revised to parse in one pass rather than parsing a raw string, splitting it, and parsing the components. * Allow pipe tables without pipes on the ends (as PHP Markdown Extra does).
2012-07-22Merge pull request #510 from mytskine/markdown-extraJohn MacFarlane1-0/+5
Markdown extra tables [part of the multi-markdown syntax for tables]
2012-07-20Use Parser as type synonym for Parsec.John MacFarlane1-134/+134
2012-07-20Text.Pandoc.Parsing: Export all Parsec functions used in pandoc code.John MacFarlane1-1/+0
No other module directly imports Parsec. This will make it easier to change the parsing backend in the future, if we want to.
2012-07-20Use Text.Parsec instead of Text.ParserCombinators.Parsec.John MacFarlane1-137/+137
2012-06-04Markdown reader: Added cf. and cp. to list of likely abbreviations.John MacFarlane1-1/+1
2012-05-08Treat four or more `~` or `^` in an inline context as regular text.John MacFarlane1-3/+3
This avoids exponential parsing blowups with long strings of these characters. Closes #507.
2012-04-13Markdown reader: Allow lists as list items.John MacFarlane1-6/+8
So, for example: 1. * x * y 2. * z * w
2012-04-12Markdown: don't recognize references inside delimited code blocks.John MacFarlane1-0/+1
Previously pandoc would produce incorrect results on this: ~~~ [not a link]: /url ~~~ [not a link] because it would recognize "not a link" as a reference link definition on the first pass. This fix causes the first pass to skip delimited code blocks.
2012-02-21Added support for markdown-extra tables in the markdown parserFrançois Gannaz1-0/+5
Only tables whose lines begin with a "|" are supported. There are 2 warnings about unused variables when compiling.
2012-02-08Improvements to markdown attributes syntax (on code blocks).John MacFarlane1-4/+5
(1) Attributes can contain line breaks. (2) Values in key-value attributes can be surrounded by either double or single quotes, or left unquoted if they contain no spaces.
2012-02-07Limit nesting of strong/emph.John MacFarlane1-2/+14
This avoids exponential lookahead in parasitic cases, like a**a*a**a*a**a*a**a*a**a*a**a*a**a*a**. Added stateMaxNestingLevel to ParserState. We set this to 6, so you can still have Emph inside Emph, just not indefinitely.
2012-02-05Removed module Text.Pandoc.CharacterReferences.John MacFarlane1-3/+3
Moved characterReference parser to Text.Pandoc.Parsing. decodeCharacterReferences is now replaced by fromEntities in Text.Pandoc.XML.
2012-02-04Complete rewrite of LaTeX reader.John MacFarlane1-12/+9
* The new reader is more robust, accurate, and extensible. It is still quite incomplete, but it should be easier now to add features. * Text.Pandoc.Parsing: Added withRaw combinator. * Markdown reader: do escapedChar before raw latex inline. Otherwise we capture commands like \{. * Fixed latex citation tests for new citeproc. * Handle \include{} commands in latex. This is done in pandoc.hs, not the (pure) latex reader. But the reader exports the needed function, handleIncludes. * Moved err and warn from pandoc.hs to Shared. * Fixed tests - raw tex should sometimes have trailing space. * Updated lhs-test for highlighting-kate changes.
2012-01-28Removed an unnecessary `many spaceChar`.John MacFarlane1-1/+1
2012-01-28Markdown reader: Fixed bug in code block attribute parser.John MacFarlane1-3/+4
Previously the ID attribute got lost if it didn't come first. Now attributes can come in any order.
2012-01-28Support github syntax for fenced code blocks.John MacFarlane1-10/+14
You can now write ```ruby x = 2 ``` instead of ~~~ {.ruby} x = 2 ~~~~
2012-01-27Fixed table parsing with wide or combining characters.John MacFarlane1-4/+4
Closes #348. Closes #108.
2012-01-10Markdown reader: fixed bug in table/hrule parsing.John MacFarlane1-1/+1
Top line of table must not be followed by a blank line. This bug caused slowdown on some files with hrules and tables, and pandoc tried to interpret the hrules as the tops of multiline tables.
2012-01-08Markdown reader: Allow links in image captions.John MacFarlane1-13/+10
This change also means that [link with [link](/url)](/url) will turn into <p><a href="/url">link with link</a></p> instead of <p><a href="/url">link with [link](/url)</a></p>
2012-01-02Markdown reader: Fix parsing of consecutive lists.John MacFarlane1-10/+12
Pandoc previously behaved like Markdown.pl for consecutive lists of different styles. Thus, the following would be parsed as a single ordered list, rather than an ordered list followed by an unordered list: 1. one 2. two - one - two This patch makes pandoc behave more sensibly, parsing this as two lists. Any change in list type (ordered/unordered) or in list number style will trigger a new list. Thus, the following will also be parsed as two lists: 1. one 2. two a. one b. two Since we regard this as a bug in Markdown.pl, and not something anyone would ever rely on, we do not preserve the old behavior even when `--strict` is selected.
2011-12-29Better smart quote parsing.John MacFarlane1-0/+2
* Added stateLastStrPos to ParserState. This lets us keep track of whether we're parsing the position immediately after a 'str'. If we encounter a ' in such a location, it must be an apostrophe, and can't be a single quote start. * Set this in the markdown, textile, html, and rst str parsers. * Closes #360.
2011-12-27Replaced Apostrophe, Ellipses, EmDash, EnDash w/ unicode strings.John MacFarlane1-8/+6
2011-12-27Markdown reader: Improved previous patch to allow unicode apostrophe.John MacFarlane1-1/+2
2011-12-26Modified str parser to capture apostrophes in smart mode.John MacFarlane1-2/+9
This solves a problem stemming from the fact that a parser doesn't know what came *before* in the input stream. Previously pandoc would parse D'oh l'*aide* as containing a single quoted "oh l", when both `'`s should be apostrophes. (Issue #360.) There are two issues here. (a) It is obvious that the first `'` is not an open quote, becaues of the preceding `D`. This patch solves the problem. (b) It is obvious to us that the second `'` is not an open quote, because we see that *aide* is some text. But getting a good algorithm that has good performance is a bit tricky. You can't assume that `'` followed by `*` is always an apostrophe: *'this is quoted'* This patch does not fix (b).
2011-12-05Markdown reader: Fixed backslash escapes in reference links.John MacFarlane1-4/+3
Closes #312.
2011-12-05Markdown: Better handling of escapes in link URLs and titles.John MacFarlane1-10/+8
2011-12-05Changes to fit new charsInBalanced.John MacFarlane1-6/+11
2011-12-05Markdown reader: internal changes.John MacFarlane1-5/+9
Refactored escapedChar into escapedChar', escapedChar.