diff options
| author | John MacFarlane <jgm@berkeley.edu> | 2010-11-28 20:19:32 -0800 | 
|---|---|---|
| committer | John MacFarlane <jgm@berkeley.edu> | 2010-11-28 20:19:32 -0800 | 
| commit | 3ffd7246173b8bdc0d25167355f55e3193aaa989 (patch) | |
| tree | 5e4e071fc0ab13a99a69e95a475ca05e4b3fb23a | |
| parent | e9cfbd5adc7a2f19be24904b6fb4d200ddbaa9ce (diff) | |
| download | pandoc-3ffd7246173b8bdc0d25167355f55e3193aaa989.tar.gz | |
Markdown parser performance improvement.
Do a quick lookahead to make sure what follows looks like a setext
header before parsing any Inlines.  This gives a 15% performance
boost in one benchmark.  Many thanks to knieriem for finding
the problem (in peg-markdown):
https://github.com/jgm/peg-markdown/issues/issue/3
| -rw-r--r-- | src/Text/Pandoc/Readers/Markdown.hs | 3 | 
1 files changed, 3 insertions, 0 deletions
| diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs index b655ea1a9..9fe5c9f06 100644 --- a/src/Text/Pandoc/Readers/Markdown.hs +++ b/src/Text/Pandoc/Readers/Markdown.hs @@ -320,6 +320,9 @@ atxClosing = try $ skipMany (char '#') >> blanklines  setextHeader :: GenParser Char ParserState Block  setextHeader = try $ do +  -- This lookahead prevents us from wasting time parsing Inlines +  -- unless necessary -- it gives a significant performance boost. +  lookAhead $ anyLine >> many1 (oneOf setextHChars) >> blankline    text <- many1Till inline newline    underlineChar <- oneOf setextHChars    many (char underlineChar) | 
