diff options
author | John MacFarlane <jgm@berkeley.edu> | 2010-11-28 20:19:32 -0800 |
---|---|---|
committer | John MacFarlane <jgm@berkeley.edu> | 2010-11-28 20:19:32 -0800 |
commit | 3ffd7246173b8bdc0d25167355f55e3193aaa989 (patch) | |
tree | 5e4e071fc0ab13a99a69e95a475ca05e4b3fb23a | |
parent | e9cfbd5adc7a2f19be24904b6fb4d200ddbaa9ce (diff) | |
download | pandoc-3ffd7246173b8bdc0d25167355f55e3193aaa989.tar.gz |
Markdown parser performance improvement.
Do a quick lookahead to make sure what follows looks like a setext
header before parsing any Inlines. This gives a 15% performance
boost in one benchmark. Many thanks to knieriem for finding
the problem (in peg-markdown):
https://github.com/jgm/peg-markdown/issues/issue/3
-rw-r--r-- | src/Text/Pandoc/Readers/Markdown.hs | 3 |
1 files changed, 3 insertions, 0 deletions
diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs index b655ea1a9..9fe5c9f06 100644 --- a/src/Text/Pandoc/Readers/Markdown.hs +++ b/src/Text/Pandoc/Readers/Markdown.hs @@ -320,6 +320,9 @@ atxClosing = try $ skipMany (char '#') >> blanklines setextHeader :: GenParser Char ParserState Block setextHeader = try $ do + -- This lookahead prevents us from wasting time parsing Inlines + -- unless necessary -- it gives a significant performance boost. + lookAhead $ anyLine >> many1 (oneOf setextHChars) >> blankline text <- many1Till inline newline underlineChar <- oneOf setextHChars many (char underlineChar) |