aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJohn MacFarlane <jgm@berkeley.edu>2010-11-28 20:19:32 -0800
committerJohn MacFarlane <jgm@berkeley.edu>2010-11-28 20:19:32 -0800
commit3ffd7246173b8bdc0d25167355f55e3193aaa989 (patch)
tree5e4e071fc0ab13a99a69e95a475ca05e4b3fb23a
parente9cfbd5adc7a2f19be24904b6fb4d200ddbaa9ce (diff)
downloadpandoc-3ffd7246173b8bdc0d25167355f55e3193aaa989.tar.gz
Markdown parser performance improvement.
Do a quick lookahead to make sure what follows looks like a setext header before parsing any Inlines. This gives a 15% performance boost in one benchmark. Many thanks to knieriem for finding the problem (in peg-markdown): https://github.com/jgm/peg-markdown/issues/issue/3
-rw-r--r--src/Text/Pandoc/Readers/Markdown.hs3
1 files changed, 3 insertions, 0 deletions
diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs
index b655ea1a9..9fe5c9f06 100644
--- a/src/Text/Pandoc/Readers/Markdown.hs
+++ b/src/Text/Pandoc/Readers/Markdown.hs
@@ -320,6 +320,9 @@ atxClosing = try $ skipMany (char '#') >> blanklines
setextHeader :: GenParser Char ParserState Block
setextHeader = try $ do
+ -- This lookahead prevents us from wasting time parsing Inlines
+ -- unless necessary -- it gives a significant performance boost.
+ lookAhead $ anyLine >> many1 (oneOf setextHChars) >> blankline
text <- many1Till inline newline
underlineChar <- oneOf setextHChars
many (char underlineChar)