From 3ffd7246173b8bdc0d25167355f55e3193aaa989 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Sun, 28 Nov 2010 20:19:32 -0800 Subject: Markdown parser performance improvement. Do a quick lookahead to make sure what follows looks like a setext header before parsing any Inlines. This gives a 15% performance boost in one benchmark. Many thanks to knieriem for finding the problem (in peg-markdown): https://github.com/jgm/peg-markdown/issues/issue/3 --- src/Text/Pandoc/Readers/Markdown.hs | 3 +++ 1 file changed, 3 insertions(+) (limited to 'src/Text/Pandoc/Readers') diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs index b655ea1a9..9fe5c9f06 100644 --- a/src/Text/Pandoc/Readers/Markdown.hs +++ b/src/Text/Pandoc/Readers/Markdown.hs @@ -320,6 +320,9 @@ atxClosing = try $ skipMany (char '#') >> blanklines setextHeader :: GenParser Char ParserState Block setextHeader = try $ do + -- This lookahead prevents us from wasting time parsing Inlines + -- unless necessary -- it gives a significant performance boost. + lookAhead $ anyLine >> many1 (oneOf setextHChars) >> blankline text <- many1Till inline newline underlineChar <- oneOf setextHChars many (char underlineChar) -- cgit v1.2.3