diff options
author | John MacFarlane <jgm@berkeley.edu> | 2021-07-06 10:22:07 -0700 |
---|---|---|
committer | John MacFarlane <jgm@berkeley.edu> | 2021-07-06 10:22:07 -0700 |
commit | f88ebf3ebf49e00ffa12778caf6817cc34459e6a (patch) | |
tree | 91faf19b43bc129797c36a18430c7b16c90575d5 | |
parent | 3ed37f00771d20a1b7516f2a37b7b424b3b2f1d8 (diff) | |
download | pandoc-f88ebf3ebf49e00ffa12778caf6817cc34459e6a.tar.gz |
Markdown reader: don't try to read contents in self-closing HTML tag.
Previously we had problems parsing raw HTML with self-closing
tags like `<col/>`. The problem was that pandoc would look
for a closing tag to close the markdown contents, but the
closing tag had, in effect, already been parsed by `htmlTag`.
This fixes the issue described in
<https://groups.google.com/d/msgid/pandoc-discuss/297bc662-7841-4423-bcbb-534e99bbba09n%40googlegroups.com>.
-rw-r--r-- | src/Text/Pandoc/Readers/Markdown.hs | 5 |
1 files changed, 4 insertions, 1 deletions
diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs index 1e9867d07..2dc7ddf52 100644 --- a/src/Text/Pandoc/Readers/Markdown.hs +++ b/src/Text/Pandoc/Readers/Markdown.hs @@ -1121,6 +1121,7 @@ rawTeXBlock = do rawHtmlBlocks :: PandocMonad m => MarkdownParser m (F Blocks) rawHtmlBlocks = do (TagOpen tagtype _, raw) <- htmlTag isBlockTag + let selfClosing = "/>" `T.isSuffixOf` raw -- we don't want '<td> text' to be a code block: skipMany spaceChar indentlevel <- (blankline >> length <$> many (char ' ')) <|> return 0 @@ -1134,7 +1135,9 @@ rawHtmlBlocks = do gobbleAtMostSpaces indentlevel notFollowedBy' closer block - contents <- mconcat <$> many block' + contents <- if selfClosing + then return mempty + else mconcat <$> many block' result <- try (do gobbleAtMostSpaces indentlevel |