aboutsummaryrefslogtreecommitdiff
path: root/Text/Pandoc
diff options
context:
space:
mode:
authorfiddlosopher <fiddlosopher@788f1e2b-df1e-0410-8736-df70ead52e1b>2007-12-24 04:22:20 +0000
committerfiddlosopher <fiddlosopher@788f1e2b-df1e-0410-8736-df70ead52e1b>2007-12-24 04:22:20 +0000
commit97992e6f7b3953297b036c3cf68eb175e1aa6806 (patch)
tree6e0d0be4a63ca5581cf01b15689c6437ea83a225 /Text/Pandoc
parentdad8e163301f7f5894f7b7e491577427b7e05d7a (diff)
downloadpandoc-97992e6f7b3953297b036c3cf68eb175e1aa6806.tar.gz
Improved handling of raw HTML in Markdown reader. (Resolves Issue #36.)
Tags that can be either block or inline (e.g. <ins>) should be treated as block when appropriate and as inline when appropriate. Thus, for example, <ins>hi</ins> should be treated as a paragraph with inline <ins> tags, while <ins> hi </ins> should be treated as a paragraph within <ins> tags. + Moved htmlBlock after para in list of block parsers. This ensures that tags that can be either block or inline get parsed as inline when appropriate. + Modified rawHtmlInline' so that block elements aren't treated as inline. + Modified para parser so that paragraphs containing only HTML tags and blank space are not allowed. Treat these as raw HTML blocks instead. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1154 788f1e2b-df1e-0410-8736-df70ead52e1b
Diffstat (limited to 'Text/Pandoc')
-rw-r--r--Text/Pandoc/Readers/Markdown.hs16
1 files changed, 12 insertions, 4 deletions
diff --git a/Text/Pandoc/Readers/Markdown.hs b/Text/Pandoc/Readers/Markdown.hs
index 2c0bf8db8..365218b3d 100644
--- a/Text/Pandoc/Readers/Markdown.hs
+++ b/Text/Pandoc/Readers/Markdown.hs
@@ -235,9 +235,9 @@ block = choice [ header
, hrule
, list
, blockQuote
- , htmlBlock
, rawLaTeXEnvironment'
, para
+ , htmlBlock
, plain
, nullBlock ] <?> "block"
@@ -448,8 +448,16 @@ definitionList = do
-- paragraph block
--
+isHtmlOrBlank (HtmlInline _) = True
+isHtmlOrBlank (Space) = True
+isHtmlOrBlank (LineBreak) = True
+isHtmlOrBlank _ = False
+
para = try $ do
result <- many1 inline
+ if all isHtmlOrBlank result
+ then fail "treat as raw HTML"
+ else return ()
newline
blanklines <|> do st <- getState
if stateStrict st
@@ -886,8 +894,8 @@ rawLaTeXInline' = failIfStrict >> rawLaTeXInline
rawHtmlInline' = do
st <- getState
- result <- choice $ if stateStrict st
- then [htmlBlockElement, anyHtmlTag, anyHtmlEndTag]
- else [htmlBlockElement, anyHtmlInlineTag]
+ result <- if stateStrict st
+ then choice [htmlBlockElement, anyHtmlTag, anyHtmlEndTag]
+ else anyHtmlInlineTag
return $ HtmlInline result