aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--Makefile7
-rw-r--r--README49
-rw-r--r--pandoc.cabal12
-rw-r--r--src/Text/Pandoc.hs3
-rw-r--r--src/Text/Pandoc/Parsing.hs22
-rw-r--r--src/Text/Pandoc/Readers/HTML.hs16
-rw-r--r--src/Text/Pandoc/Readers/LaTeX.hs36
-rw-r--r--src/Text/Pandoc/Readers/Markdown.hs14
-rw-r--r--src/Text/Pandoc/Readers/MediaWiki.hs594
-rw-r--r--src/Text/Pandoc/Readers/RST.hs2
-rw-r--r--src/Text/Pandoc/Readers/Textile.hs16
-rw-r--r--src/Text/Pandoc/Writers/Docbook.hs2
-rw-r--r--src/Text/Pandoc/Writers/Docx.hs39
-rw-r--r--src/Text/Pandoc/Writers/EPUB.hs17
-rw-r--r--src/Text/Pandoc/Writers/HTML.hs7
-rw-r--r--src/Text/Pandoc/Writers/LaTeX.hs40
-rw-r--r--src/Text/Pandoc/Writers/Markdown.hs12
-rw-r--r--src/pandoc.hs3
m---------templates16
-rw-r--r--tests/Tests/Old.hs7
-rw-r--r--tests/latex-reader.latex5
-rw-r--r--tests/latex-reader.native3
-rw-r--r--tests/mediawiki-reader.native241
-rw-r--r--tests/mediawiki-reader.wiki369
-rw-r--r--tests/tables.latex108
-rw-r--r--tests/textile-reader.native2
-rw-r--r--tests/writer.context2
-rw-r--r--tests/writer.html15
-rw-r--r--tests/writer.latex2
-rw-r--r--windows/pandoc-setup.iss2
-rw-r--r--windows/pcre-license.txt68
-rw-r--r--windows/pcre3.dllbin140288 -> 0 bytes
32 files changed, 1454 insertions, 277 deletions
diff --git a/Makefile b/Makefile
index 73923dc77..e5b4f3378 100644
--- a/Makefile
+++ b/Makefile
@@ -6,8 +6,11 @@
all:
cabal-dev configure --enable-tests --enable-benchmarks && cabal-dev build
+prof:
+ cabal-dev configure --enable-tests --enable-library-profiling --enable-executable-profiling && cabal-dev build
+
prep: pandoc-types citeproc-hs
- cabal-dev install-deps --enable-tests --enable-benchmarks
+ cabal-dev install-deps --enable-library-profiling --enable-tests --enable-benchmarks
quick:
cabal-dev configure --enable-tests --disable-optimization && cabal-dev build
@@ -27,4 +30,4 @@ citeproc-hs: pandoc-types
cabal-dev add-source citeproc-hs
install:
- cabal-dev install
+ cabal-dev install --enable-tests --enable-benchmarks
diff --git a/README b/README
index 2fa8cbce5..7a0773924 100644
--- a/README
+++ b/README
@@ -13,13 +13,14 @@ Description
Pandoc is a [Haskell] library for converting from one markup format to
another, and a command-line tool that uses this library. It can read
[markdown] and (subsets of) [Textile], [reStructuredText], [HTML],
-[LaTeX], and [DocBook XML]; and it can write plain text, [markdown],
-[reStructuredText], [XHTML], [HTML 5], [LaTeX] (including [beamer]
-slide shows), [ConTeXt], [RTF], [DocBook XML], [OpenDocument XML],
-[ODT], [Word docx], [GNU Texinfo], [MediaWiki markup], [EPUB], [FictionBook2],
-[Textile], [groff man] pages, [Emacs Org-Mode], [AsciiDoc], and [Slidy],
-[Slideous], [DZSlides], or [S5] HTML slide shows. It can also produce
-[PDF] output on systems where LaTeX is installed.
+[LaTeX], [MediaWiki markup], and [DocBook XML]; and it can write plain
+text, [markdown], [reStructuredText], [XHTML], [HTML 5], [LaTeX]
+(including [beamer] slide shows), [ConTeXt], [RTF], [DocBook XML],
+[OpenDocument XML], [ODT], [Word docx], [GNU Texinfo], [MediaWiki
+markup], [EPUB], [FictionBook2], [Textile], [groff man] pages, [Emacs
+Org-Mode], [AsciiDoc], and [Slidy], [Slideous], [DZSlides], or [S5] HTML
+slide shows. It can also produce [PDF] output on systems where LaTeX is
+installed.
Pandoc's enhanced version of markdown includes syntax for footnotes,
tables, flexible ordered lists, definition lists, fenced code blocks,
@@ -107,7 +108,7 @@ to PDF:
Production of a PDF requires that a LaTeX engine be installed (see
`--latex-engine`, below), and assumes that the following LaTeX packages are
available: `amssymb`, `amsmath`, `ifxetex`, `ifluatex`, `listings` (if the
-`--listings` option is used), `fancyvrb`, `enumerate`, `ctable`, `url`,
+`--listings` option is used), `fancyvrb`, `enumerate`, `longtable`, `url`,
`graphicx`, `hyperref`, `ulem`, `babel` (if the `lang` variable is set),
`fontspec` (if `xelatex` or `lualatex` is used as the LaTeX engine), `xltxtra`
and `xunicode` (if `xelatex` is used).
@@ -140,18 +141,18 @@ General options
`json` (JSON version of native AST), `markdown` (pandoc's
extended markdown), `markdown_strict` (original unextended markdown),
`textile` (Textile), `rst` (reStructuredText), `html` (HTML),
- `docbook` (DocBook XML), or `latex` (LaTeX). If `+lhs` is
- appended to `markdown`, `rst`, `latex`, the input will be
- treated as literate Haskell source: see [Literate Haskell
- support](#literate-haskell-support), below. Markdown syntax
- extensions can be individually enabled or disabled by appending
- `+EXTENSION` or `-EXTENSION` to the format name. So, for example,
- `markdown_strict+footnotes+definition_lists` is strict markdown
- with footnotes and definition lists enabled, and
- `markdown-pipe_tables+hard_line_breaks` is pandoc's markdown
- without pipe tables and with hard line breaks. See
- [Pandoc's markdown](#pandocs-markdown), below, for a list of
- extensions and their names.
+ `docbook` (DocBook XML), `mediawiki` (MediaWiki markup),
+ or `latex` (LaTeX). If `+lhs` is appended to `markdown`, `rst`,
+ `latex`, the input will be treated as literate Haskell source:
+ see [Literate Haskell support](#literate-haskell-support), below.
+ Markdown syntax extensions can be individually enabled or disabled
+ by appending `+EXTENSION` or `-EXTENSION` to the format name.
+ So, for example, `markdown_strict+footnotes+definition_lists`
+ is strict markdown with footnotes and definition lists enabled,
+ and `markdown-pipe_tables+hard_line_breaks` is pandoc's markdown
+ without pipe tables and with hard line breaks. See [Pandoc's
+ markdown](#pandocs-markdown), below, for a list of extensions and
+ their names.
`-t` *FORMAT*, `-w` *FORMAT*, `--to=`*FORMAT*, `--write=`*FORMAT*
: Specify output format. *FORMAT* can be `native` (native Haskell),
@@ -415,8 +416,8 @@ Options affecting specific writers
`--id-prefix`=*STRING*
: Specify a prefix to be added to all automatically generated identifiers
- in HTML output. This is useful for preventing duplicate identifiers
- when generating fragments to be included in other pages.
+ in HTML and DocBook output. This is useful for preventing duplicate
+ identifiers when generating fragments to be included in other pages.
`-T` *STRING*, `--title-prefix=`*STRING*
: Specify *STRING* as a prefix at the beginning of the title
@@ -1813,11 +1814,11 @@ HTML, Slidy, DZSlides, S5, EPUB
styled differently from the surrounding text if needed.
2. If the `--latexmathml` option is used, TeX math will be displayed
- between $ or $$ characters and put in `<span>` tags with class `LaTeX`.
+ between `$` or `$$` characters and put in `<span>` tags with class `LaTeX`.
The [LaTeXMathML] script will be used to render it as formulas.
(This trick does not work in all browsers, but it works in Firefox.
In browsers that do not support LaTeXMathML, TeX math will appear
- verbatim between $ characters.)
+ verbatim between `$` characters.)
3. If the `--jsmath` option is used, TeX math will be put inside
`<span>` tags (for inline math) or `<div>` tags (for display math)
diff --git a/pandoc.cabal b/pandoc.cabal
index e999f1b80..0b234c52f 100644
--- a/pandoc.cabal
+++ b/pandoc.cabal
@@ -16,10 +16,11 @@ Synopsis: Conversion between markup formats
Description: Pandoc is a Haskell library for converting from one markup
format to another, and a command-line tool that uses
this library. It can read markdown and (subsets of) HTML,
- reStructuredText, LaTeX, DocBook, and Textile, and it can write
- markdown, reStructuredText, HTML, LaTeX, ConTeXt, Docbook,
- OpenDocument, ODT, Word docx, RTF, MediaWiki, Textile,
- groff man pages, plain text, Emacs Org-Mode, AsciiDoc, EPUB,
+ reStructuredText, LaTeX, DocBook, MediaWiki markup,
+ and Textile, and it can write markdown, reStructuredText,
+ HTML, LaTeX, ConTeXt, Docbook, OpenDocument, ODT,
+ Word docx, RTF, MediaWiki, Textile, groff man pages,
+ plain text, Emacs Org-Mode, AsciiDoc, EPUB,
FictionBook2, and S5, Slidy and Slideous HTML slide shows.
.
Pandoc extends standard markdown syntax with footnotes,
@@ -120,6 +121,8 @@ Extra-Source-Files:
tests/markdown-citations.mhra.txt,
tests/markdown-citations.ieee.txt,
tests/textile-reader.textile,
+ tests/mediawiki-reader.wiki,
+ tests/mediawiki-reader.native,
tests/rst-reader.native,
tests/rst-reader.rst,
tests/s5.basic.html,
@@ -262,6 +265,7 @@ Library
Text.Pandoc.Readers.HTML,
Text.Pandoc.Readers.LaTeX,
Text.Pandoc.Readers.Markdown,
+ Text.Pandoc.Readers.MediaWiki,
Text.Pandoc.Readers.RST,
Text.Pandoc.Readers.DocBook,
Text.Pandoc.Readers.TeXMath,
diff --git a/src/Text/Pandoc.hs b/src/Text/Pandoc.hs
index 33706816e..1e6b1d010 100644
--- a/src/Text/Pandoc.hs
+++ b/src/Text/Pandoc.hs
@@ -66,6 +66,7 @@ module Text.Pandoc
, writers
-- * Readers: converting /to/ Pandoc format
, readMarkdown
+ , readMediaWiki
, readRST
, readLaTeX
, readHtml
@@ -110,6 +111,7 @@ module Text.Pandoc
import Text.Pandoc.Definition
import Text.Pandoc.Generic
import Text.Pandoc.Readers.Markdown
+import Text.Pandoc.Readers.MediaWiki
import Text.Pandoc.Readers.RST
import Text.Pandoc.Readers.DocBook
import Text.Pandoc.Readers.LaTeX
@@ -179,6 +181,7 @@ readers = [("native" , \_ -> readNative)
,("markdown_strict" , readMarkdown)
,("markdown" , readMarkdown)
,("rst" , readRST)
+ ,("mediawiki" , readMediaWiki)
,("docbook" , readDocBook)
,("textile" , readTextile) -- TODO : textile+lhs
,("html" , readHtml)
diff --git a/src/Text/Pandoc/Parsing.hs b/src/Text/Pandoc/Parsing.hs
index 50691f409..bee96be82 100644
--- a/src/Text/Pandoc/Parsing.hs
+++ b/src/Text/Pandoc/Parsing.hs
@@ -82,6 +82,7 @@ module Text.Pandoc.Parsing ( (>>~),
ellipses,
apostrophe,
dash,
+ nested,
macro,
applyMacros',
Parser,
@@ -379,10 +380,11 @@ uri = try $ do
char ')'
return $ '(' : res ++ ")"
str <- liftM concat $ many1 $ inParens <|> count 1 (innerPunct <|> uriChar)
+ str' <- option str $ char '/' >> return (str ++ "/")
-- now see if they amount to an absolute URI
- case parseURI (escapeURI str) of
+ case parseURI (escapeURI str') of
Just uri' -> if uriScheme uri' `elem` protocols
- then return (str, show uri')
+ then return (str', show uri')
else fail "not a URI"
Nothing -> fail "not a URI"
@@ -811,8 +813,8 @@ quoted :: Parser [Char] ParserState Inline
quoted inlineParser = doubleQuoted inlineParser <|> singleQuoted inlineParser
withQuoteContext :: QuoteContext
- -> Parser [Char] ParserState a
- -> Parser [Char] ParserState a
+ -> Parser [tok] ParserState a
+ -> Parser [tok] ParserState a
withQuoteContext context parser = do
oldState <- getState
let oldQuoteContext = stateQuoteContext oldState
@@ -924,6 +926,18 @@ emDashOld = do
try (charOrRef "\8212\151") <|> (try $ string "--" >> optional (char '-') >> return '-')
return (Str "\8212")
+-- This is used to prevent exponential blowups for things like:
+-- a**a*a**a*a**a*a**a*a**a*a**a*a**
+nested :: Parser s ParserState a
+ -> Parser s ParserState a
+nested p = do
+ nestlevel <- stateMaxNestingLevel `fmap` getState
+ guard $ nestlevel > 0
+ updateState $ \st -> st{ stateMaxNestingLevel = stateMaxNestingLevel st - 1 }
+ res <- p
+ updateState $ \st -> st{ stateMaxNestingLevel = nestlevel }
+ return res
+
--
-- Macros
--
diff --git a/src/Text/Pandoc/Readers/HTML.hs b/src/Text/Pandoc/Readers/HTML.hs
index e5c310ffc..424d9bdec 100644
--- a/src/Text/Pandoc/Readers/HTML.hs
+++ b/src/Text/Pandoc/Readers/HTML.hs
@@ -271,6 +271,7 @@ pCodeBlock = try $ do
inline :: TagParser [Inline]
inline = choice
[ pTagText
+ , pQ
, pEmph
, pStrong
, pSuperscript
@@ -306,6 +307,17 @@ pSelfClosing f g = do
optional $ pSatisfy (tagClose f)
return open
+pQ :: TagParser [Inline]
+pQ = do
+ quoteContext <- stateQuoteContext `fmap` getState
+ let quoteType = case quoteContext of
+ InDoubleQuote -> SingleQuote
+ _ -> DoubleQuote
+ let innerQuoteContext = if quoteType == SingleQuote
+ then InSingleQuote
+ else InDoubleQuote
+ withQuoteContext innerQuoteContext $ pInlinesInTags "q" (Quoted quoteType)
+
pEmph :: TagParser [Inline]
pEmph = pInlinesInTags "em" Emph <|> pInlinesInTags "i" Emph
@@ -585,9 +597,9 @@ htmlInBalanced f = try $ do
return $ tag ++ concat contents ++ endtag
-- | Matches a tag meeting a certain condition.
-htmlTag :: (Tag String -> Bool) -> Parser [Char] ParserState (Tag String, String)
+htmlTag :: (Tag String -> Bool) -> Parser [Char] st (Tag String, String)
htmlTag f = try $ do
- lookAhead (char '<')
+ lookAhead $ char '<' >> (oneOf "/!?" <|> letter)
(next : _) <- getInput >>= return . canonicalizeTags . parseTags
guard $ f next
-- advance the parser
diff --git a/src/Text/Pandoc/Readers/LaTeX.hs b/src/Text/Pandoc/Readers/LaTeX.hs
index 4a5a14d6a..86ae400de 100644
--- a/src/Text/Pandoc/Readers/LaTeX.hs
+++ b/src/Text/Pandoc/Readers/LaTeX.hs
@@ -166,10 +166,8 @@ double_quote = (doubleQuoted . mconcat) <$>
(try $ string "``" *> manyTill inline (try $ string "''"))
single_quote :: LP Inlines
-single_quote = char '`' *>
- ( try ((singleQuoted . mconcat) <$>
- manyTill inline (try $ char '\'' >> notFollowedBy letter))
- <|> lit "`")
+single_quote = (singleQuoted . mconcat) <$>
+ (try $ char '`' *> manyTill inline (try $ char '\'' >> notFollowedBy letter))
inline :: LP Inlines
inline = (mempty <$ comment)
@@ -181,6 +179,9 @@ inline = (mempty <$ comment)
((char '-') *> option (str "–") (str "—" <$ char '-')))
<|> double_quote
<|> single_quote
+ <|> (str "“" <$ try (string "``")) -- nb. {``} won't be caught by double_quote
+ <|> (str "”" <$ try (string "''"))
+ <|> (str "‘" <$ char '`') -- nb. {`} won't be caught by single_quote
<|> (str "’" <$ char '\'')
<|> (str "\160" <$ char '~')
<|> (mathDisplay $ string "$$" *> mathChars <* string "$$")
@@ -188,10 +189,9 @@ inline = (mempty <$ comment)
<|> (superscript <$> (char '^' *> tok))
<|> (subscript <$> (char '_' *> tok))
<|> (guardEnabled Ext_literate_haskell *> char '|' *> doLHSverb)
- <|> (str <$> count 1 tildeEscape)
- <|> (str <$> string "]")
- <|> (str <$> string "#") -- TODO print warning?
- <|> (str <$> string "&") -- TODO print warning?
+ <|> (str . (:[]) <$> tildeEscape)
+ <|> (str . (:[]) <$> oneOf "[]")
+ <|> (str . (:[]) <$> oneOf "#&") -- TODO print warning?
-- <|> (str <$> count 1 (satisfy (\c -> c /= '\\' && c /='\n' && c /='}' && c /='{'))) -- eat random leftover characters
inlines :: LP Inlines
@@ -203,8 +203,8 @@ block = (mempty <$ comment)
<|> environment
<|> mempty <$ macro -- TODO improve macros, make them work everywhere
<|> blockCommand
- <|> grouped block
<|> paragraph
+ <|> grouped block
<|> (mempty <$ char '&') -- loose & in table environment
@@ -214,6 +214,7 @@ blocks = mconcat <$> many block
blockCommand :: LP Blocks
blockCommand = try $ do
name <- anyControlSeq
+ guard $ name /= "begin" && name /= "end"
star <- option "" (string "*" <* optional sp)
let name' = name ++ star
case M.lookup name' blockCommands of
@@ -265,8 +266,6 @@ blockCommands = M.fromList $
, ("closing", skipopts *> closing)
--
, ("rule", skipopts *> tok *> tok *> pure horizontalRule)
- , ("begin", mzero) -- these are here so they won't be interpreted as inline
- , ("end", mzero)
, ("item", skipopts *> loose_item)
, ("documentclass", skipopts *> braced *> preamble)
, ("centerline", (para . trimInlines) <$> (skipopts *> tok))
@@ -321,6 +320,7 @@ section lvl = do
inlineCommand :: LP Inlines
inlineCommand = try $ do
name <- anyControlSeq
+ guard $ name /= "begin" && name /= "end"
guard $ not $ isBlockCommand name
parseRaw <- getOption readerParseRaw
star <- option "" (string "*")
@@ -352,6 +352,7 @@ inlineCommands = M.fromList $
, ("textsubscript", subscript <$> tok)
, ("textbackslash", lit "\\")
, ("backslash", lit "\\")
+ , ("slash", lit "/")
, ("textbf", strong <$> tok)
, ("ldots", lit "…")
, ("dots", lit "…")
@@ -644,11 +645,7 @@ inlineText :: LP Inlines
inlineText = str <$> many1 inlineChar
inlineChar :: LP Char
-inlineChar = satisfy $ \c ->
- not (c == '\\' || c == '$' || c == '%' || c == '^' || c == '_' ||
- c == '&' || c == '~' || c == '#' || c == '{' || c == '}' ||
- c == '^' || c == '\'' || c == '`' || c == '-' || c == ']' ||
- c == ' ' || c == '\t' || c == '\n' )
+inlineChar = noneOf "\\$%^_&~#{}^'`-[] \t\n"
environment :: LP Blocks
environment = do
@@ -745,6 +742,9 @@ environments = M.fromList
, ("lstlisting", codeBlock <$> (verbEnv "lstlisting"))
, ("minted", liftA2 (\l c -> codeBlockWith ("",[l],[]) c)
(grouped (many1 $ satisfy (/= '}'))) (verbEnv "minted"))
+ , ("obeylines", parseFromString
+ (para . trimInlines . mconcat <$> many inline) =<<
+ intercalate "\\\\\n" . lines <$> verbEnv "obeylines")
, ("displaymath", mathEnv Nothing "displaymath")
, ("equation", mathEnv Nothing "equation")
, ("equation*", mathEnv Nothing "equation*")
@@ -801,7 +801,9 @@ descItem = do
return (ils, [bs])
env :: String -> LP a -> LP a
-env name p = p <* (controlSeq "end" *> braced >>= guard . (== name))
+env name p = p <*
+ (try (controlSeq "end" *> braced >>= guard . (== name))
+ <?> ("\\end{" ++ name ++ "}"))
listenv :: String -> LP a -> LP a
listenv name p = try $ do
diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs
index 2407e137c..1c2cc12f1 100644
--- a/src/Text/Pandoc/Readers/Markdown.hs
+++ b/src/Text/Pandoc/Readers/Markdown.hs
@@ -1,6 +1,4 @@
{-# LANGUAGE RelaxedPolyRec #-} -- needed for inlinesBetween on GHC < 7
-{-# LANGUAGE FlexibleInstances, TypeSynonymInstances,
- GeneralizedNewtypeDeriving #-}
{-
Copyright (C) 2006-2010 John MacFarlane <jgm@berkeley.edu>
@@ -1294,18 +1292,6 @@ inlinesBetween start end =
where inner = innerSpace <|> (notFollowedBy' (() <$ whitespace) >> inline)
innerSpace = try $ whitespace >>~ notFollowedBy' end
--- This is used to prevent exponential blowups for things like:
--- a**a*a**a*a**a*a**a*a**a*a**a*a**
-nested :: Parser [Char] ParserState a
- -> Parser [Char] ParserState a
-nested p = do
- nestlevel <- stateMaxNestingLevel `fmap` getState
- guard $ nestlevel > 0
- updateState $ \st -> st{ stateMaxNestingLevel = stateMaxNestingLevel st - 1 }
- res <- p
- updateState $ \st -> st{ stateMaxNestingLevel = nestlevel }
- return res
-
emph :: Parser [Char] ParserState (F Inlines)
emph = fmap B.emph <$> nested
(inlinesBetween starStart starEnd <|> inlinesBetween ulStart ulEnd)
diff --git a/src/Text/Pandoc/Readers/MediaWiki.hs b/src/Text/Pandoc/Readers/MediaWiki.hs
new file mode 100644
index 000000000..7936be38b
--- /dev/null
+++ b/src/Text/Pandoc/Readers/MediaWiki.hs
@@ -0,0 +1,594 @@
+{-# LANGUAGE RelaxedPolyRec #-} -- needed for inlinesBetween on GHC < 7
+{-
+ Copyright (C) 2012 John MacFarlane <jgm@berkeley.edu>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+-}
+
+{- |
+ Module : Text.Pandoc.Readers.MediaWiki
+ Copyright : Copyright (C) 2012 John MacFarlane
+ License : GNU GPL, version 2 or above
+
+ Maintainer : John MacFarlane <jgm@berkeley.edu>
+ Stability : alpha
+ Portability : portable
+
+Conversion of mediawiki text to 'Pandoc' document.
+-}
+{-
+TODO:
+_ correctly handle tables within tables
+_ parse templates?
+-}
+module Text.Pandoc.Readers.MediaWiki ( readMediaWiki ) where
+
+import Text.Pandoc.Definition
+import qualified Text.Pandoc.Builder as B
+import Text.Pandoc.Builder (Inlines, Blocks, trimInlines, (<>))
+import Text.Pandoc.Options
+import Text.Pandoc.Readers.HTML ( htmlTag, isBlockTag, isCommentTag )
+import Text.Pandoc.XML ( fromEntities )
+import Text.Pandoc.Parsing hiding ( nested )
+import Text.Pandoc.Generic ( bottomUp )
+import Text.Pandoc.Shared ( stripTrailingNewlines, safeRead )
+import Data.Monoid (mconcat, mempty)
+import Control.Applicative ((<$>), (<*), (*>), (<$))
+import Control.Monad
+import Data.List (intersperse, intercalate, isPrefixOf )
+import Text.HTML.TagSoup
+import Data.Sequence (viewl, ViewL(..), (<|))
+import Data.Char (isDigit)
+
+-- | Read mediawiki from an input string and return a Pandoc document.
+readMediaWiki :: ReaderOptions -- ^ Reader options
+ -> String -- ^ String to parse (assuming @'\n'@ line endings)
+ -> Pandoc
+readMediaWiki opts s =
+ case runParser parseMediaWiki MWState{ mwOptions = opts
+ , mwMaxNestingLevel = 4
+ , mwNextLinkNumber = 1
+ , mwCategoryLinks = []
+ }
+ "source" (s ++ "\n") of
+ Left err' -> error $ "\nError:\n" ++ show err'
+ Right result -> result
+
+data MWState = MWState { mwOptions :: ReaderOptions
+ , mwMaxNestingLevel :: Int
+ , mwNextLinkNumber :: Int
+ , mwCategoryLinks :: [Inlines]
+ }
+
+type MWParser = Parser [Char] MWState
+
+--
+-- auxiliary functions
+--
+
+-- This is used to prevent exponential blowups for things like:
+-- ''a'''a''a'''a''a'''a''a'''a
+nested :: MWParser a -> MWParser a
+nested p = do
+ nestlevel <- mwMaxNestingLevel `fmap` getState
+ guard $ nestlevel > 0
+ updateState $ \st -> st{ mwMaxNestingLevel = mwMaxNestingLevel st - 1 }
+ res <- p
+ updateState $ \st -> st{ mwMaxNestingLevel = nestlevel }
+ return res
+
+specialChars :: [Char]
+specialChars = "'[]<=&*{}|\""
+
+spaceChars :: [Char]
+spaceChars = " \n\t"
+
+sym :: String -> MWParser ()
+sym s = () <$ try (string s)
+
+newBlockTags :: [String]
+newBlockTags = ["haskell","syntaxhighlight","source","gallery","references"]
+
+isBlockTag' :: Tag String -> Bool
+isBlockTag' tag@(TagOpen t _) = (isBlockTag tag || t `elem` newBlockTags) &&
+ t `notElem` eitherBlockOrInline
+isBlockTag' tag@(TagClose t) = (isBlockTag tag || t `elem` newBlockTags) &&
+ t `notElem` eitherBlockOrInline
+isBlockTag' tag = isBlockTag tag
+
+isInlineTag' :: Tag String -> Bool
+isInlineTag' (TagComment _) = True
+isInlineTag' t = not (isBlockTag' t)
+
+eitherBlockOrInline :: [String]
+eitherBlockOrInline = ["applet", "button", "del", "iframe", "ins",
+ "map", "area", "object"]
+
+htmlComment :: MWParser ()
+htmlComment = () <$ htmlTag isCommentTag
+
+inlinesInTags :: String -> MWParser Inlines
+inlinesInTags tag = try $ do
+ (_,raw) <- htmlTag (~== TagOpen tag [])
+ if '/' `elem` raw -- self-closing tag
+ then return mempty
+ else trimInlines . mconcat <$>
+ manyTill inline (htmlTag (~== TagClose tag))
+
+blocksInTags :: String -> MWParser Blocks
+blocksInTags tag = try $ do
+ (_,raw) <- htmlTag (~== TagOpen tag [])
+ if '/' `elem` raw -- self-closing tag
+ then return mempty
+ else mconcat <$> manyTill block (htmlTag (~== TagClose tag))
+
+charsInTags :: String -> MWParser [Char]
+charsInTags tag = try $ do
+ (_,raw) <- htmlTag (~== TagOpen tag [])
+ if '/' `elem` raw -- self-closing tag
+ then return ""
+ else innerText . parseTags <$>
+ manyTill anyChar (htmlTag (~== TagClose tag))
+
+--
+-- main parser
+--
+
+parseMediaWiki :: MWParser Pandoc
+parseMediaWiki = do
+ bs <- mconcat <$> many block
+ spaces
+ eof
+ categoryLinks <- reverse . mwCategoryLinks <$> getState
+ let categories = if null categoryLinks
+ then mempty
+ else B.para $ mconcat $ intersperse B.space categoryLinks
+ return $ B.doc $ bs <> categories
+
+--
+-- block parsers
+--
+
+block :: MWParser Blocks
+block = mempty <$ skipMany1 blankline
+ <|> table
+ <|> header
+ <|> hrule
+ <|> orderedList
+ <|> bulletList
+ <|> definitionList
+ <|> mempty <$ try (spaces *> htmlComment)
+ <|> preformatted
+ <|> blockTag
+ <|> (B.rawBlock "mediawiki" <$> template)
+ <|> para
+
+para :: MWParser Blocks
+para = B.para . trimInlines . mconcat <$> many1 inline
+
+table :: MWParser Blocks
+table = do
+ tableStart
+ styles <- manyTill anyChar newline
+ let tableWidth = case lookup "width" $ parseAttrs styles of
+ Just w -> maybe 1.0 id $ parseWidth w
+ Nothing -> 1.0
+ caption <- option mempty tableCaption
+ optional rowsep
+ hasheader <- option False $ True <$ (lookAhead (char '!'))
+ (cellspecs',hdr) <- unzip <$> tableRow
+ let widths = map ((tableWidth *) . snd) cellspecs'
+ let restwidth = tableWidth - sum widths
+ let zerocols = length $ filter (==0.0) widths
+ let defaultwidth = if zerocols == 0 || zerocols == length widths
+ then 0.0
+ else restwidth / fromIntegral zerocols
+ let widths' = map (\w -> if w == 0 then defaultwidth else w) widths
+ let cellspecs = zip (map fst cellspecs') widths'
+ rows' <- many $ try $ rowsep *> (map snd <$> tableRow)
+ tableEnd
+ let cols = length hdr
+ let (headers,rows) = if hasheader
+ then (hdr, rows')
+ else (replicate cols mempty, hdr:rows')
+ return $ B.table caption cellspecs headers rows
+
+parseAttrs :: String -> [(String,String)]
+parseAttrs s = case parse (many parseAttr) "attributes" s of
+ Right r -> r
+ Left _ -> []
+
+parseAttr :: Parser String () (String, String)
+parseAttr = try $ do
+ skipMany spaceChar
+ k <- many1 letter
+ char '='
+ char '"'
+ v <- many1Till anyChar (char '"')
+ return (k,v)
+
+tableStart :: MWParser ()
+tableStart = try $ guardColumnOne *> sym "{|"
+
+tableEnd :: MWParser ()
+tableEnd = try $ guardColumnOne *> sym "|}" <* blanklines
+
+rowsep :: MWParser ()
+rowsep = try $ guardColumnOne *> sym "|-" <* blanklines
+
+cellsep :: MWParser ()
+cellsep = try $
+ (guardColumnOne <*
+ ( (char '|' <* notFollowedBy (oneOf "-}+"))
+ <|> (char '!')
+ )
+ )
+ <|> (() <$ try (string "||"))
+ <|> (() <$ try (string "!!"))
+
+tableCaption :: MWParser Inlines
+tableCaption = try $ do
+ guardColumnOne
+ sym "|+"
+ skipMany spaceChar
+ res <- manyTill anyChar newline >>= parseFromString (many inline)
+ return $ trimInlines $ mconcat res
+
+tableRow :: MWParser [((Alignment, Double), Blocks)]
+tableRow = try $ many tableCell
+
+tableCell :: MWParser ((Alignment, Double), Blocks)
+tableCell = try $ do
+ cellsep
+ skipMany spaceChar
+ attrs <- option [] $ try $ parseAttrs <$>
+ manyTill (satisfy (/='\n')) (char '|' <* notFollowedBy (char '|'))
+ skipMany spaceChar
+ ls <- concat <$> many (notFollowedBy (cellsep <|> rowsep <|> tableEnd) *>
+ ((snd <$> withRaw table) <|> count 1 anyChar))
+ bs <- parseFromString (mconcat <$> many block) ls
+ let align = case lookup "align" attrs of
+ Just "left" -> AlignLeft
+ Just "right" -> AlignRight
+ Just "center" -> AlignCenter
+ _ -> AlignDefault
+ let width = case lookup "width" attrs of
+ Just xs -> maybe 0.0 id $ parseWidth xs
+ Nothing -> 0.0
+ return ((align, width), bs)
+
+parseWidth :: String -> Maybe Double
+parseWidth s =
+ case reverse s of
+ ('%':ds) | all isDigit ds -> safeRead ('0':'.':reverse ds)
+ _ -> Nothing
+
+template :: MWParser String
+template = try $ do
+ string "{{"
+ notFollowedBy (char '{')
+ let chunk = template <|> variable <|> many1 (noneOf "{}") <|> count 1 anyChar
+ contents <- manyTill chunk (try $ string "}}")
+ return $ "{{" ++ concat contents ++ "}}"
+
+blockTag :: MWParser Blocks
+blockTag = do
+ (tag, _) <- lookAhead $ htmlTag isBlockTag'
+ case tag of
+ TagOpen "blockquote" _ -> B.blockQuote <$> blocksInTags "blockquote"
+ TagOpen "pre" _ -> B.codeBlock . trimCode <$> charsInTags "pre"
+ TagOpen "syntaxhighlight" attrs -> syntaxhighlight "syntaxhighlight" attrs
+ TagOpen "source" attrs -> syntaxhighlight "source" attrs
+ TagOpen "haskell" _ -> B.codeBlockWith ("",["haskell"],[]) . trimCode <$>
+ charsInTags "haskell"
+ TagOpen "gallery" _ -> blocksInTags "gallery"
+ TagOpen "p" _ -> mempty <$ htmlTag (~== tag)
+ TagClose "p" -> mempty <$ htmlTag (~== tag)
+ _ -> B.rawBlock "html" . snd <$> htmlTag (~== tag)
+
+trimCode :: String -> String
+trimCode ('\n':xs) = stripTrailingNewlines xs
+trimCode xs = stripTrailingNewlines xs
+
+syntaxhighlight :: String -> [Attribute String] -> MWParser Blocks
+syntaxhighlight tag attrs = try $ do
+ let mblang = lookup "lang" attrs
+ let mbstart = lookup "start" attrs
+ let mbline = lookup "line" attrs
+ let classes = maybe [] (:[]) mblang ++ maybe [] (const ["numberLines"]) mbline
+ let kvs = maybe [] (\x -> [("startFrom",x)]) mbstart
+ contents <- charsInTags tag
+ return $ B.codeBlockWith ("",classes,kvs) $ trimCode contents
+
+hrule :: MWParser Blocks
+hrule = B.horizontalRule <$ try (string "----" *> many (char '-') *> newline)
+
+guardColumnOne :: MWParser ()
+guardColumnOne = getPosition >>= \pos -> guard (sourceColumn pos == 1)
+
+preformatted :: MWParser Blocks
+preformatted = try $ do
+ guardColumnOne
+ char ' '
+ let endline' = B.linebreak <$ (try $ newline <* char ' ')
+ let whitespace' = B.str <$> many1 ('\160' <$ spaceChar)
+ let spToNbsp ' ' = '\160'
+ spToNbsp x = x
+ let nowiki' = mconcat . intersperse B.linebreak . map B.str .
+ lines . fromEntities . map spToNbsp <$> try
+ (htmlTag (~== TagOpen "nowiki" []) *>
+ manyTill anyChar (htmlTag (~== TagClose "nowiki")))
+ let inline' = whitespace' <|> endline' <|> nowiki' <|> inline
+ let strToCode (Str s) = Code ("",[],[]) s
+ strToCode x = x
+ B.para . bottomUp strToCode . mconcat <$> many1 inline'
+
+header :: MWParser Blocks
+header = try $ do
+ guardColumnOne
+ eqs <- many1 (char '=')
+ let lev = length eqs
+ guard $ lev <= 6
+ contents <- trimInlines . mconcat <$> manyTill inline (count lev $ char '=')
+ return $ B.header lev contents
+
+bulletList :: MWParser Blocks
+bulletList = B.bulletList <$>
+ ( many1 (listItem '*')
+ <|> (htmlTag (~== TagOpen "ul" []) *> spaces *> many (listItem '*' <|> li) <*
+ optional (htmlTag (~== TagClose "ul"))) )
+
+orderedList :: MWParser Blocks
+orderedList =
+ (B.orderedList <$> many1 (listItem '#'))
+ <|> (B.orderedList <$> (htmlTag (~== TagOpen "ul" []) *> spaces *>
+ many (listItem '#' <|> li) <*
+ optional (htmlTag (~== TagClose "ul"))))
+ <|> do (tag,_) <- htmlTag (~== TagOpen "ol" [])
+ spaces
+ items <- many (listItem '#' <|> li)
+ optional (htmlTag (~== TagClose "ol"))
+ let start = maybe 1 id $ safeRead $ fromAttrib "start" tag
+ return $ B.orderedListWith (start, DefaultStyle, DefaultDelim) items
+
+definitionList :: MWParser Blocks
+definitionList = B.definitionList <$> many1 defListItem
+
+defListItem :: MWParser (Inlines, [Blocks])
+defListItem = try $ do
+ terms <- mconcat . intersperse B.linebreak <$> many defListTerm
+ -- we allow dd with no dt, or dt with no dd
+ defs <- if B.isNull terms
+ then many1 $ listItem ':'
+ else many $ listItem ':'
+ return (terms, defs)
+
+defListTerm :: MWParser Inlines
+defListTerm = char ';' >> skipMany spaceChar >> manyTill anyChar newline >>=
+ parseFromString (trimInlines . mconcat <$> many inline)
+
+listStart :: Char -> MWParser ()
+listStart c = char c *> notFollowedBy listStartChar
+
+listStartChar :: MWParser Char
+listStartChar = oneOf "*#;:"
+
+anyListStart :: MWParser Char
+anyListStart = char '*'
+ <|> char '#'
+ <|> char ':'
+ <|> char ';'
+
+li :: MWParser Blocks
+li = lookAhead (htmlTag (~== TagOpen "li" [])) *>
+ (firstParaToPlain <$> blocksInTags "li") <* spaces
+
+listItem :: Char -> MWParser Blocks
+listItem c = try $ do
+ extras <- many (try $ char c <* lookAhead listStartChar)
+ if null extras
+ then listItem' c
+ else do
+ skipMany spaceChar
+ first <- concat <$> manyTill listChunk newline
+ rest <- many
+ (try $ string extras *> (concat <$> manyTill listChunk newline))
+ contents <- parseFromString (many1 $ listItem' c)
+ (unlines (first : rest))
+ case c of
+ '*' -> return $ B.bulletList contents
+ '#' -> return $ B.orderedList contents
+ ':' -> return $ B.definitionList [(mempty, contents)]
+ _ -> mzero
+
+-- The point of this is to handle stuff like
+-- * {{cite book
+-- | blah
+-- | blah
+-- }}
+-- * next list item
+-- which seems to be valid mediawiki.
+listChunk :: MWParser String
+listChunk = template <|> count 1 anyChar
+
+listItem' :: Char -> MWParser Blocks
+listItem' c = try $ do
+ listStart c
+ skipMany spaceChar
+ first <- concat <$> manyTill listChunk newline
+ rest <- many (try $ char c *> lookAhead listStartChar *>
+ (concat <$> manyTill listChunk newline))
+ parseFromString (firstParaToPlain . mconcat <$> many1 block)
+ $ unlines $ first : rest
+
+firstParaToPlain :: Blocks -> Blocks
+firstParaToPlain contents =
+ case viewl (B.unMany contents) of
+ (Para xs) :< ys -> B.Many $ (Plain xs) <| ys
+ _ -> contents
+
+--
+-- inline parsers
+--
+
+inline :: MWParser Inlines
+inline = whitespace
+ <|> url
+ <|> str
+ <|> doubleQuotes
+ <|> strong
+ <|> emph
+ <|> image
+ <|> internalLink
+ <|> externalLink
+ <|> inlineTag
+ <|> B.singleton <$> charRef
+ <|> inlineHtml
+ <|> (B.rawInline "mediawiki" <$> variable)
+ <|> (B.rawInline "mediawiki" <$> template)
+ <|> special
+
+str :: MWParser Inlines
+str = B.str <$> many1 (noneOf $ specialChars ++ spaceChars)
+
+variable :: MWParser String
+variable = try $ do
+ string "{{{"
+ contents <- manyTill anyChar (try $ string "}}}")
+ return $ "{{{" ++ contents ++ "}}}"
+
+inlineTag :: MWParser Inlines
+inlineTag = do
+ (tag, _) <- lookAhead $ htmlTag isInlineTag'
+ case tag of
+ TagOpen "ref" _ -> B.note . B.plain <$> inlinesInTags "ref"
+ TagOpen "nowiki" _ -> try $ do
+ (_,raw) <- htmlTag (~== tag)
+ if '/' `elem` raw
+ then return mempty
+ else B.text . fromEntities <$>
+ manyTill anyChar (htmlTag (~== TagClose "nowiki"))
+ TagOpen "br" _ -> B.linebreak <$ (htmlTag (~== TagOpen "br" []) -- will get /> too
+ *> optional blankline)
+ TagOpen "strike" _ -> B.strikeout <$> inlinesInTags "strike"
+ TagOpen "del" _ -> B.strikeout <$> inlinesInTags "del"
+ TagOpen "sub" _ -> B.subscript <$> inlinesInTags "sub"
+ TagOpen "sup" _ -> B.superscript <$> inlinesInTags "sup"
+ TagOpen "math" _ -> B.math <$> charsInTags "math"
+ TagOpen "code" _ -> B.code <$> charsInTags "code"
+ TagOpen "tt" _ -> B.code <$> charsInTags "tt"
+ TagOpen "hask" _ -> B.codeWith ("",["haskell"],[]) <$> charsInTags "hask"
+ _ -> B.rawInline "html" . snd <$> htmlTag (~== tag)
+
+special :: MWParser Inlines
+special = B.str <$> count 1 (notFollowedBy' (htmlTag isBlockTag') *>
+ oneOf specialChars)
+
+inlineHtml :: MWParser Inlines
+inlineHtml = B.rawInline "html" . snd <$> htmlTag isInlineTag'
+
+whitespace :: MWParser Inlines
+whitespace = B.space <$ (skipMany1 spaceChar <|> endline <|> htmlComment)
+
+endline :: MWParser ()
+endline = () <$ try (newline <*
+ notFollowedBy blankline <*
+ notFollowedBy' hrule <*
+ notFollowedBy tableStart <*
+ notFollowedBy' header <*
+ notFollowedBy anyListStart)
+
+image :: MWParser Inlines
+image = try $ do
+ sym "[["
+ sym "File:"
+ fname <- many1 (noneOf "|]")
+ _ <- many (try $ char '|' *> imageOption)
+ caption <- (B.str fname <$ sym "]]")
+ <|> try (char '|' *> (mconcat <$> manyTill inline (sym "]]")))
+ return $ B.image fname "image" caption
+
+imageOption :: MWParser String
+imageOption =
+ try (oneOfStrings [ "border", "thumbnail", "frameless"
+ , "thumb", "upright", "left", "right"
+ , "center", "none", "baseline", "sub"
+ , "super", "top", "text-top", "middle"
+ , "bottom", "text-bottom" ])
+ <|> try (string "frame")
+ <|> try (many1 (oneOf "x0123456789") <* string "px")
+ <|> try (oneOfStrings ["link=","alt=","page=","class="] <* many (noneOf "|]"))
+
+internalLink :: MWParser Inlines
+internalLink = try $ do
+ sym "[["
+ let addUnderscores x = let (pref,suff) = break (=='#') x
+ in pref ++ intercalate "_" (words suff)
+ pagename <- unwords . words <$> many (noneOf "|]")
+ label <- option (B.text pagename) $ char '|' *>
+ ( (mconcat <$> many1 (notFollowedBy (char ']') *> inline))
+ -- the "pipe trick"
+ -- [[Help:Contents|] -> "Contents"
+ <|> (return $ B.text $ drop 1 $ dropWhile (/=':') pagename) )
+ sym "]]"
+ linktrail <- B.text <$> many letter
+ let link = B.link (addUnderscores pagename) "wikilink" (label <> linktrail)
+ if "Category:" `isPrefixOf` pagename
+ then do
+ updateState $ \st -> st{ mwCategoryLinks = link : mwCategoryLinks st }
+ return mempty
+ else return link
+
+externalLink :: MWParser Inlines
+externalLink = try $ do
+ char '['
+ (_, src) <- uri
+ lab <- try (trimInlines . mconcat <$>
+ (skipMany1 spaceChar *> manyTill inline (char ']')))
+ <|> do char ']'
+ num <- mwNextLinkNumber <$> getState
+ updateState $ \st -> st{ mwNextLinkNumber = num + 1 }
+ return $ B.str $ show num
+ return $ B.link src "" lab
+
+url :: MWParser Inlines
+url = do
+ (orig, src) <- uri
+ return $ B.link src "" (B.str orig)
+
+-- | Parses a list of inlines between start and end delimiters.
+inlinesBetween :: (Show b) => MWParser a -> MWParser b -> MWParser Inlines
+inlinesBetween start end =
+ (trimInlines . mconcat) <$> try (start >> many1Till inner end)
+ where inner = innerSpace <|> (notFollowedBy' (() <$ whitespace) >> inline)
+ innerSpace = try $ whitespace >>~ notFollowedBy' end
+
+emph :: MWParser Inlines
+emph = B.emph <$> nested (inlinesBetween start end)
+ where start = sym "''" >> lookAhead nonspaceChar
+ end = try $ notFollowedBy' (() <$ strong) >> sym "''"
+
+strong :: MWParser Inlines
+strong = B.strong <$> nested (inlinesBetween start end)
+ where start = sym "'''" >> lookAhead nonspaceChar
+ end = try $ sym "'''"
+
+doubleQuotes :: MWParser Inlines
+doubleQuotes = B.doubleQuoted . trimInlines . mconcat <$> try
+ ((getState >>= guard . readerSmart . mwOptions) *>
+ openDoubleQuote *> manyTill inline closeDoubleQuote )
+ where openDoubleQuote = char '"' <* lookAhead alphaNum
+ closeDoubleQuote = char '"' <* notFollowedBy alphaNum
+
diff --git a/src/Text/Pandoc/Readers/RST.hs b/src/Text/Pandoc/Readers/RST.hs
index a26b1623d..74653efcf 100644
--- a/src/Text/Pandoc/Readers/RST.hs
+++ b/src/Text/Pandoc/Readers/RST.hs
@@ -757,7 +757,7 @@ simpleTableHeader headless = try $ do
rawContent <- if headless
then return ""
else simpleTableSep '=' >> anyLine
- dashes <- simpleDashedLines '='
+ dashes <- simpleDashedLines '=' <|> simpleDashedLines '-'
newline
let lines' = map snd dashes
let indices = scanl (+) 0 lines'
diff --git a/src/Text/Pandoc/Readers/Textile.hs b/src/Text/Pandoc/Readers/Textile.hs
index 89f281ae8..dc95d9a56 100644
--- a/src/Text/Pandoc/Readers/Textile.hs
+++ b/src/Text/Pandoc/Readers/Textile.hs
@@ -47,7 +47,6 @@ Left to be implemented:
TODO : refactor common patterns across readers :
- autolink
- - smartPunctuation
- more ...
-}
@@ -62,6 +61,7 @@ import Text.Pandoc.Parsing
import Text.Pandoc.Readers.HTML ( htmlTag, isInlineTag, isBlockTag )
import Text.Pandoc.Readers.LaTeX ( rawLaTeXInline, rawLaTeXBlock )
import Text.HTML.TagSoup.Match
+import Data.List ( intercalate )
import Data.Char ( digitToInt, isUpper )
import Control.Monad ( guard, liftM )
import Control.Applicative ((<$>), (*>), (<*))
@@ -412,7 +412,7 @@ note = try $ do
-- | Special chars
markupChars :: [Char]
-markupChars = "\\[]*#_@~-+^|%="
+markupChars = "\\*#_@~-+^|%=[]"
-- | Break strings on following chars. Space tab and newline break for
-- inlines breaking. Open paren breaks for mark. Quote, dash and dot
@@ -427,13 +427,15 @@ wordBoundaries = markupChars ++ stringBreakers
-- | Parse a hyphened sequence of words
hyphenedWords :: Parser [Char] ParserState String
-hyphenedWords = try $ do
+hyphenedWords = intercalate "-" <$> sepBy1 wordChunk (char '-')
+
+wordChunk :: Parser [Char] ParserState String
+wordChunk = try $ do
hd <- noneOf wordBoundaries
tl <- many ( (noneOf wordBoundaries) <|>
- try (oneOf markupChars <* lookAhead (noneOf wordBoundaries) ) )
- let wd = hd:tl
- option wd $ try $
- (\r -> concat [wd, "-", r]) <$> (char '-' *> hyphenedWords)
+ try (notFollowedBy' note *> oneOf markupChars
+ <* lookAhead (noneOf wordBoundaries) ) )
+ return $ hd:tl
-- | Any string
str :: Parser [Char] ParserState Inline
diff --git a/src/Text/Pandoc/Writers/Docbook.hs b/src/Text/Pandoc/Writers/Docbook.hs
index e696fc63e..a38f57074 100644
--- a/src/Text/Pandoc/Writers/Docbook.hs
+++ b/src/Text/Pandoc/Writers/Docbook.hs
@@ -103,7 +103,7 @@ elementToDocbook opts lvl (Sec _ _num id' title elements) =
n | n == 0 -> "chapter"
| n >= 1 && n <= 5 -> "sect" ++ show n
| otherwise -> "simplesect"
- in inTags True tag [("id",id')] $
+ in inTags True tag [("id", writerIdentifierPrefix opts ++ id')] $
inTagsSimple "title" (inlinesToDocbook opts title) $$
vcat (map (elementToDocbook opts (lvl + 1)) elements')
diff --git a/src/Text/Pandoc/Writers/Docx.hs b/src/Text/Pandoc/Writers/Docx.hs
index 05c9555c6..84bf95dfb 100644
--- a/src/Text/Pandoc/Writers/Docx.hs
+++ b/src/Text/Pandoc/Writers/Docx.hs
@@ -60,7 +60,7 @@ data WriterState = WriterState{
, stExternalLinks :: M.Map String String
, stImages :: M.Map FilePath (String, B.ByteString)
, stListLevel :: Int
- , stListMarker :: ListMarker
+ , stListNumId :: Int
, stNumStyles :: M.Map ListMarker Int
, stLists :: [ListMarker]
}
@@ -79,7 +79,7 @@ defaultWriterState = WriterState{
, stExternalLinks = M.empty
, stImages = M.empty
, stListLevel = -1
- , stListMarker = NoMarker
+ , stListNumId = 1
, stNumStyles = M.fromList [(NoMarker, 0)]
, stLists = [NoMarker]
}
@@ -285,6 +285,9 @@ mkLvl marker lvl =
patternFor TwoParens s = "(" ++ s ++ ")"
patternFor _ s = s ++ "."
+getNumId :: WS Int
+getNumId = length `fmap` gets stLists
+
-- | Convert Pandoc document to string in OpenXML format.
writeOpenXML :: WriterOptions -> Pandoc -> WS Element
writeOpenXML opts (Pandoc (Meta tit auths dat) blocks) = do
@@ -402,11 +405,13 @@ blockToOpenXML opts (Table caption aligns widths headers rows) = do
blockToOpenXML opts (BulletList lst) = do
let marker = BulletMarker
addList marker
- asList $ concat `fmap` mapM (listItemToOpenXML opts marker) lst
+ numid <- getNumId
+ asList $ concat `fmap` mapM (listItemToOpenXML opts numid) lst
blockToOpenXML opts (OrderedList (start, numstyle, numdelim) lst) = do
let marker = NumberMarker numstyle numdelim start
addList marker
- asList $ concat `fmap` mapM (listItemToOpenXML opts marker) lst
+ numid <- getNumId
+ asList $ concat `fmap` mapM (listItemToOpenXML opts numid) lst
blockToOpenXML opts (DefinitionList items) =
concat `fmap` mapM (definitionListItemToOpenXML opts) items
@@ -418,9 +423,6 @@ definitionListItemToOpenXML opts (term,defs) = do
$ concat `fmap` mapM (blocksToOpenXML opts) defs
return $ term' ++ defs'
-getNumId :: WS Int
-getNumId = length `fmap` gets stLists
-
addList :: ListMarker -> WS ()
addList marker = do
lists <- gets stLists
@@ -431,11 +433,11 @@ addList marker = do
Nothing -> modify $ \st ->
st{ stNumStyles = M.insert marker (M.size numStyles + 1) numStyles }
-listItemToOpenXML :: WriterOptions -> ListMarker -> [Block] -> WS [Element]
+listItemToOpenXML :: WriterOptions -> Int -> [Block] -> WS [Element]
listItemToOpenXML _ _ [] = return []
-listItemToOpenXML opts marker (first:rest) = do
- first' <- withMarker marker $ blockToOpenXML opts first
- rest' <- withMarker NoMarker $ blocksToOpenXML opts rest
+listItemToOpenXML opts numid (first:rest) = do
+ first' <- withNumId numid $ blockToOpenXML opts first
+ rest' <- withNumId 1 $ blocksToOpenXML opts rest
return $ first' ++ rest'
alignmentToString :: Alignment -> [Char]
@@ -449,12 +451,12 @@ alignmentToString alignment = case alignment of
inlinesToOpenXML :: WriterOptions -> [Inline] -> WS [Element]
inlinesToOpenXML opts lst = concat `fmap` mapM (inlineToOpenXML opts) lst
-withMarker :: ListMarker -> WS a -> WS a
-withMarker m p = do
- origMarker <- gets stListMarker
- modify $ \st -> st{ stListMarker = m }
+withNumId :: Int -> WS a -> WS a
+withNumId numid p = do
+ origNumId <- gets stListNumId
+ modify $ \st -> st{ stListNumId = numid }
result <- p
- modify $ \st -> st{ stListMarker = origMarker }
+ modify $ \st -> st{ stListNumId = origNumId }
return result
asList :: WS a -> WS a
@@ -489,10 +491,7 @@ getParaProps :: WS [Element]
getParaProps = do
props <- gets stParaProperties
listLevel <- gets stListLevel
- listMarker <- gets stListMarker
- numid <- case listMarker of
- NoMarker -> return 1
- _ -> getNumId
+ numid <- gets stListNumId
let listPr = if listLevel >= 0
then [ mknode "w:numPr" []
[ mknode "w:numId" [("w:val",show numid)] ()
diff --git a/src/Text/Pandoc/Writers/EPUB.hs b/src/Text/Pandoc/Writers/EPUB.hs
index b6527c6c8..6f8931caa 100644
--- a/src/Text/Pandoc/Writers/EPUB.hs
+++ b/src/Text/Pandoc/Writers/EPUB.hs
@@ -32,6 +32,7 @@ import Data.IORef
import Data.Maybe ( fromMaybe, isNothing )
import Data.List ( findIndices, isPrefixOf )
import System.Environment ( getEnv )
+import Text.Printf (printf)
import System.FilePath ( (</>), (<.>), takeBaseName, takeExtension, takeFileName )
import qualified Data.ByteString.Lazy as B
import Data.ByteString.Lazy.UTF8 ( fromString )
@@ -122,8 +123,9 @@ writeEPUB opts doc@(Pandoc meta _) = do
let chapters = map titleize chunks
let chapToHtml = writeHtmlString opts'{ writerTemplate = pageTemplate }
let chapterToEntry :: Int -> Pandoc -> Entry
- chapterToEntry num chap = mkEntry ("ch" ++ show num ++ ".xhtml") $
- fromString $ chapToHtml chap
+ chapterToEntry num chap = mkEntry
+ (showChapter num) $
+ fromString $ chapToHtml chap
let chapterEntries = zipWith chapterToEntry [1..] chapters
-- contents.opf
@@ -334,11 +336,15 @@ data IdentState = IdentState{
identTable :: [(String,String)]
} deriving (Read, Show)
+-- Returns filename for chapter number.
+showChapter :: Int -> String
+showChapter = printf "ch%03d.xhtml"
+
-- Go through a block list and construct a table
-- correlating the automatically constructed references
-- that would be used in a normal pandoc document with
-- new URLs to be used in the EPUB. For example, what
--- was "header-1" might turn into "ch6.xhtml#header".
+-- was "header-1" might turn into "ch006.xhtml#header".
correlateRefs :: [Block] -> [(String,String)]
correlateRefs bs = identTable $ execState (mapM_ go bs)
IdentState{ chapterNumber = 0
@@ -358,8 +364,9 @@ correlateRefs bs = identTable $ execState (mapM_ go bs)
modify $ \s -> s{ runningIdents = runningid : runningIdents st
, chapterIdents = maybe (chapterIdents st)
(: chapterIdents st) chapid
- , identTable = (runningid, "ch" ++ show (chapterNumber st) ++
- ".xhtml" ++ maybe "" ('#':) chapid) : identTable st
+ , identTable = (runningid,
+ showChapter (chapterNumber st) ++
+ maybe "" ('#':) chapid) : identTable st
}
go _ = return ()
diff --git a/src/Text/Pandoc/Writers/HTML.hs b/src/Text/Pandoc/Writers/HTML.hs
index c6c4a8fd7..ebb705a61 100644
--- a/src/Text/Pandoc/Writers/HTML.hs
+++ b/src/Text/Pandoc/Writers/HTML.hs
@@ -482,10 +482,13 @@ blockToHtml opts (OrderedList (startnum, numstyle, _) lst) = do
return $ foldl (!) (ordList opts contents) attribs
blockToHtml opts (DefinitionList lst) = do
contents <- mapM (\(term, defs) ->
- do term' <- liftM (H.dt) $ inlineListToHtml opts term
+ do term' <- if null term
+ then return mempty
+ else liftM (H.dt) $ inlineListToHtml opts term
defs' <- mapM ((liftM (\x -> H.dd $ (x >> nl opts))) .
blockListToHtml opts) defs
- return $ mconcat $ nl opts : term' : nl opts : defs') lst
+ return $ mconcat $ nl opts : term' : nl opts :
+ intersperse (nl opts) defs') lst
let lst' = H.dl $ mconcat contents >> nl opts
let lst'' = if writerIncremental opts
then lst' ! A.class_ "incremental"
diff --git a/src/Text/Pandoc/Writers/LaTeX.hs b/src/Text/Pandoc/Writers/LaTeX.hs
index abbbd4d01..2b5c7e84b 100644
--- a/src/Text/Pandoc/Writers/LaTeX.hs
+++ b/src/Text/Pandoc/Writers/LaTeX.hs
@@ -49,8 +49,7 @@ import Text.Pandoc.Highlighting (highlight, styleToLaTeX,
data WriterState =
WriterState { stInNote :: Bool -- true if we're in a note
, stInTable :: Bool -- true if we're in a table
- , stTableNotes :: [(Char, Doc)] -- List of markers, notes
- -- in current table
+ , stTableNotes :: [Doc] -- List of notes in current table
, stOLLevel :: Int -- level of ordered list nesting
, stOptions :: WriterOptions -- writer options, so they don't have to be parameter
, stVerbInNote :: Bool -- true if document has verbatim text in note
@@ -190,7 +189,7 @@ stringToLaTeX isUrl (x:xs) = do
'$' -> "\\$" ++ rest
'%' -> "\\%" ++ rest
'&' -> "\\&" ++ rest
- '_' -> "\\_" ++ rest
+ '_' | not isUrl -> "\\_" ++ rest
'#' -> "\\#" ++ rest
'-' -> case xs of -- prevent adjacent hyphens from forming ligatures
('-':_) -> "-{}" ++ rest
@@ -382,27 +381,27 @@ blockToLaTeX (Table caption aligns widths heads rows) = do
modify $ \s -> s{ stInTable = True, stTableNotes = [] }
headers <- if all null heads
then return empty
- else liftM ($$ "\\ML")
- $ (tableRowToLaTeX True aligns widths) heads
+ else ($$ "\\hline\\noalign{\\medskip}") `fmap`
+ (tableRowToLaTeX True aligns widths) heads
captionText <- inlineListToLaTeX caption
let capt = if isEmpty captionText
then empty
- else text "caption = {" <> captionText <> "}," <> space
+ else text "\\noalign{\\medskip}"
+ $$ text "\\caption" <> braces captionText
rows' <- mapM (tableRowToLaTeX False aligns widths) rows
- let rows'' = intersperse ("\\\\\\noalign{\\medskip}") rows'
tableNotes <- liftM (reverse . stTableNotes) get
- let toNote (marker, x) = "\\tnote" <> brackets (char marker) <>
- braces (nest 2 x)
+ let toNote x = "\\footnotetext" <> braces (nest 2 x)
let notes = vcat $ map toNote tableNotes
let colDescriptors = text $ concat $ map toColDescriptor aligns
- let tableBody =
- ("\\ctable" <> brackets (capt <> text "pos = H, center, botcap"))
- <> braces colDescriptors
- $$ braces ("% notes" <> cr <> notes <> cr)
- $$ braces (text "% rows" $$ "\\FL" $$
- vcat (headers : rows'') $$ "\\LL" <> cr)
modify $ \s -> s{ stTable = True, stInTable = False, stTableNotes = [] }
- return $ tableBody
+ return $ "\\begin{longtable}[c]" <> braces colDescriptors
+ $$ "\\hline\\noalign{\\medskip}"
+ $$ headers
+ $$ vcat rows'
+ $$ "\\hline"
+ $$ capt
+ $$ notes
+ $$ "\\end{longtable}"
toColDescriptor :: Alignment -> String
toColDescriptor align =
@@ -433,7 +432,7 @@ tableRowToLaTeX header aligns widths cols = do
braces (text (printf "%.2f\\columnwidth" w)) <>
braces (halign a <> cr <> c <> cr)
let cells = zipWith3 toCell widths aligns renderedCells
- return $ hcat $ intersperse (" & ") cells
+ return $ hsep (intersperse "&" cells) $$ "\\\\\\noalign{\\medskip}"
listItemToLaTeX :: [Block] -> State WriterState Doc
listItemToLaTeX lst = blockListToLaTeX lst >>= return . (text "\\item" $$) .
@@ -572,7 +571,7 @@ inlineToLaTeX (LineBreak) = return "\\\\"
inlineToLaTeX Space = return space
inlineToLaTeX (Link txt ('#':ident, _)) = do
contents <- inlineListToLaTeX txt
- ident' <- stringToLaTeX False ident
+ ident' <- stringToLaTeX True ident
return $ text "\\hyperref" <> brackets (text ident') <> braces contents
inlineToLaTeX (Link txt (src, _)) =
case txt of
@@ -600,9 +599,8 @@ inlineToLaTeX (Note contents) = do
if inTable
then do
curnotes <- liftM stTableNotes get
- let marker = cycle ['a'..'z'] !! length curnotes
- modify $ \s -> s{ stTableNotes = (marker, contents') : curnotes }
- return $ "\\tmark" <> brackets (char marker) <> space
+ modify $ \s -> s{ stTableNotes = contents' : curnotes }
+ return $ "\\footnotemark" <> space
else return $ "\\footnote" <> braces (nest 2 contents' <> optnl)
-- note: a \n before } needed when note ends with a Verbatim environment
diff --git a/src/Text/Pandoc/Writers/Markdown.hs b/src/Text/Pandoc/Writers/Markdown.hs
index d88419feb..1a0731710 100644
--- a/src/Text/Pandoc/Writers/Markdown.hs
+++ b/src/Text/Pandoc/Writers/Markdown.hs
@@ -54,9 +54,12 @@ data WriterState = WriterState { stNotes :: Notes
-- | Convert Pandoc to Markdown.
writeMarkdown :: WriterOptions -> Pandoc -> String
writeMarkdown opts document =
- evalState (pandocToMarkdown opts document) WriterState{ stNotes = []
- , stRefs = []
- , stPlain = False }
+ evalState (pandocToMarkdown opts{
+ writerWrapText = writerWrapText opts &&
+ not (isEnabled Ext_hard_line_breaks opts) }
+ document) WriterState{ stNotes = []
+ , stRefs = []
+ , stPlain = False }
-- | Convert Pandoc to plain text (like markdown, but without links,
-- pictures, or inline formatting).
@@ -588,8 +591,9 @@ inlineToMarkdown opts (RawInline f str)
return $ text str
inlineToMarkdown _ (RawInline _ _) = return empty
inlineToMarkdown opts (LineBreak)
+ | isEnabled Ext_hard_line_breaks opts = return cr
| isEnabled Ext_escaped_line_breaks opts = return $ "\\" <> cr
- | otherwise = return $ " " <> cr
+ | otherwise = return $ " " <> cr
inlineToMarkdown _ Space = return space
inlineToMarkdown opts (Cite (c:cs) lst)
| writerCiteMethod opts == Citeproc = inlineListToMarkdown opts lst
diff --git a/src/pandoc.hs b/src/pandoc.hs
index 63a0df51a..cb561e817 100644
--- a/src/pandoc.hs
+++ b/src/pandoc.hs
@@ -653,7 +653,7 @@ options =
(\arg opt -> do
let url' = case arg of
Just u -> u
- Nothing -> "https://d3eoax9i5htok0.cloudfront.net/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"
+ Nothing -> "http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"
return opt { optHTMLMathMethod = MathJax url'})
"URL")
"" -- "Use MathJax for HTML math"
@@ -721,6 +721,7 @@ defaultReaderName fallback (x:xs) =
".rst" -> "rst"
".lhs" -> "markdown+lhs"
".db" -> "docbook"
+ ".wiki" -> "mediawiki"
".textile" -> "textile"
".native" -> "native"
".json" -> "json"
diff --git a/templates b/templates
-Subproject 150f3ce9c67a2b4c948fcdc5793344bf8b53072
+Subproject 22628ea9aa4e65ac11d489f22be5c765b510760
diff --git a/tests/Tests/Old.hs b/tests/Tests/Old.hs
index 8899fef6f..5360126c2 100644
--- a/tests/Tests/Old.hs
+++ b/tests/Tests/Old.hs
@@ -119,9 +119,14 @@ tests = [ testGroup "markdown"
, fb2WriterTest "math" [] "fb2.math.markdown" "fb2.math.fb2"
, fb2WriterTest "testsuite" [] "testsuite.native" "writer.fb2"
]
+ , testGroup "mediawiki"
+ [ testGroup "writer" $ writerTests "mediawiki"
+ , test "reader" ["-r", "mediawiki", "-w", "native", "-s"]
+ "mediawiki-reader.wiki" "mediawiki-reader.native"
+ ]
, testGroup "other writers" $ map (\f -> testGroup f $ writerTests f)
[ "opendocument" , "context" , "texinfo"
- , "man" , "plain" , "mediawiki", "rtf", "org", "asciidoc"
+ , "man" , "plain" , "rtf", "org", "asciidoc"
]
]
diff --git a/tests/latex-reader.latex b/tests/latex-reader.latex
index cd424baec..dd7854eb1 100644
--- a/tests/latex-reader.latex
+++ b/tests/latex-reader.latex
@@ -155,6 +155,11 @@ And:
These should not be escaped: \$ \\ \> \[ \{
\end{verbatim}
+
+\begin{obeylines}
+this has \emph{two
+lines}
+\end{obeylines}
\begin{center}\rule{3in}{0.4pt}\end{center}
\section{Lists}
diff --git a/tests/latex-reader.native b/tests/latex-reader.native
index 2873529ae..b7cf39bc3 100644
--- a/tests/latex-reader.native
+++ b/tests/latex-reader.native
@@ -57,6 +57,7 @@ Pandoc (Meta {docTitle = [Str "Pandoc",Space,Str "Test",Space,Str "Suite"], docA
,CodeBlock ("",[],[]) "---- (should be four hyphens)\n\nsub status {\n print \"working\";\n}\n\nthis code block is indented by one tab"
,Para [Str "And:"]
,CodeBlock ("",[],[]) " this code block is indented by two tabs\n\nThese should not be escaped: \\$ \\\\ \\> \\[ \\{"
+,Para [Str "this",Space,Str "has",Space,Emph [Str "two",LineBreak,Str "lines"]]
,HorizontalRule
,Header 1 [Str "Lists"]
,Header 2 [Str "Unordered"]
@@ -301,7 +302,7 @@ Pandoc (Meta {docTitle = [Str "Pandoc",Space,Str "Test",Space,Str "Suite"], docA
,Para [Str "4",Space,Str "<",Space,Str "5."]
,Para [Str "6",Space,Str ">",Space,Str "5."]
,Para [Str "Backslash:",Space,Str "\\"]
-,Para [Str "Backtick:",Space,Str "`"]
+,Para [Str "Backtick:",Space,Str "\8216"]
,Para [Str "Asterisk:",Space,Str "*"]
,Para [Str "Underscore:",Space,Str "_"]
,Para [Str "Left",Space,Str "brace:",Space,Str "{"]
diff --git a/tests/mediawiki-reader.native b/tests/mediawiki-reader.native
new file mode 100644
index 000000000..122b52545
--- /dev/null
+++ b/tests/mediawiki-reader.native
@@ -0,0 +1,241 @@
+Pandoc (Meta {docTitle = [], docAuthors = [], docDate = []})
+[Header 1 [Str "header"]
+,Header 2 [Str "header",Space,Str "level",Space,Str "two"]
+,Header 3 [Str "header",Space,Str "level",Space,Str "3"]
+,Header 4 [Str "header",Space,Emph [Str "level"],Space,Str "four"]
+,Header 5 [Str "header",Space,Str "level",Space,Str "5"]
+,Header 6 [Str "header",Space,Str "level",Space,Str "6"]
+,Para [Str "=======",Space,Str "not",Space,Str "a",Space,Str "header",Space,Str "========"]
+,Para [Code ("",[],[]) "==\160not\160a\160header\160=="]
+,Header 2 [Str "emph",Space,Str "and",Space,Str "strong"]
+,Para [Emph [Str "emph"],Space,Strong [Str "strong"]]
+,Para [Strong [Emph [Str "strong",Space,Str "and",Space,Str "emph"]]]
+,Para [Strong [Emph [Str "emph",Space,Str "inside"],Space,Str "strong"]]
+,Para [Strong [Str "strong",Space,Str "with",Space,Emph [Str "emph"]]]
+,Para [Emph [Strong [Str "strong",Space,Str "inside"],Space,Str "emph"]]
+,Header 2 [Str "horizontal",Space,Str "rule"]
+,Para [Str "top"]
+,HorizontalRule
+,Para [Str "bottom"]
+,HorizontalRule
+,Header 2 [Str "nowiki"]
+,Para [Str "''not",Space,Str "emph''"]
+,Header 2 [Str "strikeout"]
+,Para [Strikeout [Str "This",Space,Str "is",Space,Emph [Str "struck",Space,Str "out"]]]
+,Header 2 [Str "entities"]
+,Para [Str "hi",Space,Str "&",Space,Str "low"]
+,Para [Str "hi",Space,Str "&",Space,Str "low"]
+,Para [Str "G\246del"]
+,Para [Str "\777\2730"]
+,Header 2 [Str "comments"]
+,Para [Str "inline",Space,Str "comment"]
+,Para [Str "between",Space,Str "blocks"]
+,Header 2 [Str "linebreaks"]
+,Para [Str "hi",LineBreak,Str "there"]
+,Para [Str "hi",LineBreak,Str "there"]
+,Header 2 [Str ":",Space,Str "indents"]
+,Para [Str "hi"]
+,DefinitionList
+ [([],
+ [[Plain [Str "there"]]])]
+,Para [Str "bud"]
+,Para [Str "hi"]
+,DefinitionList
+ [([],
+ [[DefinitionList
+ [([],
+ [[Plain [Str "there"]]])]]])]
+,Para [Str "bud"]
+,Header 2 [Str "p",Space,Str "tags"]
+,Para [Str "hi",Space,Str "there"]
+,Para [Str "bud"]
+,Para [Str "another"]
+,Header 2 [Str "raw",Space,Str "html"]
+,Para [Str "hi",Space,RawInline "html" "<span style=\"color:red\">",Emph [Str "there"],RawInline "html" "</span>",Str "."]
+,Para [RawInline "html" "<ins>",Str "inserted",RawInline "html" "</ins>"]
+,RawBlock "html" "<div class=\"special\">"
+,Para [Str "hi",Space,Emph [Str "there"]]
+,RawBlock "html" "</div>"
+,Header 2 [Str "sup,",Space,Str "sub,",Space,Str "del"]
+,Para [Str "H",Subscript [Str "2"],Str "O",Space,Str "base",Superscript [Emph [Str "exponent"]],Space,Strikeout [Str "hello"]]
+,Header 2 [Str "inline",Space,Str "code"]
+,Para [Code ("",[],[]) "*\8594*",Space,Code ("",[],[]) "typed",Space,Code ("",["haskell"],[]) ">>="]
+,Header 2 [Str "code",Space,Str "blocks"]
+,CodeBlock ("",[],[]) "case xs of\n (_:_) -> reverse xs\n [] -> ['*']"
+,CodeBlock ("",["haskell"],[]) "case xs of\n (_:_) -> reverse xs\n [] -> ['*']"
+,CodeBlock ("",["ruby","numberLines"],[("startFrom","100")]) "widgets.each do |w|\n print w.price\nend"
+,Header 2 [Str "block",Space,Str "quotes"]
+,Para [Str "Regular",Space,Str "paragraph"]
+,BlockQuote
+ [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "block",Space,Str "quote."]
+ ,Para [Str "With",Space,Str "two",Space,Str "paragraphs."]]
+,Para [Str "Nother",Space,Str "paragraph."]
+,Header 2 [Str "external",Space,Str "links"]
+,Para [Link [Emph [Str "Google"],Space,Str "search",Space,Str "engine"] ("http://google.com","")]
+,Para [Link [Str "http://johnmacfarlane.net/pandoc/"] ("http://johnmacfarlane.net/pandoc/","")]
+,Para [Link [Str "1"] ("http://google.com",""),Space,Link [Str "2"] ("http://yahoo.com","")]
+,Para [Link [Str "email",Space,Str "me"] ("mailto:info@example.org","")]
+,Header 2 [Str "internal",Space,Str "links"]
+,Para [Link [Str "Help"] ("Help","wikilink")]
+,Para [Link [Str "the",Space,Str "help",Space,Str "page"] ("Help","wikilink")]
+,Para [Link [Str "Helpers"] ("Help","wikilink")]
+,Para [Link [Str "Help"] ("Help","wikilink"),Str "ers"]
+,Para [Link [Str "Contents"] ("Help:Contents","wikilink")]
+,Para [Link [Str "#My",Space,Str "anchor"] ("#My_anchor","wikilink")]
+,Para [Link [Str "and",Space,Str "text"] ("Page#with_anchor","wikilink")]
+,Header 2 [Str "images"]
+,Para [Image [Str "caption"] ("example.jpg","image")]
+,Para [Image [Str "the",Space,Emph [Str "caption"],Space,Str "with",Space,Link [Str "external",Space,Str "link"] ("http://google.com","")] ("example.jpg","image")]
+,Para [Image [Str "caption"] ("example.jpg","image")]
+,Para [Image [Str "example.jpg"] ("example.jpg","image")]
+,Header 2 [Str "lists"]
+,BulletList
+ [[Plain [Str "Start",Space,Str "each",Space,Str "line"]]
+ ,[Plain [Str "with",Space,Str "an",Space,Str "asterisk",Space,Str "(*)."]
+ ,BulletList
+ [[Plain [Str "More",Space,Str "asterisks",Space,Str "gives",Space,Str "deeper"]
+ ,BulletList
+ [[Plain [Str "and",Space,Str "deeper",Space,Str "levels."]]]]]]
+ ,[Plain [Str "Line",Space,Str "breaks",LineBreak,Str "don't",Space,Str "break",Space,Str "levels."]
+ ,BulletList
+ [[BulletList
+ [[Plain [Str "But",Space,Str "jumping",Space,Str "levels",Space,Str "creates",Space,Str "empty",Space,Str "space."]]]]]]]
+,Para [Str "Any",Space,Str "other",Space,Str "start",Space,Str "ends",Space,Str "the",Space,Str "list."]
+,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "Start",Space,Str "each",Space,Str "line"]]
+ ,[Plain [Str "with",Space,Str "a",Space,Str "number",Space,Str "sign",Space,Str "(#)."]
+ ,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "More",Space,Str "number",Space,Str "signs",Space,Str "gives",Space,Str "deeper"]
+ ,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "and",Space,Str "deeper"]]
+ ,[Plain [Str "levels."]]]]]]
+ ,[Plain [Str "Line",Space,Str "breaks",LineBreak,Str "don't",Space,Str "break",Space,Str "levels."]
+ ,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "But",Space,Str "jumping",Space,Str "levels",Space,Str "creates",Space,Str "empty",Space,Str "space."]]]]]]
+ ,[Plain [Str "Blank",Space,Str "lines"]]]
+,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "end",Space,Str "the",Space,Str "list",Space,Str "and",Space,Str "start",Space,Str "another."]]]
+,Para [Str "Any",Space,Str "other",Space,Str "start",Space,Str "also",Space,Str "ends",Space,Str "the",Space,Str "list."]
+,DefinitionList
+ [([Str "item",Space,Str "1"],
+ [[Plain [Str "definition",Space,Str "1"]]])
+ ,([Str "item",Space,Str "2"],
+ [[Plain [Str "definition",Space,Str "2-1"]]
+ ,[Plain [Str "definition",Space,Str "2-2"]]])]
+,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "one"]]
+ ,[Plain [Str "two"]
+ ,BulletList
+ [[Plain [Str "two",Space,Str "point",Space,Str "one"]]
+ ,[Plain [Str "two",Space,Str "point",Space,Str "two"]]]]
+ ,[Plain [Str "three"]
+ ,DefinitionList
+ [([Str "three",Space,Str "item",Space,Str "one"],
+ [[Plain [Str "three",Space,Str "def",Space,Str "one"]]])]]
+ ,[Plain [Str "four"]
+ ,DefinitionList
+ [([],
+ [[Plain [Str "four",Space,Str "def",Space,Str "one"]]
+ ,[Plain [Str "this",Space,Str "looks",Space,Str "like",Space,Str "a",Space,Str "continuation"]]
+ ,[Plain [Str "and",Space,Str "is",Space,Str "often",Space,Str "used"]]
+ ,[Plain [Str "instead",LineBreak,Str "of",Space,Str "<br/>"]]])]]
+ ,[Plain [RawInline "mediawiki" "{{{template\n|author=John\n|title=My Book\n}}}"]
+ ,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "five",Space,Str "sub",Space,Str "1"]
+ ,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "five",Space,Str "sub",Space,Str "1",Space,Str "sub",Space,Str "1"]]]]
+ ,[Plain [Str "five",Space,Str "sub",Space,Str "2"]]]]]
+,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "list",Space,Str "item",Space,Emph [Str "emph"]]
+ ,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "list",Space,Str "item",Space,Str "B1"]]
+ ,[Plain [Str "list",Space,Str "item",Space,Str "B2"]]]
+ ,Para [Str "continuing",Space,Str "list",Space,Str "item",Space,Str "A1"]]
+ ,[Plain [Str "list",Space,Str "item",Space,Str "A2"]]]
+,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "abc"]]
+ ,[Plain [Str "def"]]
+ ,[Plain [Str "ghi"]]]
+,OrderedList (9,DefaultStyle,DefaultDelim)
+ [[Plain [Str "Amsterdam"]]
+ ,[Plain [Str "Rotterdam"]]
+ ,[Plain [Str "The",Space,Str "Hague"]]]
+,Header 2 [Str "math"]
+,Para [Str "Here",Space,Str "is",Space,Str "some",Space,Math InlineMath "x=\\frac{y^\\pi}{z}",Str "."]
+,Header 2 [Str "preformatted",Space,Str "blocks"]
+,Para [Code ("",[],[]) "Start\160each\160line\160with\160a\160space.",LineBreak,Code ("",[],[]) "Text\160is\160",Strong [Code ("",[],[]) "preformatted"],Code ("",[],[]) "\160and",LineBreak,Emph [Code ("",[],[]) "markups"],Code ("",[],[]) "\160",Strong [Emph [Code ("",[],[]) "can"]],Code ("",[],[]) "\160be\160done."]
+,Para [Code ("",[],[]) "\160hell\160\160\160\160\160\160yeah"]
+,Para [Code ("",[],[]) "Start\160with\160a\160space\160in\160the\160first\160column,",LineBreak,Code ("",[],[]) "(before\160the\160<nowiki>).",LineBreak,Code ("",[],[]) "",LineBreak,Code ("",[],[]) "Then\160your\160block\160format\160will\160be",LineBreak,Code ("",[],[]) "\160\160\160\160maintained.",LineBreak,Code ("",[],[]) "",LineBreak,Code ("",[],[]) "This\160is\160good\160for\160copying\160in\160code\160blocks:",LineBreak,Code ("",[],[]) "",LineBreak,Code ("",[],[]) "def\160function():",LineBreak,Code ("",[],[]) "\160\160\160\160\"\"\"documentation\160string\"\"\"",LineBreak,Code ("",[],[]) "",LineBreak,Code ("",[],[]) "\160\160\160\160if\160True:",LineBreak,Code ("",[],[]) "\160\160\160\160\160\160\160\160print\160True",LineBreak,Code ("",[],[]) "\160\160\160\160else:",LineBreak,Code ("",[],[]) "\160\160\160\160\160\160\160\160print\160False"]
+,Para [Str "Not"]
+,RawBlock "html" "<hr/>"
+,Para [Str "preformatted"]
+,Header 2 [Str "templates"]
+,RawBlock "mediawiki" "{{Welcome}}"
+,RawBlock "mediawiki" "{{Foo:Bar}}"
+,RawBlock "mediawiki" "{{Thankyou|all your effort|Me}}"
+,Para [Str "Written",Space,RawInline "mediawiki" "{{{date}}}",Space,Str "by",Space,RawInline "mediawiki" "{{{name}}}",Str "."]
+,Header 2 [Str "tables"]
+,Table [] [AlignDefault,AlignDefault] [0.0,0.0]
+ [[]
+ ,[]]
+ [[[Para [Str "Orange"]]
+ ,[Para [Str "Apple"]]]
+ ,[[Para [Str "Bread"]]
+ ,[Para [Str "Pie"]]]
+ ,[[Para [Str "Butter"]]
+ ,[Para [Str "Ice",Space,Str "cream"]]]]
+,Table [Str "Food",Space,Str "complements"] [AlignDefault,AlignDefault] [0.0,0.0]
+ [[Para [Str "Orange"]]
+ ,[Para [Str "Apple"]]]
+ [[[Para [Str "Bread"]]
+ ,[Para [Str "Pie"]]]
+ ,[[Para [Str "Butter"]]
+ ,[Para [Str "Ice",Space,Str "cream"]]]]
+,Table [Str "Food",Space,Str "complements"] [AlignDefault,AlignDefault] [0.0,0.0]
+ [[Para [Str "Orange"]]
+ ,[Para [Str "Apple"]]]
+ [[[Para [Str "Bread"]
+ ,Para [Str "and",Space,Str "cheese"]]
+ ,[Para [Str "Pie"]
+ ,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "apple"]]
+ ,[Plain [Str "carrot"]]]]]]
+,Table [] [AlignDefault,AlignDefault,AlignDefault] [0.0,0.0,0.0]
+ [[]
+ ,[]
+ ,[]]
+ [[[Para [Str "Orange"]]
+ ,[Para [Str "Apple"]]
+ ,[Para [Str "more"]]]
+ ,[[Para [Str "Bread"]]
+ ,[Para [Str "Pie"]]
+ ,[Para [Str "more"]]]
+ ,[[Para [Str "Butter"]]
+ ,[Para [Str "Ice",Space,Str "cream"]]
+ ,[Para [Str "and",Space,Str "more"]]]]
+,Table [] [AlignLeft,AlignRight,AlignCenter] [0.25,0.125,0.125]
+ [[Para [Str "Left"]]
+ ,[Para [Str "Right"]]
+ ,[Para [Str "Center"]]]
+ [[[Para [Str "left"]]
+ ,[Para [Str "15.00"]]
+ ,[Para [Str "centered"]]]
+ ,[[Para [Str "more"]]
+ ,[Para [Str "2.0"]]
+ ,[Para [Str "more"]]]]
+,Table [] [AlignDefault,AlignDefault] [0.0,0.0]
+ [[]
+ ,[]]
+ [[[Para [Str "Orange"]]
+ ,[Para [Str "Apple"]]]
+ ,[[Para [Str "Bread"]]
+ ,[Table [] [AlignDefault,AlignDefault] [0.0,0.0]
+ [[Para [Str "fruit"]]
+ ,[Para [Str "topping"]]]
+ [[[Para [Str "apple"]]
+ ,[Para [Str "ice",Space,Str "cream"]]]]]]
+ ,[[Para [Str "Butter"]]
+ ,[Para [Str "Ice",Space,Str "cream"]]]]
+,Header 2 [Str "notes"]
+,Para [Str "My",Space,Str "note!",Note [Plain [Str "This."]]]]
diff --git a/tests/mediawiki-reader.wiki b/tests/mediawiki-reader.wiki
new file mode 100644
index 000000000..a9769594d
--- /dev/null
+++ b/tests/mediawiki-reader.wiki
@@ -0,0 +1,369 @@
+= header =
+
+== header level two ==
+
+===header level 3===
+
+====header ''level'' four====
+
+===== header level 5 =====
+
+====== header level 6 ======
+
+======= not a header ========
+
+ == not a header ==
+
+== emph and strong ==
+
+''emph'' '''strong'''
+
+'''''strong and emph'''''
+
+'''''emph inside'' strong'''
+
+'''strong with ''emph'''''
+
+'''''strong inside''' emph''
+
+== horizontal rule ==
+
+top
+----
+bottom
+
+----
+
+== nowiki ==
+
+<nowiki>''not emph''</nowiki>
+
+== strikeout ==
+
+<strike> This is ''struck out''</strike>
+
+== entities ==
+
+hi & low
+
+hi &amp; low
+
+G&ouml;del
+
+&#777;&#xAAA;
+
+== comments ==
+
+inline<!-- secret --> comment
+
+<!-- secret -->
+
+between blocks
+
+ <!-- secret -->
+
+== linebreaks ==
+
+hi<br/>there
+
+hi<br>
+there
+
+== : indents ==
+
+hi
+: there
+bud
+
+hi
+:: there
+bud
+
+== p tags ==
+
+hi there
+<p>
+bud
+<p>
+another
+</p>
+
+== raw html ==
+
+hi <span style="color:red">''there''</span>.
+
+<ins>inserted</ins>
+
+<div class="special">
+hi ''there''
+</div>
+
+== sup, sub, del ==
+
+H<sub>2</sub>O base<sup>''exponent''</sup>
+<del>hello</del>
+
+== inline code ==
+
+<code>*&rarr;*</code> <tt>typed</tt> <hask>&gt;&gt;=</hask>
+
+== code blocks ==
+
+<pre>
+case xs of
+ (_:_) -> reverse xs
+ [] -> ['*']
+</pre>
+
+<haskell>
+case xs of
+ (_:_) -> reverse xs
+ [] -> ['*']
+</haskell>
+
+<syntaxhighlight lang="ruby" line start=100>
+widgets.each do |w|
+ print w.price
+end
+</syntaxhighlight>
+
+== block quotes ==
+
+Regular paragraph
+<blockquote>
+This is a block quote.
+
+With two paragraphs.
+</blockquote>
+Nother paragraph.
+
+== external links ==
+
+[http://google.com ''Google'' search engine]
+
+http://johnmacfarlane.net/pandoc/
+
+[http://google.com] [http://yahoo.com]
+
+[mailto:info@example.org email me]
+
+== internal links ==
+
+[[Help]]
+
+[[Help|the help page]]
+
+[[Help]]ers
+
+[[Help]]<nowiki/>ers
+
+[[Help:Contents|]]
+
+[[#My anchor]]
+
+[[Page#with anchor|and text]]
+
+== images ==
+
+[[File:example.jpg|caption]]
+
+[[File:example.jpg|border|the ''caption'' with [http://google.com external link]]]
+
+[[File:example.jpg|frameless|border|30x40px|caption]]
+
+[[File:example.jpg]]
+
+== lists ==
+
+* Start each line
+* with an asterisk (*).
+** More asterisks gives deeper
+*** and deeper levels.
+* Line breaks<br/>don't break levels.
+*** But jumping levels creates empty space.
+Any other start ends the list.
+
+# Start each line
+# with a number sign (#).
+## More number signs gives deeper
+### and deeper
+### levels.
+# Line breaks<br/>don't break levels.
+### But jumping levels creates empty space.
+# Blank lines
+
+# end the list and start another.
+Any other start also
+ends the list.
+
+;item 1
+: definition 1
+;item 2
+: definition 2-1
+: definition 2-2
+
+# one
+# two
+#* two point one
+#* two point two
+# three
+#; three item one
+#: three def one
+# four
+#: four def one
+#: this looks like a continuation
+#: and is often used
+#: instead<br/>of <nowiki><br/></nowiki>
+# {{{template
+|author=John
+|title=My Book
+}}}
+## five sub 1
+### five sub 1 sub 1
+## five sub 2
+
+<ol>
+ <li>list item ''emph''
+ <ol>
+ <li>list item B1</li>
+ <li>list item B2</li>
+ </ol>continuing list item A1
+ </li>
+ <li>list item A2</li>
+</ol>
+
+<ul>
+#abc
+#def
+#ghi
+</ul>
+
+<ol start="9">
+<li>Amsterdam</li>
+<li>Rotterdam</li>
+<li>The Hague</li>
+</ol>
+
+== math ==
+
+Here is some <math>x=\frac{y^\pi}{z}</math>.
+
+== preformatted blocks ==
+
+ Start each line with a space.
+ Text is '''preformatted''' and
+ ''markups'' '''''can''''' be done.
+
+ hell yeah
+
+ <nowiki>Start with a space in the first column,
+(before the <nowiki>).
+
+Then your block format will be
+ maintained.
+
+This is good for copying in code blocks:
+
+def function():
+ """documentation string"""
+
+ if True:
+ print True
+ else:
+ print False</nowiki>
+
+Not<hr/> preformatted
+
+== templates ==
+
+{{Welcome}}
+
+{{Foo:Bar}}
+
+{{Thankyou|all your effort|Me}}
+
+Written {{{date}}} by {{{name}}}.
+
+== tables ==
+
+{|
+|-
+|Orange
+|Apple
+|-
+|Bread
+|Pie
+|-
+|Butter
+|Ice cream
+|}
+
+{|
+|+Food complements
+!Orange
+!Apple
+|-
+|Bread
+|Pie
+|-
+!Butter
+|Ice cream
+|}
+
+{|
+|+Food complements
+!Orange
+!Apple
+|-
+|Bread
+
+and cheese
+|Pie
+
+# apple
+# carrot
+
+|}
+
+{|
+| Orange || Apple || more
+|-
+| Bread || Pie || more
+|-
+| Butter || Ice cream || and more
+|}
+
+{|width="50%"
+! align="left" width="50%"| Left
+! align="right"|Right
+! align="center"|Center
+|-
+| left || 15.00 || centered
+|-
+| more || 2.0 || more
+|}
+
+{|
+|-
+|Orange
+|Apple
+|-
+|Bread
+|
+{|
+!fruit
+!topping
+|-
+|apple
+|ice cream
+|}
+|-
+|Butter
+|Ice cream
+|}
+
+
+== notes ==
+
+My note!<ref>This.</ref>
+
diff --git a/tests/tables.latex b/tests/tables.latex
index 56b469a54..2a8013413 100644
--- a/tests/tables.latex
+++ b/tests/tables.latex
@@ -1,64 +1,59 @@
Simple table with caption:
-\ctable[caption = {Demonstration of simple table syntax.},
-pos = H, center, botcap]{rlcl}
-{% notes
-}
-{% rows
-\FL
+\begin{longtable}[c]{rlcl}
+\hline\noalign{\medskip}
Right & Left & Center & Default
-\ML
+\\\noalign{\medskip}
+\hline\noalign{\medskip}
12 & 12 & 12 & 12
\\\noalign{\medskip}
123 & 123 & 123 & 123
\\\noalign{\medskip}
1 & 1 & 1 & 1
-\LL
-}
+\\\noalign{\medskip}
+\hline
+\noalign{\medskip}
+\caption{Demonstration of simple table syntax.}
+\end{longtable}
Simple table without caption:
-\ctable[pos = H, center, botcap]{rlcl}
-{% notes
-}
-{% rows
-\FL
+\begin{longtable}[c]{rlcl}
+\hline\noalign{\medskip}
Right & Left & Center & Default
-\ML
+\\\noalign{\medskip}
+\hline\noalign{\medskip}
12 & 12 & 12 & 12
\\\noalign{\medskip}
123 & 123 & 123 & 123
\\\noalign{\medskip}
1 & 1 & 1 & 1
-\LL
-}
+\\\noalign{\medskip}
+\hline
+\end{longtable}
Simple table indented two spaces:
-\ctable[caption = {Demonstration of simple table syntax.},
-pos = H, center, botcap]{rlcl}
-{% notes
-}
-{% rows
-\FL
+\begin{longtable}[c]{rlcl}
+\hline\noalign{\medskip}
Right & Left & Center & Default
-\ML
+\\\noalign{\medskip}
+\hline\noalign{\medskip}
12 & 12 & 12 & 12
\\\noalign{\medskip}
123 & 123 & 123 & 123
\\\noalign{\medskip}
1 & 1 & 1 & 1
-\LL
-}
+\\\noalign{\medskip}
+\hline
+\noalign{\medskip}
+\caption{Demonstration of simple table syntax.}
+\end{longtable}
Multiline table with caption:
-\ctable[caption = {Here's the caption. It may span multiple lines.},
-pos = H, center, botcap]{clrl}
-{% notes
-}
-{% rows
-\FL
+\begin{longtable}[c]{clrl}
+\hline\noalign{\medskip}
\parbox[b]{0.15\columnwidth}{\centering
Centered Header
} & \parbox[b]{0.14\columnwidth}{\raggedright
@@ -68,7 +63,8 @@ Right Aligned
} & \parbox[b]{0.34\columnwidth}{\raggedright
Default aligned
}
-\ML
+\\\noalign{\medskip}
+\hline\noalign{\medskip}
\parbox[t]{0.15\columnwidth}{\centering
First
} & \parbox[t]{0.14\columnwidth}{\raggedright
@@ -88,16 +84,16 @@ row
} & \parbox[t]{0.34\columnwidth}{\raggedright
Here's another one. Note the blank line between rows.
}
-\LL
-}
+\\\noalign{\medskip}
+\hline
+\noalign{\medskip}
+\caption{Here's the caption. It may span multiple lines.}
+\end{longtable}
Multiline table without caption:
-\ctable[pos = H, center, botcap]{clrl}
-{% notes
-}
-{% rows
-\FL
+\begin{longtable}[c]{clrl}
+\hline\noalign{\medskip}
\parbox[b]{0.15\columnwidth}{\centering
Centered Header
} & \parbox[b]{0.14\columnwidth}{\raggedright
@@ -107,7 +103,8 @@ Right Aligned
} & \parbox[b]{0.34\columnwidth}{\raggedright
Default aligned
}
-\ML
+\\\noalign{\medskip}
+\hline\noalign{\medskip}
\parbox[t]{0.15\columnwidth}{\centering
First
} & \parbox[t]{0.14\columnwidth}{\raggedright
@@ -127,31 +124,27 @@ row
} & \parbox[t]{0.34\columnwidth}{\raggedright
Here's another one. Note the blank line between rows.
}
-\LL
-}
+\\\noalign{\medskip}
+\hline
+\end{longtable}
Table without column headers:
-\ctable[pos = H, center, botcap]{rlcr}
-{% notes
-}
-{% rows
-\FL
+\begin{longtable}[c]{rlcr}
+\hline\noalign{\medskip}
12 & 12 & 12 & 12
\\\noalign{\medskip}
123 & 123 & 123 & 123
\\\noalign{\medskip}
1 & 1 & 1 & 1
-\LL
-}
+\\\noalign{\medskip}
+\hline
+\end{longtable}
Multiline table without column headers:
-\ctable[pos = H, center, botcap]{clrl}
-{% notes
-}
-{% rows
-\FL
+\begin{longtable}[c]{clrl}
+\hline\noalign{\medskip}
\parbox[t]{0.15\columnwidth}{\centering
First
} & \parbox[t]{0.14\columnwidth}{\raggedright
@@ -171,5 +164,6 @@ row
} & \parbox[t]{0.34\columnwidth}{\raggedright
Here's another one. Note the blank line between rows.
}
-\LL
-}
+\\\noalign{\medskip}
+\hline
+\end{longtable}
diff --git a/tests/textile-reader.native b/tests/textile-reader.native
index 7532a8fda..84dc50516 100644
--- a/tests/textile-reader.native
+++ b/tests/textile-reader.native
@@ -155,4 +155,4 @@ Pandoc (Meta {docTitle = [], docAuthors = [], docDate = []})
,Para [Str "\174",Space,Str "Hi",Str "\174"]
,Para [Str "Hi",Str "\169",Str "2008",Space,Str "\169",Space,Str "2008"]
,Header 1 [Str "Footnotes"]
-,Para [Str "A",Space,Str "note",Str ".",Note [Para [Str "The",Space,Str "note",LineBreak,Str "is",Space,Str "here",Str "!",Space,Str "Another",Space,Str "note",Note [Para [Str "Other",Space,Str "note."]],Str "."]]]]
+,Para [Str "A",Space,Str "note",Str ".",Note [Para [Str "The",Space,Str "note",LineBreak,Str "is",Space,Str "here",Str "!"]],Space,Str "Another",Space,Str "note",Note [Para [Str "Other",Space,Str "note",Str "."]],Str "."]]
diff --git a/tests/writer.context b/tests/writer.context
index bb070ce67..ff82aa457 100644
--- a/tests/writer.context
+++ b/tests/writer.context
@@ -6,7 +6,7 @@
% Enable hyperlinks
\setupinteraction[state=start, color=middleblue]
-\setuppapersize [letter][letter]
+\setuppapersize [letter][letter]
\setuplayout [width=middle, backspace=1.5in, cutspace=1.5in,
height=middle, topspace=0.75in, bottomspace=0.75in]
diff --git a/tests/writer.html b/tests/writer.html
index c9ef6f1f1..67bb57ead 100644
--- a/tests/writer.html
+++ b/tests/writer.html
@@ -281,29 +281,34 @@ These should not be escaped: \$ \\ \&gt; \[ \{</code></pre>
<dl>
<dt>apple</dt>
<dd>red fruit
-</dd><dd>computer
+</dd>
+<dd>computer
</dd>
<dt>orange</dt>
<dd>orange fruit
-</dd><dd>bank
+</dd>
+<dd>bank
</dd>
</dl>
<p>Multiple definitions, loose:</p>
<dl>
<dt>apple</dt>
<dd><p>red fruit</p>
-</dd><dd><p>computer</p>
+</dd>
+<dd><p>computer</p>
</dd>
<dt>orange</dt>
<dd><p>orange fruit</p>
-</dd><dd><p>bank</p>
+</dd>
+<dd><p>bank</p>
</dd>
</dl>
<p>Blank line after term, indented marker, alternate markers:</p>
<dl>
<dt>apple</dt>
<dd><p>red fruit</p>
-</dd><dd><p>computer</p>
+</dd>
+<dd><p>computer</p>
</dd>
<dt>orange</dt>
<dd><p>orange fruit</p>
diff --git a/tests/writer.latex b/tests/writer.latex
index 3efc08277..bab23cd64 100644
--- a/tests/writer.latex
+++ b/tests/writer.latex
@@ -786,7 +786,7 @@ Just a \href{/url/}{URL}.
\href{/url/}{URL and title}
-\href{/url/with\_underscore}{with\_underscore}
+\href{/url/with_underscore}{with\_underscore}
\href{mailto:nobody@nowhere.net}{Email link}
diff --git a/windows/pandoc-setup.iss b/windows/pandoc-setup.iss
index 712cf4f0d..47a480b1f 100644
--- a/windows/pandoc-setup.iss
+++ b/windows/pandoc-setup.iss
@@ -71,8 +71,6 @@ Source: "..\slidy\graphics\*"; DestDir: "{app}\slidy\graphics"; Flags: ignorever
Source: "..\slidy\scripts\*"; DestDir: "{app}\slidy\scripts"; Flags: ignoreversion
Source: "..\slidy\styles\*"; DestDir: "{app}\slidy\styles"; Flags: ignoreversion
Source: "..\dzslides\*"; DestDir: "{app}\dzslides"; Flags: ignoreversion
-Source: "pcre-license.txt"; DestDir: "{app}"; Flags: ignoreversion
-Source: "pcre3.dll"; DestDir: "{sys}"; Flags: onlyifdoesntexist sharedfile
; NOTE: Don't use "Flags: ignoreversion" on any shared system files
[Icons]
diff --git a/windows/pcre-license.txt b/windows/pcre-license.txt
deleted file mode 100644
index 73f8cde3d..000000000
--- a/windows/pcre-license.txt
+++ /dev/null
@@ -1,68 +0,0 @@
-PCRE LICENCE
-------------
-
-PCRE is a library of functions to support regular expressions whose syntax
-and semantics are as close as possible to those of the Perl 5 language.
-
-Release 8 of PCRE is distributed under the terms of the "BSD" licence, as
-specified below. The documentation for PCRE, supplied in the "doc"
-directory, is distributed under the same terms as the software itself.
-
-The basic library functions are written in C and are freestanding. Also
-included in the distribution is a set of C++ wrapper functions.
-
-
-THE BASIC LIBRARY FUNCTIONS
----------------------------
-
-Written by: Philip Hazel
-Email local part: ph10
-Email domain: cam.ac.uk
-
-University of Cambridge Computing Service,
-Cambridge, England.
-
-Copyright (c) 1997-2009 University of Cambridge
-All rights reserved.
-
-
-THE C++ WRAPPER FUNCTIONS
--------------------------
-
-Contributed by: Google Inc.
-
-Copyright (c) 2007-2008, Google Inc.
-All rights reserved.
-
-
-THE "BSD" LICENCE
------------------
-
-Redistribution and use in source and binary forms, with or without
-modification, are permitted provided that the following conditions are met:
-
- * Redistributions of source code must retain the above copyright notice,
- this list of conditions and the following disclaimer.
-
- * Redistributions in binary form must reproduce the above copyright
- notice, this list of conditions and the following disclaimer in the
- documentation and/or other materials provided with the distribution.
-
- * Neither the name of the University of Cambridge nor the name of Google
- Inc. nor the names of their contributors may be used to endorse or
- promote products derived from this software without specific prior
- written permission.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
-AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
-ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
-LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
-CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
-SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
-INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
-CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
-ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
-POSSIBILITY OF SUCH DAMAGE.
-
-End
diff --git a/windows/pcre3.dll b/windows/pcre3.dll
deleted file mode 100644
index b5fd2a637..000000000
--- a/windows/pcre3.dll
+++ /dev/null
Binary files differ