aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--INSTALL21
-rw-r--r--README60
m---------data/templates14
-rw-r--r--pandoc.cabal25
-rw-r--r--pandoc.hs49
-rw-r--r--src/Text/Pandoc.hs3
-rw-r--r--src/Text/Pandoc/MIME.hs3
-rw-r--r--src/Text/Pandoc/MediaBag.hs10
-rw-r--r--src/Text/Pandoc/Options.hs1
-rw-r--r--src/Text/Pandoc/Parsing.hs24
-rw-r--r--src/Text/Pandoc/Pretty.hs5
-rw-r--r--src/Text/Pandoc/Readers/Docx.hs33
-rw-r--r--src/Text/Pandoc/Readers/Docx/Parse.hs172
-rw-r--r--src/Text/Pandoc/Readers/HTML.hs11
-rw-r--r--src/Text/Pandoc/Readers/LaTeX.hs2
-rw-r--r--src/Text/Pandoc/Readers/Markdown.hs15
-rw-r--r--src/Text/Pandoc/Readers/Org.hs120
-rw-r--r--src/Text/Pandoc/Readers/RST.hs20
-rw-r--r--src/Text/Pandoc/Readers/TWiki.hs526
-rw-r--r--src/Text/Pandoc/SelfContained.hs3
-rw-r--r--src/Text/Pandoc/Shared.hs24
-rw-r--r--src/Text/Pandoc/Writers/ConTeXt.hs24
-rw-r--r--src/Text/Pandoc/Writers/Docx.hs93
-rw-r--r--src/Text/Pandoc/Writers/DokuWiki.hs6
-rw-r--r--src/Text/Pandoc/Writers/EPUB.hs57
-rw-r--r--src/Text/Pandoc/Writers/HTML.hs63
-rw-r--r--src/Text/Pandoc/Writers/ICML.hs2
-rw-r--r--src/Text/Pandoc/Writers/ODT.hs10
-rw-r--r--src/Text/Pandoc/Writers/RST.hs15
-rw-r--r--tests/Tests/Old.hs3
-rw-r--r--tests/Tests/Readers/Docx.hs9
-rw-r--r--tests/Tests/Readers/EPUB.hs5
-rw-r--r--tests/Tests/Readers/Markdown.hs12
-rw-r--r--tests/Tests/Readers/Org.hs92
-rw-r--r--tests/Tests/Readers/RST.hs23
-rw-r--r--tests/Tests/Shared.hs37
-rw-r--r--tests/docx/danish_headers.docxbin17691 -> 0 bytes
-rw-r--r--tests/docx/danish_headers.native10
-rw-r--r--tests/docx/i18n_blocks.docxbin0 -> 13680 bytes
-rw-r--r--tests/docx/i18n_blocks.native8
-rw-r--r--tests/docx/links.docxbin41751 -> 45361 bytes
-rw-r--r--tests/docx/links.native1
-rw-r--r--tests/epub/features.epubbin67495 -> 66370 bytes
-rw-r--r--tests/epub/features.native38
-rw-r--r--tests/epub/img.epubbin0 -> 61768 bytes
-rw-r--r--tests/html-reader.html1
-rw-r--r--tests/markdown-reader-more.native3
-rw-r--r--tests/markdown-reader-more.txt4
-rw-r--r--tests/tables.asciidoc1
-rw-r--r--tests/tables.haddock1
-rw-r--r--tests/tables.org1
-rw-r--r--tests/tables.rst1
-rw-r--r--tests/twiki-reader.native174
-rw-r--r--tests/twiki-reader.twiki221
-rw-r--r--tests/writer.dokuwiki5
-rw-r--r--tests/writer.opml10
-rw-r--r--tests/writer.rst1
57 files changed, 1732 insertions, 340 deletions
diff --git a/INSTALL b/INSTALL
index eb9b2b030..a8ba3ddef 100644
--- a/INSTALL
+++ b/INSTALL
@@ -44,12 +44,7 @@ Quick install
[Not sure where `$CABALDIR` is?](http://www.haskell.org/haskellwiki/Cabal-Install#The_cabal-install_configuration_file)
-5. Make sure the `$CABALDIR/share/man/man1` directory is in your `MANPATH`.
- You should now be able to access the `pandoc` man page:
-
- man pandoc
-
-6. If you want to process citations with pandoc, you will also need to
+5. If you want to process citations with pandoc, you will also need to
install a separate package, `pandoc-citeproc`. This can be installed
using cabal:
@@ -71,6 +66,20 @@ Quick install
--extra-include-dirs=/usr/local/Cellar/icu4c/51.1/include \
-funicode_collation text-icu pandoc-citeproc
+The cabal installation procedure does not generate man pages.
+To build the `pandoc` man pages, build pandoc with the
+`make-pandoc-man-pages` flag, and then use the command
+`make-pandoc-man-pages` from the pandoc source directory.
+This will create the man pages in `man/man1` and `man/man5`.
+
+To build the `pandoc-citeproc` man pages, go to the pandoc-citeproc
+build directory, and
+
+ cd man
+ make
+
+The man page will be created in the `man1` subdirectory.
+
[GHC]: http://www.haskell.org/ghc/
[Haskell platform]: http://hackage.haskell.org/platform/
[cabal-install]: http://hackage.haskell.org/trac/hackage/wiki/CabalInstall
diff --git a/README b/README
index efd183407..de688aa1c 100644
--- a/README
+++ b/README
@@ -13,15 +13,16 @@ Description
Pandoc is a [Haskell] library for converting from one markup format to
another, and a command-line tool that uses this library. It can read
[markdown] and (subsets of) [Textile], [reStructuredText], [HTML],
-[LaTeX], [MediaWiki markup], [Haddock markup], [OPML], [Emacs
-Org-mode], [DocBook], [txt2tags], [EPUB] and [Word docx]; and it can write plain text,
-[markdown], [reStructuredText], [XHTML], [HTML 5], [LaTeX] (including
-[beamer] slide shows), [ConTeXt], [RTF], [OPML], [DocBook],
-[OpenDocument], [ODT], [Word docx], [GNU Texinfo], [MediaWiki markup],
-[DokuWiki markup], [Haddock markup], [EPUB] (v2 or v3), [FictionBook2],
-[Textile], [groff man] pages, [Emacs Org-Mode], [AsciiDoc], [InDesign ICML],
-and [Slidy], [Slideous], [DZSlides], [reveal.js] or [S5] HTML slide shows.
-It can also produce [PDF] output on systems where LaTeX is installed.
+[LaTeX], [MediaWiki markup], [TWiki markup], [Haddock markup], [OPML],
+[Emacs Org-mode], [DocBook], [txt2tags], [EPUB] and [Word docx]; and
+it can write plain text, [markdown], [reStructuredText], [XHTML],
+[HTML 5], [LaTeX] (including [beamer] slide shows), [ConTeXt], [RTF],
+[OPML], [DocBook], [OpenDocument], [ODT], [Word docx], [GNU Texinfo],
+[MediaWiki markup], [DokuWiki markup], [Haddock markup], [EPUB] (v2 or v3),
+[FictionBook2], [Textile], [groff man] pages, [Emacs Org-Mode], [AsciiDoc],
+[InDesign ICML], and [Slidy], [Slideous], [DZSlides], [reveal.js] or
+[S5] HTML slide shows. It can also produce [PDF] output on systems where
+LaTeX is installed.
Pandoc's enhanced version of markdown includes syntax for footnotes,
tables, flexible ordered lists, definition lists, fenced code blocks,
@@ -50,6 +51,15 @@ default (though output to *stdout* is disabled for the `odt`, `docx`,
pandoc -o output.html input.txt
+By default, pandoc produces a document fragment, not a standalone
+document with a proper header and footer. To produce a standalone
+document, use the `-s` or `--standalone` flag:
+
+ pandoc -s -o output.html input.txt
+
+For more information on how standalone documents are produced, see
+[Templates](#templates), below.
+
Instead of a file, an absolute URI may be given. In this case
pandoc will fetch the content using HTTP:
@@ -95,6 +105,11 @@ should pipe input and output through `iconv`:
iconv -t utf-8 input.txt | pandoc | iconv -f utf-8
+Note that in some output formats (such as HTML, LaTeX, ConTeXt,
+RTF, OPML, DocBook, and Texinfo), information about
+the character encoding is included in the document header, which
+will only be included if you use the `-s/--standalone` option.
+
Creating a PDF
--------------
@@ -147,8 +162,8 @@ General options
`textile` (Textile), `rst` (reStructuredText), `html` (HTML),
`docbook` (DocBook), `t2t` (txt2tags), `docx` (docx), `epub` (EPUB),
`opml` (OPML), `org` (Emacs Org-mode), `mediawiki` (MediaWiki markup),
- `haddock` (Haddock markup), or `latex` (LaTeX). If `+lhs` is appended
- to `markdown`, `rst`,
+ `twiki` (TWiki markup), `haddock` (Haddock markup), or `latex` (LaTeX).
+ If `+lhs` is appended to `markdown`, `rst`,
`latex`, or `html`, the input will be treated as literate Haskell
source: see [Literate Haskell support](#literate-haskell-support),
below. Markdown syntax extensions can be individually enabled or
@@ -239,8 +254,8 @@ Reader options
to curly quotes, `---` to em-dashes, `--` to en-dashes, and
`...` to ellipses. Nonbreaking spaces are inserted after certain
abbreviations, such as "Mr." (Note: This option is significant only when
- the input format is `markdown`, `markdown_strict`, or `textile`. It
- is selected automatically when the input format is `textile` or the
+ the input format is `markdown`, `markdown_strict`, `textile` or `twiki`.
+ It is selected automatically when the input format is `textile` or the
output format is `latex` or `context`, unless `--no-tex-ligatures`
is used.)
@@ -710,6 +725,16 @@ Math rendering in HTML
formulas to images. The formula will be concatenated with the URL
provided. If *URL* is not specified, the Google Chart API will be used.
+`--katex`[=*URL*]
+: Use [KaTeX] to display embedded TeX math in HTML output.
+ The *URL* should point to the `katex.js` load script. If a *URL* is
+ not provided, a link to the KaTeX CDN will be inserted.
+
+`--katex-stylesheet=*URL*`
+: The *URL* should point to the `katex.css` stylesheet. If this option is
+ not specified, a link to the KaTeX CDN will be inserted. Note that this
+ option does not imply `--katex`.
+
Options for wrapper scripts
---------------------------
@@ -1652,7 +1677,7 @@ proportionally spaced fonts, as it does not require lining up columns.
#### Extension: `table_captions` ####
-A caption may optionally be provided with all 4 kinds of tables (as
+A caption may optionally be provided with all 4 kinds of tables (as
illustrated in the examples below). A caption is a paragraph beginning
with the string `Table:` (or just `:`), which will be stripped off.
It may appear either before or after the table.
@@ -2121,8 +2146,9 @@ Math
#### Extension: `tex_math_dollars` ####
Anything between two `$` characters will be treated as TeX math. The
-opening `$` must have a character immediately to its right, while the
-closing `$` must have a character immediately to its left. Thus,
+opening `$` must have a non-space character immediately to its right,
+while the closing `$` must have a non-space character immediately to its
+left, and must not be followed immediately by a digit. Thus,
`$20,000 and $30,000` won't parse as math. If for some reason
you need to enclose text in literal `$` characters, backslash-escape
them and they won't be treated as math delimiters.
@@ -3161,6 +3187,7 @@ Rosenthal.
[Textile]: http://redcloth.org/textile
[MediaWiki markup]: http://www.mediawiki.org/wiki/Help:Formatting
[DokuWiki markup]: https://www.dokuwiki.org/dokuwiki
+[TWiki markup]: http://twiki.org/cgi-bin/view/TWiki/TextFormattingRules
[Haddock markup]: http://www.haskell.org/haddock/doc/html/ch03s08.html
[groff man]: http://developer.apple.com/DOCUMENTATION/Darwin/Reference/ManPages/man7/groff_man.7.html
[Haskell]: http://www.haskell.org/
@@ -3181,3 +3208,4 @@ Rosenthal.
[txt2tags]: http://txt2tags.org/
[EPUB]: http://idpf.org/epub
[EPUBspine]: http://www.idpf.org/epub/301/spec/epub-publications.html#sec-spine-elem
+[KaTeX]: https://github.com/Khan/KaTeX
diff --git a/data/templates b/data/templates
-Subproject a63c58b23ecb159ed463beb084d6f36b1bd9f56
+Subproject c76c6c52249118e49c5dd1b99fe916724350a59
diff --git a/pandoc.cabal b/pandoc.cabal
index 0e805d724..20e06121b 100644
--- a/pandoc.cabal
+++ b/pandoc.cabal
@@ -1,5 +1,5 @@
Name: pandoc
-Version: 1.13.1
+Version: 1.13.2
Cabal-Version: >= 1.10
Build-Type: Custom
License: GPL
@@ -16,14 +16,15 @@ Synopsis: Conversion between markup formats
Description: Pandoc is a Haskell library for converting from one markup
format to another, and a command-line tool that uses
this library. It can read markdown and (subsets of) HTML,
- reStructuredText, LaTeX, DocBook, MediaWiki markup, Haddock
- markup, OPML, Emacs Org-Mode, txt2tags and Textile, and it can write
- markdown, reStructuredText, HTML, LaTeX, ConTeXt, Docbook,
- OPML, OpenDocument, ODT, Word docx, RTF, MediaWiki, DokuWiki,
- Textile, groff man pages, plain text, Emacs Org-Mode, AsciiDoc,
- Haddock markup, EPUB (v2 and v3), FictionBook2,
- InDesign ICML, and several kinds of HTML/javascript
- slide shows (S5, Slidy, Slideous, DZSlides, reveal.js).
+ reStructuredText, LaTeX, DocBook, MediaWiki markup, TWiki
+ markup, Haddock markup, OPML, Emacs Org-Mode, txt2tags and
+ Textile, and it can write markdown, reStructuredText, XHTML,
+ HTML 5, LaTeX, ConTeXt, DocBook, OPML, OpenDocument, ODT,
+ Word docx, RTF, MediaWiki, DokuWiki, Textile, groff man
+ pages, plain text, Emacs Org-Mode, AsciiDoc, Haddock markup,
+ EPUB (v2 and v3), FictionBook2, InDesign ICML, and several
+ kinds of HTML/javascript slide shows (S5, Slidy, Slideous,
+ DZSlides, reveal.js).
.
Pandoc extends standard markdown syntax with footnotes,
embedded LaTeX, definition lists, tables, and other
@@ -182,6 +183,7 @@ Extra-Source-Files:
tests/epub/*.epub
tests/epub/*.native
tests/txt2tags.t2t
+ tests/twiki-reader.twiki
Source-repository head
type: git
@@ -247,7 +249,7 @@ Library
haddock-library >= 1.1 && < 1.2,
old-time,
deepseq-generics >= 0.1 && < 0.2,
- JuicyPixels >= 3.1.6.1 && < 3.2
+ JuicyPixels >= 3.1.6.1 && < 3.3
if flag(network-uri)
Build-Depends: network-uri >= 2.6 && < 2.7, network >= 2.6
else
@@ -289,6 +291,7 @@ Library
Text.Pandoc.Readers.Textile,
Text.Pandoc.Readers.Native,
Text.Pandoc.Readers.Haddock,
+ Text.Pandoc.Readers.TWiki,
Text.Pandoc.Readers.Docx,
Text.Pandoc.Readers.EPUB,
Text.Pandoc.Writers.Native,
@@ -357,7 +360,7 @@ Executable pandoc
containers >= 0.1 && < 0.6,
HTTP >= 4000.0.5 && < 4000.3
if flag(network-uri)
- Build-Depends: network-uri >= 2.6 && < 2.7
+ Build-Depends: network-uri >= 2.6 && < 2.7, network >= 2.6
else
Build-Depends: network >= 2 && < 2.6
Ghc-Options: -rtsopts -with-rtsopts=-K16m -Wall -fno-warn-unused-do-bind
diff --git a/pandoc.hs b/pandoc.hs
index a9bea12c6..7b910be71 100644
--- a/pandoc.hs
+++ b/pandoc.hs
@@ -58,7 +58,7 @@ import qualified Control.Exception as E
import Control.Exception.Extensible ( throwIO )
import qualified Text.Pandoc.UTF8 as UTF8
import Control.Monad (when, unless, (>=>))
-import Data.Maybe (isJust)
+import Data.Maybe (isJust, fromMaybe)
import Data.Foldable (foldrM)
import Network.URI (parseURI, isURI, URI(..))
import qualified Data.ByteString.Lazy as B
@@ -68,7 +68,7 @@ import qualified Data.Map as M
import Data.Yaml (decode)
import qualified Data.Yaml as Yaml
import qualified Data.Text as T
-import Control.Applicative ((<$>))
+import Control.Applicative ((<$>), (<|>))
import Text.Pandoc.Readers.Txt2Tags (getT2TMeta)
import Data.Monoid
@@ -205,6 +205,8 @@ data Opt = Opt
, optExtractMedia :: Maybe FilePath -- ^ Path to extract embedded media
, optTrace :: Bool -- ^ Print debug information
, optTrackChanges :: TrackChanges -- ^ Accept or reject MS Word track-changes.
+ , optKaTeXStylesheet :: Maybe String -- ^ Path to stylesheet for KaTeX
+ , optKaTeXJS :: Maybe String -- ^ Path to js file for KaTeX
}
-- | Defaults for command-line options.
@@ -263,6 +265,8 @@ defaultOpts = Opt
, optExtractMedia = Nothing
, optTrace = False
, optTrackChanges = AcceptChanges
+ , optKaTeXStylesheet = Nothing
+ , optKaTeXJS = Nothing
}
-- | A list of functions, each transforming the options data structure
@@ -818,6 +822,21 @@ options =
return opt { optHTMLMathMethod = MathJax url'})
"URL")
"" -- "Use MathJax for HTML math"
+ , Option "" ["katex"]
+ (OptArg
+ (\arg opt ->
+ return opt
+ { optKaTeXJS =
+ arg <|> Just "http://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.1.0/katex.min.js"})
+ "URL")
+ "" -- Use KaTeX for HTML Math
+
+ , Option "" ["katex-stylesheet"]
+ (ReqArg
+ (\arg opt ->
+ return opt { optKaTeXStylesheet = Just arg })
+ "URL")
+ "" -- Set the KaTeX Stylesheet location
, Option "" ["gladtex"]
(NoArg
@@ -860,6 +879,7 @@ options =
]
+
addMetadata :: String -> MetaValue -> M.Map String MetaValue
-> M.Map String MetaValue
addMetadata k v m = case M.lookup k m of
@@ -910,6 +930,9 @@ defaultReaderName fallback (x:xs) =
".docx" -> "docx"
".t2t" -> "t2t"
".epub" -> "epub"
+ ".odt" -> "odt" -- so we get an "unknown reader" error
+ ".pdf" -> "pdf" -- so we get an "unknown reader" error
+ ".doc" -> "doc" -- so we get an "unknown reader" error
_ -> defaultReaderName fallback xs
-- Returns True if extension of first source is .lhs
@@ -950,6 +973,7 @@ defaultWriterName x =
".pdf" -> "latex"
".fb2" -> "fb2"
".opml" -> "opml"
+ ".icml" -> "icml"
['.',y] | y `elem` ['1'..'9'] -> "man"
_ -> "html"
@@ -1027,7 +1051,7 @@ main = do
, optHighlight = highlight
, optHighlightStyle = highlightStyle
, optChapters = chapters
- , optHTMLMathMethod = mathMethod
+ , optHTMLMathMethod = mathMethod'
, optReferenceODT = referenceODT
, optReferenceDocx = referenceDocx
, optEpubStylesheet = epubStylesheet
@@ -1056,6 +1080,8 @@ main = do
, optExtractMedia = mbExtractMedia
, optTrace = trace
, optTrackChanges = trackChanges
+ , optKaTeXStylesheet = katexStylesheet
+ , optKaTeXJS = katexJS
} = opts
when dumpArgs $
@@ -1063,6 +1089,13 @@ main = do
mapM_ (\arg -> UTF8.hPutStrLn stdout arg) args
exitWith ExitSuccess
+ let csscdn = "http://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.1.0/katex.min.css"
+ let mathMethod =
+ case (katexJS, katexStylesheet) of
+ (Nothing, _) -> mathMethod'
+ (Just js, ss) -> KaTeX js (fromMaybe csscdn ss)
+
+
-- --bibliography implies -F pandoc-citeproc for backwards compatibility:
let needsCiteproc = isJust (M.lookup "bibliography" metadata) &&
optCiteMethod opts `notElem` [Natbib, Biblatex] &&
@@ -1119,7 +1152,15 @@ main = do
(getT2TMeta sources outputFile)
else case getReader readerName' of
Right r -> return r
- Left e -> err 7 e
+ Left e -> err 7 e'
+ where e' = case readerName' of
+ "odt" -> e ++
+ "\nPandoc can convert to ODT, but not from ODT.\nTry using LibreOffice to export as HTML, and convert that with pandoc."
+ "pdf" -> e ++
+ "\nPandoc can convert to PDF, but not from PDF."
+ "doc" -> e ++
+ "\nPandoc can convert from DOCX, but not from DOC.\nTry using Word to save your DOC file as DOCX, and convert that with pandoc."
+ _ -> e
let standalone' = standalone || not (isTextFormat writerName') || pdfOutput
diff --git a/src/Text/Pandoc.hs b/src/Text/Pandoc.hs
index fd849316b..1f22122ac 100644
--- a/src/Text/Pandoc.hs
+++ b/src/Text/Pandoc.hs
@@ -77,6 +77,7 @@ module Text.Pandoc
, readHaddock
, readNative
, readJSON
+ , readTWiki
, readTxt2Tags
, readTxt2TagsNoMacros
, readEPUB
@@ -133,6 +134,7 @@ import Text.Pandoc.Readers.HTML
import Text.Pandoc.Readers.Textile
import Text.Pandoc.Readers.Native
import Text.Pandoc.Readers.Haddock
+import Text.Pandoc.Readers.TWiki
import Text.Pandoc.Readers.Docx
import Text.Pandoc.Readers.Txt2Tags
import Text.Pandoc.Readers.EPUB
@@ -233,6 +235,7 @@ readers = [ ("native" , StringReader $ \_ s -> return $ readNative s)
,("html" , mkStringReader readHtml)
,("latex" , mkStringReader readLaTeX)
,("haddock" , mkStringReader readHaddock)
+ ,("twiki" , mkStringReader readTWiki)
,("docx" , mkBSReader readDocx)
,("t2t" , mkStringReader readTxt2TagsNoMacros)
,("epub" , mkBSReader readEPUB)
diff --git a/src/Text/Pandoc/MIME.hs b/src/Text/Pandoc/MIME.hs
index 3b3b3b5b3..2fdba93e0 100644
--- a/src/Text/Pandoc/MIME.hs
+++ b/src/Text/Pandoc/MIME.hs
@@ -328,7 +328,7 @@ mimeTypesList = -- List borrowed from happstack-server.
,("oth","application/vnd.oasis.opendocument.text-web")
,("otp","application/vnd.oasis.opendocument.presentation-template")
,("ots","application/vnd.oasis.opendocument.spreadsheet-template")
- ,("otf","application/x-font-opentype")
+ ,("otf","application/vnd.ms-opentype")
,("ott","application/vnd.oasis.opendocument.text-template")
,("oza","application/x-oz-application")
,("p","text/x-pascal")
@@ -477,6 +477,7 @@ mimeTypesList = -- List borrowed from happstack-server.
,("vrml","model/vrml")
,("vs","text/plain")
,("vsd","application/vnd.visio")
+ ,("vtt","text/vtt")
,("wad","application/x-doom")
,("wav","audio/x-wav")
,("wax","audio/x-ms-wax")
diff --git a/src/Text/Pandoc/MediaBag.hs b/src/Text/Pandoc/MediaBag.hs
index 5921b56cf..a55d5417e 100644
--- a/src/Text/Pandoc/MediaBag.hs
+++ b/src/Text/Pandoc/MediaBag.hs
@@ -51,7 +51,7 @@ import System.IO (stderr)
-- mime types. Note that a 'MediaBag' is a Monoid, so 'mempty'
-- can be used for an empty 'MediaBag', and '<>' can be used to append
-- two 'MediaBag's.
-newtype MediaBag = MediaBag (M.Map String (MimeType, BL.ByteString))
+newtype MediaBag = MediaBag (M.Map [String] (MimeType, BL.ByteString))
deriving (Monoid)
instance Show MediaBag where
@@ -65,7 +65,7 @@ insertMedia :: FilePath -- ^ relative path and canonical name of resource
-> MediaBag
-> MediaBag
insertMedia fp mbMime contents (MediaBag mediamap) =
- MediaBag (M.insert fp (mime, contents) mediamap)
+ MediaBag (M.insert (splitPath fp) (mime, contents) mediamap)
where mime = fromMaybe fallback mbMime
fallback = case takeExtension fp of
".gz" -> getMimeTypeDef $ dropExtension fp
@@ -75,14 +75,14 @@ insertMedia fp mbMime contents (MediaBag mediamap) =
lookupMedia :: FilePath
-> MediaBag
-> Maybe (MimeType, BL.ByteString)
-lookupMedia fp (MediaBag mediamap) = M.lookup fp mediamap
+lookupMedia fp (MediaBag mediamap) = M.lookup (splitPath fp) mediamap
-- | Get a list of the file paths stored in a 'MediaBag', with
-- their corresponding mime types and the lengths in bytes of the contents.
mediaDirectory :: MediaBag -> [(String, MimeType, Int)]
mediaDirectory (MediaBag mediamap) =
M.foldWithKey (\fp (mime,contents) ->
- ((fp, mime, fromIntegral $ BL.length contents):)) [] mediamap
+ (((joinPath fp), mime, fromIntegral $ BL.length contents):)) [] mediamap
-- | Extract contents of MediaBag to a given directory. Print informational
-- messages if 'verbose' is true.
@@ -93,7 +93,7 @@ extractMediaBag :: Bool
extractMediaBag verbose dir (MediaBag mediamap) = do
sequence_ $ M.foldWithKey
(\fp (_ ,contents) ->
- ((writeMedia verbose dir (fp, contents)):)) [] mediamap
+ ((writeMedia verbose dir (joinPath fp, contents)):)) [] mediamap
writeMedia :: Bool -> FilePath -> (FilePath, BL.ByteString) -> IO ()
writeMedia verbose dir (subpath, bs) = do
diff --git a/src/Text/Pandoc/Options.hs b/src/Text/Pandoc/Options.hs
index 84ccbbdc9..ebfd8f8a9 100644
--- a/src/Text/Pandoc/Options.hs
+++ b/src/Text/Pandoc/Options.hs
@@ -251,6 +251,7 @@ data HTMLMathMethod = PlainMath
| WebTeX String -- url of TeX->image script.
| MathML (Maybe String) -- url of MathMLinHTML.js
| MathJax String -- url of MathJax.js
+ | KaTeX String String -- url of stylesheet and katex.js
deriving (Show, Read, Eq)
data CiteMethod = Citeproc -- use citeproc to render them
diff --git a/src/Text/Pandoc/Parsing.hs b/src/Text/Pandoc/Parsing.hs
index d1fba1e21..e0f5f65bb 100644
--- a/src/Text/Pandoc/Parsing.hs
+++ b/src/Text/Pandoc/Parsing.hs
@@ -472,7 +472,12 @@ mathInlineWith op cl = try $ do
string op
notFollowedBy space
words' <- many1Till (count 1 (noneOf " \t\n\\")
- <|> (char '\\' >> anyChar >>= \c -> return ['\\',c])
+ <|> (char '\\' >>
+ -- This next clause is needed because \text{..} can
+ -- contain $, \(\), etc.
+ (try (string "text" >>
+ (("\\text" ++) <$> inBalancedBraces 0 ""))
+ <|> (\c -> ['\\',c]) <$> anyChar))
<|> do (blankline <* notFollowedBy' blankline) <|>
(oneOf " \t" <* skipMany (oneOf " \t"))
notFollowedBy (char '$')
@@ -480,6 +485,23 @@ mathInlineWith op cl = try $ do
) (try $ string cl)
notFollowedBy digit -- to prevent capture of $5
return $ concat words'
+ where
+ inBalancedBraces :: Stream s m Char => Int -> String -> ParserT s st m String
+ inBalancedBraces 0 "" = do
+ c <- anyChar
+ if c == '{'
+ then inBalancedBraces 1 "{"
+ else mzero
+ inBalancedBraces 0 s = return $ reverse s
+ inBalancedBraces numOpen ('\\':xs) = do
+ c <- anyChar
+ inBalancedBraces numOpen (c:'\\':xs)
+ inBalancedBraces numOpen xs = do
+ c <- anyChar
+ case c of
+ '}' -> inBalancedBraces (numOpen - 1) (c:xs)
+ '{' -> inBalancedBraces (numOpen + 1) (c:xs)
+ _ -> inBalancedBraces numOpen (c:xs)
mathDisplayWith :: Stream s m Char => String -> String -> ParserT s st m String
mathDisplayWith op cl = try $ do
diff --git a/src/Text/Pandoc/Pretty.hs b/src/Text/Pandoc/Pretty.hs
index 1e72c2040..2f2656086 100644
--- a/src/Text/Pandoc/Pretty.hs
+++ b/src/Text/Pandoc/Pretty.hs
@@ -286,6 +286,9 @@ renderList (BlankLines num : xs) = do
| otherwise -> replicateM_ (1 + num - newlines st) (outp (-1) "\n")
renderList xs
+renderList (CarriageReturn : BlankLines m : xs) =
+ renderList (BlankLines m : xs)
+
renderList (CarriageReturn : xs) = do
st <- get
if newlines st > 0 || null xs
@@ -531,4 +534,4 @@ charWidth c =
-- | Get real length of string, taking into account combining and double-wide
-- characters.
realLength :: String -> Int
-realLength = sum . map charWidth
+realLength = foldr (\a b -> charWidth a + b) 0
diff --git a/src/Text/Pandoc/Readers/Docx.hs b/src/Text/Pandoc/Readers/Docx.hs
index 4b5fbfdfc..64eb0322f 100644
--- a/src/Text/Pandoc/Readers/Docx.hs
+++ b/src/Text/Pandoc/Readers/Docx.hs
@@ -84,8 +84,7 @@ import Text.Pandoc.Readers.Docx.Lists
import Text.Pandoc.Readers.Docx.Reducible
import Text.Pandoc.Shared
import Text.Pandoc.MediaBag (insertMedia, MediaBag)
-import Data.Maybe (isJust)
-import Data.List (delete, stripPrefix, (\\), intersect, isPrefixOf)
+import Data.List (delete, (\\), intersect)
import Data.Monoid
import Text.TeXMath (writeTeX)
import Data.Default (Default)
@@ -197,19 +196,9 @@ fixAuthors mv = mv
codeStyles :: [String]
codeStyles = ["VerbatimChar"]
-blockQuoteDivs :: [String]
-blockQuoteDivs = ["Quote", "BlockQuote", "BlockQuotation"]
-
codeDivs :: [String]
codeDivs = ["SourceCode"]
-
--- For the moment, we have English, Danish, German, and French. This
--- is fairly ad-hoc, and there might be a more systematic way to do
--- it, but it's better than nothing.
-headerPrefixes :: [String]
-headerPrefixes = ["Heading", "Overskrift", "berschrift", "Titre"]
-
runElemToInlines :: RunElem -> Inlines
runElemToInlines (TextRun s) = text s
runElemToInlines (LnBrk) = linebreak
@@ -434,9 +423,9 @@ parStyleToTransform pPr
let pPr' = pPr { pStyle = cs, indentation = Nothing}
in
(divWith ("", [c], [])) . (parStyleToTransform pPr')
- | (c:cs) <- pStyle pPr
- , c `elem` blockQuoteDivs =
- let pPr' = pPr { pStyle = cs \\ blockQuoteDivs }
+ | (_:cs) <- pStyle pPr
+ , Just True <- pBlockQuote pPr =
+ let pPr' = pPr { pStyle = cs }
in
blockQuote . (parStyleToTransform pPr')
| (_:cs) <- pStyle pPr =
@@ -467,12 +456,11 @@ bodyPartToBlocks (Paragraph pPr parparts)
$ parStyleToTransform pPr
$ codeBlock
$ concatMap parPartToString parparts
- | (c : cs) <- filter (isJust . isHeaderClass) $ pStyle pPr
- , Just (prefix, n) <- isHeaderClass c = do
+ | Just (style, n) <- pHeading pPr = do
ils <- local (\s-> s{docxInHeaderBlock=True}) $
(concatReduce <$> mapM parPartToInlines parparts)
makeHeaderAnchor $
- headerWith ("", delete (prefix ++ show n) cs, []) n ils
+ headerWith ("", delete style (pStyle pPr), []) n ils
| otherwise = do
ils <- concatReduce <$> mapM parPartToInlines parparts >>=
(return . fromList . trimLineBreaks . normalizeSpaces . toList)
@@ -559,12 +547,3 @@ docxToOutput :: ReaderOptions -> Docx -> (Meta, [Block], MediaBag)
docxToOutput opts (Docx (Document _ body)) =
let dEnv = def { docxOptions = opts} in
evalDocxContext (bodyToOutput body) dEnv def
-
-isHeaderClass :: String -> Maybe (String, Int)
-isHeaderClass s | (pref:_) <- filter (\h -> isPrefixOf h s) headerPrefixes
- , Just s' <- stripPrefix pref s =
- case reads s' :: [(Int, String)] of
- [] -> Nothing
- ((n, "") : []) -> Just (pref, n)
- _ -> Nothing
-isHeaderClass _ = Nothing
diff --git a/src/Text/Pandoc/Readers/Docx/Parse.hs b/src/Text/Pandoc/Readers/Docx/Parse.hs
index e7a6c3ffb..5fd6b7a81 100644
--- a/src/Text/Pandoc/Readers/Docx/Parse.hs
+++ b/src/Text/Pandoc/Readers/Docx/Parse.hs
@@ -1,4 +1,4 @@
-{-# LANGUAGE PatternGuards, ViewPatterns #-}
+{-# LANGUAGE PatternGuards, ViewPatterns, FlexibleInstances #-}
{-
Copyright (C) 2014 Jesse Rosenthal <jrosenthal@jhu.edu>
@@ -65,7 +65,7 @@ import Text.Pandoc.Compat.Except
import Text.TeXMath.Readers.OMML (readOMML)
import Text.Pandoc.Readers.Docx.Fonts (getUnicode, Font(..))
import Text.TeXMath (Exp)
-import Data.Char (readLitChar, ord, chr)
+import Data.Char (readLitChar, ord, chr, isDigit)
data ReaderEnv = ReaderEnv { envNotes :: Notes
, envNumbering :: Numbering
@@ -73,6 +73,7 @@ data ReaderEnv = ReaderEnv { envNotes :: Notes
, envMedia :: Media
, envFont :: Maybe Font
, envCharStyles :: CharStyleMap
+ , envParStyles :: ParStyleMap
}
deriving Show
@@ -122,8 +123,12 @@ type Media = [(FilePath, B.ByteString)]
type CharStyle = (String, RunStyle)
+type ParStyle = (String, ParStyleData)
+
type CharStyleMap = M.Map String RunStyle
+type ParStyleMap = M.Map String ParStyleData
+
data Numbering = Numbering NameSpaces [Numb] [AbstractNumb]
deriving Show
@@ -152,6 +157,8 @@ data ParIndentation = ParIndentation { leftParIndent :: Maybe Integer
data ParagraphStyle = ParagraphStyle { pStyle :: [String]
, indentation :: Maybe ParIndentation
, dropCap :: Bool
+ , pHeading :: Maybe (String, Int)
+ , pBlockQuote :: Maybe Bool
}
deriving Show
@@ -159,6 +166,8 @@ defaultParagraphStyle :: ParagraphStyle
defaultParagraphStyle = ParagraphStyle { pStyle = []
, indentation = Nothing
, dropCap = False
+ , pHeading = Nothing
+ , pBlockQuote = Nothing
}
@@ -213,6 +222,11 @@ data RunStyle = RunStyle { isBold :: Maybe Bool
, rStyle :: Maybe CharStyle}
deriving Show
+data ParStyleData = ParStyleData { headingLev :: Maybe (String, Int)
+ , isBlockQuote :: Maybe Bool
+ , psStyle :: Maybe ParStyle}
+ deriving Show
+
defaultRunStyle :: RunStyle
defaultRunStyle = RunStyle { isBold = Nothing
, isItalic = Nothing
@@ -242,8 +256,8 @@ archiveToDocx archive = do
numbering = archiveToNumbering archive
rels = archiveToRelationships archive
media = archiveToMedia archive
- styles = archiveToStyles archive
- rEnv = ReaderEnv notes numbering rels media Nothing styles
+ (styles, parstyles) = archiveToStyles archive
+ rEnv = ReaderEnv notes numbering rels media Nothing styles parstyles
doc <- runD (archiveToDocument archive) rEnv
return $ Docx doc
@@ -263,47 +277,69 @@ elemToBody ns element | isElem ns "w" "body" element =
(\bps -> return $ Body bps)
elemToBody _ _ = throwError WrongElem
-archiveToStyles :: Archive -> CharStyleMap
+archiveToStyles :: Archive -> (CharStyleMap, ParStyleMap)
archiveToStyles zf =
let stylesElem = findEntryByPath "word/styles.xml" zf >>=
(parseXMLDoc . UTF8.toStringLazy . fromEntry)
in
case stylesElem of
- Nothing -> M.empty
+ Nothing -> (M.empty, M.empty)
Just styElem ->
let namespaces = mapMaybe attrToNSPair (elAttribs styElem)
in
- M.fromList $ buildBasedOnList namespaces styElem Nothing
+ ( M.fromList $ buildBasedOnList namespaces styElem
+ (Nothing :: Maybe CharStyle),
+ M.fromList $ buildBasedOnList namespaces styElem
+ (Nothing :: Maybe ParStyle) )
-isBasedOnStyle :: NameSpaces -> Element -> Maybe CharStyle -> Bool
+isBasedOnStyle :: (ElemToStyle a) => NameSpaces -> Element -> Maybe a -> Bool
isBasedOnStyle ns element parentStyle
| isElem ns "w" "style" element
- , Just "character" <- findAttr (elemName ns "w" "type") element
+ , Just styleType <- findAttr (elemName ns "w" "type") element
+ , styleType == cStyleType parentStyle
, Just basedOnVal <- findChild (elemName ns "w" "basedOn") element >>=
findAttr (elemName ns "w" "val")
- , Just (parentId, _) <- parentStyle = (basedOnVal == parentId)
+ , Just ps <- parentStyle = (basedOnVal == getStyleId ps)
| isElem ns "w" "style" element
- , Just "character" <- findAttr (elemName ns "w" "type") element
+ , Just styleType <- findAttr (elemName ns "w" "type") element
+ , styleType == cStyleType parentStyle
, Nothing <- findChild (elemName ns "w" "basedOn") element
, Nothing <- parentStyle = True
| otherwise = False
-elemToCharStyle :: NameSpaces -> Element -> Maybe CharStyle -> Maybe CharStyle
-elemToCharStyle ns element parentStyle
- | isElem ns "w" "style" element
- , Just "character" <- findAttr (elemName ns "w" "type") element
- , Just styleId <- findAttr (elemName ns "w" "styleId") element =
- Just (styleId, elemToRunStyle ns element parentStyle)
- | otherwise = Nothing
-
-getStyleChildren :: NameSpaces -> Element -> Maybe CharStyle -> [CharStyle]
+class ElemToStyle a where
+ cStyleType :: Maybe a -> String
+ elemToStyle :: NameSpaces -> Element -> Maybe a -> Maybe a
+ getStyleId :: a -> String
+
+instance ElemToStyle CharStyle where
+ cStyleType _ = "character"
+ elemToStyle ns element parentStyle
+ | isElem ns "w" "style" element
+ , Just "character" <- findAttr (elemName ns "w" "type") element
+ , Just styleId <- findAttr (elemName ns "w" "styleId") element =
+ Just (styleId, elemToRunStyle ns element parentStyle)
+ | otherwise = Nothing
+ getStyleId s = fst s
+
+instance ElemToStyle ParStyle where
+ cStyleType _ = "paragraph"
+ elemToStyle ns element parentStyle
+ | isElem ns "w" "style" element
+ , Just "paragraph" <- findAttr (elemName ns "w" "type") element
+ , Just styleId <- findAttr (elemName ns "w" "styleId") element =
+ Just (styleId, elemToParStyleData ns element parentStyle)
+ | otherwise = Nothing
+ getStyleId s = fst s
+
+getStyleChildren :: (ElemToStyle a) => NameSpaces -> Element -> Maybe a -> [a]
getStyleChildren ns element parentStyle
| isElem ns "w" "styles" element =
- mapMaybe (\e -> elemToCharStyle ns e parentStyle) $
+ mapMaybe (\e -> elemToStyle ns e parentStyle) $
filterChildren (\e' -> isBasedOnStyle ns e' parentStyle) element
| otherwise = []
-buildBasedOnList :: NameSpaces -> Element -> Maybe CharStyle -> [CharStyle]
+buildBasedOnList :: (ElemToStyle a) => NameSpaces -> Element -> Maybe a -> [a]
buildBasedOnList ns element rootStyle =
case (getStyleChildren ns element rootStyle) of
[] -> []
@@ -543,7 +579,8 @@ elemToBodyPart ns element
elemToBodyPart ns element
| isElem ns "w" "p" element
, Just (numId, lvl) <- elemToNumInfo ns element = do
- let parstyle = elemToParagraphStyle ns element
+ sty <- asks envParStyles
+ let parstyle = elemToParagraphStyle ns element sty
parparts <- mapD (elemToParPart ns) (elChildren element)
num <- asks envNumbering
case lookupLevel numId lvl num of
@@ -551,7 +588,8 @@ elemToBodyPart ns element
Nothing -> throwError WrongElem
elemToBodyPart ns element
| isElem ns "w" "p" element = do
- let parstyle = elemToParagraphStyle ns element
+ sty <- asks envParStyles
+ let parstyle = elemToParagraphStyle ns element sty
parparts <- mapD (elemToParPart ns) (elChildren element)
return $ Paragraph parstyle parparts
elemToBodyPart ns element
@@ -584,7 +622,7 @@ expandDrawingId s = do
target <- asks (lookupRelationship s . envRelationships)
case target of
Just filepath -> do
- bytes <- asks (lookup (combine "word" filepath) . envMedia)
+ bytes <- asks (lookup ("word/" ++ filepath) . envMedia)
case bytes of
Just bs -> return (filepath, bs)
Nothing -> throwError DocxError
@@ -625,17 +663,20 @@ elemToParPart ns element
return $ BookMark bmId bmName
elemToParPart ns element
| isElem ns "w" "hyperlink" element
- , Just anchor <- findAttr (elemName ns "w" "anchor") element = do
+ , Just relId <- findAttr (elemName ns "r" "id") element = do
runs <- mapD (elemToRun ns) (elChildren element)
- return $ InternalHyperLink anchor runs
+ rels <- asks envRelationships
+ case lookupRelationship relId rels of
+ Just target -> do
+ case findAttr (elemName ns "w" "anchor") element of
+ Just anchor -> return $ ExternalHyperLink (target ++ '#':anchor) runs
+ Nothing -> return $ ExternalHyperLink target runs
+ Nothing -> return $ ExternalHyperLink "" runs
elemToParPart ns element
| isElem ns "w" "hyperlink" element
- , Just relId <- findAttr (elemName ns "r" "id") element = do
+ , Just anchor <- findAttr (elemName ns "w" "anchor") element = do
runs <- mapD (elemToRun ns) (elChildren element)
- rels <- asks envRelationships
- return $ case lookupRelationship relId rels of
- Just target -> ExternalHyperLink target runs
- Nothing -> ExternalHyperLink "" runs
+ return $ InternalHyperLink anchor runs
elemToParPart ns element
| isElem ns "m" "oMath" element =
(eitherToD $ readOMML $ showElement element) >>= (return . PlainOMath)
@@ -684,14 +725,30 @@ elemToRun ns element
return $ Run runStyle runElems
elemToRun _ _ = throwError WrongElem
-elemToParagraphStyle :: NameSpaces -> Element -> ParagraphStyle
-elemToParagraphStyle ns element
+getParentStyleValue :: (ParStyleData -> Maybe a) -> ParStyleData -> Maybe a
+getParentStyleValue field style
+ | Just value <- field style = Just value
+ | Just parentStyle <- psStyle style
+ = getParentStyleValue field (snd parentStyle)
+getParentStyleValue _ _ = Nothing
+
+getParStyleField :: (ParStyleData -> Maybe a) -> ParStyleMap -> [String] ->
+ Maybe a
+getParStyleField field stylemap styles
+ | x <- mapMaybe (\x -> M.lookup x stylemap) styles
+ , (y:_) <- mapMaybe (getParentStyleValue field) x
+ = Just y
+getParStyleField _ _ _ = Nothing
+
+elemToParagraphStyle :: NameSpaces -> Element -> ParStyleMap -> ParagraphStyle
+elemToParagraphStyle ns element sty
| Just pPr <- findChild (elemName ns "w" "pPr") element =
- ParagraphStyle
- {pStyle =
+ let style =
mapMaybe
(findAttr (elemName ns "w" "val"))
(findChildren (elemName ns "w" "pStyle") pPr)
+ in ParagraphStyle
+ {pStyle = style
, indentation =
findChild (elemName ns "w" "ind") pPr >>=
elemToParIndentation ns
@@ -703,8 +760,10 @@ elemToParagraphStyle ns element
Just "none" -> False
Just _ -> True
Nothing -> False
+ , pHeading = getParStyleField headingLev sty style
+ , pBlockQuote = getParStyleField isBlockQuote sty style
}
-elemToParagraphStyle _ _ = defaultParagraphStyle
+elemToParagraphStyle _ _ _ = defaultParagraphStyle
checkOnOff :: NameSpaces -> Element -> QName -> Maybe Bool
checkOnOff ns rPr tag
@@ -758,6 +817,45 @@ elemToRunStyle ns element parentStyle
}
elemToRunStyle _ _ _ = defaultRunStyle
+isNumericNotNull :: String -> Bool
+isNumericNotNull str = (str /= []) && (all isDigit str)
+
+getHeaderLevel :: NameSpaces -> Element -> Maybe (String,Int)
+getHeaderLevel ns element
+ | Just styleId <- findAttr (elemName ns "w" "styleId") element
+ , Just index <- stripPrefix "Heading" styleId
+ , isNumericNotNull index = Just (styleId, read index)
+ | Just styleId <- findAttr (elemName ns "w" "styleId") element
+ , Just index <- findChild (elemName ns "w" "name") element >>=
+ findAttr (elemName ns "w" "val") >>=
+ stripPrefix "heading "
+ , isNumericNotNull index = Just (styleId, read index)
+getHeaderLevel _ _ = Nothing
+
+blockQuoteStyleIds :: [String]
+blockQuoteStyleIds = ["Quote", "BlockQuote", "BlockQuotation"]
+
+blockQuoteStyleNames :: [String]
+blockQuoteStyleNames = ["Quote", "Block Text"]
+
+getBlockQuote :: NameSpaces -> Element -> Maybe Bool
+getBlockQuote ns element
+ | Just styleId <- findAttr (elemName ns "w" "styleId") element
+ , styleId `elem` blockQuoteStyleIds = Just True
+ | Just styleName <- findChild (elemName ns "w" "name") element >>=
+ findAttr (elemName ns "w" "val")
+ , styleName `elem` blockQuoteStyleNames = Just True
+getBlockQuote _ _ = Nothing
+
+elemToParStyleData :: NameSpaces -> Element -> Maybe ParStyle -> ParStyleData
+elemToParStyleData ns element parentStyle =
+ ParStyleData
+ {
+ headingLev = getHeaderLevel ns element
+ , isBlockQuote = getBlockQuote ns element
+ , psStyle = parentStyle
+ }
+
elemToRunElem :: NameSpaces -> Element -> D RunElem
elemToRunElem ns element
| isElem ns "w" "t" element
diff --git a/src/Text/Pandoc/Readers/HTML.hs b/src/Text/Pandoc/Readers/HTML.hs
index 4ea5f41d5..2a23f2a62 100644
--- a/src/Text/Pandoc/Readers/HTML.hs
+++ b/src/Text/Pandoc/Readers/HTML.hs
@@ -440,7 +440,7 @@ pCodeBlock :: TagParser Blocks
pCodeBlock = try $ do
TagOpen _ attr <- pSatisfy (~== TagOpen "pre" [])
contents <- manyTill pAnyTag (pCloses "pre" <|> eof)
- let rawText = concatMap fromTagText $ filter isTagText contents
+ let rawText = concatMap tagToString contents
-- drop leading newline if any
let result' = case rawText of
'\n':xs -> xs
@@ -451,6 +451,11 @@ pCodeBlock = try $ do
_ -> result'
return $ B.codeBlockWith (mkAttr attr) result
+tagToString :: Tag String -> String
+tagToString (TagText s) = s
+tagToString (TagOpen "br" _) = "\n"
+tagToString _ = ""
+
inline :: TagParser Inlines
inline = choice
[ eNoteref
@@ -735,7 +740,7 @@ pSpace = many1 (satisfy isSpace) >> return B.space
--
eitherBlockOrInline :: [String]
-eitherBlockOrInline = ["audio", "applet", "button", "iframe",
+eitherBlockOrInline = ["audio", "applet", "button", "iframe", "embed",
"del", "ins",
"progress", "map", "area", "noscript", "script",
"object", "svg", "video", "source"]
@@ -753,7 +758,7 @@ blockHtmlTags :: [String]
blockHtmlTags = ["?xml", "!DOCTYPE", "address", "article", "aside",
"blockquote", "body", "button", "canvas",
"caption", "center", "col", "colgroup", "dd", "dir", "div",
- "dl", "dt", "embed", "fieldset", "figcaption", "figure",
+ "dl", "dt", "fieldset", "figcaption", "figure",
"footer", "form", "h1", "h2", "h3", "h4",
"h5", "h6", "head", "header", "hgroup", "hr", "html",
"isindex", "menu", "noframes", "ol", "output", "p", "pre",
diff --git a/src/Text/Pandoc/Readers/LaTeX.hs b/src/Text/Pandoc/Readers/LaTeX.hs
index 9f51e9a8f..9420d602f 100644
--- a/src/Text/Pandoc/Readers/LaTeX.hs
+++ b/src/Text/Pandoc/Readers/LaTeX.hs
@@ -494,6 +494,7 @@ inlineCommands = M.fromList $
, ("citealp", citation "citealp" NormalCitation False)
, ("citealp*", citation "citealp*" NormalCitation False)
, ("autocite", citation "autocite" NormalCitation False)
+ , ("smartcite", citation "smartcite" NormalCitation False)
, ("footcite", inNote <$> citation "footcite" NormalCitation False)
, ("parencite", citation "parencite" NormalCitation False)
, ("supercite", citation "supercite" NormalCitation False)
@@ -516,6 +517,7 @@ inlineCommands = M.fromList $
, ("supercites", citation "supercites" NormalCitation True)
, ("footcitetexts", inNote <$> citation "footcitetexts" NormalCitation True)
, ("Autocite", citation "Autocite" NormalCitation False)
+ , ("Smartcite", citation "Smartcite" NormalCitation False)
, ("Footcite", citation "Footcite" NormalCitation False)
, ("Parencite", citation "Parencite" NormalCitation False)
, ("Supercite", citation "Supercite" NormalCitation False)
diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs
index 02a787670..b8487b4e6 100644
--- a/src/Text/Pandoc/Readers/Markdown.hs
+++ b/src/Text/Pandoc/Readers/Markdown.hs
@@ -117,6 +117,12 @@ isBlank _ = False
-- auxiliary functions
--
+-- | Succeeds when we're in list context.
+inList :: MarkdownParser ()
+inList = do
+ ctx <- stateParserContext <$> getState
+ guard (ctx == ListItemState)
+
isNull :: F Inlines -> Bool
isNull ils = B.isNull $ runF ils def
@@ -735,9 +741,9 @@ anyOrderedListStart = try $ do
skipNonindentSpaces
notFollowedBy $ string "p." >> spaceChar >> digit -- page number
res <- do guardDisabled Ext_fancy_lists
- many1 digit
+ start <- many1 digit >>= safeRead
char '.'
- return (1, DefaultStyle, DefaultDelim)
+ return (start, DefaultStyle, DefaultDelim)
<|> do (num, style, delim) <- anyOrderedListMarker
-- if it could be an abbreviated first name,
-- insist on more than one space
@@ -926,6 +932,8 @@ para = try $ do
<|> (guardEnabled Ext_backtick_code_blocks >> () <$ lookAhead codeBlockFenced)
<|> (guardDisabled Ext_blank_before_header >> () <$ lookAhead header)
<|> (guardEnabled Ext_lists_without_preceding_blankline >>
+ -- Avoid creating a paragraph in a nested list.
+ notFollowedBy' inList >>
() <$ lookAhead listStart)
<|> do guardEnabled Ext_native_divs
inHtmlBlock <- stateInHtmlBlock <$> getState
@@ -1610,8 +1618,7 @@ endline = try $ do
newline
notFollowedBy blankline
-- parse potential list-starts differently if in a list:
- st <- getState
- when (stateParserContext st == ListItemState) $ notFollowedBy listStart
+ notFollowedBy (inList >> listStart)
guardDisabled Ext_lists_without_preceding_blankline <|> notFollowedBy listStart
guardEnabled Ext_blank_before_blockquote <|> notFollowedBy emailBlockQuoteStart
guardEnabled Ext_blank_before_header <|> notFollowedBy (char '#') -- atx header
diff --git a/src/Text/Pandoc/Readers/Org.hs b/src/Text/Pandoc/Readers/Org.hs
index b07f96846..579e38a38 100644
--- a/src/Text/Pandoc/Readers/Org.hs
+++ b/src/Text/Pandoc/Readers/Org.hs
@@ -42,6 +42,7 @@ import Text.Pandoc.Parsing hiding ( F, unF, askF, asksF, runF
import Text.Pandoc.Readers.LaTeX (inlineCommand, rawLaTeXInline)
import Text.Pandoc.Shared (compactify', compactify'DL)
import Text.TeXMath (readTeX, writePandoc, DisplayType(..))
+import qualified Text.TeXMath.Readers.MathML.EntityMap as MathMLEntityMap
import Control.Applicative ( Applicative, pure
, (<$>), (<$), (<*>), (<*), (*>) )
@@ -69,7 +70,32 @@ parseOrg = do
blocks' <- parseBlocks
st <- getState
let meta = runF (orgStateMeta' st) st
- return $ Pandoc meta $ filter (/= Null) (B.toList $ runF blocks' st)
+ let removeUnwantedBlocks = dropCommentTrees . filter (/= Null)
+ return $ Pandoc meta $ removeUnwantedBlocks (B.toList $ runF blocks' st)
+
+-- | Drop COMMENT headers and the document tree below those headers.
+dropCommentTrees :: [Block] -> [Block]
+dropCommentTrees [] = []
+dropCommentTrees blks@(b:bs) =
+ maybe blks (flip dropUntilHeaderAboveLevel bs) $ commentHeaderLevel b
+
+-- | Return the level of a header starting a comment tree and Nothing
+-- otherwise.
+commentHeaderLevel :: Block -> Maybe Int
+commentHeaderLevel blk =
+ case blk of
+ (Header level _ ((Str "COMMENT"):_)) -> Just level
+ _ -> Nothing
+
+-- | Drop blocks until a header on or above the given level is seen
+dropUntilHeaderAboveLevel :: Int -> [Block] -> [Block]
+dropUntilHeaderAboveLevel n = dropWhile (not . isHeaderLevelLowerEq n)
+
+isHeaderLevelLowerEq :: Int -> Block -> Bool
+isHeaderLevelLowerEq n blk =
+ case blk of
+ (Header level _ _) -> n >= level
+ _ -> False
--
-- Parser State for Org
@@ -828,12 +854,14 @@ list :: OrgParser (F Blocks)
list = choice [ definitionList, bulletList, orderedList ] <?> "list"
definitionList :: OrgParser (F Blocks)
-definitionList = fmap B.definitionList . fmap compactify'DL . sequence
- <$> many1 (definitionListItem bulletListStart)
+definitionList = try $ do n <- lookAhead (bulletListStart' Nothing)
+ fmap B.definitionList . fmap compactify'DL . sequence
+ <$> many1 (definitionListItem $ bulletListStart' (Just n))
bulletList :: OrgParser (F Blocks)
-bulletList = fmap B.bulletList . fmap compactify' . sequence
- <$> many1 (listItem bulletListStart)
+bulletList = try $ do n <- lookAhead (bulletListStart' Nothing)
+ fmap B.bulletList . fmap compactify' . sequence
+ <$> many1 (listItem (bulletListStart' $ Just n))
orderedList :: OrgParser (F Blocks)
orderedList = fmap B.orderedList . fmap compactify' . sequence
@@ -845,10 +873,27 @@ genericListStart listMarker = try $
(+) <$> (length <$> many spaceChar)
<*> (length <$> listMarker <* many1 spaceChar)
--- parses bullet list start and returns its length (excl. following whitespace)
+-- parses bullet list marker. maybe we know the indent level
bulletListStart :: OrgParser Int
-bulletListStart = genericListStart bulletListMarker
- where bulletListMarker = pure <$> oneOf "*-+"
+bulletListStart = bulletListStart' Nothing
+
+bulletListStart' :: Maybe Int -> OrgParser Int
+-- returns length of bulletList prefix, inclusive of marker
+bulletListStart' Nothing = do ind <- length <$> many spaceChar
+ when (ind == 0) $ notFollowedBy (char '*')
+ oneOf bullets
+ many1 spaceChar
+ return (ind + 1)
+ -- Unindented lists are legal, but they can't use '*' bullets
+ -- We return n to maintain compatibility with the generic listItem
+bulletListStart' (Just n) = do count (n-1) spaceChar
+ when (n == 1) $ notFollowedBy (char '*')
+ oneOf bullets
+ many1 spaceChar
+ return n
+
+bullets :: String
+bullets = "*+-"
orderedListStart :: OrgParser Int
orderedListStart = genericListStart orderedListMarker
@@ -863,7 +908,7 @@ definitionListItem parseMarkerGetLength = try $ do
line1 <- anyLineNewline
blank <- option "" ("\n" <$ blankline)
cont <- concat <$> many (listContinuation markerLength)
- term' <- parseFromString inline term
+ term' <- parseFromString parseInlines term
contents' <- parseFromString parseBlocks $ line1 ++ blank ++ cont
return $ (,) <$> term' <*> fmap (:[]) contents'
@@ -927,7 +972,7 @@ parseInlines = trimInlinesF . mconcat <$> many1 inline
-- treat these as potentially non-text when parsing inline:
specialChars :: [Char]
-specialChars = "\"$'()*+-./:<=>[\\]^_{|}~"
+specialChars = "\"$'()*+-,./:<=>[\\]^_{|}~"
whitespace :: OrgParser (F Inlines)
@@ -1054,7 +1099,7 @@ linkOrImage = explicitOrImageLink
explicitOrImageLink :: OrgParser (F Inlines)
explicitOrImageLink = try $ do
char '['
- srcF <- applyCustomLinkFormat =<< linkTarget
+ srcF <- applyCustomLinkFormat =<< possiblyEmptyLinkTarget
title <- enclosedRaw (char '[') (char ']')
title' <- parseFromString (mconcat <$> many inline) title
char ']'
@@ -1087,6 +1132,9 @@ selfTarget = try $ char '[' *> linkTarget <* char ']'
linkTarget :: OrgParser String
linkTarget = enclosedByPair '[' ']' (noneOf "\n\r[]")
+possiblyEmptyLinkTarget :: OrgParser String
+possiblyEmptyLinkTarget = try linkTarget <|> ("" <$ string "[]")
+
applyCustomLinkFormat :: String -> OrgParser (F String)
applyCustomLinkFormat link = do
let (linkType, rest) = break (== ':') link
@@ -1094,27 +1142,33 @@ applyCustomLinkFormat link = do
formatter <- M.lookup linkType <$> asksF orgStateLinkFormatters
return $ maybe link ($ drop 1 rest) formatter
-
linkToInlinesF :: String -> Inlines -> F Inlines
-linkToInlinesF s@('#':_) = pure . B.link s ""
-linkToInlinesF s
- | isImageFilename s = const . pure $ B.image s "" ""
- | isUri s = pure . B.link s ""
- | isRelativeUrl s = pure . B.link s ""
-linkToInlinesF s = \title -> do
- anchorB <- (s `elem`) <$> asksF orgStateAnchorIds
- if anchorB
- then pure $ B.link ('#':s) "" title
- else pure $ B.emph title
-
-isRelativeUrl :: String -> Bool
-isRelativeUrl s = (':' `notElem` s) && ("./" `isPrefixOf` s)
+linkToInlinesF s =
+ case s of
+ "" -> pure . B.link "" ""
+ ('#':_) -> pure . B.link s ""
+ _ | isImageFilename s -> const . pure $ B.image s "" ""
+ _ | isUri s -> pure . B.link s ""
+ _ | isRelativeFilePath s -> pure . B.link s ""
+ _ | isAbsoluteFilePath s -> pure . B.link ("file://" ++ s) ""
+ _ -> \title -> do
+ anchorB <- (s `elem`) <$> asksF orgStateAnchorIds
+ if anchorB
+ then pure $ B.link ('#':s) "" title
+ else pure $ B.emph title
+
+isRelativeFilePath :: String -> Bool
+isRelativeFilePath s = (("./" `isPrefixOf` s) || ("../" `isPrefixOf` s)) &&
+ (':' `notElem` s)
isUri :: String -> Bool
isUri s = let (scheme, path) = break (== ':') s
in all (\c -> isAlphaNum c || c `elem` ".-") scheme
&& not (null path)
+isAbsoluteFilePath :: String -> Bool
+isAbsoluteFilePath = ('/' ==) . head
+
isImageFilename :: String -> Bool
isImageFilename filename =
any (\x -> ('.':x) `isSuffixOf` filename) imageExtensions &&
@@ -1205,10 +1259,10 @@ displayMath = return . B.displayMath <$> choice [ rawMathBetween "\\[" "\\]"
]
symbol :: OrgParser (F Inlines)
symbol = return . B.str . (: "") <$> (oneOf specialChars >>= updatePositions)
- where updatePositions c
- | c `elem` emphasisPreChars = c <$ updateLastPreCharPos
- | c `elem` emphasisForbiddenBorderChars = c <$ updateLastForbiddenCharPos
- | otherwise = return c
+ where updatePositions c = do
+ when (c `elem` emphasisPreChars) updateLastPreCharPos
+ when (c `elem` emphasisForbiddenBorderChars) updateLastForbiddenCharPos
+ return c
emphasisBetween :: Char
-> OrgParser (F Inlines)
@@ -1387,7 +1441,8 @@ simpleSubOrSuperString = try $
inlineLaTeX :: OrgParser (F Inlines)
inlineLaTeX = try $ do
cmd <- inlineLaTeXCommand
- maybe mzero returnF $ parseAsMath cmd `mplus` parseAsInlineLaTeX cmd
+ maybe mzero returnF $
+ parseAsMath cmd `mplus` parseAsMathMLSym cmd `mplus` parseAsInlineLaTeX cmd
where
parseAsMath :: String -> Maybe Inlines
parseAsMath cs = B.fromList <$> texMathToPandoc cs
@@ -1395,6 +1450,11 @@ inlineLaTeX = try $ do
parseAsInlineLaTeX :: String -> Maybe Inlines
parseAsInlineLaTeX cs = maybeRight $ runParser inlineCommand state "" cs
+ parseAsMathMLSym :: String -> Maybe Inlines
+ parseAsMathMLSym cs = B.str <$> MathMLEntityMap.getUnicode (clean cs)
+ -- dropWhileEnd would be nice here, but it's not available before base 4.5
+ where clean = reverse . dropWhile (`elem` "{}") . reverse . drop 1
+
state :: ParserState
state = def{ stateOptions = def{ readerParseRaw = True }}
diff --git a/src/Text/Pandoc/Readers/RST.hs b/src/Text/Pandoc/Readers/RST.hs
index e5eccb116..732956981 100644
--- a/src/Text/Pandoc/Readers/RST.hs
+++ b/src/Text/Pandoc/Readers/RST.hs
@@ -47,7 +47,7 @@ import Text.Pandoc.Builder (Inlines, Blocks, trimInlines, (<>))
import qualified Text.Pandoc.Builder as B
import Data.Monoid (mconcat, mempty)
import Data.Sequence (viewr, ViewR(..))
-import Data.Char (toLower, isHexDigit)
+import Data.Char (toLower, isHexDigit, isSpace)
-- | Parse reStructuredText string and return Pandoc document.
readRST :: ReaderOptions -- ^ Reader options
@@ -335,6 +335,13 @@ indentedBlock = try $ do
optional blanklines
return $ unlines lns
+quotedBlock :: Parser [Char] st [Char]
+quotedBlock = try $ do
+ quote <- lookAhead $ oneOf "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
+ lns <- many1 $ lookAhead (char quote) >> anyLine
+ optional blanklines
+ return $ unlines lns
+
codeBlockStart :: Parser [Char] st Char
codeBlockStart = string "::" >> blankline >> blankline
@@ -342,7 +349,8 @@ codeBlock :: Parser [Char] st Blocks
codeBlock = try $ codeBlockStart >> codeBlockBody
codeBlockBody :: Parser [Char] st Blocks
-codeBlockBody = try $ B.codeBlock . stripTrailingNewlines <$> indentedBlock
+codeBlockBody = try $ B.codeBlock . stripTrailingNewlines <$>
+ (indentedBlock <|> quotedBlock)
lhsCodeBlock :: RSTParser Blocks
lhsCodeBlock = try $ do
@@ -513,7 +521,6 @@ directive = try $ do
-- TODO: line-block, parsed-literal, table, csv-table, list-table
-- date
-- include
--- class
-- title
directive' :: RSTParser Blocks
directive' = do
@@ -594,6 +601,13 @@ directive' = do
Just t -> B.link (escapeURI $ trim t) ""
$ B.image src "" alt
Nothing -> B.image src "" alt
+ "class" -> do
+ let attrs = ("", (splitBy isSpace $ trim top), map (\(k,v) -> (k, trimr v)) fields)
+ -- directive content or the first immediately following element
+ children <- case body of
+ "" -> block
+ _ -> parseFromString parseBlocks body'
+ return $ B.divWith attrs children
_ -> return mempty
-- TODO:
diff --git a/src/Text/Pandoc/Readers/TWiki.hs b/src/Text/Pandoc/Readers/TWiki.hs
new file mode 100644
index 000000000..c2325c0ea
--- /dev/null
+++ b/src/Text/Pandoc/Readers/TWiki.hs
@@ -0,0 +1,526 @@
+{-# LANGUAGE RelaxedPolyRec, FlexibleInstances, TypeSynonymInstances #-}
+-- RelaxedPolyRec needed for inlinesBetween on GHC < 7
+{-
+ Copyright (C) 2014 Alexander Sulfrian <alexander.sulfrian@fu-berlin.de>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+-}
+
+{- |
+ Module : Text.Pandoc.Readers.TWiki
+ Copyright : Copyright (C) 2014 Alexander Sulfrian
+ License : GNU GPL, version 2 or above
+
+ Maintainer : Alexander Sulfrian <alexander.sulfrian@fu-berlin.de>
+ Stability : alpha
+ Portability : portable
+
+Conversion of twiki text to 'Pandoc' document.
+-}
+module Text.Pandoc.Readers.TWiki ( readTWiki
+ , readTWikiWithWarnings
+ ) where
+
+import Text.Pandoc.Definition
+import qualified Text.Pandoc.Builder as B
+import Text.Pandoc.Options
+import Text.Pandoc.Parsing hiding (enclosed, macro, nested)
+import Text.Pandoc.Readers.HTML (htmlTag, isCommentTag)
+import Data.Monoid (Monoid, mconcat, mempty)
+import Control.Applicative ((<$>), (<*), (*>), (<$))
+import Control.Monad
+import Text.Printf (printf)
+import Debug.Trace (trace)
+import Text.Pandoc.XML (fromEntities)
+import Data.Maybe (fromMaybe)
+import Text.HTML.TagSoup
+import Data.Char (isAlphaNum)
+import qualified Data.Foldable as F
+
+-- | Read twiki from an input string and return a Pandoc document.
+readTWiki :: ReaderOptions -- ^ Reader options
+ -> String -- ^ String to parse (assuming @'\n'@ line endings)
+ -> Pandoc
+readTWiki opts s =
+ (readWith parseTWiki) def{ stateOptions = opts } (s ++ "\n\n")
+
+readTWikiWithWarnings :: ReaderOptions -- ^ Reader options
+ -> String -- ^ String to parse (assuming @'\n'@ line endings)
+ -> (Pandoc, [String])
+readTWikiWithWarnings opts s =
+ (readWith parseTWikiWithWarnings) def{ stateOptions = opts } (s ++ "\n\n")
+ where parseTWikiWithWarnings = do
+ doc <- parseTWiki
+ warnings <- stateWarnings <$> getState
+ return (doc, warnings)
+
+type TWParser = Parser [Char] ParserState
+
+--
+-- utility functions
+--
+
+tryMsg :: String -> TWParser a -> TWParser a
+tryMsg msg p = try p <?> msg
+
+skip :: TWParser a -> TWParser ()
+skip parser = parser >> return ()
+
+nested :: TWParser a -> TWParser a
+nested p = do
+ nestlevel <- stateMaxNestingLevel <$> getState
+ guard $ nestlevel > 0
+ updateState $ \st -> st{ stateMaxNestingLevel = stateMaxNestingLevel st - 1 }
+ res <- p
+ updateState $ \st -> st{ stateMaxNestingLevel = nestlevel }
+ return res
+
+htmlElement :: String -> TWParser (Attr, String)
+htmlElement tag = tryMsg tag $ do
+ (TagOpen _ attr, _) <- htmlTag (~== TagOpen tag [])
+ content <- manyTill anyChar (endtag <|> endofinput)
+ return (htmlAttrToPandoc attr, trim content)
+ where
+ endtag = skip $ htmlTag (~== TagClose tag)
+ endofinput = lookAhead $ try $ skipMany blankline >> skipSpaces >> eof
+ trim = dropWhile (=='\n') . reverse . dropWhile (=='\n') . reverse
+
+htmlAttrToPandoc :: [Attribute String] -> Attr
+htmlAttrToPandoc attrs = (ident, classes, keyvals)
+ where
+ ident = fromMaybe "" $ lookup "id" attrs
+ classes = maybe [] words $ lookup "class" attrs
+ keyvals = [(k,v) | (k,v) <- attrs, k /= "id" && k /= "class"]
+
+parseHtmlContentWithAttrs :: String -> TWParser a -> TWParser (Attr, [a])
+parseHtmlContentWithAttrs tag parser = do
+ (attr, content) <- htmlElement tag
+ parsedContent <- try $ parseContent content
+ return (attr, parsedContent)
+ where
+ parseContent = parseFromString $ nested $ manyTill parser endOfContent
+ endOfContent = try $ skipMany blankline >> skipSpaces >> eof
+
+parseHtmlContent :: String -> TWParser a -> TWParser [a]
+parseHtmlContent tag p = parseHtmlContentWithAttrs tag p >>= return . snd
+
+--
+-- main parser
+--
+
+parseTWiki :: TWParser Pandoc
+parseTWiki = do
+ bs <- mconcat <$> many block
+ spaces
+ eof
+ return $ B.doc bs
+
+
+--
+-- block parsers
+--
+
+block :: TWParser B.Blocks
+block = do
+ tr <- getOption readerTrace
+ pos <- getPosition
+ res <- mempty <$ skipMany1 blankline
+ <|> blockElements
+ <|> para
+ skipMany blankline
+ when tr $
+ trace (printf "line %d: %s" (sourceLine pos)
+ (take 60 $ show $ B.toList res)) (return ())
+ return res
+
+blockElements :: TWParser B.Blocks
+blockElements = choice [ separator
+ , header
+ , verbatim
+ , literal
+ , list ""
+ , table
+ , blockQuote
+ , noautolink
+ ]
+
+separator :: TWParser B.Blocks
+separator = tryMsg "separator" $ string "---" >> newline >> return B.horizontalRule
+
+header :: TWParser B.Blocks
+header = tryMsg "header" $ do
+ string "---"
+ level <- many1 (char '+') >>= return . length
+ guard $ level <= 6
+ classes <- option [] $ string "!!" >> return ["unnumbered"]
+ skipSpaces
+ content <- B.trimInlines . mconcat <$> manyTill inline newline
+ attr <- registerHeader ("", classes, []) content
+ return $ B.headerWith attr level $ content
+
+verbatim :: TWParser B.Blocks
+verbatim = (htmlElement "verbatim" <|> htmlElement "pre")
+ >>= return . (uncurry B.codeBlockWith)
+
+literal :: TWParser B.Blocks
+literal = htmlElement "literal" >>= return . rawBlock
+ where
+ format (_, _, kvs) = fromMaybe "html" $ lookup "format" kvs
+ rawBlock (attrs, content) = B.rawBlock (format attrs) content
+
+list :: String -> TWParser B.Blocks
+list prefix = choice [ bulletList prefix
+ , orderedList prefix
+ , definitionList prefix]
+
+definitionList :: String -> TWParser B.Blocks
+definitionList prefix = tryMsg "definitionList" $ do
+ indent <- lookAhead $ string prefix *> (many1 $ string " ") <* string "$ "
+ elements <- many $ parseDefinitionListItem (prefix ++ concat indent)
+ return $ B.definitionList elements
+ where
+ parseDefinitionListItem :: String -> TWParser (B.Inlines, [B.Blocks])
+ parseDefinitionListItem indent = do
+ string (indent ++ "$ ") >> skipSpaces
+ term <- many1Till inline $ string ": "
+ line <- listItemLine indent $ string "$ "
+ return $ (mconcat term, [line])
+
+bulletList :: String -> TWParser B.Blocks
+bulletList prefix = tryMsg "bulletList" $
+ parseList prefix (char '*') (char ' ')
+
+orderedList :: String -> TWParser B.Blocks
+orderedList prefix = tryMsg "orderedList" $
+ parseList prefix (oneOf "1iIaA") (string ". ")
+
+parseList :: Show a => String -> TWParser Char -> TWParser a -> TWParser B.Blocks
+parseList prefix marker delim = do
+ (indent, style) <- lookAhead $ string prefix *> listStyle <* delim
+ blocks <- many $ parseListItem (prefix ++ indent) (char style <* delim)
+ return $ case style of
+ '1' -> B.orderedListWith (1, DefaultStyle, DefaultDelim) blocks
+ 'i' -> B.orderedListWith (1, LowerRoman, DefaultDelim) blocks
+ 'I' -> B.orderedListWith (1, UpperRoman, DefaultDelim) blocks
+ 'a' -> B.orderedListWith (1, LowerAlpha, DefaultDelim) blocks
+ 'A' -> B.orderedListWith (1, UpperAlpha, DefaultDelim) blocks
+ _ -> B.bulletList blocks
+ where
+ listStyle = do
+ indent <- many1 $ string " "
+ style <- marker
+ return (concat indent, style)
+
+parseListItem :: Show a => String -> TWParser a -> TWParser B.Blocks
+parseListItem prefix marker = string prefix >> marker >> listItemLine prefix marker
+
+listItemLine :: Show a => String -> TWParser a -> TWParser B.Blocks
+listItemLine prefix marker = lineContent >>= parseContent >>= return . mconcat
+ where
+ lineContent = do
+ content <- anyLine
+ continuation <- optionMaybe listContinuation
+ return $ filterSpaces content ++ "\n" ++ (maybe "" (" " ++) continuation)
+ filterSpaces = reverse . dropWhile (== ' ') . reverse
+ listContinuation = notFollowedBy (string prefix >> marker) >>
+ string " " >> lineContent
+ parseContent = parseFromString $ many1 $ nestedList <|> parseInline
+ parseInline = many1Till inline (lastNewline <|> newlineBeforeNestedList) >>=
+ return . B.plain . mconcat
+ nestedList = list prefix
+ lastNewline = try $ char '\n' <* eof
+ newlineBeforeNestedList = try $ char '\n' <* lookAhead nestedList
+
+table :: TWParser B.Blocks
+table = try $ do
+ tableHead <- optionMaybe $ many1Till tableParseHeader newline >>= return . unzip
+ rows <- many1 tableParseRow
+ return $ buildTable mempty rows $ fromMaybe (align rows, columns rows) tableHead
+ where
+ buildTable caption rows (aligns, heads)
+ = B.table caption aligns heads rows
+ align rows = replicate (columCount rows) (AlignDefault, 0)
+ columns rows = replicate (columCount rows) mempty
+ columCount rows = length $ head rows
+
+tableParseHeader :: TWParser ((Alignment, Double), B.Blocks)
+tableParseHeader = try $ do
+ char '|'
+ leftSpaces <- many spaceChar >>= return . length
+ char '*'
+ content <- tableColumnContent (char '*' >> skipSpaces >> char '|')
+ char '*'
+ rightSpaces <- many spaceChar >>= return . length
+ optional tableEndOfRow
+ return (tableAlign leftSpaces rightSpaces, content)
+ where
+ tableAlign left right
+ | left >= 2 && left == right = (AlignCenter, 0)
+ | left > right = (AlignRight, 0)
+ | otherwise = (AlignLeft, 0)
+
+tableParseRow :: TWParser [B.Blocks]
+tableParseRow = many1Till tableParseColumn newline
+
+tableParseColumn :: TWParser B.Blocks
+tableParseColumn = char '|' *> skipSpaces *>
+ tableColumnContent (skipSpaces >> char '|')
+ <* skipSpaces <* optional tableEndOfRow
+
+tableEndOfRow :: TWParser Char
+tableEndOfRow = lookAhead (try $ char '|' >> char '\n') >> char '|'
+
+tableColumnContent :: Show a => TWParser a -> TWParser B.Blocks
+tableColumnContent end = manyTill content (lookAhead $ try end) >>= return . B.plain . mconcat
+ where
+ content = continuation <|> inline
+ continuation = try $ char '\\' >> newline >> return mempty
+
+blockQuote :: TWParser B.Blocks
+blockQuote = parseHtmlContent "blockquote" block >>= return . B.blockQuote . mconcat
+
+noautolink :: TWParser B.Blocks
+noautolink = do
+ (_, content) <- htmlElement "noautolink"
+ st <- getState
+ setState $ st{ stateAllowLinks = False }
+ blocks <- try $ parseContent content
+ setState $ st{ stateAllowLinks = True }
+ return $ mconcat blocks
+ where
+ parseContent = parseFromString $ many $ block
+
+para :: TWParser B.Blocks
+para = many1Till inline endOfParaElement >>= return . result . mconcat
+ where
+ endOfParaElement = lookAhead $ endOfInput <|> endOfPara <|> newBlockElement
+ endOfInput = try $ skipMany blankline >> skipSpaces >> eof
+ endOfPara = try $ blankline >> skipMany1 blankline
+ newBlockElement = try $ blankline >> skip blockElements
+ result content = if F.all (==Space) content
+ then mempty
+ else B.para $ B.trimInlines content
+
+
+--
+-- inline parsers
+--
+
+inline :: TWParser B.Inlines
+inline = choice [ whitespace
+ , br
+ , macro
+ , strong
+ , strongHtml
+ , strongAndEmph
+ , emph
+ , emphHtml
+ , boldCode
+ , smart
+ , link
+ , htmlComment
+ , code
+ , codeHtml
+ , nop
+ , autoLink
+ , str
+ , symbol
+ ] <?> "inline"
+
+whitespace :: TWParser B.Inlines
+whitespace = (lb <|> regsp) >>= return
+ where lb = try $ skipMany spaceChar >> linebreak >> return B.space
+ regsp = try $ skipMany1 spaceChar >> return B.space
+
+br :: TWParser B.Inlines
+br = try $ string "%BR%" >> return B.linebreak
+
+linebreak :: TWParser B.Inlines
+linebreak = newline >> notFollowedBy newline >> (lastNewline <|> innerNewline)
+ where lastNewline = eof >> return mempty
+ innerNewline = return B.space
+
+between :: (Show b, Monoid c) => TWParser a -> TWParser b -> (TWParser b -> TWParser c) -> TWParser c
+between start end p =
+ mconcat <$> try (start >> notFollowedBy whitespace >> many1Till (p end) end)
+
+enclosed :: (Show a, Monoid b) => TWParser a -> (TWParser a -> TWParser b) -> TWParser b
+enclosed sep p = between sep (try $ sep <* endMarker) p
+ where
+ endMarker = lookAhead $ skip endSpace <|> skip (oneOf ".,!?:)|") <|> eof
+ endSpace = (spaceChar <|> newline) >> return B.space
+
+macro :: TWParser B.Inlines
+macro = macroWithParameters <|> withoutParameters
+ where
+ withoutParameters = enclosed (char '%') (\_ -> macroName) >>= return . emptySpan
+ emptySpan name = buildSpan name [] mempty
+
+macroWithParameters :: TWParser B.Inlines
+macroWithParameters = try $ do
+ char '%'
+ name <- macroName
+ (content, kvs) <- attributes
+ char '%'
+ return $ buildSpan name kvs $ B.str content
+
+buildSpan :: String -> [(String, String)] -> B.Inlines -> B.Inlines
+buildSpan className kvs = B.spanWith attrs
+ where
+ attrs = ("", ["twiki-macro", className] ++ additionalClasses, kvsWithoutClasses)
+ additionalClasses = maybe [] words $ lookup "class" kvs
+ kvsWithoutClasses = [(k,v) | (k,v) <- kvs, k /= "class"]
+
+macroName :: TWParser String
+macroName = do
+ first <- letter
+ rest <- many $ alphaNum <|> char '_'
+ return (first:rest)
+
+attributes :: TWParser (String, [(String, String)])
+attributes = char '{' *> spnl *> many (attribute <* spnl) <* char '}' >>=
+ return . foldr (either mkContent mkKvs) ([], [])
+ where
+ spnl = skipMany (spaceChar <|> newline)
+ mkContent c ([], kvs) = (c, kvs)
+ mkContent c (rest, kvs) = (c ++ " " ++ rest, kvs)
+ mkKvs kv (cont, rest) = (cont, (kv : rest))
+
+attribute :: TWParser (Either String (String, String))
+attribute = withKey <|> withoutKey
+ where
+ withKey = try $ do
+ key <- macroName
+ char '='
+ parseValue False >>= return . (curry Right key)
+ withoutKey = try $ parseValue True >>= return . Left
+ parseValue allowSpaces = (withQuotes <|> withoutQuotes allowSpaces) >>= return . fromEntities
+ withQuotes = between (char '"') (char '"') (\_ -> count 1 $ noneOf ['"'])
+ withoutQuotes allowSpaces
+ | allowSpaces == True = many1 $ noneOf "}"
+ | otherwise = many1 $ noneOf " }"
+
+nestedInlines :: Show a => TWParser a -> TWParser B.Inlines
+nestedInlines end = innerSpace <|> nestedInline
+ where
+ innerSpace = try $ whitespace <* (notFollowedBy end)
+ nestedInline = notFollowedBy whitespace >> nested inline
+
+strong :: TWParser B.Inlines
+strong = try $ enclosed (char '*') nestedInlines >>= return . B.strong
+
+strongHtml :: TWParser B.Inlines
+strongHtml = (parseHtmlContent "strong" inline <|> parseHtmlContent "b" inline)
+ >>= return . B.strong . mconcat
+
+strongAndEmph :: TWParser B.Inlines
+strongAndEmph = try $ enclosed (string "__") nestedInlines >>= return . B.emph . B.strong
+
+emph :: TWParser B.Inlines
+emph = try $ enclosed (char '_') nestedInlines >>= return . B.emph
+
+emphHtml :: TWParser B.Inlines
+emphHtml = (parseHtmlContent "em" inline <|> parseHtmlContent "i" inline)
+ >>= return . B.emph . mconcat
+
+nestedString :: Show a => TWParser a -> TWParser String
+nestedString end = innerSpace <|> (count 1 nonspaceChar)
+ where
+ innerSpace = try $ many1 spaceChar <* notFollowedBy end
+
+boldCode :: TWParser B.Inlines
+boldCode = try $ enclosed (string "==") nestedString >>= return . B.strong . B.code . fromEntities
+
+htmlComment :: TWParser B.Inlines
+htmlComment = htmlTag isCommentTag >> return mempty
+
+code :: TWParser B.Inlines
+code = try $ enclosed (char '=') nestedString >>= return . B.code . fromEntities
+
+codeHtml :: TWParser B.Inlines
+codeHtml = do
+ (attrs, content) <- parseHtmlContentWithAttrs "code" anyChar
+ return $ B.codeWith attrs $ fromEntities content
+
+autoLink :: TWParser B.Inlines
+autoLink = try $ do
+ state <- getState
+ guard $ stateAllowLinks state
+ (text, url) <- parseLink
+ guard $ checkLink (head $ reverse url)
+ return $ makeLink (text, url)
+ where
+ parseLink = notFollowedBy nop >> (uri <|> emailAddress)
+ makeLink (text, url) = B.link url "" $ B.str text
+ checkLink c
+ | c == '/' = True
+ | otherwise = isAlphaNum c
+
+str :: TWParser B.Inlines
+str = (many1 alphaNum <|> count 1 characterReference) >>= return . B.str
+
+nop :: TWParser B.Inlines
+nop = try $ (skip exclamation <|> skip nopTag) >> followContent
+ where
+ exclamation = char '!'
+ nopTag = stringAnyCase "<nop>"
+ followContent = many1 nonspaceChar >>= return . B.str . fromEntities
+
+symbol :: TWParser B.Inlines
+symbol = count 1 nonspaceChar >>= return . B.str
+
+smart :: TWParser B.Inlines
+smart = do
+ getOption readerSmart >>= guard
+ doubleQuoted <|> singleQuoted <|>
+ choice [ apostrophe
+ , dash
+ , ellipses
+ ]
+
+singleQuoted :: TWParser B.Inlines
+singleQuoted = try $ do
+ singleQuoteStart
+ withQuoteContext InSingleQuote $
+ many1Till inline singleQuoteEnd >>=
+ (return . B.singleQuoted . B.trimInlines . mconcat)
+
+doubleQuoted :: TWParser B.Inlines
+doubleQuoted = try $ do
+ doubleQuoteStart
+ contents <- mconcat <$> many (try $ notFollowedBy doubleQuoteEnd >> inline)
+ (withQuoteContext InDoubleQuote $ doubleQuoteEnd >>
+ return (B.doubleQuoted $ B.trimInlines contents))
+ <|> (return $ (B.str "\8220") B.<> contents)
+
+link :: TWParser B.Inlines
+link = try $ do
+ st <- getState
+ guard $ stateAllowLinks st
+ setState $ st{ stateAllowLinks = False }
+ (url, title, content) <- linkText
+ setState $ st{ stateAllowLinks = True }
+ return $ B.link url title content
+
+linkText :: TWParser (String, String, B.Inlines)
+linkText = do
+ string "[["
+ url <- many1Till anyChar (char ']')
+ content <- option [B.str url] linkContent
+ char ']'
+ return (url, "", mconcat content)
+ where
+ linkContent = (char '[') >> many1Till anyChar (char ']') >>= parseLinkContent
+ parseLinkContent = parseFromString $ many1 inline
diff --git a/src/Text/Pandoc/SelfContained.hs b/src/Text/Pandoc/SelfContained.hs
index 36839ddd0..5b8f7a75a 100644
--- a/src/Text/Pandoc/SelfContained.hs
+++ b/src/Text/Pandoc/SelfContained.hs
@@ -51,7 +51,8 @@ isOk c = isAscii c && isAlphaNum c
convertTag :: MediaBag -> Maybe String -> Tag String -> IO (Tag String)
convertTag media sourceURL t@(TagOpen tagname as)
- | tagname `elem` ["img", "embed", "video", "input", "audio", "source"] = do
+ | tagname `elem`
+ ["img", "embed", "video", "input", "audio", "source", "track"] = do
as' <- mapM processAttribute as
return $ TagOpen tagname as'
where processAttribute (x,y) =
diff --git a/src/Text/Pandoc/Shared.hs b/src/Text/Pandoc/Shared.hs
index 2d7c08718..9aa70e6f2 100644
--- a/src/Text/Pandoc/Shared.hs
+++ b/src/Text/Pandoc/Shared.hs
@@ -1,5 +1,6 @@
{-# LANGUAGE DeriveDataTypeable, CPP, MultiParamTypeClasses,
- FlexibleContexts, ScopedTypeVariables, PatternGuards #-}
+ FlexibleContexts, ScopedTypeVariables, PatternGuards,
+ ViewPatterns #-}
{-
Copyright (C) 2006-2014 John MacFarlane <jgm@berkeley.edu>
@@ -106,7 +107,7 @@ import Network.URI ( escapeURIString, isURI, nonStrictRelativeTo,
unEscapeString, parseURIReference, isAllowedInURI )
import qualified Data.Set as Set
import System.Directory
-import System.FilePath (joinPath, splitDirectories)
+import System.FilePath (joinPath, splitDirectories, pathSeparator, isPathSeparator)
import Text.Pandoc.MIME (MimeType, getMimeType)
import System.FilePath ( (</>), takeExtension, dropExtension)
import Data.Generics (Typeable, Data)
@@ -734,12 +735,10 @@ renderTags' = renderTagsOptions
-- | Perform an IO action in a directory, returning to starting directory.
inDirectory :: FilePath -> IO a -> IO a
-inDirectory path action = do
- oldDir <- getCurrentDirectory
- setCurrentDirectory path
- result <- action
- setCurrentDirectory oldDir
- return result
+inDirectory path action = E.bracket
+ getCurrentDirectory
+ setCurrentDirectory
+ (const $ setCurrentDirectory path >> action)
readDefaultDataFile :: FilePath -> IO BS.ByteString
readDefaultDataFile fname =
@@ -871,11 +870,14 @@ collapseFilePath = joinPath . reverse . foldl go [] . splitDirectories
go rs "." = rs
go r@(p:rs) ".." = case p of
".." -> ("..":r)
- "/" -> ("..":r)
+ (checkPathSeperator -> Just True) -> ("..":r)
_ -> rs
- go _ "/" = ["/"]
+ go _ (checkPathSeperator -> Just True) = [[pathSeparator]]
go rs x = x:rs
-
+ isSingleton [] = Nothing
+ isSingleton [x] = Just x
+ isSingleton _ = Nothing
+ checkPathSeperator = fmap isPathSeparator . isSingleton
--
-- Safe read
diff --git a/src/Text/Pandoc/Writers/ConTeXt.hs b/src/Text/Pandoc/Writers/ConTeXt.hs
index bbca7f858..ebdc4a3d3 100644
--- a/src/Text/Pandoc/Writers/ConTeXt.hs
+++ b/src/Text/Pandoc/Writers/ConTeXt.hs
@@ -36,6 +36,7 @@ import Text.Pandoc.Options
import Text.Pandoc.Walk (query)
import Text.Printf ( printf )
import Data.List ( intercalate )
+import Data.Char ( ord )
import Control.Monad.State
import Text.Pandoc.Pretty
import Text.Pandoc.Templates ( renderTemplate' )
@@ -114,6 +115,13 @@ escapeCharForConTeXt opts ch =
stringToConTeXt :: WriterOptions -> String -> String
stringToConTeXt opts = concatMap (escapeCharForConTeXt opts)
+-- | Sanitize labels
+toLabel :: String -> String
+toLabel z = concatMap go z
+ where go x
+ | elem x "\\#[]\",{}%()|=" = "ux" ++ printf "%x" (ord x)
+ | otherwise = [x]
+
-- | Convert Elements to ConTeXt
elementToConTeXt :: WriterOptions -> Element -> State WriterState Doc
elementToConTeXt _ (Blk block) = blockToConTeXt block
@@ -286,15 +294,16 @@ inlineToConTeXt Space = return space
-- Handle HTML-like internal document references to sections
inlineToConTeXt (Link txt (('#' : ref), _)) = do
opts <- gets stOptions
- label <- inlineListToConTeXt txt
+ contents <- inlineListToConTeXt txt
+ let ref' = toLabel $ stringToConTeXt opts ref
return $ text "\\in"
<> braces (if writerNumberSections opts
- then label <+> text "(\\S"
- else label) -- prefix
+ then contents <+> text "(\\S"
+ else contents) -- prefix
<> braces (if writerNumberSections opts
then text ")"
else empty) -- suffix
- <> brackets (text ref)
+ <> brackets (text ref')
inlineToConTeXt (Link txt (src, _)) = do
let isAutolink = txt == [Str (unEscapeString src)]
@@ -302,13 +311,13 @@ inlineToConTeXt (Link txt (src, _)) = do
let next = stNextRef st
put $ st {stNextRef = next + 1}
let ref = "url" ++ show next
- label <- inlineListToConTeXt txt
+ contents <- inlineListToConTeXt txt
return $ "\\useURL"
<> brackets (text ref)
<> brackets (text $ escapeStringUsing [('#',"\\#"),('%',"\\%")] src)
<> (if isAutolink
then empty
- else brackets empty <> brackets label)
+ else brackets empty <> brackets contents)
<> "\\from"
<> brackets (text ref)
inlineToConTeXt (Image _ (src, _)) = do
@@ -337,6 +346,7 @@ sectionHeader (ident,classes,_) hdrLevel lst = do
st <- get
let opts = stOptions st
let level' = if writerChapters opts then hdrLevel - 1 else hdrLevel
+ let ident' = toLabel ident
let (section, chapter) = if "unnumbered" `elem` classes
then (text "subject", text "title")
else (text "section", text "chapter")
@@ -344,7 +354,7 @@ sectionHeader (ident,classes,_) hdrLevel lst = do
then char '\\'
<> text (concat (replicate (level' - 1) "sub"))
<> section
- <> (if (not . null) ident then brackets (text ident) else empty)
+ <> (if (not . null) ident' then brackets (text ident') else empty)
<> braces contents
<> blankline
else if level' == 0
diff --git a/src/Text/Pandoc/Writers/Docx.hs b/src/Text/Pandoc/Writers/Docx.hs
index 5320a2816..f8e8cc34d 100644
--- a/src/Text/Pandoc/Writers/Docx.hs
+++ b/src/Text/Pandoc/Writers/Docx.hs
@@ -29,7 +29,7 @@ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Conversion of 'Pandoc' documents to docx.
-}
module Text.Pandoc.Writers.Docx ( writeDocx ) where
-import Data.List ( intercalate, isPrefixOf, isSuffixOf )
+import Data.List ( intercalate, isPrefixOf, isSuffixOf, stripPrefix )
import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as BL
import qualified Data.ByteString.Lazy.Char8 as BL8
@@ -62,8 +62,9 @@ import Text.Printf (printf)
import qualified Control.Exception as E
import Text.Pandoc.MIME (MimeType, getMimeType, getMimeTypeDef,
extensionFromMimeType)
-import Control.Applicative ((<$>), (<|>))
+import Control.Applicative ((<$>), (<|>), (<*>))
import Data.Maybe (fromMaybe, mapMaybe)
+import Data.Char (isDigit)
data ListMarker = NoMarker
| BulletMarker
@@ -104,6 +105,8 @@ data WriterState = WriterState{
, stInDel :: Bool
, stChangesAuthor :: String
, stChangesDate :: String
+ , stPrintWidth :: Integer
+ , stHeadingStyles :: [(Int,String)]
}
defaultWriterState :: WriterState
@@ -122,6 +125,8 @@ defaultWriterState = WriterState{
, stInDel = False
, stChangesAuthor = "unknown"
, stChangesDate = "1969-12-31T19:00:00Z"
+ , stPrintWidth = 1
+ , stHeadingStyles = []
}
type WS a = StateT WriterState IO a
@@ -181,11 +186,59 @@ writeDocx opts doc@(Pandoc meta _) = do
case writerReferenceDocx opts of
Just f -> B.readFile f
Nothing -> readDataFile datadir "reference.docx"
- distArchive <- liftM (toArchive . toLazy) $ readDataFile Nothing "reference.docx"
+ distArchive <- liftM (toArchive . toLazy) $ readDataFile datadir "reference.docx"
+
+ parsedDoc <- parseXml refArchive distArchive "word/document.xml"
+ let wname f qn = qPrefix qn == Just "w" && f (qName qn)
+ let mbsectpr = filterElementName (wname (=="sectPr")) parsedDoc
+
+ -- Gets the template size
+ let mbpgsz = mbsectpr >>= (filterElementName (wname (=="pgSz")))
+ let mbAttrSzWidth = (elAttribs <$> mbpgsz) >>= (lookupAttrBy ((=="w") . qName))
+
+ let mbpgmar = mbsectpr >>= (filterElementName (wname (=="pgMar")))
+ let mbAttrMarLeft = (elAttribs <$> mbpgmar) >>= (lookupAttrBy ((=="left") . qName))
+ let mbAttrMarRight = (elAttribs <$> mbpgmar) >>= (lookupAttrBy ((=="right") . qName))
+
+ -- Get the avaible area (converting the size and the margins to int and
+ -- doing the difference
+ let pgContentWidth = (-) <$> (read <$> mbAttrSzWidth ::Maybe Integer)
+ <*> (
+ (+) <$> (read <$> mbAttrMarRight ::Maybe Integer)
+ <*> (read <$> mbAttrMarLeft ::Maybe Integer)
+ )
+
+ -- styles
+ let stylepath = "word/styles.xml"
+ styledoc <- parseXml refArchive distArchive stylepath
+
+ -- parse styledoc for heading styles
+ let styleNamespaces = map ((,) <$> qName . attrKey <*> attrVal) .
+ filter ((==Just "xmlns") . qPrefix . attrKey) .
+ elAttribs $ styledoc
+ let headingStyles =
+ let
+ mywURI = lookup "w" styleNamespaces
+ myName name = QName name mywURI (Just "w")
+ getAttrStyleId = findAttr (myName "styleId")
+ getNameVal = findChild (myName "name") >=> findAttr (myName "val")
+ getNum s | not $ null s, all isDigit s = Just (read s :: Int)
+ | otherwise = Nothing
+ getEngHeader = getAttrStyleId >=> stripPrefix "Heading" >=> getNum
+ getIntHeader = getNameVal >=> stripPrefix "heading " >=> getNum
+ toTuple getF = liftM2 (,) <$> getF <*> getAttrStyleId
+ toMap getF = mapMaybe (toTuple getF) $
+ findChildren (myName "style") styledoc
+ select a b | not $ null a = a
+ | otherwise = b
+ in
+ select (toMap getEngHeader) (toMap getIntHeader)
((contents, footnotes), st) <- runStateT (writeOpenXML opts{writerWrapText = False} doc')
defaultWriterState{ stChangesAuthor = fromMaybe "unknown" username
- , stChangesDate = formatTime defaultTimeLocale "%FT%XZ" utctime}
+ , stChangesDate = formatTime defaultTimeLocale "%FT%XZ" utctime
+ , stPrintWidth = (maybe 420 (\x -> quot x 20) pgContentWidth)
+ , stHeadingStyles = headingStyles}
let epochtime = floor $ utcTimeToPOSIXSeconds utctime
let imgs = M.elems $ stImages st
@@ -193,9 +246,6 @@ writeDocx opts doc@(Pandoc meta _) = do
let toImageEntry (_,path,_,_,img) = toEntry ("word/" ++ path) epochtime $ toLazy img
let imageEntries = map toImageEntry imgs
-
-
-
let stdAttributes =
[("xmlns:w","http://schemas.openxmlformats.org/wordprocessingml/2006/main")
,("xmlns:m","http://schemas.openxmlformats.org/officeDocument/2006/math")
@@ -310,10 +360,7 @@ writeDocx opts doc@(Pandoc meta _) = do
$ renderXml reldoc
- -- adjust contents to add sectPr from reference.docx
- parsedDoc <- parseXml refArchive distArchive "word/document.xml"
- let wname f qn = qPrefix qn == Just "w" && f (qName qn)
- let mbsectpr = filterElementName (wname (=="sectPr")) parsedDoc
+ -- adjust contents to add sectPr from reference.docx
let sectpr = case mbsectpr of
Just sectpr' -> let cs = renumIds
(\q -> qName q == "id" && qPrefix q == Just "r")
@@ -323,8 +370,6 @@ writeDocx opts doc@(Pandoc meta _) = do
add_attrs (elAttribs sectpr') $ mknode "w:sectPr" [] cs
Nothing -> (mknode "w:sectPr" [] ())
-
-
-- let sectpr = fromMaybe (mknode "w:sectPr" [] ()) mbsectpr'
let contents' = contents ++ [sectpr]
let docContents = mknode "w:document" stdAttributes
@@ -347,8 +392,6 @@ writeDocx opts doc@(Pandoc meta _) = do
-- styles
let newstyles = styleToOpenXml $ writerHighlightStyle opts
- let stylepath = "word/styles.xml"
- styledoc <- parseXml refArchive distArchive stylepath
let styledoc' = styledoc{ elContent = elContent styledoc ++
[Elem x | x <- newstyles, writerHighlight opts] }
let styleEntry = toEntry stylepath epochtime $ renderXml styledoc'
@@ -600,8 +643,9 @@ blockToOpenXML opts (Div (_,["references"],_) bs) = do
return (header ++ rest)
blockToOpenXML opts (Div _ bs) = blocksToOpenXML opts bs
blockToOpenXML opts (Header lev (ident,_,_) lst) = do
- paraProps <- withParaProp (pStyle $ "Heading" ++ show lev) $
- getParaProps False
+ headingStyles <- gets stHeadingStyles
+ paraProps <- maybe id (withParaProp . pStyle) (lookup lev headingStyles) $
+ getParaProps False
contents <- inlinesToOpenXML opts lst
usedIdents <- gets stSectionIds
let bookmarkName = if null ident
@@ -927,6 +971,7 @@ inlineToOpenXML opts (Link txt (src,_)) = do
return [ mknode "w:hyperlink" [("r:id",id')] contents ]
inlineToOpenXML opts (Image alt (src, tit)) = do
-- first, check to see if we've already done this image
+ pageWidth <- gets stPrintWidth
imgs <- gets stImages
case M.lookup src imgs of
Just (_,_,_,elt,_) -> return [elt]
@@ -943,7 +988,7 @@ inlineToOpenXML opts (Image alt (src, tit)) = do
let size = imageSize img
let (xpt,ypt) = maybe (120,120) sizeInPoints size
-- 12700 emu = 1 pt
- let (xemu,yemu) = fitToPage (xpt * 12700, ypt * 12700)
+ let (xemu,yemu) = fitToPage (xpt * 12700, ypt * 12700) (pageWidth * 12700)
let cNvPicPr = mknode "pic:cNvPicPr" [] $
mknode "a:picLocks" [("noChangeArrowheads","1"),("noChangeAspect","1")] ()
let nvPicPr = mknode "pic:nvPicPr" []
@@ -1010,9 +1055,11 @@ parseXml refArchive distArchive relpath =
Nothing -> fail $ relpath ++ " corrupt or missing in reference docx"
-- | Scales the image to fit the page
-fitToPage :: (Integer, Integer) -> (Integer, Integer)
-fitToPage (x, y)
- --5440680 is the emu width size of a letter page in portrait, minus the margins
- | x > 5440680 =
- (5440680, round $ (5440680 / ((fromIntegral :: Integer -> Double) x)) * (fromIntegral y))
+-- sizes are passed in emu
+fitToPage :: (Integer, Integer) -> Integer -> (Integer, Integer)
+fitToPage (x, y) pageWidth
+ -- Fixes width to the page width and scales the height
+ | x > pageWidth =
+ (pageWidth, round $
+ ((fromIntegral pageWidth) / ((fromIntegral :: Integer -> Double) x)) * (fromIntegral y))
| otherwise = (x, y)
diff --git a/src/Text/Pandoc/Writers/DokuWiki.hs b/src/Text/Pandoc/Writers/DokuWiki.hs
index 8c1d360aa..74418aa7e 100644
--- a/src/Text/Pandoc/Writers/DokuWiki.hs
+++ b/src/Text/Pandoc/Writers/DokuWiki.hs
@@ -178,7 +178,7 @@ blockToDokuWiki _ (CodeBlock (_,classes,_) str) = do
blockToDokuWiki opts (BlockQuote blocks) = do
contents <- blockListToDokuWiki opts blocks
if isSimpleBlockQuote blocks
- then return $ "> " ++ contents
+ then return $ unlines $ map ("> " ++) $ lines contents
else return $ "<HTML><blockquote>\n" ++ contents ++ "</blockquote></HTML>"
blockToDokuWiki opts (Table capt aligns _ headers rows) = do
@@ -352,9 +352,7 @@ isPlainOrPara (Para _) = True
isPlainOrPara _ = False
isSimpleBlockQuote :: [Block] -> Bool
-isSimpleBlockQuote [BlockQuote bs] = isSimpleBlockQuote bs
-isSimpleBlockQuote [b] = isPlainOrPara b
-isSimpleBlockQuote _ = False
+isSimpleBlockQuote bs = all isPlainOrPara bs
-- | Concatenates strings with line breaks between them.
vcat :: [String] -> String
diff --git a/src/Text/Pandoc/Writers/EPUB.hs b/src/Text/Pandoc/Writers/EPUB.hs
index 32256cb42..2291c7184 100644
--- a/src/Text/Pandoc/Writers/EPUB.hs
+++ b/src/Text/Pandoc/Writers/EPUB.hs
@@ -35,7 +35,7 @@ import Data.Maybe ( fromMaybe )
import Data.List ( isPrefixOf, isInfixOf, intercalate )
import System.Environment ( getEnv )
import Text.Printf (printf)
-import System.FilePath ( (</>), takeExtension, takeFileName )
+import System.FilePath ( takeExtension, takeFileName )
import qualified Data.ByteString.Lazy as B
import qualified Data.ByteString.Lazy.Char8 as B8
import qualified Text.Pandoc.UTF8 as UTF8
@@ -64,7 +64,6 @@ import Text.XML.Light ( unode, Element(..), unqual, Attr(..), add_attrs
import Text.Pandoc.UUID (getRandomUUID)
import Text.Pandoc.Writers.HTML (writeHtmlString, writeHtml)
import Data.Char ( toLower, isDigit, isAlphaNum )
-import Network.URI ( unEscapeString )
import Text.Pandoc.MIME (MimeType, getMimeType)
import qualified Control.Exception as E
import Text.Blaze.Html.Renderer.Utf8 (renderHtml)
@@ -359,8 +358,9 @@ writeEPUB opts doc@(Pandoc meta _) = do
Nothing -> return ([],[])
Just img -> do
let coverImage = "media/" ++ takeFileName img
- let cpContent = renderHtml $ writeHtml opts'
- (Pandoc meta [RawBlock (Format "html") $ "<div id=\"cover-image\">\n<img src=\"" ++ coverImage ++ "\" alt=\"cover image\" />\n</div>"])
+ let cpContent = renderHtml $ writeHtml
+ opts'{ writerVariables = ("coverpage","true"):vars }
+ (Pandoc meta [RawBlock (Format "html") $ "<div id=\"cover-image\">\n<img src=\"" ++ coverImage ++ "\" alt=\"cover image\" />\n</div>"])
imgContent <- B.readFile img
return ( [mkEntry "cover.xhtml" cpContent]
, [mkEntry coverImage imgContent] )
@@ -612,17 +612,14 @@ writeEPUB opts doc@(Pandoc meta _) = do
(_:_) -> [unode "ol" ! [("class","toc")] $ subs]
let navtag = if epub3 then "nav" else "div"
- let navData = UTF8.fromStringLazy $ ppTopElement $
- unode "html" ! [("xmlns","http://www.w3.org/1999/xhtml")
- ,("xmlns:epub","http://www.idpf.org/2007/ops")] $
- [ unode "head" $
- [ unode "title" plainTitle
- , unode "link" ! [("rel","stylesheet"),("type","text/css"),("href","stylesheet.css")] $ () ]
- , unode "body" $
- unode navtag ! [("epub:type","toc") | epub3] $
- [ unode "h1" ! [("id","toc-title")] $ plainTitle
- , unode "ol" ! [("class","toc")] $ evalState (mapM (navPointNode navXhtmlFormatter) secs) 1]
- ]
+ let navBlocks = [RawBlock (Format "html") $ ppElement $
+ unode navtag ! [("epub:type","toc") | epub3] $
+ [ unode "h1" ! [("id","toc-title")] $ plainTitle
+ , unode "ol" ! [("class","toc")] $ evalState (mapM (navPointNode navXhtmlFormatter) secs) 1]]
+ let navData = renderHtml $ writeHtml opts'
+ (Pandoc (setMeta "title"
+ (walk removeNote $ fromList $ docTitle' meta) nullMeta)
+ navBlocks)
let navEntry = mkEntry "nav.xhtml" navData
-- mimetype
@@ -765,23 +762,20 @@ metadataElement version md currentTime =
showDateTimeISO8601 :: UTCTime -> String
showDateTimeISO8601 = formatTime defaultTimeLocale "%FT%TZ"
-transformTag :: WriterOptions
- -> IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media
+transformTag :: IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media
-> Tag String
-> IO (Tag String)
-transformTag opts mediaRef tag@(TagOpen name attr)
+transformTag mediaRef tag@(TagOpen name attr)
| name `elem` ["video", "source", "img", "audio"] = do
let src = fromAttrib "src" tag
let poster = fromAttrib "poster" tag
- let oldsrc = maybe src (</> src) $ writerSourceURL opts
- let oldposter = maybe poster (</> poster) $ writerSourceURL opts
- newsrc <- modifyMediaRef mediaRef oldsrc
- newposter <- modifyMediaRef mediaRef oldposter
+ newsrc <- modifyMediaRef mediaRef src
+ newposter <- modifyMediaRef mediaRef poster
let attr' = filter (\(x,_) -> x /= "src" && x /= "poster") attr ++
[("src", newsrc) | not (null newsrc)] ++
[("poster", newposter) | not (null newposter)]
return $ TagOpen name attr'
-transformTag _ _ tag = return tag
+transformTag _ tag = return tag
modifyMediaRef :: IORef [(FilePath, FilePath)] -> FilePath -> IO FilePath
modifyMediaRef _ "" = return ""
@@ -791,7 +785,7 @@ modifyMediaRef mediaRef oldsrc = do
Just n -> return n
Nothing -> do
let new = "media/file" ++ show (length media) ++
- takeExtension oldsrc
+ takeExtension (takeWhile (/='?') oldsrc) -- remove query
modifyIORef mediaRef ( (oldsrc, new): )
return new
@@ -799,10 +793,10 @@ transformBlock :: WriterOptions
-> IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media
-> Block
-> IO Block
-transformBlock opts mediaRef (RawBlock fmt raw)
+transformBlock _ mediaRef (RawBlock fmt raw)
| fmt == Format "html" = do
let tags = parseTags raw
- tags' <- mapM (transformTag opts mediaRef) tags
+ tags' <- mapM (transformTag mediaRef) tags
return $ RawBlock fmt (renderTags' tags')
transformBlock _ _ b = return b
@@ -810,19 +804,17 @@ transformInline :: WriterOptions
-> IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media
-> Inline
-> IO Inline
-transformInline opts mediaRef (Image lab (src,tit)) = do
- let src' = unEscapeString src
- let oldsrc = maybe src' (</> src) $ writerSourceURL opts
- newsrc <- modifyMediaRef mediaRef oldsrc
+transformInline _ mediaRef (Image lab (src,tit)) = do
+ newsrc <- modifyMediaRef mediaRef src
return $ Image lab (newsrc, tit)
transformInline opts _ (x@(Math _ _))
| WebTeX _ <- writerHTMLMathMethod opts = do
raw <- makeSelfContained opts $ writeHtmlInline opts x
return $ RawInline (Format "html") raw
-transformInline opts mediaRef (RawInline fmt raw)
+transformInline _ mediaRef (RawInline fmt raw)
| fmt == Format "html" = do
let tags = parseTags raw
- tags' <- mapM (transformTag opts mediaRef) tags
+ tags' <- mapM (transformTag mediaRef) tags
return $ RawInline fmt (renderTags' tags')
transformInline _ _ x = return x
@@ -1204,3 +1196,4 @@ docTitle' meta = fromMaybe [] $ go <$> lookupMeta "title" meta
_ -> []
go (MetaList xs) = concatMap go xs
go _ = []
+
diff --git a/src/Text/Pandoc/Writers/HTML.hs b/src/Text/Pandoc/Writers/HTML.hs
index 9ead604d7..e261cfca8 100644
--- a/src/Text/Pandoc/Writers/HTML.hs
+++ b/src/Text/Pandoc/Writers/HTML.hs
@@ -1,4 +1,4 @@
-{-# LANGUAGE OverloadedStrings, CPP #-}
+{-# LANGUAGE OverloadedStrings, CPP, ViewPatterns #-}
{-# OPTIONS_GHC -fno-warn-deprecations #-}
{-
Copyright (C) 2006-2014 John MacFarlane <jgm@berkeley.edu>
@@ -60,6 +60,8 @@ import qualified Text.Blaze.XHtml1.Transitional.Attributes as A
import Text.Blaze.Renderer.String (renderHtml)
import Text.TeXMath
import Text.XML.Light.Output
+import Text.XML.Light (unode, elChildren, add_attr, unqual)
+import qualified Text.XML.Light as XML
import System.FilePath (takeExtension)
import Data.Monoid
import Data.Aeson (Value)
@@ -71,11 +73,13 @@ data WriterState = WriterState
, stQuotes :: Bool -- ^ <q> tag is used
, stHighlighting :: Bool -- ^ Syntax highlighting is used
, stSecNum :: [Int] -- ^ Number of current section
+ , stElement :: Bool -- ^ Processing an Element
}
defaultWriterState :: WriterState
defaultWriterState = WriterState {stNotes= [], stMath = False, stQuotes = False,
- stHighlighting = False, stSecNum = []}
+ stHighlighting = False, stSecNum = [],
+ stElement = False}
-- Helpers to render HTML with the appropriate function.
@@ -155,6 +159,10 @@ pandocToHtml opts (Pandoc meta blocks) = do
H.script ! A.src (toValue url)
! A.type_ "text/javascript"
$ mempty
+ KaTeX js css ->
+ (H.script ! A.src (toValue js) $ mempty) <>
+ (H.link ! A.rel "stylesheet" ! A.href (toValue css)) <>
+ (H.script ! A.type_ "text/javascript" $ toHtml renderKaTeX)
_ -> case lookup "mathml-script" (writerVariables opts) of
Just s | not (writerHtml5 opts) ->
H.script ! A.type_ "text/javascript"
@@ -274,7 +282,13 @@ elementToHtml slideLevel opts (Sec level num (id',classes,keyvals) title' elemen
let titleSlide = slide && level < slideLevel
header' <- if title' == [Str "\0"] -- marker for hrule
then return mempty
- else blockToHtml opts (Header level' (id',classes,keyvals) title')
+ else do
+ modify (\st -> st{ stElement = True})
+ res <- blockToHtml opts
+ (Header level' (id',classes,keyvals) title')
+ modify (\st -> st{ stElement = False})
+ return res
+
let isSec (Sec _ _ _ _ _) = True
isSec (Blk _) = False
let isPause (Blk x) = x == Para [Str ".",Space,Str ".",Space,Str "."]
@@ -342,10 +356,10 @@ parseMailto s = do
_ -> fail "not a mailto: URL"
-- | Obfuscate a "mailto:" link.
-obfuscateLink :: WriterOptions -> String -> String -> Html
+obfuscateLink :: WriterOptions -> Html -> String -> Html
obfuscateLink opts txt s | writerEmailObfuscation opts == NoObfuscation =
- H.a ! A.href (toValue s) $ toHtml txt
-obfuscateLink opts txt s =
+ H.a ! A.href (toValue s) $ txt
+obfuscateLink opts (renderHtml -> txt) s =
let meth = writerEmailObfuscation opts
s' = map toLower (take 7 s) ++ drop 7 s
in case parseMailto s' of
@@ -485,7 +499,7 @@ blockToHtml opts (BlockQuote blocks) =
else do
contents <- blockListToHtml opts blocks
return $ H.blockquote $ nl opts >> contents >> nl opts
-blockToHtml opts (Header level (_,classes,_) lst) = do
+blockToHtml opts (Header level attr@(_,classes,_) lst) = do
contents <- inlineListToHtml opts lst
secnum <- liftM stSecNum get
let contents' = if writerNumberSections opts && not (null secnum)
@@ -493,7 +507,9 @@ blockToHtml opts (Header level (_,classes,_) lst) = do
then (H.span ! A.class_ "header-section-number" $ toHtml
$ showSecNum secnum) >> strToHtml " " >> contents
else contents
- return $ case level of
+ inElement <- gets stElement
+ return $ (if inElement then id else addAttrs opts attr)
+ $ case level of
1 -> H.h1 contents'
2 -> H.h2 contents'
3 -> H.h3 contents'
@@ -615,6 +631,18 @@ inlineListToHtml :: WriterOptions -> [Inline] -> State WriterState Html
inlineListToHtml opts lst =
mapM (inlineToHtml opts) lst >>= return . mconcat
+-- | Annotates a MathML expression with the tex source
+annotateMML :: XML.Element -> String -> XML.Element
+annotateMML e tex = math (unode "semantics" [cs, unode "annotation" (annotAttrs, tex)])
+ where
+ cs = case elChildren e of
+ [] -> unode "mrow" ()
+ [x] -> x
+ xs -> unode "mrow" xs
+ math = add_attr (XML.Attr (unqual "xmlns") "http://www.w3.org/1998/Math/MathML") . unode "math"
+ annotAttrs = [XML.Attr (unqual "encoding") "application/x-tex"]
+
+
-- | Convert Pandoc inline element to HTML.
inlineToHtml :: WriterOptions -> Inline -> State WriterState Html
inlineToHtml opts inline =
@@ -706,7 +734,7 @@ inlineToHtml opts inline =
defaultConfigPP
case writeMathML dt <$> readTeX str of
Right r -> return $ preEscapedString $
- ppcElement conf r
+ ppcElement conf (annotateMML r str)
Left _ -> inlineListToHtml opts
(texMathToInlines t str) >>=
return . (H.span ! A.class_ "math")
@@ -714,6 +742,10 @@ inlineToHtml opts inline =
case t of
InlineMath -> "\\(" ++ str ++ "\\)"
DisplayMath -> "\\[" ++ str ++ "\\]"
+ KaTeX _ _ -> return $ H.span ! A.class_ "math" $
+ toHtml (case t of
+ InlineMath -> str
+ DisplayMath -> "\\displaystyle " ++ str)
PlainMath -> do
x <- inlineListToHtml opts (texMathToInlines t str)
let m = H.span ! A.class_ "math" $ x
@@ -731,7 +763,7 @@ inlineToHtml opts inline =
| otherwise -> return mempty
(Link txt (s,_)) | "mailto:" `isPrefixOf` s -> do
linkText <- inlineListToHtml opts txt
- return $ obfuscateLink opts (renderHtml linkText) s
+ return $ obfuscateLink opts linkText s
(Link txt (s,tit)) -> do
linkText <- inlineListToHtml opts txt
let s' = case s of
@@ -815,3 +847,14 @@ blockListToNote opts ref blocks =
Just EPUB3 -> noteItem ! customAttribute "epub:type" "footnote"
_ -> noteItem
return $ nl opts >> noteItem'
+
+-- Javascript snippet to render all KaTeX elements
+renderKaTeX :: String
+renderKaTeX = unlines [
+ "window.onload = function(){var mathElements = document.getElementsByClassName(\"math\");"
+ , "for (var i=0; i < mathElements.length; i++)"
+ , "{"
+ , " var texText = mathElements[i].firstChild"
+ , " katex.render(texText.data, mathElements[i])"
+ , "}}"
+ ]
diff --git a/src/Text/Pandoc/Writers/ICML.hs b/src/Text/Pandoc/Writers/ICML.hs
index ae20efd4b..181c63df7 100644
--- a/src/Text/Pandoc/Writers/ICML.hs
+++ b/src/Text/Pandoc/Writers/ICML.hs
@@ -399,7 +399,7 @@ inlineToICML opts style (Subscript lst) = inlinesToICML opts (subscriptName:styl
inlineToICML opts style (SmallCaps lst) = inlinesToICML opts (smallCapsName:style) lst
inlineToICML opts style (Quoted SingleQuote lst) = inlinesToICML opts style $ [Str "‘"] ++ lst ++ [Str "’"]
inlineToICML opts style (Quoted DoubleQuote lst) = inlinesToICML opts style $ [Str "“"] ++ lst ++ [Str "”"]
-inlineToICML opts style (Cite _ lst) = footnoteToICML opts style [Para lst]
+inlineToICML opts style (Cite _ lst) = inlinesToICML opts style lst
inlineToICML _ style (Code _ str) = charStyle (codeName:style) $ text $ escapeStringForXML str
inlineToICML _ style Space = charStyle style space
inlineToICML _ style LineBreak = charStyle style $ text lineSeparator
diff --git a/src/Text/Pandoc/Writers/ODT.hs b/src/Text/Pandoc/Writers/ODT.hs
index 03f8e8ba4..2a4129512 100644
--- a/src/Text/Pandoc/Writers/ODT.hs
+++ b/src/Text/Pandoc/Writers/ODT.hs
@@ -41,7 +41,7 @@ import Control.Applicative ((<$>))
import Text.Pandoc.Options ( WriterOptions(..) )
import Text.Pandoc.Shared ( stringify, readDataFile, fetchItem', warn )
import Text.Pandoc.ImageSize ( imageSize, sizeInPoints )
-import Text.Pandoc.MIME ( getMimeType )
+import Text.Pandoc.MIME ( getMimeType, extensionFromMimeType )
import Text.Pandoc.Definition
import Text.Pandoc.Walk
import Text.Pandoc.Writers.Shared ( fixDisplayMath )
@@ -51,7 +51,7 @@ import Text.Pandoc.XML
import Text.Pandoc.Pretty
import qualified Control.Exception as E
import Data.Time.Clock.POSIX ( getPOSIXTime )
-import System.FilePath ( takeExtension, takeDirectory )
+import System.FilePath ( takeExtension, takeDirectory, (<.>))
-- | Produce an ODT file from a Pandoc document.
writeODT :: WriterOptions -- ^ Writer options
@@ -133,12 +133,14 @@ transformPicMath opts entriesRef (Image lab (src,_)) = do
Left (_ :: E.SomeException) -> do
warn $ "Could not find image `" ++ src ++ "', skipping..."
return $ Emph lab
- Right (img, _) -> do
+ Right (img, mbMimeType) -> do
let size = imageSize img
let (w,h) = fromMaybe (0,0) $ sizeInPoints `fmap` size
let tit' = show w ++ "x" ++ show h
entries <- readIORef entriesRef
- let newsrc = "Pictures/" ++ show (length entries) ++ takeExtension src
+ let extension = fromMaybe (takeExtension $ takeWhile (/='?') src)
+ (mbMimeType >>= extensionFromMimeType)
+ let newsrc = "Pictures/" ++ show (length entries) <.> extension
let toLazy = B.fromChunks . (:[])
epochtime <- floor `fmap` getPOSIXTime
let entry = toEntry newsrc epochtime $ toLazy img
diff --git a/src/Text/Pandoc/Writers/RST.hs b/src/Text/Pandoc/Writers/RST.hs
index 57ebfc360..5ba4c9983 100644
--- a/src/Text/Pandoc/Writers/RST.hs
+++ b/src/Text/Pandoc/Writers/RST.hs
@@ -173,11 +173,11 @@ blockToRST (Para [Image txt (src,'f':'i':'g':':':tit)]) = do
capt <- inlineListToRST txt
let fig = "figure:: " <> text src
let alt = ":alt: " <> if null tit then capt else text tit
- return $ hang 3 ".. " $ fig $$ alt $+$ capt $$ blankline
+ return $ hang 3 ".. " (fig $$ alt $+$ capt) $$ blankline
blockToRST (Para inlines)
| LineBreak `elem` inlines = do -- use line block if LineBreaks
lns <- mapM inlineListToRST $ splitBy (==LineBreak) inlines
- return $ (vcat $ map (text "| " <>) lns) <> blankline
+ return $ (vcat $ map (hang 2 (text "| ")) lns) <> blankline
| otherwise = do
contents <- inlineListToRST inlines
return $ contents <> blankline
@@ -239,8 +239,7 @@ blockToRST (Table caption _ widths headers rows) = do
middle = hcat $ intersperse sep' blocks
let makeRow = hpipeBlocks . zipWith lblock widthsInChars
let head' = makeRow headers'
- rows' <- mapM (\row -> do cols <- mapM blockListToRST row
- return $ makeRow cols) rows
+ let rows' = map makeRow rawRows
let border ch = char '+' <> char ch <>
(hcat $ intersperse (char ch <> char '+' <> char ch) $
map (\l -> text $ replicate l ch) widthsInChars) <>
@@ -253,7 +252,7 @@ blockToRST (Table caption _ widths headers rows) = do
blockToRST (BulletList items) = do
contents <- mapM bulletListItemToRST items
-- ensure that sublists have preceding blank line
- return $ blankline $$ vcat contents $$ blankline
+ return $ blankline $$ chomp (vcat contents) $$ blankline
blockToRST (OrderedList (start, style', delim) items) = do
let markers = if start == 1 && style' == DefaultStyle && delim == DefaultDelim
then take (length items) $ repeat "#."
@@ -265,11 +264,11 @@ blockToRST (OrderedList (start, style', delim) items) = do
contents <- mapM (\(item, num) -> orderedListItemToRST item num) $
zip markers' items
-- ensure that sublists have preceding blank line
- return $ blankline $$ vcat contents $$ blankline
+ return $ blankline $$ chomp (vcat contents) $$ blankline
blockToRST (DefinitionList items) = do
contents <- mapM definitionListItemToRST items
-- ensure that sublists have preceding blank line
- return $ blankline $$ vcat contents $$ blankline
+ return $ blankline $$ chomp (vcat contents) $$ blankline
-- | Convert bullet list item (list of blocks) to RST.
bulletListItemToRST :: [Block] -> State WriterState Doc
@@ -427,7 +426,7 @@ inlineToRST (Image alternate (source, tit)) = do
return $ "|" <> label <> "|"
inlineToRST (Note contents) = do
-- add to notes in state
- notes <- get >>= return . stNotes
+ notes <- gets stNotes
modify $ \st -> st { stNotes = contents:notes }
let ref = show $ (length notes) + 1
return $ " [" <> text ref <> "]_"
diff --git a/tests/Tests/Old.hs b/tests/Tests/Old.hs
index 8a256b761..0f7b33dd1 100644
--- a/tests/Tests/Old.hs
+++ b/tests/Tests/Old.hs
@@ -153,6 +153,9 @@ tests = [ testGroup "markdown"
, test "formatting" ["-r", "epub", "-w", "native"]
"epub/formatting.epub" "epub/formatting.native"
]
+ , testGroup "twiki"
+ [ test "reader" ["-r", "twiki", "-w", "native", "-s"]
+ "twiki-reader.twiki" "twiki-reader.native" ]
, testGroup "other writers" $ map (\f -> testGroup f $ writerTests f)
[ "opendocument" , "context" , "texinfo", "icml"
, "man" , "plain" , "rtf", "org", "asciidoc"
diff --git a/tests/Tests/Readers/Docx.hs b/tests/Tests/Readers/Docx.hs
index 565d117e9..2963c34da 100644
--- a/tests/Tests/Readers/Docx.hs
+++ b/tests/Tests/Readers/Docx.hs
@@ -13,7 +13,6 @@ import Text.Pandoc.Writers.Native (writeNative)
import qualified Data.Map as M
import Text.Pandoc.MediaBag (MediaBag, lookupMedia, mediaDirectory)
import Codec.Archive.Zip
-import System.FilePath (combine)
-- We define a wrapper around pandoc that doesn't normalize in the
-- tests. Since we do our own normalization, we want to make sure
@@ -60,7 +59,7 @@ testCompare = testCompareWithOpts def
getMedia :: FilePath -> FilePath -> IO (Maybe B.ByteString)
getMedia archivePath mediaPath = do
zf <- B.readFile archivePath >>= return . toArchive
- return $ findEntryByPath (combine "word" mediaPath) zf >>= (Just . fromEntry)
+ return $ findEntryByPath ("word/" ++ mediaPath) zf >>= (Just . fromEntry)
compareMediaPathIO :: FilePath -> MediaBag -> FilePath -> IO Bool
compareMediaPathIO mediaPath mediaBag docxPath = do
@@ -157,9 +156,9 @@ tests = [ testGroup "inlines"
"docx/numbered_header.docx"
"docx/numbered_header.native"
, testCompare
- "headers in other languages"
- "docx/danish_headers.docx"
- "docx/danish_headers.native"
+ "i18n blocks (headers and blockquotes)"
+ "docx/i18n_blocks.docx"
+ "docx/i18n_blocks.native"
, testCompare
"lists"
"docx/lists.docx"
diff --git a/tests/Tests/Readers/EPUB.hs b/tests/Tests/Readers/EPUB.hs
index f27ea979f..0d19a8400 100644
--- a/tests/Tests/Readers/EPUB.hs
+++ b/tests/Tests/Readers/EPUB.hs
@@ -8,6 +8,7 @@ import qualified Data.ByteString.Lazy as BL
import Text.Pandoc.Readers.EPUB
import Text.Pandoc.MediaBag (MediaBag, mediaDirectory)
import Control.Applicative
+import System.FilePath (joinPath)
getMediaBag :: FilePath -> IO MediaBag
getMediaBag fp = snd . readEPUB def <$> BL.readFile fp
@@ -22,12 +23,12 @@ testMediaBag fp bag = do
(actBag == bag)
featuresBag :: [(String, String, Int)]
-featuresBag = [("img/ElementaryMathExample.png","image/png",1331),("img/Maghreb1.png","image/png",2520),("img/check.gif","image/gif",1340),("img/check.jpg","image/jpeg",2661),("img/check.png","image/png",2815),("img/cichons_diagram.png","image/png",7045),("img/complex_number.png","image/png",5238),("img/multiscripts_and_greek_alphabet.png","image/png",10060)]
+featuresBag = [(joinPath ["img","check.gif"],"image/gif",1340),(joinPath ["img","check.jpg"],"image/jpeg",2661),(joinPath ["img","check.png"],"image/png",2815),(joinPath ["img","multiscripts_and_greek_alphabet.png"],"image/png",10060)]
tests :: [Test]
tests =
[ testGroup "EPUB Mediabag"
[ testCase "features bag"
- (testMediaBag "epub/features.epub" featuresBag)
+ (testMediaBag "epub/img.epub" featuresBag)
]
]
diff --git a/tests/Tests/Readers/Markdown.hs b/tests/Tests/Readers/Markdown.hs
index b45d94032..a7e322306 100644
--- a/tests/Tests/Readers/Markdown.hs
+++ b/tests/Tests/Readers/Markdown.hs
@@ -20,6 +20,9 @@ markdownCDL :: String -> Pandoc
markdownCDL = readMarkdown def { readerExtensions = Set.insert
Ext_compact_definition_lists $ readerExtensions def }
+markdownGH :: String -> Pandoc
+markdownGH = readMarkdown def { readerExtensions = githubMarkdownExtensions }
+
infix 4 =:
(=:) :: ToString c
=> String -> (String, c) -> Test
@@ -271,5 +274,14 @@ tests = [ testGroup "inline code"
plain (text "if this button exists") <>
rawBlock "html" "</button>" <>
divWith nullAttr (para $ text "with this div too.")]
+ , test markdownGH "issue #1636" $
+ unlines [ "* a"
+ , "* b"
+ , "* c"
+ , " * d" ]
+ =?>
+ bulletList [ plain "a"
+ , plain "b"
+ , plain "c" <> bulletList [plain "d"] ]
]
]
diff --git a/tests/Tests/Readers/Org.hs b/tests/Tests/Readers/Org.hs
index b9896e1b0..d1f673514 100644
--- a/tests/Tests/Readers/Org.hs
+++ b/tests/Tests/Readers/Org.hs
@@ -126,6 +126,14 @@ tests =
, (emph "b") <> "."
])
+ , "Quotes are forbidden border chars" =:
+ "/'nope/ *nope\"*" =?>
+ para ("/'nope/" <> space <> "*nope\"*")
+
+ , "Commata are forbidden border chars" =:
+ "/nada,/" =?>
+ para "/nada,/"
+
, "Markup should work properly after a blank line" =:
unlines ["foo", "", "/bar/"] =?>
(para $ text "foo") <> (para $ emph $ text "bar")
@@ -189,6 +197,18 @@ tests =
"[[http://zeitlens.com/]]" =?>
(para $ link "http://zeitlens.com/" "" "http://zeitlens.com/")
+ , "Absolute file link" =:
+ "[[/url][hi]]" =?>
+ (para $ link "file:///url" "" "hi")
+
+ , "Link to file in parent directory" =:
+ "[[../file.txt][moin]]" =?>
+ (para $ link "../file.txt" "" "moin")
+
+ , "Empty link (for gitit interop)" =:
+ "[[][New Link]]" =?>
+ (para $ link "" "" "New Link")
+
, "Image link" =:
"[[sunset.png][dusk.svg]]" =?>
(para $ link "sunset.png" "" (image "dusk.svg" "" ""))
@@ -268,6 +288,18 @@ tests =
"\\notacommand{foo}" =?>
para (rawInline "latex" "\\notacommand{foo}")
+ , "MathML symbol in LaTeX-style" =:
+ "There is a hackerspace in Lübeck, Germany, called nbsp (unicode symbol: '\\nbsp')." =?>
+ para ("There is a hackerspace in Lübeck, Germany, called nbsp (unicode symbol: ' ').")
+
+ , "MathML symbol in LaTeX-style, including braces" =:
+ "\\Aacute{}stor" =?>
+ para "Ástor"
+
+ , "MathML copy sign" =:
+ "\\copy" =?>
+ para "©"
+
, "LaTeX citation" =:
"\\cite{Coffee}" =?>
let citation = Citation
@@ -450,6 +482,18 @@ tests =
, header 2 ("walk" <> space <> "dog")
]
+ , "Comment Trees" =:
+ unlines [ "* COMMENT A comment tree"
+ , " Not much going on here"
+ , "** This will be dropped"
+ , "* Comment tree above"
+ ] =?>
+ header 1 "Comment tree above"
+
+ , "Nothing but a COMMENT header" =:
+ "* COMMENT Test" =?>
+ (mempty::Blocks)
+
, "Paragraph starting with an asterisk" =:
"*five" =?>
para "*five"
@@ -578,6 +622,13 @@ tests =
, plain "Item2"
]
+ , "Unindented *" =:
+ ("- Item1\n" ++
+ "* Item2\n") =?>
+ bulletList [ plain "Item1"
+ ] <>
+ header 1 "Item2"
+
, "Multi-line Bullet Lists" =:
("- *Fat\n" ++
" Tony*\n" ++
@@ -622,6 +673,33 @@ tests =
]
]
+ , "Bullet List with Decreasing Indent" =:
+ (" - Discovery\n\
+ \ - Human After All\n") =?>
+ mconcat [ bulletList [ plain "Discovery" ]
+ , bulletList [ plain ("Human" <> space <> "After" <> space <> "All")]
+ ]
+
+ , "Header follows Bullet List" =:
+ (" - Discovery\n\
+ \ - Human After All\n\
+ \* Homework") =?>
+ mconcat [ bulletList [ plain "Discovery"
+ , plain ("Human" <> space <> "After" <> space <> "All")
+ ]
+ , header 1 "Homework"
+ ]
+
+ , "Bullet List Unindented with trailing Header" =:
+ ("- Discovery\n\
+ \- Homework\n\
+ \* NotValidListItem") =?>
+ mconcat [ bulletList [ plain "Discovery"
+ , plain "Homework"
+ ]
+ , header 1 "NotValidListItem"
+ ]
+
, "Simple Ordered List" =:
("1. Item1\n" ++
"2. Item2\n") =?>
@@ -702,7 +780,9 @@ tests =
]
])
]
-
+ , "Definition list with multi-word term" =:
+ " - Elijah Wood :: He plays Frodo" =?>
+ definitionList [ ("Elijah" <> space <> "Wood", [plain $ "He" <> space <> "plays" <> space <> "Frodo"])]
, "Compact definition list" =:
unlines [ "- ATP :: adenosine 5' triphosphate"
, "- DNA :: deoxyribonucleic acid"
@@ -715,6 +795,16 @@ tests =
, ("PCR", [ plain $ spcSep [ "polymerase", "chain", "reaction" ] ])
]
+ , "Definition List With Trailing Header" =:
+ "- definition :: list\n\
+ \- cool :: defs\n\
+ \* header" =?>
+ mconcat [ definitionList [ ("definition", [plain "list"])
+ , ("cool", [plain "defs"])
+ ]
+ , header 1 "header"
+ ]
+
, "Loose bullet list" =:
unlines [ "- apple"
, ""
diff --git a/tests/Tests/Readers/RST.hs b/tests/Tests/Readers/RST.hs
index a80dc32b7..c97dcb149 100644
--- a/tests/Tests/Readers/RST.hs
+++ b/tests/Tests/Readers/RST.hs
@@ -67,5 +67,26 @@ tests = [ "line block with blank line" =:
link "http://foo.bar.baz" "" "http://foo.bar.baz" <> ". " <>
link "http://foo.bar/baz_(bam)" "" "http://foo.bar/baz_(bam)"
<> " (" <> link "http://foo.bar" "" "http://foo.bar" <> ")")
+ , "indented literal block" =: unlines
+ [ "::"
+ , ""
+ , " block quotes"
+ , ""
+ , " can go on for many lines"
+ , "but must stop here"]
+ =?> (doc $
+ codeBlock "block quotes\n\ncan go on for many lines" <>
+ para "but must stop here")
+ , "line block with 3 lines" =: "| a\n| b\n| c"
+ =?> para ("a" <> linebreak <> "b" <> linebreak <> "c")
+ , "quoted literal block using >" =: "::\n\n> quoted\n> block\n\nOrdinary paragraph"
+ =?> codeBlock "> quoted\n> block" <> para "Ordinary paragraph"
+ , "quoted literal block using | (not a line block)" =: "::\n\n| quoted\n| block\n\nOrdinary paragraph"
+ =?> codeBlock "| quoted\n| block" <> para "Ordinary paragraph"
+ , "class directive with single paragraph" =: ".. class:: special\n\nThis is a \"special\" paragraph."
+ =?> divWith ("", ["special"], []) (para "This is a \"special\" paragraph.")
+ , "class directive with two paragraphs" =: ".. class:: exceptional remarkable\n\n First paragraph.\n\n Second paragraph."
+ =?> divWith ("", ["exceptional", "remarkable"], []) (para "First paragraph." <> para "Second paragraph.")
+ , "class directive around literal block" =: ".. class:: classy\n\n::\n\n a\n b"
+ =?> divWith ("", ["classy"], []) (codeBlock "a\nb")
]
-
diff --git a/tests/Tests/Shared.hs b/tests/Tests/Shared.hs
index b6671835c..9b55b7b1d 100644
--- a/tests/Tests/Shared.hs
+++ b/tests/Tests/Shared.hs
@@ -9,6 +9,7 @@ import Test.Framework.Providers.HUnit
import Test.HUnit ( assertBool, (@?=) )
import Text.Pandoc.Builder
import Data.Monoid
+import System.FilePath (joinPath)
tests :: [Test]
tests = [ testGroup "normalize"
@@ -40,21 +41,21 @@ p_normalize_no_trailing_spaces ils = null ils' || last ils' /= Space
testCollapse :: [Test]
testCollapse = map (testCase "collapse")
- [ (collapseFilePath "" @?= "")
- , (collapseFilePath "./foo" @?= "foo")
- , (collapseFilePath "././../foo" @?= "../foo")
- , (collapseFilePath "../foo" @?= "../foo")
- , (collapseFilePath "/bar/../baz" @?= "/baz")
- , (collapseFilePath "/../baz" @?= "/../baz")
- , (collapseFilePath "./foo/.././bar/../././baz" @?= "baz")
- , (collapseFilePath "./" @?= "")
- , (collapseFilePath "././" @?= "")
- , (collapseFilePath "../" @?= "..")
- , (collapseFilePath ".././" @?= "..")
- , (collapseFilePath "./../" @?= "..")
- , (collapseFilePath "../../" @?= "../..")
- , (collapseFilePath "parent/foo/baz/../bar" @?= "parent/foo/bar")
- , (collapseFilePath "parent/foo/baz/../../bar" @?= "parent/bar")
- , (collapseFilePath "parent/foo/.." @?= "parent")
- , (collapseFilePath "/parent/foo/../../bar" @?= "/bar")
- , (collapseFilePath "/./parent/foo" @?= "/parent/foo")]
+ [ (collapseFilePath (joinPath [ ""]) @?= (joinPath [ ""]))
+ , (collapseFilePath (joinPath [ ".","foo"]) @?= (joinPath [ "foo"]))
+ , (collapseFilePath (joinPath [ ".",".","..","foo"]) @?= (joinPath [ joinPath ["..", "foo"]]))
+ , (collapseFilePath (joinPath [ "..","foo"]) @?= (joinPath [ "..","foo"]))
+ , (collapseFilePath (joinPath [ "","bar","..","baz"]) @?= (joinPath [ "","baz"]))
+ , (collapseFilePath (joinPath [ "","..","baz"]) @?= (joinPath [ "","..","baz"]))
+ , (collapseFilePath (joinPath [ ".","foo","..",".","bar","..",".",".","baz"]) @?= (joinPath [ "baz"]))
+ , (collapseFilePath (joinPath [ ".",""]) @?= (joinPath [ ""]))
+ , (collapseFilePath (joinPath [ ".",".",""]) @?= (joinPath [ ""]))
+ , (collapseFilePath (joinPath [ "..",""]) @?= (joinPath [ ".."]))
+ , (collapseFilePath (joinPath [ "..",".",""]) @?= (joinPath [ ".."]))
+ , (collapseFilePath (joinPath [ ".","..",""]) @?= (joinPath [ ".."]))
+ , (collapseFilePath (joinPath [ "..","..",""]) @?= (joinPath [ "..",".."]))
+ , (collapseFilePath (joinPath [ "parent","foo","baz","..","bar"]) @?= (joinPath [ "parent","foo","bar"]))
+ , (collapseFilePath (joinPath [ "parent","foo","baz","..","..","bar"]) @?= (joinPath [ "parent","bar"]))
+ , (collapseFilePath (joinPath [ "parent","foo",".."]) @?= (joinPath [ "parent"]))
+ , (collapseFilePath (joinPath [ "","parent","foo","..","..","bar"]) @?= (joinPath [ "","bar"]))
+ , (collapseFilePath (joinPath [ "",".","parent","foo"]) @?= (joinPath [ "","parent","foo"]))]
diff --git a/tests/docx/danish_headers.docx b/tests/docx/danish_headers.docx
deleted file mode 100644
index 345fc632c..000000000
--- a/tests/docx/danish_headers.docx
+++ /dev/null
Binary files differ
diff --git a/tests/docx/danish_headers.native b/tests/docx/danish_headers.native
deleted file mode 100644
index 12a857811..000000000
--- a/tests/docx/danish_headers.native
+++ /dev/null
@@ -1,10 +0,0 @@
-[Header 1 ("testoverskrift",[],[]) [Str "Testoverskrift"]
-,Para [Str "Normal"]
-,Header 2 ("testoverskrift-2",[],[]) [Str "Testoverskrift",Space,Str "2"]
-,Table [] [AlignDefault,AlignDefault] [0.0,0.0]
- [[Plain [Str "Kolonne1"]]
- ,[Plain [Str "Kolonne2"]]]
- [[[Plain [Str "Testdata",Space,Str "1"]]
- ,[Plain [Str "Tester2"]]]
- ,[[Plain [Str "Testdata",Space,Str "2"]]
- ,[Plain [Str "Tester3"]]]]]
diff --git a/tests/docx/i18n_blocks.docx b/tests/docx/i18n_blocks.docx
new file mode 100644
index 000000000..36341c363
--- /dev/null
+++ b/tests/docx/i18n_blocks.docx
Binary files differ
diff --git a/tests/docx/i18n_blocks.native b/tests/docx/i18n_blocks.native
new file mode 100644
index 000000000..582a7360d
--- /dev/null
+++ b/tests/docx/i18n_blocks.native
@@ -0,0 +1,8 @@
+[Header 1 ("this-is-heading-1",[],[]) [Str "This",Space,Str "is",Space,Str "Heading",Space,Str "1"]
+,Header 2 ("this-is-heading-2",[],[]) [Str "This",Space,Str "is",Space,Str "Heading",Space,Str "2"]
+,BlockQuote
+ [Para [Str "This",Space,Str "is",Space,Str "Quote"]
+ ,Para [Str "This",Space,Str "is",Space,Str "Block",Space,Str "Text"]]
+,BulletList
+ [[Para [Str "This",Space,Str "is",Space,Str "list",Space,Str "item",Space,Str "1"]]
+ ,[Para [Str "This",Space,Str "is",Space,Str "list",Space,Str "item",Space,Str "2"]]]]
diff --git a/tests/docx/links.docx b/tests/docx/links.docx
index 10ec62fd7..538b84b08 100644
--- a/tests/docx/links.docx
+++ b/tests/docx/links.docx
Binary files differ
diff --git a/tests/docx/links.native b/tests/docx/links.native
index c741fe875..cd7ab6fb6 100644
--- a/tests/docx/links.native
+++ b/tests/docx/links.native
@@ -1,5 +1,6 @@
[Header 2 ("an-internal-link-and-an-external-link",[],[]) [Str "An",Space,Str "internal",Space,Str "link",Space,Str "and",Space,Str "an",Space,Str "external",Space,Str "link"]
,Para [Str "An",Space,Link [Str "external",Space,Str "link"] ("http://google.com",""),Space,Str "to",Space,Str "a",Space,Str "popular",Space,Str "website."]
+,Para [Str "An",Space,Link [Str "external",Space,Str "link"] ("http://johnmacfarlane.net/pandoc/README.html#synopsis",""),Space,Str "to",Space,Str "a",Space,Str "website",Space,Str "with",Space,Str "an",Space,Str "anchor."]
,Para [Str "An",Space,Link [Str "internal",Space,Str "link"] ("#a-section-for-testing-link-targets",""),Space,Str "to",Space,Str "a",Space,Str "section",Space,Str "header."]
,Para [Str "An",Space,Link [Str "internal",Space,Str "link"] ("#my_bookmark",""),Space,Str "to",Space,Str "a",Space,Str "bookmark."]
,Header 2 ("a-section-for-testing-link-targets",[],[]) [Str "A",Space,Str "section",Space,Str "for",Space,Str "testing",Space,Str "link",Space,Str "targets"]
diff --git a/tests/epub/features.epub b/tests/epub/features.epub
index 8dcae384b..2690eec8b 100644
--- a/tests/epub/features.epub
+++ b/tests/epub/features.epub
Binary files differ
diff --git a/tests/epub/features.native b/tests/epub/features.native
index f01070383..6ccc04f43 100644
--- a/tests/epub/features.native
+++ b/tests/epub/features.native
@@ -1,5 +1,4 @@
-[Para [Image [] ("img/multiscripts_and_greek_alphabet.png","")]
-,Para [Span ("front.xhtml",[],[]) []]
+[Para [Span ("front.xhtml",[],[]) []]
,RawBlock (Format "html") "<section>"
,Header 1 ("",[],[]) [Str "Reflowable",Space,Str "EPUB",Space,Str "3",Space,Str "Conformance",Space,Str "Test",Space,Str "Document:",Space,Str "0100"]
,RawBlock (Format "html") "<section>"
@@ -28,31 +27,6 @@
[[Plain [Str "@@@TODO",Space,Str "provide",Space,Str "info",Space,Str "on",Space,Str "where",Space,Str "to",Space,Str "get",Space,Str "the",Space,Str "results",Space,Str "form"]]])]
,RawBlock (Format "html") "</section>"
,RawBlock (Format "html") "</section>"
-,Para [Span ("content-images-001.xhtml",[],[]) []]
-,RawBlock (Format "html") "<section>"
-,Header 2 ("content-images-001.xhtml#multimedia",[],[]) [Str "Multimedia"]
-,RawBlock (Format "html") "<section>"
-,Header 3 ("content-images-001.xhtml#images",[],[]) [Str "Images"]
-,RawBlock (Format "html") "<section id=\"img-010\" class=\"ctest\">"
-,Header 4 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "img-010"],Space,Str "GIF"]
-,Para [Str "Tests",Space,Str "whether",Space,Str "the",Space,Str "GIF",Space,Str "image",Space,Str "format",Space,Str "is",Space,Str "supported."]
-,Para [Image [Str "gif",Space,Str "test"] ("img/check.gif","")]
-,Para [Str "If",Space,Str "a",Space,Str "checkmark",Space,Str "precedes",Space,Str "this",Space,Str "paragaph,",Space,Str "the",Space,Str "test",Space,Str "passes."]
-,RawBlock (Format "html") "</section>"
-,RawBlock (Format "html") "<section id=\"img-020\" class=\"ctest\">"
-,Header 4 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "img-020"],Space,Str "PNG"]
-,Para [Str "Tests",Space,Str "whether",Space,Str "the",Space,Str "PNG",Space,Str "image",Space,Str "format",Space,Str "is",Space,Str "supported."]
-,Para [Image [Str "png",Space,Str "test"] ("img/check.png","")]
-,Para [Str "If",Space,Str "a",Space,Str "checkmark",Space,Str "precedes",Space,Str "this",Space,Str "paragaph,",Space,Str "the",Space,Str "test",Space,Str "passes."]
-,RawBlock (Format "html") "</section>"
-,RawBlock (Format "html") "<section id=\"img-030\" class=\"ctest\">"
-,Header 4 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "img-030"],Space,Str "JPEG"]
-,Para [Str "Tests",Space,Str "whether",Space,Str "the",Space,Str "JPEG",Space,Str "image",Space,Str "format",Space,Str "is",Space,Str "supported."]
-,Para [Image [Str "jpeg",Space,Str "test"] ("img/check.jpg","")]
-,Para [Str "If",Space,Str "a",Space,Str "checkmark",Space,Str "precedes",Space,Str "this",Space,Str "paragaph,",Space,Str "the",Space,Str "test",Space,Str "passes."]
-,RawBlock (Format "html") "</section>"
-,RawBlock (Format "html") "</section>"
-,RawBlock (Format "html") "</section>"
,Para [Span ("content-mathml-001.xhtml",[],[]) []]
,RawBlock (Format "html") "<section>"
,Header 2 ("content-mathml-001.xhtml#mathml",[],[]) [Str "MathML"]
@@ -94,13 +68,13 @@
,Header 2 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "mathml-024"],Str "Horizontal",Space,Str "stretch,",Space,Code ("",[],[]) "mover",Str ",",Space,Code ("",[],[]) "munder",Str ",",Space,Str "and",Space,Code ("",[],[]) "mspace",Space,Str "elements"]
,Para [Str "Tests",Space,Str "whether",Space,Str "horizontal",Space,Str "stretch,",Space,Code ("",[],[]) "mover",Str ",",Space,Code ("",[],[]) "munder",Str ",",Space,Code ("",[],[]) "mspace",Space,Str "elements",Space,Str "are",Space,Str "supported."]
,Para [Math DisplayMath "c = \\overset{\\text{complex\\ number}}{\\overbrace{\\underset{\\text{real}}{\\underbrace{\\mspace{20mu} a\\mspace{20mu}}} + \\underset{\\text{imaginary}}{\\underbrace{\\quad b{\\mathbb{i}}\\quad}}}}"]
-,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Image [Str "description",Space,Str "of",Space,Str "imaginary",Space,Str "number:",Space,Str "c",Space,Str "=",Space,Str "a",Space,Str "+bi",Space,Str "with",Space,Str "an",Space,Str "overbrace",Space,Str "reading",Space,Str "'complex",Space,Str "number'",Space,Str "and",Space,Str "underbraces",Space,Str "below",Space,Str "'a'",Space,Str "and",Space,Str "'b",Space,Str "i'",Space,Str "reading",Space,Str "'real'",Space,Str "and",Space,Str "'imaginary'",Space,Str "respectively."] ("img/complex_number.png",""),Str "."]
+,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Str "."]
,RawBlock (Format "html") "</section>"
,RawBlock (Format "html") "<section id=\"mathml-025\" class=\"ctest\">"
,Header 2 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "mathml-025"],Str "Testing",Space,Code ("",[],[]) "mtable",Space,Str "with",Space,Code ("",[],[]) "colspan",Space,Str "and",Space,Code ("",[],[]) "rowspan",Space,Str "attributes,",Space,Str "Hebrew",Space,Str "and",Space,Str "Script",Space,Str "fonts"]
,Para [Str "Tests",Space,Str "whether",Space,Code ("",[],[]) "mtable",Space,Str "with",Space,Code ("",[],[]) "colspan",Space,Str "and",Space,Code ("",[],[]) "mspace",Space,Str "attributes",Space,Str "(colum",Space,Str "and",Space,Str "row",Space,Str "spanning)",Space,Str "are",Space,Str "supported;",Space,Str "uses",Space,Str "Hebrew",Space,Str "and",Space,Str "Script",Space,Str "alphabets."]
,Para [Math DisplayMath "\\begin{array}{llllllllll}\n & {\\operatorname{cov}\\left( \\mathcal{L} \\right)} & \\longrightarrow & {\\operatorname{non}\\left( \\mathcal{K} \\right)} & \\longrightarrow & {\\operatorname{cof}\\left( \\mathcal{K} \\right)} & \\longrightarrow & {\\operatorname{cof}\\left( \\mathcal{L} \\right)} & \\longrightarrow & 2^{\\aleph_{0}} \\\\\n & \\uparrow & & \\uparrow & & \\uparrow & & \\uparrow & & \\\\\n & {\\mathfrak{b}} & \\longrightarrow & {\\mathfrak{d}} & & & & & & \\\\\n & \\uparrow & & \\uparrow & & & & & & \\\\\n\\aleph_{1} & \\longrightarrow & {\\operatorname{add}\\left( \\mathcal{L} \\right)} & \\longrightarrow & {\\operatorname{add}\\left( \\mathcal{K} \\right)} & \\longrightarrow & {\\operatorname{cov}\\left( \\mathcal{K} \\right)} & \\longrightarrow & {\\operatorname{non}\\left( \\mathcal{L} \\right)} & \\\\\n\\end{array}"]
-,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Link [Str "Cicho\324's",Space,Str "Diagram"] ("Cicho%C5%84's_diagram",""),Str ":",Space,Image [Str "rendering",Space,Str "of",Space,Str "Cicho\324's",Space,Str "diagram."] ("img/cichons_diagram.png",""),Str "."]
+,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Link [Str "Cicho\324's",Space,Str "Diagram"] ("Cicho%C5%84's_diagram",""),Str ":",Space,Str "."]
,RawBlock (Format "html") "</section>"
,RawBlock (Format "html") "<section id=\"mathml-026\" class=\"ctest\">"
,Header 2 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "mathml-026"],Str "BiDi,",Space,Str "RTL",Space,Str "and",Space,Str "Arabic",Space,Str "alphabets"]
@@ -111,7 +85,7 @@
,RawBlock (Format "html") "<section id=\"mathml-027\" class=\"ctest\">"
,Header 2 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "mathml-027"],Str "Elementary",Space,Str "math:",Space,Str "long",Space,Str "division",Space,Str "notation"]
,Para [Str "Tests",Space,Str "whether",Space,Code ("",[],[]) "mlongdiv",Space,Str "elements",Space,Str "(from",Space,Str "elementary",Space,Str "math)",Space,Str "are",Space,Str "supported."]
-,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Str "the",Space,Str "following",Space,Str "image:",Space,Image [Str "A",Space,Str "long",Space,Str "division",Space,Str "dividing",Space,Str "1306",Space,Str "by",Space,Str "3,",Space,Str "presented",Space,Str "in",Space,Str "'lefttop'",Space,Str "(US)",Space,Str "notation"] ("img/ElementaryMathExample.png",""),Str "."]
+,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Str "the",Space,Str "following",Space,Str "image:",Space,Str "."]
,RawBlock (Format "html") "</section>"
,RawBlock (Format "html") "</section>"
,Para [Span ("content-switch-001.xhtml",[],[]) []]
@@ -130,6 +104,4 @@
,Para [Str "If",Space,Str "a",Space,Str "MathML",Space,Str "equation",Space,Str "is",Space,Str "rendered",Space,Str "before",Space,Str "this",Space,Str "paragraph,",Space,Str "the",Space,Str "test",Space,Str "passes."]
,Para [Str "If",Space,Str "test",Space,Code ("",[],[]) "switch-010",Space,Str "did",Space,Str "not",Space,Str "pass,",Space,Str "this",Space,Str "test",Space,Str "should",Space,Str "be",Space,Str "marked",Space,Code ("",[],[]) "Not Supported",Str "."]
,RawBlock (Format "html") "</section>"
-,RawBlock (Format "html") "</section>"
-,Para [Span ("Maghreb1.png",[],[]) []]
-,Para [Image [] ("img/Maghreb1.png","")]]
+,RawBlock (Format "html") "</section>"]
diff --git a/tests/epub/img.epub b/tests/epub/img.epub
new file mode 100644
index 000000000..ebe80d935
--- /dev/null
+++ b/tests/epub/img.epub
Binary files differ
diff --git a/tests/html-reader.html b/tests/html-reader.html
index 14ad3ed54..e9ba2a68b 100644
--- a/tests/html-reader.html
+++ b/tests/html-reader.html
@@ -451,5 +451,4 @@ An e-mail address: nobody [at] nowhere.net<blockquote>
</tr>
</table>
</body>
-</body>
</html>
diff --git a/tests/markdown-reader-more.native b/tests/markdown-reader-more.native
index 30da0afbb..00313a0ac 100644
--- a/tests/markdown-reader-more.native
+++ b/tests/markdown-reader-more.native
@@ -17,6 +17,7 @@
,Header 3 ("my-header",[],[]) [Str "my",Space,Str "header"]
,Header 2 ("in-math",[],[]) [Str "$",Space,Str "in",Space,Str "math"]
,Para [Math InlineMath "\\$2 + \\$3"]
+,Para [Math InlineMath "x = \\text{the $n$th root of $y$}"]
,Para [Str "This",Space,Str "should",Space,Str "not",Space,Str "be",Space,Str "math:"]
,Para [Str "$PATH",Space,Str "90",Space,Str "$PATH"]
,Header 2 ("commented-out-list-item",[],[]) [Str "Commented-out",Space,Str "list",Space,Str "item"]
@@ -78,6 +79,8 @@
,Para [Str "Link",Space,Str "to",Space,Link [Str "Explicit",Space,Str "header",Space,Str "attributes"] ("#foobar",""),Str "."]
,Para [Str "But",Space,Str "this",Space,Str "is",Space,Str "not",Space,Str "a",Space,Str "link",Space,Str "to",Space,Link [Str "My",Space,Str "other",Space,Str "header"] ("/foo",""),Str ",",Space,Str "since",Space,Str "the",Space,Str "reference",Space,Str "is",Space,Str "defined."]
,Header 2 ("foobar",["baz"],[("key","val")]) [Str "Explicit",Space,Str "header",Space,Str "attributes"]
+,BlockQuote
+ [Header 2 ("foobar",["baz"],[("key","val")]) [Str "Header",Space,Str "attributes",Space,Str "inside",Space,Str "block",Space,Str "quote"]]
,Header 2 ("line-blocks",[],[]) [Str "Line",Space,Str "blocks"]
,Para [Str "But",Space,Str "can",Space,Str "a",Space,Str "bee",Space,Str "be",Space,Str "said",Space,Str "to",Space,Str "be",LineBreak,Str "\160\160\160\160or",Space,Str "not",Space,Str "to",Space,Str "be",Space,Str "an",Space,Str "entire",Space,Str "bee,",LineBreak,Str "\160\160\160\160\160\160\160\160when",Space,Str "half",Space,Str "the",Space,Str "bee",Space,Str "is",Space,Str "not",Space,Str "a",Space,Str "bee,",LineBreak,Str "\160\160\160\160\160\160\160\160\160\160\160\160due",Space,Str "to",Space,Str "some",Space,Str "ancient",Space,Str "injury?"]
,Para [Str "Continuation",Space,Str "line",LineBreak,Str "\160\160and",Space,Str "another"]
diff --git a/tests/markdown-reader-more.txt b/tests/markdown-reader-more.txt
index c486f8885..ea4fc9cad 100644
--- a/tests/markdown-reader-more.txt
+++ b/tests/markdown-reader-more.txt
@@ -60,6 +60,8 @@
$\$2 + \$3$
+$x = \text{the $n$th root of $y$}$
+
This should not be math:
$PATH 90 $PATH
@@ -174,6 +176,8 @@ But this is not a link to [My other header], since the reference is defined.
## Explicit header attributes {#foobar .baz key="val"}
+> ## Header attributes inside block quote {#foobar .baz key="val"}
+
## Line blocks
| But can a bee be said to be
diff --git a/tests/tables.asciidoc b/tests/tables.asciidoc
index ba647866a..2a24544a3 100644
--- a/tests/tables.asciidoc
+++ b/tests/tables.asciidoc
@@ -65,4 +65,3 @@ Multiline table without column headers:
|First |row |12.0 |Example of a row that spans multiple lines.
|Second |row |5.0 |Here's another one. Note the blank line between rows.
|=======================================================================
-
diff --git a/tests/tables.haddock b/tests/tables.haddock
index 413ec97ad..f9efdc0de 100644
--- a/tests/tables.haddock
+++ b/tests/tables.haddock
@@ -74,4 +74,3 @@ Multiline table without column headers:
> the blank line between
> rows.
> ----------- ---------- ------------ --------------------------
-
diff --git a/tests/tables.org b/tests/tables.org
index 8d9100d07..9eaf5e706 100644
--- a/tests/tables.org
+++ b/tests/tables.org
@@ -49,4 +49,3 @@ Multiline table without column headers:
| First | row | 12.0 | Example of a row that spans multiple lines. |
| Second | row | 5.0 | Here's another one. Note the blank line between rows. |
-
diff --git a/tests/tables.rst b/tests/tables.rst
index e77f69d97..25d5932ea 100644
--- a/tests/tables.rst
+++ b/tests/tables.rst
@@ -88,4 +88,3 @@ Multiline table without column headers:
| | | | the blank line between |
| | | | rows. |
+-------------+------------+--------------+----------------------------+
-
diff --git a/tests/twiki-reader.native b/tests/twiki-reader.native
new file mode 100644
index 000000000..bde55a378
--- /dev/null
+++ b/tests/twiki-reader.native
@@ -0,0 +1,174 @@
+Pandoc (Meta {unMeta = fromList []})
+[Header 1 ("header",[],[]) [Str "header"]
+,Header 2 ("header-level-two",[],[]) [Str "header",Space,Str "level",Space,Str "two"]
+,Header 3 ("header-level-3",[],[]) [Str "header",Space,Str "level",Space,Str "3"]
+,Header 4 ("header-level-four",[],[]) [Str "header",Space,Emph [Str "level"],Space,Str "four"]
+,Header 5 ("header-level-5",[],[]) [Str "header",Space,Str "level",Space,Str "5"]
+,Header 6 ("header-level-6",[],[]) [Str "header",Space,Str "level",Space,Str "6"]
+,Para [Str "---+++++++",Space,Str "not",Space,Str "a",Space,Str "header"]
+,Para [Str "--++",Space,Str "not",Space,Str "a",Space,Str "header"]
+,Header 1 ("emph-and-strong",[],[]) [Str "emph",Space,Str "and",Space,Str "strong"]
+,Para [Emph [Str "emph"],Space,Strong [Str "strong"]]
+,Para [Emph [Strong [Str "strong",Space,Str "and",Space,Str "emph"]]]
+,Para [Strong [Emph [Str "emph",Space,Str "inside"],Space,Str "strong"]]
+,Para [Strong [Str "strong",Space,Str "with",Space,Emph [Str "emph"]]]
+,Para [Emph [Strong [Str "strong",Space,Str "inside"],Space,Str "emph"]]
+,Header 1 ("horizontal-rule",[],[]) [Str "horizontal",Space,Str "rule"]
+,Para [Str "top"]
+,HorizontalRule
+,Para [Str "bottom"]
+,HorizontalRule
+,Header 1 ("nop",[],[]) [Str "nop"]
+,Para [Str "_not",Space,Str "emph_"]
+,Header 1 ("entities",[],[]) [Str "entities"]
+,Para [Str "hi",Space,Str "&",Space,Str "low"]
+,Para [Str "hi",Space,Str "&",Space,Str "low"]
+,Para [Str "G\246del"]
+,Para [Str "\777\2730"]
+,Header 1 ("comments",[],[]) [Str "comments"]
+,Para [Str "inline",Space,Str "comment"]
+,Para [Str "between",Space,Str "blocks"]
+,Header 1 ("linebreaks",[],[]) [Str "linebreaks"]
+,Para [Str "hi",LineBreak,Str "there"]
+,Para [Str "hi",LineBreak,Space,Str "there"]
+,Header 1 ("inline-code",[],[]) [Str "inline",Space,Str "code"]
+,Para [Code ("",[],[]) "*\8594*",Space,Code ("",[],[]) "typed",Space,Code ("",["haskell"],[]) ">>="]
+,Header 1 ("code-blocks",[],[]) [Str "code",Space,Str "blocks"]
+,CodeBlock ("",[],[]) "case xs of\n (_:_) -> reverse xs\n [] -> ['*']"
+,CodeBlock ("",["haskell"],[]) "case xs of\n (_:_) -> reverse xs\n [] -> ['*']"
+,Header 1 ("block-quotes",[],[]) [Str "block",Space,Str "quotes"]
+,Para [Str "Regular",Space,Str "paragraph"]
+,BlockQuote
+ [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "block",Space,Str "quote."]
+ ,Para [Str "With",Space,Str "two",Space,Str "paragraphs."]]
+,Para [Str "Nother",Space,Str "paragraph."]
+,Header 1 ("external-links",[],[]) [Str "external",Space,Str "links"]
+,Para [Link [Emph [Str "Google"],Space,Str "search",Space,Str "engine"] ("http://google.com","")]
+,Para [Link [Str "http://johnmacfarlane.net/pandoc/"] ("http://johnmacfarlane.net/pandoc/","")]
+,Para [Link [Str "http://google.com"] ("http://google.com",""),Space,Link [Str "http://yahoo.com"] ("http://yahoo.com","")]
+,Para [Link [Str "email",Space,Str "me"] ("mailto:info@example.org","")]
+,Para [Str "http://google.com"]
+,Para [Str "http://google.com"]
+,Para [Str "http://google.com"]
+,Para [Str "info@example.org"]
+,Para [Str "info@example.org"]
+,Para [Str "info@example.org"]
+,Header 1 ("lists",[],[]) [Str "lists"]
+,BulletList
+ [[Plain [Str "Start",Space,Str "each",Space,Str "line"]]
+ ,[Plain [Str "with",Space,Str "an",Space,Str "asterisk",Space,Str "(*)."]
+ ,BulletList
+ [[Plain [Str "More",Space,Str "asterisks",Space,Str "gives",Space,Str "deeper"]
+ ,BulletList
+ [[Plain [Str "and",Space,Str "deeper",Space,Str "levels."]]]]]]
+ ,[Plain [Str "Line",Space,Str "breaks",LineBreak,Str "don't",Space,Str "break",Space,Str "levels."]]
+ ,[Plain [Str "Continuations",Space,Str "are",Space,Str "also",Space,Str "possible"]
+ ,BulletList
+ [[Plain [Str "and",Space,Str "do",Space,Str "not",Space,Str "break",Space,Str "the",Space,Str "list",Space,Str "flow"]]]]
+ ,[Plain [Str "Level",Space,Str "one"]]]
+,Para [Str "Any",Space,Str "other",Space,Str "start",Space,Str "ends",Space,Str "the",Space,Str "list."]
+,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "Start",Space,Str "each",Space,Str "line"]]
+ ,[Plain [Str "with",Space,Str "a",Space,Str "number",Space,Str "(1.)."]
+ ,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "More",Space,Str "number",Space,Str "signs",Space,Str "gives",Space,Str "deeper"]
+ ,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "and",Space,Str "deeper"]]
+ ,[Plain [Str "levels."]]]]]]
+ ,[Plain [Str "Line",Space,Str "breaks",LineBreak,Str "don't",Space,Str "break",Space,Str "levels."]]
+ ,[Plain [Str "Blank",Space,Str "lines"]]]
+,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "end",Space,Str "the",Space,Str "list",Space,Str "and",Space,Str "start",Space,Str "another."]]]
+,Para [Str "Any",Space,Str "other",Space,Str "start",Space,Str "also",Space,Str "ends",Space,Str "the",Space,Str "list."]
+,DefinitionList
+ [([Str "item",Space,Str "1"],
+ [[Plain [Str "definition",Space,Str "1"]]])
+ ,([Str "item",Space,Str "2"],
+ [[Plain [Str "definition",Space,Str "2-1",Space,Str "definition",Space,Str "2-2"]]])
+ ,([Str "item",Space,Emph [Str "3"]],
+ [[Plain [Str "definition",Space,Emph [Str "3"]]]])]
+,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "one"]]
+ ,[Plain [Str "two"]
+ ,BulletList
+ [[Plain [Str "two",Space,Str "point",Space,Str "one"]]
+ ,[Plain [Str "two",Space,Str "point",Space,Str "two"]]]]
+ ,[Plain [Str "three"]
+ ,DefinitionList
+ [([Str "three",Space,Str "item",Space,Str "one"],
+ [[Plain [Str "three",Space,Str "def",Space,Str "one"]]])]]
+ ,[Plain [Str "four"]
+ ,DefinitionList
+ [([Str "four",Space,Str "def",Space,Str "one"],
+ [[Plain [Str "this",Space,Str "is",Space,Str "a",Space,Str "continuation"]]])]]
+ ,[Plain [Str "five"]
+ ,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "five",Space,Str "sub",Space,Str "1"]
+ ,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "five",Space,Str "sub",Space,Str "1",Space,Str "sub",Space,Str "1"]]]]
+ ,[Plain [Str "five",Space,Str "sub",Space,Str "2"]]]]]
+,OrderedList (1,DefaultStyle,DefaultDelim)
+ [[Plain [Str "other"]
+ ,OrderedList (1,UpperRoman,DefaultDelim)
+ [[Plain [Str "list"]]
+ ,[Plain [Str "styles"]]]]
+ ,[Plain [Str "are"]
+ ,OrderedList (1,LowerRoman,DefaultDelim)
+ [[Plain [Str "also"]]
+ ,[Plain [Str "possible"]]]]
+ ,[Plain [Str "all"]
+ ,OrderedList (1,LowerAlpha,DefaultDelim)
+ [[Plain [Str "the"]]
+ ,[Plain [Str "different"]]
+ ,[Plain [Str "styles"]]]]
+ ,[Plain [Str "are"]
+ ,OrderedList (1,UpperAlpha,DefaultDelim)
+ [[Plain [Str "implemented"]]
+ ,[Plain [Str "and"]]
+ ,[Plain [Str "supported"]]]]]
+,Header 1 ("tables",[],[]) [Str "tables"]
+,Table [] [AlignDefault,AlignDefault] [0.0,0.0]
+ [[]
+ ,[]]
+ [[[Plain [Str "Orange"]]
+ ,[Plain [Str "Apple"]]]
+ ,[[Plain [Str "Bread"]]
+ ,[Plain [Str "Pie"]]]
+ ,[[Plain [Str "Butter"]]
+ ,[Plain [Str "Ice",Space,Str "cream"]]]]
+,Table [] [AlignLeft,AlignLeft] [0.0,0.0]
+ [[Plain [Str "Orange"]]
+ ,[Plain [Str "Apple"]]]
+ [[[Plain [Str "Bread"]]
+ ,[Plain [Str "Pie"]]]
+ ,[[Plain [Strong [Str "Butter"]]]
+ ,[Plain [Str "Ice",Space,Str "cream"]]]]
+,Table [] [AlignLeft,AlignLeft] [0.0,0.0]
+ [[Plain [Str "Orange"]]
+ ,[Plain [Str "Apple"]]]
+ [[[Plain [Str "Bread",LineBreak,LineBreak,Str "and",Space,Str "cheese"]]
+ ,[Plain [Str "Pie",LineBreak,LineBreak,Strong [Str "apple"],Space,Str "and",Space,Emph [Str "carrot"]]]]]
+,Table [] [AlignDefault,AlignDefault,AlignDefault] [0.0,0.0,0.0]
+ [[]
+ ,[]
+ ,[]]
+ [[[Plain [Str "Orange"]]
+ ,[Plain [Str "Apple"]]
+ ,[Plain [Str "more"]]]
+ ,[[Plain [Str "Bread"]]
+ ,[Plain [Str "Pie"]]
+ ,[Plain [Str "more"]]]
+ ,[[Plain [Str "Butter"]]
+ ,[Plain [Str "Ice",Space,Str "cream"]]
+ ,[Plain [Str "and",Space,Str "more"]]]]
+,Header 1 ("macros",[],[]) [Str "macros"]
+,Para [Span ("",["twiki-macro","TEST"],[]) []]
+,Para [Span ("",["twiki-macro","TEST"],[]) [Str ""]]
+,Para [Span ("",["twiki-macro","TEST"],[]) [Str "content with spaces"]]
+,Para [Span ("",["twiki-macro","TEST"],[]) [Str "content with spaces"]]
+,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test")]) [Str "content with spaces"]]
+,Para [Span ("",["twiki-macro","TEST"],[]) [Str "content with spaces ARG1=test"]]
+,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test")]) [Str "content with spaces"]]
+,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test"),("ARG2","test2")]) [Str ""]]
+,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test"),("ARG2","test2")]) [Str ""]]
+,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test"),("ARG2","test2")]) [Str "multiline\ndoes also work"]]]
diff --git a/tests/twiki-reader.twiki b/tests/twiki-reader.twiki
new file mode 100644
index 000000000..51828ef80
--- /dev/null
+++ b/tests/twiki-reader.twiki
@@ -0,0 +1,221 @@
+---+ header
+
+---++ header level two
+
+---+++ header level 3
+
+---++++ header _level_ four
+
+---+++++ header level 5
+
+---++++++ header level 6
+
+---+++++++ not a header
+
+ --++ not a header
+
+---+ emph and strong
+
+_emph_ *strong*
+
+__strong and emph__
+
+*<i>emph inside</i> strong*
+
+*strong with <i>emph</i>*
+
+_<b>strong inside</b> emph_
+
+---+ horizontal rule
+
+top
+---
+bottom
+
+---
+
+---+ nop
+
+<nop>_not emph_
+
+---+ entities
+
+hi & low
+
+hi &amp; low
+
+G&ouml;del
+
+&#777;&#xAAA;
+
+---+ comments
+
+inline <!-- secret --> comment
+
+<!-- secret -->
+
+between blocks
+
+ <!-- secret -->
+
+---+ linebreaks
+
+hi%BR%there
+
+hi%BR%
+there
+
+---+ inline code
+
+<code>*→*</code> =typed= <code class="haskell">>>=</code>
+
+---+ code blocks
+
+<verbatim>
+case xs of
+ (_:_) -> reverse xs
+ [] -> ['*']
+</verbatim>
+
+<verbatim class="haskell">
+case xs of
+ (_:_) -> reverse xs
+ [] -> ['*']
+</verbatim>
+
+---+ block quotes
+
+Regular paragraph
+<blockquote>
+This is a block quote.
+
+With two paragraphs.
+</blockquote>
+Nother paragraph.
+
+---+ external links
+
+[[http://google.com][<i>Google</i> search engine]]
+
+http://johnmacfarlane.net/pandoc/
+
+[[http://google.com]] [[http://yahoo.com]]
+
+[[mailto:info@example.org][email me]]
+
+!http://google.com
+
+<nop>http://google.com
+
+<noautolink>
+http://google.com
+</noautolink>
+
+!info@example.org
+
+<nop>info@example.org
+
+<noautolink>
+info@example.org
+</noautolink>
+
+---+ lists
+
+ * Start each line
+ * with an asterisk (*).
+ * More asterisks gives deeper
+ * and deeper levels.
+ * Line breaks%BR%don't break levels.
+ * Continuations
+ are also possible
+ * and do not break the list flow
+ * Level one
+Any other start ends the list.
+
+ 1. Start each line
+ 1. with a number (1.).
+ 1. More number signs gives deeper
+ 1. and deeper
+ 1. levels.
+ 1. Line breaks%BR%don't break levels.
+ 1. Blank lines
+
+ 1. end the list and start another.
+Any other start also
+ends the list.
+
+ $ item 1: definition 1
+ $ item 2: definition 2-1
+ definition 2-2
+ $ item _3_: definition _3_
+
+ 1. one
+ 1. two
+ * two point one
+ * two point two
+ 1. three
+ $ three item one: three def one
+ 1. four
+ $ four def one: this
+ is a continuation
+ 1. five
+ 1. five sub 1
+ 1. five sub 1 sub 1
+ 1. five sub 2
+
+ 1. other
+ I. list
+ I. styles
+ 1. are
+ i. also
+ i. possible
+ 1. all
+ a. the
+ a. different
+ a. styles
+ 1. are
+ A. implemented
+ A. and
+ A. supported
+
+---+ tables
+
+|Orange|Apple|
+|Bread|Pie|
+|Butter|Ice cream|
+
+|*Orange*|*Apple*|
+|Bread|Pie|
+|*Butter*|Ice cream|
+
+|*Orange*|*Apple*|
+|Bread%BR%%BR%and cheese|Pie%BR%%BR%*apple* and <i>carrot</i>|
+
+| Orange | Apple | more |
+| Bread | Pie | more |
+| Butter | Ice cream | and more |
+
+---+ macros
+
+%TEST%
+
+%TEST{}%
+
+%TEST{content with spaces}%
+
+%TEST{"content with spaces"}%
+
+%TEST{"content with spaces" ARG1="test"}%
+
+%TEST{content with spaces ARG1=test}%
+
+%TEST{ARG1=test content with spaces}%
+
+%TEST{ARG1=test ARG2=test2}%
+
+%TEST{ARG1="test" ARG2="test2"}%
+
+%TEST{ARG1="test"
+ARG2="test2"
+multiline
+does also work}%
diff --git a/tests/writer.dokuwiki b/tests/writer.dokuwiki
index 704e79b87..dc14e9b00 100644
--- a/tests/writer.dokuwiki
+++ b/tests/writer.dokuwiki
@@ -268,7 +268,8 @@ Multiple blocks with italics:
<HTML><dt></HTML>//orange//<HTML></dt></HTML>
<HTML><dd></HTML><HTML><p></HTML>orange fruit<HTML></p></HTML>
<code>{ orange code block }</code>
-> <HTML><p></HTML>orange block quote<HTML></p></HTML><HTML></dd></HTML><HTML></dl></HTML>
+> <HTML><p></HTML>orange block quote<HTML></p></HTML>
+<HTML></dd></HTML><HTML></dl></HTML>
Multiple definitions, tight:
@@ -611,7 +612,7 @@ If you want, you can indent every line, but you can also be lazy and just indent
))
> Notes can go in quotes.((In quote.
-))
+> ))
- And in list items.((In list.))
diff --git a/tests/writer.opml b/tests/writer.opml
index 54be4b671..840c3c6e1 100644
--- a/tests/writer.opml
+++ b/tests/writer.opml
@@ -18,7 +18,7 @@
</outline>
<outline text="Level 1">
<outline text="Level 2 with &lt;em&gt;emphasis&lt;/em&gt;">
- <outline text="Level 3" _note="with no blank line&#10;">
+ <outline text="Level 3" _note="with no blank line">
</outline>
</outline>
<outline text="Level 2" _note="with no blank line&#10;&#10;------------------------------------------------------------------------">
@@ -55,18 +55,18 @@
<outline text="Special Characters" _note="Here is some unicode:&#10;&#10;- I hat: Î&#10;- o umlaut: ö&#10;- section: §&#10;- set membership: ∈&#10;- copyright: ©&#10;&#10;AT&amp;T has an ampersand in their name.&#10;&#10;AT&amp;T is another way to write it.&#10;&#10;This &amp; that.&#10;&#10;4 \&lt; 5.&#10;&#10;6 \&gt; 5.&#10;&#10;Backslash: \\&#10;&#10;Backtick: \`&#10;&#10;Asterisk: \*&#10;&#10;Underscore: \_&#10;&#10;Left brace: {&#10;&#10;Right brace: }&#10;&#10;Left bracket: [&#10;&#10;Right bracket: ]&#10;&#10;Left paren: (&#10;&#10;Right paren: )&#10;&#10;Greater-than: \&gt;&#10;&#10;Hash: \#&#10;&#10;Period: .&#10;&#10;Bang: !&#10;&#10;Plus: +&#10;&#10;Minus: -&#10;&#10;------------------------------------------------------------------------">
</outline>
<outline text="Links">
- <outline text="Explicit" _note="Just a [URL](/url/).&#10;&#10;[URL and title](/url/ &quot;title&quot;).&#10;&#10;[URL and title](/url/ &quot;title preceded by two spaces&quot;).&#10;&#10;[URL and title](/url/ &quot;title preceded by a tab&quot;).&#10;&#10;[URL and title](/url/ &quot;title with &quot;quotes&quot; in it&quot;)&#10;&#10;[URL and title](/url/ &quot;title with single quotes&quot;)&#10;&#10;[with\_underscore](/url/with_underscore)&#10;&#10;[Email link](mailto:nobody@nowhere.net)&#10;&#10;[Empty]().&#10;">
+ <outline text="Explicit" _note="Just a [URL](/url/).&#10;&#10;[URL and title](/url/ &quot;title&quot;).&#10;&#10;[URL and title](/url/ &quot;title preceded by two spaces&quot;).&#10;&#10;[URL and title](/url/ &quot;title preceded by a tab&quot;).&#10;&#10;[URL and title](/url/ &quot;title with &quot;quotes&quot; in it&quot;)&#10;&#10;[URL and title](/url/ &quot;title with single quotes&quot;)&#10;&#10;[with\_underscore](/url/with_underscore)&#10;&#10;[Email link](mailto:nobody@nowhere.net)&#10;&#10;[Empty]().">
</outline>
- <outline text="Reference" _note="Foo [bar](/url/).&#10;&#10;Foo [bar](/url/).&#10;&#10;Foo [bar](/url/).&#10;&#10;With [embedded [brackets]](/url/).&#10;&#10;[b](/url/) by itself should be a link.&#10;&#10;Indented [once](/url).&#10;&#10;Indented [twice](/url).&#10;&#10;Indented [thrice](/url).&#10;&#10;This should [not][] be a link.&#10;&#10; [not]: /url&#10;&#10;Foo [bar](/url/ &quot;Title with &quot;quotes&quot; inside&quot;).&#10;&#10;Foo [biz](/url/ &quot;Title with &quot;quote&quot; inside&quot;).&#10;">
+ <outline text="Reference" _note="Foo [bar](/url/).&#10;&#10;Foo [bar](/url/).&#10;&#10;Foo [bar](/url/).&#10;&#10;With [embedded [brackets]](/url/).&#10;&#10;[b](/url/) by itself should be a link.&#10;&#10;Indented [once](/url).&#10;&#10;Indented [twice](/url).&#10;&#10;Indented [thrice](/url).&#10;&#10;This should [not][] be a link.&#10;&#10; [not]: /url&#10;&#10;Foo [bar](/url/ &quot;Title with &quot;quotes&quot; inside&quot;).&#10;&#10;Foo [biz](/url/ &quot;Title with &quot;quote&quot; inside&quot;).">
</outline>
- <outline text="With ampersands" _note="Here’s a [link with an ampersand in the&#10;URL](http://example.com/?foo=1&amp;bar=2).&#10;&#10;Here’s a link with an amersand in the link text:&#10;[AT&amp;T](http://att.com/ &quot;AT&amp;T&quot;).&#10;&#10;Here’s an [inline link](/script?foo=1&amp;bar=2).&#10;&#10;Here’s an [inline link in pointy braces](/script?foo=1&amp;bar=2).&#10;">
+ <outline text="With ampersands" _note="Here’s a [link with an ampersand in the&#10;URL](http://example.com/?foo=1&amp;bar=2).&#10;&#10;Here’s a link with an amersand in the link text:&#10;[AT&amp;T](http://att.com/ &quot;AT&amp;T&quot;).&#10;&#10;Here’s an [inline link](/script?foo=1&amp;bar=2).&#10;&#10;Here’s an [inline link in pointy braces](/script?foo=1&amp;bar=2).">
</outline>
<outline text="Autolinks" _note="With an ampersand: &lt;http://example.com/?foo=1&amp;bar=2&gt;&#10;&#10;- In a list?&#10;- &lt;http://example.com/&gt;&#10;- It should.&#10;&#10;An e-mail address: &lt;nobody@nowhere.net&gt;&#10;&#10;&gt; Blockquoted: &lt;http://example.com/&gt;&#10;&#10;Auto-links should not occur here: `&lt;http://example.com/&gt;`&#10;&#10; or here: &lt;http://example.com/&gt;&#10;&#10;------------------------------------------------------------------------">
</outline>
</outline>
<outline text="Images" _note="From “Voyage dans la Lune” by Georges Melies (1902):&#10;&#10;![lalune](lalune.jpg &quot;Voyage dans la Lune&quot;)&#10;&#10;Here is a movie ![movie](movie.jpg) icon.&#10;&#10;------------------------------------------------------------------------">
</outline>
-<outline text="Footnotes" _note="Here is a footnote reference,[^1] and another.[^2] This should *not* be&#10;a footnote reference, because it contains a space.[\^my note] Here is an&#10;inline note.[^3]&#10;&#10;&gt; Notes can go in quotes.[^4]&#10;&#10;1. And in list items.[^5]&#10;&#10;This paragraph should not be part of the note, as it is not indented.&#10;&#10;[^1]: Here is the footnote. It can go anywhere after the footnote&#10; reference. It need not be placed at the end of the document.&#10;&#10;[^2]: Here’s the long note. This one contains multiple blocks.&#10;&#10; Subsequent blocks are indented to show that they belong to the&#10; footnote (as with list items).&#10;&#10; { &lt;code&gt; }&#10;&#10; If you want, you can indent every line, but you can also be lazy and&#10; just indent the first line of each block.&#10;&#10;[^3]: This is *easier* to type. Inline notes may contain&#10; [links](http://google.com) and `]` verbatim characters, as well as&#10; [bracketed text].&#10;&#10;[^4]: In quote.&#10;&#10;[^5]: In list.&#10;">
+<outline text="Footnotes" _note="Here is a footnote reference,[^1] and another.[^2] This should *not* be&#10;a footnote reference, because it contains a space.[\^my note] Here is an&#10;inline note.[^3]&#10;&#10;&gt; Notes can go in quotes.[^4]&#10;&#10;1. And in list items.[^5]&#10;&#10;This paragraph should not be part of the note, as it is not indented.&#10;&#10;[^1]: Here is the footnote. It can go anywhere after the footnote&#10; reference. It need not be placed at the end of the document.&#10;&#10;[^2]: Here’s the long note. This one contains multiple blocks.&#10;&#10; Subsequent blocks are indented to show that they belong to the&#10; footnote (as with list items).&#10;&#10; { &lt;code&gt; }&#10;&#10; If you want, you can indent every line, but you can also be lazy and&#10; just indent the first line of each block.&#10;&#10;[^3]: This is *easier* to type. Inline notes may contain&#10; [links](http://google.com) and `]` verbatim characters, as well as&#10; [bracketed text].&#10;&#10;[^4]: In quote.&#10;&#10;[^5]: In list.">
</outline>
</body>
</opml>
diff --git a/tests/writer.rst b/tests/writer.rst
index 1a998d2ae..f09871a34 100644
--- a/tests/writer.rst
+++ b/tests/writer.rst
@@ -839,6 +839,7 @@ From “Voyage dans la Lune” by Georges Melies (1902):
:alt: Voyage dans la Lune
lalune
+
Here is a movie |movie| icon.
--------------