diff options
57 files changed, 1732 insertions, 340 deletions
@@ -44,12 +44,7 @@ Quick install [Not sure where `$CABALDIR` is?](http://www.haskell.org/haskellwiki/Cabal-Install#The_cabal-install_configuration_file) -5. Make sure the `$CABALDIR/share/man/man1` directory is in your `MANPATH`. - You should now be able to access the `pandoc` man page: - - man pandoc - -6. If you want to process citations with pandoc, you will also need to +5. If you want to process citations with pandoc, you will also need to install a separate package, `pandoc-citeproc`. This can be installed using cabal: @@ -71,6 +66,20 @@ Quick install --extra-include-dirs=/usr/local/Cellar/icu4c/51.1/include \ -funicode_collation text-icu pandoc-citeproc +The cabal installation procedure does not generate man pages. +To build the `pandoc` man pages, build pandoc with the +`make-pandoc-man-pages` flag, and then use the command +`make-pandoc-man-pages` from the pandoc source directory. +This will create the man pages in `man/man1` and `man/man5`. + +To build the `pandoc-citeproc` man pages, go to the pandoc-citeproc +build directory, and + + cd man + make + +The man page will be created in the `man1` subdirectory. + [GHC]: http://www.haskell.org/ghc/ [Haskell platform]: http://hackage.haskell.org/platform/ [cabal-install]: http://hackage.haskell.org/trac/hackage/wiki/CabalInstall @@ -13,15 +13,16 @@ Description Pandoc is a [Haskell] library for converting from one markup format to another, and a command-line tool that uses this library. It can read [markdown] and (subsets of) [Textile], [reStructuredText], [HTML], -[LaTeX], [MediaWiki markup], [Haddock markup], [OPML], [Emacs -Org-mode], [DocBook], [txt2tags], [EPUB] and [Word docx]; and it can write plain text, -[markdown], [reStructuredText], [XHTML], [HTML 5], [LaTeX] (including -[beamer] slide shows), [ConTeXt], [RTF], [OPML], [DocBook], -[OpenDocument], [ODT], [Word docx], [GNU Texinfo], [MediaWiki markup], -[DokuWiki markup], [Haddock markup], [EPUB] (v2 or v3), [FictionBook2], -[Textile], [groff man] pages, [Emacs Org-Mode], [AsciiDoc], [InDesign ICML], -and [Slidy], [Slideous], [DZSlides], [reveal.js] or [S5] HTML slide shows. -It can also produce [PDF] output on systems where LaTeX is installed. +[LaTeX], [MediaWiki markup], [TWiki markup], [Haddock markup], [OPML], +[Emacs Org-mode], [DocBook], [txt2tags], [EPUB] and [Word docx]; and +it can write plain text, [markdown], [reStructuredText], [XHTML], +[HTML 5], [LaTeX] (including [beamer] slide shows), [ConTeXt], [RTF], +[OPML], [DocBook], [OpenDocument], [ODT], [Word docx], [GNU Texinfo], +[MediaWiki markup], [DokuWiki markup], [Haddock markup], [EPUB] (v2 or v3), +[FictionBook2], [Textile], [groff man] pages, [Emacs Org-Mode], [AsciiDoc], +[InDesign ICML], and [Slidy], [Slideous], [DZSlides], [reveal.js] or +[S5] HTML slide shows. It can also produce [PDF] output on systems where +LaTeX is installed. Pandoc's enhanced version of markdown includes syntax for footnotes, tables, flexible ordered lists, definition lists, fenced code blocks, @@ -50,6 +51,15 @@ default (though output to *stdout* is disabled for the `odt`, `docx`, pandoc -o output.html input.txt +By default, pandoc produces a document fragment, not a standalone +document with a proper header and footer. To produce a standalone +document, use the `-s` or `--standalone` flag: + + pandoc -s -o output.html input.txt + +For more information on how standalone documents are produced, see +[Templates](#templates), below. + Instead of a file, an absolute URI may be given. In this case pandoc will fetch the content using HTTP: @@ -95,6 +105,11 @@ should pipe input and output through `iconv`: iconv -t utf-8 input.txt | pandoc | iconv -f utf-8 +Note that in some output formats (such as HTML, LaTeX, ConTeXt, +RTF, OPML, DocBook, and Texinfo), information about +the character encoding is included in the document header, which +will only be included if you use the `-s/--standalone` option. + Creating a PDF -------------- @@ -147,8 +162,8 @@ General options `textile` (Textile), `rst` (reStructuredText), `html` (HTML), `docbook` (DocBook), `t2t` (txt2tags), `docx` (docx), `epub` (EPUB), `opml` (OPML), `org` (Emacs Org-mode), `mediawiki` (MediaWiki markup), - `haddock` (Haddock markup), or `latex` (LaTeX). If `+lhs` is appended - to `markdown`, `rst`, + `twiki` (TWiki markup), `haddock` (Haddock markup), or `latex` (LaTeX). + If `+lhs` is appended to `markdown`, `rst`, `latex`, or `html`, the input will be treated as literate Haskell source: see [Literate Haskell support](#literate-haskell-support), below. Markdown syntax extensions can be individually enabled or @@ -239,8 +254,8 @@ Reader options to curly quotes, `---` to em-dashes, `--` to en-dashes, and `...` to ellipses. Nonbreaking spaces are inserted after certain abbreviations, such as "Mr." (Note: This option is significant only when - the input format is `markdown`, `markdown_strict`, or `textile`. It - is selected automatically when the input format is `textile` or the + the input format is `markdown`, `markdown_strict`, `textile` or `twiki`. + It is selected automatically when the input format is `textile` or the output format is `latex` or `context`, unless `--no-tex-ligatures` is used.) @@ -710,6 +725,16 @@ Math rendering in HTML formulas to images. The formula will be concatenated with the URL provided. If *URL* is not specified, the Google Chart API will be used. +`--katex`[=*URL*] +: Use [KaTeX] to display embedded TeX math in HTML output. + The *URL* should point to the `katex.js` load script. If a *URL* is + not provided, a link to the KaTeX CDN will be inserted. + +`--katex-stylesheet=*URL*` +: The *URL* should point to the `katex.css` stylesheet. If this option is + not specified, a link to the KaTeX CDN will be inserted. Note that this + option does not imply `--katex`. + Options for wrapper scripts --------------------------- @@ -1652,7 +1677,7 @@ proportionally spaced fonts, as it does not require lining up columns. #### Extension: `table_captions` #### -A caption may optionally be provided with all 4 kinds of tables (as +A caption may optionally be provided with all 4 kinds of tables (as illustrated in the examples below). A caption is a paragraph beginning with the string `Table:` (or just `:`), which will be stripped off. It may appear either before or after the table. @@ -2121,8 +2146,9 @@ Math #### Extension: `tex_math_dollars` #### Anything between two `$` characters will be treated as TeX math. The -opening `$` must have a character immediately to its right, while the -closing `$` must have a character immediately to its left. Thus, +opening `$` must have a non-space character immediately to its right, +while the closing `$` must have a non-space character immediately to its +left, and must not be followed immediately by a digit. Thus, `$20,000 and $30,000` won't parse as math. If for some reason you need to enclose text in literal `$` characters, backslash-escape them and they won't be treated as math delimiters. @@ -3161,6 +3187,7 @@ Rosenthal. [Textile]: http://redcloth.org/textile [MediaWiki markup]: http://www.mediawiki.org/wiki/Help:Formatting [DokuWiki markup]: https://www.dokuwiki.org/dokuwiki +[TWiki markup]: http://twiki.org/cgi-bin/view/TWiki/TextFormattingRules [Haddock markup]: http://www.haskell.org/haddock/doc/html/ch03s08.html [groff man]: http://developer.apple.com/DOCUMENTATION/Darwin/Reference/ManPages/man7/groff_man.7.html [Haskell]: http://www.haskell.org/ @@ -3181,3 +3208,4 @@ Rosenthal. [txt2tags]: http://txt2tags.org/ [EPUB]: http://idpf.org/epub [EPUBspine]: http://www.idpf.org/epub/301/spec/epub-publications.html#sec-spine-elem +[KaTeX]: https://github.com/Khan/KaTeX diff --git a/data/templates b/data/templates -Subproject a63c58b23ecb159ed463beb084d6f36b1bd9f56 +Subproject c76c6c52249118e49c5dd1b99fe916724350a59 diff --git a/pandoc.cabal b/pandoc.cabal index 0e805d724..20e06121b 100644 --- a/pandoc.cabal +++ b/pandoc.cabal @@ -1,5 +1,5 @@ Name: pandoc -Version: 1.13.1 +Version: 1.13.2 Cabal-Version: >= 1.10 Build-Type: Custom License: GPL @@ -16,14 +16,15 @@ Synopsis: Conversion between markup formats Description: Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. It can read markdown and (subsets of) HTML, - reStructuredText, LaTeX, DocBook, MediaWiki markup, Haddock - markup, OPML, Emacs Org-Mode, txt2tags and Textile, and it can write - markdown, reStructuredText, HTML, LaTeX, ConTeXt, Docbook, - OPML, OpenDocument, ODT, Word docx, RTF, MediaWiki, DokuWiki, - Textile, groff man pages, plain text, Emacs Org-Mode, AsciiDoc, - Haddock markup, EPUB (v2 and v3), FictionBook2, - InDesign ICML, and several kinds of HTML/javascript - slide shows (S5, Slidy, Slideous, DZSlides, reveal.js). + reStructuredText, LaTeX, DocBook, MediaWiki markup, TWiki + markup, Haddock markup, OPML, Emacs Org-Mode, txt2tags and + Textile, and it can write markdown, reStructuredText, XHTML, + HTML 5, LaTeX, ConTeXt, DocBook, OPML, OpenDocument, ODT, + Word docx, RTF, MediaWiki, DokuWiki, Textile, groff man + pages, plain text, Emacs Org-Mode, AsciiDoc, Haddock markup, + EPUB (v2 and v3), FictionBook2, InDesign ICML, and several + kinds of HTML/javascript slide shows (S5, Slidy, Slideous, + DZSlides, reveal.js). . Pandoc extends standard markdown syntax with footnotes, embedded LaTeX, definition lists, tables, and other @@ -182,6 +183,7 @@ Extra-Source-Files: tests/epub/*.epub tests/epub/*.native tests/txt2tags.t2t + tests/twiki-reader.twiki Source-repository head type: git @@ -247,7 +249,7 @@ Library haddock-library >= 1.1 && < 1.2, old-time, deepseq-generics >= 0.1 && < 0.2, - JuicyPixels >= 3.1.6.1 && < 3.2 + JuicyPixels >= 3.1.6.1 && < 3.3 if flag(network-uri) Build-Depends: network-uri >= 2.6 && < 2.7, network >= 2.6 else @@ -289,6 +291,7 @@ Library Text.Pandoc.Readers.Textile, Text.Pandoc.Readers.Native, Text.Pandoc.Readers.Haddock, + Text.Pandoc.Readers.TWiki, Text.Pandoc.Readers.Docx, Text.Pandoc.Readers.EPUB, Text.Pandoc.Writers.Native, @@ -357,7 +360,7 @@ Executable pandoc containers >= 0.1 && < 0.6, HTTP >= 4000.0.5 && < 4000.3 if flag(network-uri) - Build-Depends: network-uri >= 2.6 && < 2.7 + Build-Depends: network-uri >= 2.6 && < 2.7, network >= 2.6 else Build-Depends: network >= 2 && < 2.6 Ghc-Options: -rtsopts -with-rtsopts=-K16m -Wall -fno-warn-unused-do-bind @@ -58,7 +58,7 @@ import qualified Control.Exception as E import Control.Exception.Extensible ( throwIO ) import qualified Text.Pandoc.UTF8 as UTF8 import Control.Monad (when, unless, (>=>)) -import Data.Maybe (isJust) +import Data.Maybe (isJust, fromMaybe) import Data.Foldable (foldrM) import Network.URI (parseURI, isURI, URI(..)) import qualified Data.ByteString.Lazy as B @@ -68,7 +68,7 @@ import qualified Data.Map as M import Data.Yaml (decode) import qualified Data.Yaml as Yaml import qualified Data.Text as T -import Control.Applicative ((<$>)) +import Control.Applicative ((<$>), (<|>)) import Text.Pandoc.Readers.Txt2Tags (getT2TMeta) import Data.Monoid @@ -205,6 +205,8 @@ data Opt = Opt , optExtractMedia :: Maybe FilePath -- ^ Path to extract embedded media , optTrace :: Bool -- ^ Print debug information , optTrackChanges :: TrackChanges -- ^ Accept or reject MS Word track-changes. + , optKaTeXStylesheet :: Maybe String -- ^ Path to stylesheet for KaTeX + , optKaTeXJS :: Maybe String -- ^ Path to js file for KaTeX } -- | Defaults for command-line options. @@ -263,6 +265,8 @@ defaultOpts = Opt , optExtractMedia = Nothing , optTrace = False , optTrackChanges = AcceptChanges + , optKaTeXStylesheet = Nothing + , optKaTeXJS = Nothing } -- | A list of functions, each transforming the options data structure @@ -818,6 +822,21 @@ options = return opt { optHTMLMathMethod = MathJax url'}) "URL") "" -- "Use MathJax for HTML math" + , Option "" ["katex"] + (OptArg + (\arg opt -> + return opt + { optKaTeXJS = + arg <|> Just "http://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.1.0/katex.min.js"}) + "URL") + "" -- Use KaTeX for HTML Math + + , Option "" ["katex-stylesheet"] + (ReqArg + (\arg opt -> + return opt { optKaTeXStylesheet = Just arg }) + "URL") + "" -- Set the KaTeX Stylesheet location , Option "" ["gladtex"] (NoArg @@ -860,6 +879,7 @@ options = ] + addMetadata :: String -> MetaValue -> M.Map String MetaValue -> M.Map String MetaValue addMetadata k v m = case M.lookup k m of @@ -910,6 +930,9 @@ defaultReaderName fallback (x:xs) = ".docx" -> "docx" ".t2t" -> "t2t" ".epub" -> "epub" + ".odt" -> "odt" -- so we get an "unknown reader" error + ".pdf" -> "pdf" -- so we get an "unknown reader" error + ".doc" -> "doc" -- so we get an "unknown reader" error _ -> defaultReaderName fallback xs -- Returns True if extension of first source is .lhs @@ -950,6 +973,7 @@ defaultWriterName x = ".pdf" -> "latex" ".fb2" -> "fb2" ".opml" -> "opml" + ".icml" -> "icml" ['.',y] | y `elem` ['1'..'9'] -> "man" _ -> "html" @@ -1027,7 +1051,7 @@ main = do , optHighlight = highlight , optHighlightStyle = highlightStyle , optChapters = chapters - , optHTMLMathMethod = mathMethod + , optHTMLMathMethod = mathMethod' , optReferenceODT = referenceODT , optReferenceDocx = referenceDocx , optEpubStylesheet = epubStylesheet @@ -1056,6 +1080,8 @@ main = do , optExtractMedia = mbExtractMedia , optTrace = trace , optTrackChanges = trackChanges + , optKaTeXStylesheet = katexStylesheet + , optKaTeXJS = katexJS } = opts when dumpArgs $ @@ -1063,6 +1089,13 @@ main = do mapM_ (\arg -> UTF8.hPutStrLn stdout arg) args exitWith ExitSuccess + let csscdn = "http://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.1.0/katex.min.css" + let mathMethod = + case (katexJS, katexStylesheet) of + (Nothing, _) -> mathMethod' + (Just js, ss) -> KaTeX js (fromMaybe csscdn ss) + + -- --bibliography implies -F pandoc-citeproc for backwards compatibility: let needsCiteproc = isJust (M.lookup "bibliography" metadata) && optCiteMethod opts `notElem` [Natbib, Biblatex] && @@ -1119,7 +1152,15 @@ main = do (getT2TMeta sources outputFile) else case getReader readerName' of Right r -> return r - Left e -> err 7 e + Left e -> err 7 e' + where e' = case readerName' of + "odt" -> e ++ + "\nPandoc can convert to ODT, but not from ODT.\nTry using LibreOffice to export as HTML, and convert that with pandoc." + "pdf" -> e ++ + "\nPandoc can convert to PDF, but not from PDF." + "doc" -> e ++ + "\nPandoc can convert from DOCX, but not from DOC.\nTry using Word to save your DOC file as DOCX, and convert that with pandoc." + _ -> e let standalone' = standalone || not (isTextFormat writerName') || pdfOutput diff --git a/src/Text/Pandoc.hs b/src/Text/Pandoc.hs index fd849316b..1f22122ac 100644 --- a/src/Text/Pandoc.hs +++ b/src/Text/Pandoc.hs @@ -77,6 +77,7 @@ module Text.Pandoc , readHaddock , readNative , readJSON + , readTWiki , readTxt2Tags , readTxt2TagsNoMacros , readEPUB @@ -133,6 +134,7 @@ import Text.Pandoc.Readers.HTML import Text.Pandoc.Readers.Textile import Text.Pandoc.Readers.Native import Text.Pandoc.Readers.Haddock +import Text.Pandoc.Readers.TWiki import Text.Pandoc.Readers.Docx import Text.Pandoc.Readers.Txt2Tags import Text.Pandoc.Readers.EPUB @@ -233,6 +235,7 @@ readers = [ ("native" , StringReader $ \_ s -> return $ readNative s) ,("html" , mkStringReader readHtml) ,("latex" , mkStringReader readLaTeX) ,("haddock" , mkStringReader readHaddock) + ,("twiki" , mkStringReader readTWiki) ,("docx" , mkBSReader readDocx) ,("t2t" , mkStringReader readTxt2TagsNoMacros) ,("epub" , mkBSReader readEPUB) diff --git a/src/Text/Pandoc/MIME.hs b/src/Text/Pandoc/MIME.hs index 3b3b3b5b3..2fdba93e0 100644 --- a/src/Text/Pandoc/MIME.hs +++ b/src/Text/Pandoc/MIME.hs @@ -328,7 +328,7 @@ mimeTypesList = -- List borrowed from happstack-server. ,("oth","application/vnd.oasis.opendocument.text-web") ,("otp","application/vnd.oasis.opendocument.presentation-template") ,("ots","application/vnd.oasis.opendocument.spreadsheet-template") - ,("otf","application/x-font-opentype") + ,("otf","application/vnd.ms-opentype") ,("ott","application/vnd.oasis.opendocument.text-template") ,("oza","application/x-oz-application") ,("p","text/x-pascal") @@ -477,6 +477,7 @@ mimeTypesList = -- List borrowed from happstack-server. ,("vrml","model/vrml") ,("vs","text/plain") ,("vsd","application/vnd.visio") + ,("vtt","text/vtt") ,("wad","application/x-doom") ,("wav","audio/x-wav") ,("wax","audio/x-ms-wax") diff --git a/src/Text/Pandoc/MediaBag.hs b/src/Text/Pandoc/MediaBag.hs index 5921b56cf..a55d5417e 100644 --- a/src/Text/Pandoc/MediaBag.hs +++ b/src/Text/Pandoc/MediaBag.hs @@ -51,7 +51,7 @@ import System.IO (stderr) -- mime types. Note that a 'MediaBag' is a Monoid, so 'mempty' -- can be used for an empty 'MediaBag', and '<>' can be used to append -- two 'MediaBag's. -newtype MediaBag = MediaBag (M.Map String (MimeType, BL.ByteString)) +newtype MediaBag = MediaBag (M.Map [String] (MimeType, BL.ByteString)) deriving (Monoid) instance Show MediaBag where @@ -65,7 +65,7 @@ insertMedia :: FilePath -- ^ relative path and canonical name of resource -> MediaBag -> MediaBag insertMedia fp mbMime contents (MediaBag mediamap) = - MediaBag (M.insert fp (mime, contents) mediamap) + MediaBag (M.insert (splitPath fp) (mime, contents) mediamap) where mime = fromMaybe fallback mbMime fallback = case takeExtension fp of ".gz" -> getMimeTypeDef $ dropExtension fp @@ -75,14 +75,14 @@ insertMedia fp mbMime contents (MediaBag mediamap) = lookupMedia :: FilePath -> MediaBag -> Maybe (MimeType, BL.ByteString) -lookupMedia fp (MediaBag mediamap) = M.lookup fp mediamap +lookupMedia fp (MediaBag mediamap) = M.lookup (splitPath fp) mediamap -- | Get a list of the file paths stored in a 'MediaBag', with -- their corresponding mime types and the lengths in bytes of the contents. mediaDirectory :: MediaBag -> [(String, MimeType, Int)] mediaDirectory (MediaBag mediamap) = M.foldWithKey (\fp (mime,contents) -> - ((fp, mime, fromIntegral $ BL.length contents):)) [] mediamap + (((joinPath fp), mime, fromIntegral $ BL.length contents):)) [] mediamap -- | Extract contents of MediaBag to a given directory. Print informational -- messages if 'verbose' is true. @@ -93,7 +93,7 @@ extractMediaBag :: Bool extractMediaBag verbose dir (MediaBag mediamap) = do sequence_ $ M.foldWithKey (\fp (_ ,contents) -> - ((writeMedia verbose dir (fp, contents)):)) [] mediamap + ((writeMedia verbose dir (joinPath fp, contents)):)) [] mediamap writeMedia :: Bool -> FilePath -> (FilePath, BL.ByteString) -> IO () writeMedia verbose dir (subpath, bs) = do diff --git a/src/Text/Pandoc/Options.hs b/src/Text/Pandoc/Options.hs index 84ccbbdc9..ebfd8f8a9 100644 --- a/src/Text/Pandoc/Options.hs +++ b/src/Text/Pandoc/Options.hs @@ -251,6 +251,7 @@ data HTMLMathMethod = PlainMath | WebTeX String -- url of TeX->image script. | MathML (Maybe String) -- url of MathMLinHTML.js | MathJax String -- url of MathJax.js + | KaTeX String String -- url of stylesheet and katex.js deriving (Show, Read, Eq) data CiteMethod = Citeproc -- use citeproc to render them diff --git a/src/Text/Pandoc/Parsing.hs b/src/Text/Pandoc/Parsing.hs index d1fba1e21..e0f5f65bb 100644 --- a/src/Text/Pandoc/Parsing.hs +++ b/src/Text/Pandoc/Parsing.hs @@ -472,7 +472,12 @@ mathInlineWith op cl = try $ do string op notFollowedBy space words' <- many1Till (count 1 (noneOf " \t\n\\") - <|> (char '\\' >> anyChar >>= \c -> return ['\\',c]) + <|> (char '\\' >> + -- This next clause is needed because \text{..} can + -- contain $, \(\), etc. + (try (string "text" >> + (("\\text" ++) <$> inBalancedBraces 0 "")) + <|> (\c -> ['\\',c]) <$> anyChar)) <|> do (blankline <* notFollowedBy' blankline) <|> (oneOf " \t" <* skipMany (oneOf " \t")) notFollowedBy (char '$') @@ -480,6 +485,23 @@ mathInlineWith op cl = try $ do ) (try $ string cl) notFollowedBy digit -- to prevent capture of $5 return $ concat words' + where + inBalancedBraces :: Stream s m Char => Int -> String -> ParserT s st m String + inBalancedBraces 0 "" = do + c <- anyChar + if c == '{' + then inBalancedBraces 1 "{" + else mzero + inBalancedBraces 0 s = return $ reverse s + inBalancedBraces numOpen ('\\':xs) = do + c <- anyChar + inBalancedBraces numOpen (c:'\\':xs) + inBalancedBraces numOpen xs = do + c <- anyChar + case c of + '}' -> inBalancedBraces (numOpen - 1) (c:xs) + '{' -> inBalancedBraces (numOpen + 1) (c:xs) + _ -> inBalancedBraces numOpen (c:xs) mathDisplayWith :: Stream s m Char => String -> String -> ParserT s st m String mathDisplayWith op cl = try $ do diff --git a/src/Text/Pandoc/Pretty.hs b/src/Text/Pandoc/Pretty.hs index 1e72c2040..2f2656086 100644 --- a/src/Text/Pandoc/Pretty.hs +++ b/src/Text/Pandoc/Pretty.hs @@ -286,6 +286,9 @@ renderList (BlankLines num : xs) = do | otherwise -> replicateM_ (1 + num - newlines st) (outp (-1) "\n") renderList xs +renderList (CarriageReturn : BlankLines m : xs) = + renderList (BlankLines m : xs) + renderList (CarriageReturn : xs) = do st <- get if newlines st > 0 || null xs @@ -531,4 +534,4 @@ charWidth c = -- | Get real length of string, taking into account combining and double-wide -- characters. realLength :: String -> Int -realLength = sum . map charWidth +realLength = foldr (\a b -> charWidth a + b) 0 diff --git a/src/Text/Pandoc/Readers/Docx.hs b/src/Text/Pandoc/Readers/Docx.hs index 4b5fbfdfc..64eb0322f 100644 --- a/src/Text/Pandoc/Readers/Docx.hs +++ b/src/Text/Pandoc/Readers/Docx.hs @@ -84,8 +84,7 @@ import Text.Pandoc.Readers.Docx.Lists import Text.Pandoc.Readers.Docx.Reducible import Text.Pandoc.Shared import Text.Pandoc.MediaBag (insertMedia, MediaBag) -import Data.Maybe (isJust) -import Data.List (delete, stripPrefix, (\\), intersect, isPrefixOf) +import Data.List (delete, (\\), intersect) import Data.Monoid import Text.TeXMath (writeTeX) import Data.Default (Default) @@ -197,19 +196,9 @@ fixAuthors mv = mv codeStyles :: [String] codeStyles = ["VerbatimChar"] -blockQuoteDivs :: [String] -blockQuoteDivs = ["Quote", "BlockQuote", "BlockQuotation"] - codeDivs :: [String] codeDivs = ["SourceCode"] - --- For the moment, we have English, Danish, German, and French. This --- is fairly ad-hoc, and there might be a more systematic way to do --- it, but it's better than nothing. -headerPrefixes :: [String] -headerPrefixes = ["Heading", "Overskrift", "berschrift", "Titre"] - runElemToInlines :: RunElem -> Inlines runElemToInlines (TextRun s) = text s runElemToInlines (LnBrk) = linebreak @@ -434,9 +423,9 @@ parStyleToTransform pPr let pPr' = pPr { pStyle = cs, indentation = Nothing} in (divWith ("", [c], [])) . (parStyleToTransform pPr') - | (c:cs) <- pStyle pPr - , c `elem` blockQuoteDivs = - let pPr' = pPr { pStyle = cs \\ blockQuoteDivs } + | (_:cs) <- pStyle pPr + , Just True <- pBlockQuote pPr = + let pPr' = pPr { pStyle = cs } in blockQuote . (parStyleToTransform pPr') | (_:cs) <- pStyle pPr = @@ -467,12 +456,11 @@ bodyPartToBlocks (Paragraph pPr parparts) $ parStyleToTransform pPr $ codeBlock $ concatMap parPartToString parparts - | (c : cs) <- filter (isJust . isHeaderClass) $ pStyle pPr - , Just (prefix, n) <- isHeaderClass c = do + | Just (style, n) <- pHeading pPr = do ils <- local (\s-> s{docxInHeaderBlock=True}) $ (concatReduce <$> mapM parPartToInlines parparts) makeHeaderAnchor $ - headerWith ("", delete (prefix ++ show n) cs, []) n ils + headerWith ("", delete style (pStyle pPr), []) n ils | otherwise = do ils <- concatReduce <$> mapM parPartToInlines parparts >>= (return . fromList . trimLineBreaks . normalizeSpaces . toList) @@ -559,12 +547,3 @@ docxToOutput :: ReaderOptions -> Docx -> (Meta, [Block], MediaBag) docxToOutput opts (Docx (Document _ body)) = let dEnv = def { docxOptions = opts} in evalDocxContext (bodyToOutput body) dEnv def - -isHeaderClass :: String -> Maybe (String, Int) -isHeaderClass s | (pref:_) <- filter (\h -> isPrefixOf h s) headerPrefixes - , Just s' <- stripPrefix pref s = - case reads s' :: [(Int, String)] of - [] -> Nothing - ((n, "") : []) -> Just (pref, n) - _ -> Nothing -isHeaderClass _ = Nothing diff --git a/src/Text/Pandoc/Readers/Docx/Parse.hs b/src/Text/Pandoc/Readers/Docx/Parse.hs index e7a6c3ffb..5fd6b7a81 100644 --- a/src/Text/Pandoc/Readers/Docx/Parse.hs +++ b/src/Text/Pandoc/Readers/Docx/Parse.hs @@ -1,4 +1,4 @@ -{-# LANGUAGE PatternGuards, ViewPatterns #-} +{-# LANGUAGE PatternGuards, ViewPatterns, FlexibleInstances #-} {- Copyright (C) 2014 Jesse Rosenthal <jrosenthal@jhu.edu> @@ -65,7 +65,7 @@ import Text.Pandoc.Compat.Except import Text.TeXMath.Readers.OMML (readOMML) import Text.Pandoc.Readers.Docx.Fonts (getUnicode, Font(..)) import Text.TeXMath (Exp) -import Data.Char (readLitChar, ord, chr) +import Data.Char (readLitChar, ord, chr, isDigit) data ReaderEnv = ReaderEnv { envNotes :: Notes , envNumbering :: Numbering @@ -73,6 +73,7 @@ data ReaderEnv = ReaderEnv { envNotes :: Notes , envMedia :: Media , envFont :: Maybe Font , envCharStyles :: CharStyleMap + , envParStyles :: ParStyleMap } deriving Show @@ -122,8 +123,12 @@ type Media = [(FilePath, B.ByteString)] type CharStyle = (String, RunStyle) +type ParStyle = (String, ParStyleData) + type CharStyleMap = M.Map String RunStyle +type ParStyleMap = M.Map String ParStyleData + data Numbering = Numbering NameSpaces [Numb] [AbstractNumb] deriving Show @@ -152,6 +157,8 @@ data ParIndentation = ParIndentation { leftParIndent :: Maybe Integer data ParagraphStyle = ParagraphStyle { pStyle :: [String] , indentation :: Maybe ParIndentation , dropCap :: Bool + , pHeading :: Maybe (String, Int) + , pBlockQuote :: Maybe Bool } deriving Show @@ -159,6 +166,8 @@ defaultParagraphStyle :: ParagraphStyle defaultParagraphStyle = ParagraphStyle { pStyle = [] , indentation = Nothing , dropCap = False + , pHeading = Nothing + , pBlockQuote = Nothing } @@ -213,6 +222,11 @@ data RunStyle = RunStyle { isBold :: Maybe Bool , rStyle :: Maybe CharStyle} deriving Show +data ParStyleData = ParStyleData { headingLev :: Maybe (String, Int) + , isBlockQuote :: Maybe Bool + , psStyle :: Maybe ParStyle} + deriving Show + defaultRunStyle :: RunStyle defaultRunStyle = RunStyle { isBold = Nothing , isItalic = Nothing @@ -242,8 +256,8 @@ archiveToDocx archive = do numbering = archiveToNumbering archive rels = archiveToRelationships archive media = archiveToMedia archive - styles = archiveToStyles archive - rEnv = ReaderEnv notes numbering rels media Nothing styles + (styles, parstyles) = archiveToStyles archive + rEnv = ReaderEnv notes numbering rels media Nothing styles parstyles doc <- runD (archiveToDocument archive) rEnv return $ Docx doc @@ -263,47 +277,69 @@ elemToBody ns element | isElem ns "w" "body" element = (\bps -> return $ Body bps) elemToBody _ _ = throwError WrongElem -archiveToStyles :: Archive -> CharStyleMap +archiveToStyles :: Archive -> (CharStyleMap, ParStyleMap) archiveToStyles zf = let stylesElem = findEntryByPath "word/styles.xml" zf >>= (parseXMLDoc . UTF8.toStringLazy . fromEntry) in case stylesElem of - Nothing -> M.empty + Nothing -> (M.empty, M.empty) Just styElem -> let namespaces = mapMaybe attrToNSPair (elAttribs styElem) in - M.fromList $ buildBasedOnList namespaces styElem Nothing + ( M.fromList $ buildBasedOnList namespaces styElem + (Nothing :: Maybe CharStyle), + M.fromList $ buildBasedOnList namespaces styElem + (Nothing :: Maybe ParStyle) ) -isBasedOnStyle :: NameSpaces -> Element -> Maybe CharStyle -> Bool +isBasedOnStyle :: (ElemToStyle a) => NameSpaces -> Element -> Maybe a -> Bool isBasedOnStyle ns element parentStyle | isElem ns "w" "style" element - , Just "character" <- findAttr (elemName ns "w" "type") element + , Just styleType <- findAttr (elemName ns "w" "type") element + , styleType == cStyleType parentStyle , Just basedOnVal <- findChild (elemName ns "w" "basedOn") element >>= findAttr (elemName ns "w" "val") - , Just (parentId, _) <- parentStyle = (basedOnVal == parentId) + , Just ps <- parentStyle = (basedOnVal == getStyleId ps) | isElem ns "w" "style" element - , Just "character" <- findAttr (elemName ns "w" "type") element + , Just styleType <- findAttr (elemName ns "w" "type") element + , styleType == cStyleType parentStyle , Nothing <- findChild (elemName ns "w" "basedOn") element , Nothing <- parentStyle = True | otherwise = False -elemToCharStyle :: NameSpaces -> Element -> Maybe CharStyle -> Maybe CharStyle -elemToCharStyle ns element parentStyle - | isElem ns "w" "style" element - , Just "character" <- findAttr (elemName ns "w" "type") element - , Just styleId <- findAttr (elemName ns "w" "styleId") element = - Just (styleId, elemToRunStyle ns element parentStyle) - | otherwise = Nothing - -getStyleChildren :: NameSpaces -> Element -> Maybe CharStyle -> [CharStyle] +class ElemToStyle a where + cStyleType :: Maybe a -> String + elemToStyle :: NameSpaces -> Element -> Maybe a -> Maybe a + getStyleId :: a -> String + +instance ElemToStyle CharStyle where + cStyleType _ = "character" + elemToStyle ns element parentStyle + | isElem ns "w" "style" element + , Just "character" <- findAttr (elemName ns "w" "type") element + , Just styleId <- findAttr (elemName ns "w" "styleId") element = + Just (styleId, elemToRunStyle ns element parentStyle) + | otherwise = Nothing + getStyleId s = fst s + +instance ElemToStyle ParStyle where + cStyleType _ = "paragraph" + elemToStyle ns element parentStyle + | isElem ns "w" "style" element + , Just "paragraph" <- findAttr (elemName ns "w" "type") element + , Just styleId <- findAttr (elemName ns "w" "styleId") element = + Just (styleId, elemToParStyleData ns element parentStyle) + | otherwise = Nothing + getStyleId s = fst s + +getStyleChildren :: (ElemToStyle a) => NameSpaces -> Element -> Maybe a -> [a] getStyleChildren ns element parentStyle | isElem ns "w" "styles" element = - mapMaybe (\e -> elemToCharStyle ns e parentStyle) $ + mapMaybe (\e -> elemToStyle ns e parentStyle) $ filterChildren (\e' -> isBasedOnStyle ns e' parentStyle) element | otherwise = [] -buildBasedOnList :: NameSpaces -> Element -> Maybe CharStyle -> [CharStyle] +buildBasedOnList :: (ElemToStyle a) => NameSpaces -> Element -> Maybe a -> [a] buildBasedOnList ns element rootStyle = case (getStyleChildren ns element rootStyle) of [] -> [] @@ -543,7 +579,8 @@ elemToBodyPart ns element elemToBodyPart ns element | isElem ns "w" "p" element , Just (numId, lvl) <- elemToNumInfo ns element = do - let parstyle = elemToParagraphStyle ns element + sty <- asks envParStyles + let parstyle = elemToParagraphStyle ns element sty parparts <- mapD (elemToParPart ns) (elChildren element) num <- asks envNumbering case lookupLevel numId lvl num of @@ -551,7 +588,8 @@ elemToBodyPart ns element Nothing -> throwError WrongElem elemToBodyPart ns element | isElem ns "w" "p" element = do - let parstyle = elemToParagraphStyle ns element + sty <- asks envParStyles + let parstyle = elemToParagraphStyle ns element sty parparts <- mapD (elemToParPart ns) (elChildren element) return $ Paragraph parstyle parparts elemToBodyPart ns element @@ -584,7 +622,7 @@ expandDrawingId s = do target <- asks (lookupRelationship s . envRelationships) case target of Just filepath -> do - bytes <- asks (lookup (combine "word" filepath) . envMedia) + bytes <- asks (lookup ("word/" ++ filepath) . envMedia) case bytes of Just bs -> return (filepath, bs) Nothing -> throwError DocxError @@ -625,17 +663,20 @@ elemToParPart ns element return $ BookMark bmId bmName elemToParPart ns element | isElem ns "w" "hyperlink" element - , Just anchor <- findAttr (elemName ns "w" "anchor") element = do + , Just relId <- findAttr (elemName ns "r" "id") element = do runs <- mapD (elemToRun ns) (elChildren element) - return $ InternalHyperLink anchor runs + rels <- asks envRelationships + case lookupRelationship relId rels of + Just target -> do + case findAttr (elemName ns "w" "anchor") element of + Just anchor -> return $ ExternalHyperLink (target ++ '#':anchor) runs + Nothing -> return $ ExternalHyperLink target runs + Nothing -> return $ ExternalHyperLink "" runs elemToParPart ns element | isElem ns "w" "hyperlink" element - , Just relId <- findAttr (elemName ns "r" "id") element = do + , Just anchor <- findAttr (elemName ns "w" "anchor") element = do runs <- mapD (elemToRun ns) (elChildren element) - rels <- asks envRelationships - return $ case lookupRelationship relId rels of - Just target -> ExternalHyperLink target runs - Nothing -> ExternalHyperLink "" runs + return $ InternalHyperLink anchor runs elemToParPart ns element | isElem ns "m" "oMath" element = (eitherToD $ readOMML $ showElement element) >>= (return . PlainOMath) @@ -684,14 +725,30 @@ elemToRun ns element return $ Run runStyle runElems elemToRun _ _ = throwError WrongElem -elemToParagraphStyle :: NameSpaces -> Element -> ParagraphStyle -elemToParagraphStyle ns element +getParentStyleValue :: (ParStyleData -> Maybe a) -> ParStyleData -> Maybe a +getParentStyleValue field style + | Just value <- field style = Just value + | Just parentStyle <- psStyle style + = getParentStyleValue field (snd parentStyle) +getParentStyleValue _ _ = Nothing + +getParStyleField :: (ParStyleData -> Maybe a) -> ParStyleMap -> [String] -> + Maybe a +getParStyleField field stylemap styles + | x <- mapMaybe (\x -> M.lookup x stylemap) styles + , (y:_) <- mapMaybe (getParentStyleValue field) x + = Just y +getParStyleField _ _ _ = Nothing + +elemToParagraphStyle :: NameSpaces -> Element -> ParStyleMap -> ParagraphStyle +elemToParagraphStyle ns element sty | Just pPr <- findChild (elemName ns "w" "pPr") element = - ParagraphStyle - {pStyle = + let style = mapMaybe (findAttr (elemName ns "w" "val")) (findChildren (elemName ns "w" "pStyle") pPr) + in ParagraphStyle + {pStyle = style , indentation = findChild (elemName ns "w" "ind") pPr >>= elemToParIndentation ns @@ -703,8 +760,10 @@ elemToParagraphStyle ns element Just "none" -> False Just _ -> True Nothing -> False + , pHeading = getParStyleField headingLev sty style + , pBlockQuote = getParStyleField isBlockQuote sty style } -elemToParagraphStyle _ _ = defaultParagraphStyle +elemToParagraphStyle _ _ _ = defaultParagraphStyle checkOnOff :: NameSpaces -> Element -> QName -> Maybe Bool checkOnOff ns rPr tag @@ -758,6 +817,45 @@ elemToRunStyle ns element parentStyle } elemToRunStyle _ _ _ = defaultRunStyle +isNumericNotNull :: String -> Bool +isNumericNotNull str = (str /= []) && (all isDigit str) + +getHeaderLevel :: NameSpaces -> Element -> Maybe (String,Int) +getHeaderLevel ns element + | Just styleId <- findAttr (elemName ns "w" "styleId") element + , Just index <- stripPrefix "Heading" styleId + , isNumericNotNull index = Just (styleId, read index) + | Just styleId <- findAttr (elemName ns "w" "styleId") element + , Just index <- findChild (elemName ns "w" "name") element >>= + findAttr (elemName ns "w" "val") >>= + stripPrefix "heading " + , isNumericNotNull index = Just (styleId, read index) +getHeaderLevel _ _ = Nothing + +blockQuoteStyleIds :: [String] +blockQuoteStyleIds = ["Quote", "BlockQuote", "BlockQuotation"] + +blockQuoteStyleNames :: [String] +blockQuoteStyleNames = ["Quote", "Block Text"] + +getBlockQuote :: NameSpaces -> Element -> Maybe Bool +getBlockQuote ns element + | Just styleId <- findAttr (elemName ns "w" "styleId") element + , styleId `elem` blockQuoteStyleIds = Just True + | Just styleName <- findChild (elemName ns "w" "name") element >>= + findAttr (elemName ns "w" "val") + , styleName `elem` blockQuoteStyleNames = Just True +getBlockQuote _ _ = Nothing + +elemToParStyleData :: NameSpaces -> Element -> Maybe ParStyle -> ParStyleData +elemToParStyleData ns element parentStyle = + ParStyleData + { + headingLev = getHeaderLevel ns element + , isBlockQuote = getBlockQuote ns element + , psStyle = parentStyle + } + elemToRunElem :: NameSpaces -> Element -> D RunElem elemToRunElem ns element | isElem ns "w" "t" element diff --git a/src/Text/Pandoc/Readers/HTML.hs b/src/Text/Pandoc/Readers/HTML.hs index 4ea5f41d5..2a23f2a62 100644 --- a/src/Text/Pandoc/Readers/HTML.hs +++ b/src/Text/Pandoc/Readers/HTML.hs @@ -440,7 +440,7 @@ pCodeBlock :: TagParser Blocks pCodeBlock = try $ do TagOpen _ attr <- pSatisfy (~== TagOpen "pre" []) contents <- manyTill pAnyTag (pCloses "pre" <|> eof) - let rawText = concatMap fromTagText $ filter isTagText contents + let rawText = concatMap tagToString contents -- drop leading newline if any let result' = case rawText of '\n':xs -> xs @@ -451,6 +451,11 @@ pCodeBlock = try $ do _ -> result' return $ B.codeBlockWith (mkAttr attr) result +tagToString :: Tag String -> String +tagToString (TagText s) = s +tagToString (TagOpen "br" _) = "\n" +tagToString _ = "" + inline :: TagParser Inlines inline = choice [ eNoteref @@ -735,7 +740,7 @@ pSpace = many1 (satisfy isSpace) >> return B.space -- eitherBlockOrInline :: [String] -eitherBlockOrInline = ["audio", "applet", "button", "iframe", +eitherBlockOrInline = ["audio", "applet", "button", "iframe", "embed", "del", "ins", "progress", "map", "area", "noscript", "script", "object", "svg", "video", "source"] @@ -753,7 +758,7 @@ blockHtmlTags :: [String] blockHtmlTags = ["?xml", "!DOCTYPE", "address", "article", "aside", "blockquote", "body", "button", "canvas", "caption", "center", "col", "colgroup", "dd", "dir", "div", - "dl", "dt", "embed", "fieldset", "figcaption", "figure", + "dl", "dt", "fieldset", "figcaption", "figure", "footer", "form", "h1", "h2", "h3", "h4", "h5", "h6", "head", "header", "hgroup", "hr", "html", "isindex", "menu", "noframes", "ol", "output", "p", "pre", diff --git a/src/Text/Pandoc/Readers/LaTeX.hs b/src/Text/Pandoc/Readers/LaTeX.hs index 9f51e9a8f..9420d602f 100644 --- a/src/Text/Pandoc/Readers/LaTeX.hs +++ b/src/Text/Pandoc/Readers/LaTeX.hs @@ -494,6 +494,7 @@ inlineCommands = M.fromList $ , ("citealp", citation "citealp" NormalCitation False) , ("citealp*", citation "citealp*" NormalCitation False) , ("autocite", citation "autocite" NormalCitation False) + , ("smartcite", citation "smartcite" NormalCitation False) , ("footcite", inNote <$> citation "footcite" NormalCitation False) , ("parencite", citation "parencite" NormalCitation False) , ("supercite", citation "supercite" NormalCitation False) @@ -516,6 +517,7 @@ inlineCommands = M.fromList $ , ("supercites", citation "supercites" NormalCitation True) , ("footcitetexts", inNote <$> citation "footcitetexts" NormalCitation True) , ("Autocite", citation "Autocite" NormalCitation False) + , ("Smartcite", citation "Smartcite" NormalCitation False) , ("Footcite", citation "Footcite" NormalCitation False) , ("Parencite", citation "Parencite" NormalCitation False) , ("Supercite", citation "Supercite" NormalCitation False) diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs index 02a787670..b8487b4e6 100644 --- a/src/Text/Pandoc/Readers/Markdown.hs +++ b/src/Text/Pandoc/Readers/Markdown.hs @@ -117,6 +117,12 @@ isBlank _ = False -- auxiliary functions -- +-- | Succeeds when we're in list context. +inList :: MarkdownParser () +inList = do + ctx <- stateParserContext <$> getState + guard (ctx == ListItemState) + isNull :: F Inlines -> Bool isNull ils = B.isNull $ runF ils def @@ -735,9 +741,9 @@ anyOrderedListStart = try $ do skipNonindentSpaces notFollowedBy $ string "p." >> spaceChar >> digit -- page number res <- do guardDisabled Ext_fancy_lists - many1 digit + start <- many1 digit >>= safeRead char '.' - return (1, DefaultStyle, DefaultDelim) + return (start, DefaultStyle, DefaultDelim) <|> do (num, style, delim) <- anyOrderedListMarker -- if it could be an abbreviated first name, -- insist on more than one space @@ -926,6 +932,8 @@ para = try $ do <|> (guardEnabled Ext_backtick_code_blocks >> () <$ lookAhead codeBlockFenced) <|> (guardDisabled Ext_blank_before_header >> () <$ lookAhead header) <|> (guardEnabled Ext_lists_without_preceding_blankline >> + -- Avoid creating a paragraph in a nested list. + notFollowedBy' inList >> () <$ lookAhead listStart) <|> do guardEnabled Ext_native_divs inHtmlBlock <- stateInHtmlBlock <$> getState @@ -1610,8 +1618,7 @@ endline = try $ do newline notFollowedBy blankline -- parse potential list-starts differently if in a list: - st <- getState - when (stateParserContext st == ListItemState) $ notFollowedBy listStart + notFollowedBy (inList >> listStart) guardDisabled Ext_lists_without_preceding_blankline <|> notFollowedBy listStart guardEnabled Ext_blank_before_blockquote <|> notFollowedBy emailBlockQuoteStart guardEnabled Ext_blank_before_header <|> notFollowedBy (char '#') -- atx header diff --git a/src/Text/Pandoc/Readers/Org.hs b/src/Text/Pandoc/Readers/Org.hs index b07f96846..579e38a38 100644 --- a/src/Text/Pandoc/Readers/Org.hs +++ b/src/Text/Pandoc/Readers/Org.hs @@ -42,6 +42,7 @@ import Text.Pandoc.Parsing hiding ( F, unF, askF, asksF, runF import Text.Pandoc.Readers.LaTeX (inlineCommand, rawLaTeXInline) import Text.Pandoc.Shared (compactify', compactify'DL) import Text.TeXMath (readTeX, writePandoc, DisplayType(..)) +import qualified Text.TeXMath.Readers.MathML.EntityMap as MathMLEntityMap import Control.Applicative ( Applicative, pure , (<$>), (<$), (<*>), (<*), (*>) ) @@ -69,7 +70,32 @@ parseOrg = do blocks' <- parseBlocks st <- getState let meta = runF (orgStateMeta' st) st - return $ Pandoc meta $ filter (/= Null) (B.toList $ runF blocks' st) + let removeUnwantedBlocks = dropCommentTrees . filter (/= Null) + return $ Pandoc meta $ removeUnwantedBlocks (B.toList $ runF blocks' st) + +-- | Drop COMMENT headers and the document tree below those headers. +dropCommentTrees :: [Block] -> [Block] +dropCommentTrees [] = [] +dropCommentTrees blks@(b:bs) = + maybe blks (flip dropUntilHeaderAboveLevel bs) $ commentHeaderLevel b + +-- | Return the level of a header starting a comment tree and Nothing +-- otherwise. +commentHeaderLevel :: Block -> Maybe Int +commentHeaderLevel blk = + case blk of + (Header level _ ((Str "COMMENT"):_)) -> Just level + _ -> Nothing + +-- | Drop blocks until a header on or above the given level is seen +dropUntilHeaderAboveLevel :: Int -> [Block] -> [Block] +dropUntilHeaderAboveLevel n = dropWhile (not . isHeaderLevelLowerEq n) + +isHeaderLevelLowerEq :: Int -> Block -> Bool +isHeaderLevelLowerEq n blk = + case blk of + (Header level _ _) -> n >= level + _ -> False -- -- Parser State for Org @@ -828,12 +854,14 @@ list :: OrgParser (F Blocks) list = choice [ definitionList, bulletList, orderedList ] <?> "list" definitionList :: OrgParser (F Blocks) -definitionList = fmap B.definitionList . fmap compactify'DL . sequence - <$> many1 (definitionListItem bulletListStart) +definitionList = try $ do n <- lookAhead (bulletListStart' Nothing) + fmap B.definitionList . fmap compactify'DL . sequence + <$> many1 (definitionListItem $ bulletListStart' (Just n)) bulletList :: OrgParser (F Blocks) -bulletList = fmap B.bulletList . fmap compactify' . sequence - <$> many1 (listItem bulletListStart) +bulletList = try $ do n <- lookAhead (bulletListStart' Nothing) + fmap B.bulletList . fmap compactify' . sequence + <$> many1 (listItem (bulletListStart' $ Just n)) orderedList :: OrgParser (F Blocks) orderedList = fmap B.orderedList . fmap compactify' . sequence @@ -845,10 +873,27 @@ genericListStart listMarker = try $ (+) <$> (length <$> many spaceChar) <*> (length <$> listMarker <* many1 spaceChar) --- parses bullet list start and returns its length (excl. following whitespace) +-- parses bullet list marker. maybe we know the indent level bulletListStart :: OrgParser Int -bulletListStart = genericListStart bulletListMarker - where bulletListMarker = pure <$> oneOf "*-+" +bulletListStart = bulletListStart' Nothing + +bulletListStart' :: Maybe Int -> OrgParser Int +-- returns length of bulletList prefix, inclusive of marker +bulletListStart' Nothing = do ind <- length <$> many spaceChar + when (ind == 0) $ notFollowedBy (char '*') + oneOf bullets + many1 spaceChar + return (ind + 1) + -- Unindented lists are legal, but they can't use '*' bullets + -- We return n to maintain compatibility with the generic listItem +bulletListStart' (Just n) = do count (n-1) spaceChar + when (n == 1) $ notFollowedBy (char '*') + oneOf bullets + many1 spaceChar + return n + +bullets :: String +bullets = "*+-" orderedListStart :: OrgParser Int orderedListStart = genericListStart orderedListMarker @@ -863,7 +908,7 @@ definitionListItem parseMarkerGetLength = try $ do line1 <- anyLineNewline blank <- option "" ("\n" <$ blankline) cont <- concat <$> many (listContinuation markerLength) - term' <- parseFromString inline term + term' <- parseFromString parseInlines term contents' <- parseFromString parseBlocks $ line1 ++ blank ++ cont return $ (,) <$> term' <*> fmap (:[]) contents' @@ -927,7 +972,7 @@ parseInlines = trimInlinesF . mconcat <$> many1 inline -- treat these as potentially non-text when parsing inline: specialChars :: [Char] -specialChars = "\"$'()*+-./:<=>[\\]^_{|}~" +specialChars = "\"$'()*+-,./:<=>[\\]^_{|}~" whitespace :: OrgParser (F Inlines) @@ -1054,7 +1099,7 @@ linkOrImage = explicitOrImageLink explicitOrImageLink :: OrgParser (F Inlines) explicitOrImageLink = try $ do char '[' - srcF <- applyCustomLinkFormat =<< linkTarget + srcF <- applyCustomLinkFormat =<< possiblyEmptyLinkTarget title <- enclosedRaw (char '[') (char ']') title' <- parseFromString (mconcat <$> many inline) title char ']' @@ -1087,6 +1132,9 @@ selfTarget = try $ char '[' *> linkTarget <* char ']' linkTarget :: OrgParser String linkTarget = enclosedByPair '[' ']' (noneOf "\n\r[]") +possiblyEmptyLinkTarget :: OrgParser String +possiblyEmptyLinkTarget = try linkTarget <|> ("" <$ string "[]") + applyCustomLinkFormat :: String -> OrgParser (F String) applyCustomLinkFormat link = do let (linkType, rest) = break (== ':') link @@ -1094,27 +1142,33 @@ applyCustomLinkFormat link = do formatter <- M.lookup linkType <$> asksF orgStateLinkFormatters return $ maybe link ($ drop 1 rest) formatter - linkToInlinesF :: String -> Inlines -> F Inlines -linkToInlinesF s@('#':_) = pure . B.link s "" -linkToInlinesF s - | isImageFilename s = const . pure $ B.image s "" "" - | isUri s = pure . B.link s "" - | isRelativeUrl s = pure . B.link s "" -linkToInlinesF s = \title -> do - anchorB <- (s `elem`) <$> asksF orgStateAnchorIds - if anchorB - then pure $ B.link ('#':s) "" title - else pure $ B.emph title - -isRelativeUrl :: String -> Bool -isRelativeUrl s = (':' `notElem` s) && ("./" `isPrefixOf` s) +linkToInlinesF s = + case s of + "" -> pure . B.link "" "" + ('#':_) -> pure . B.link s "" + _ | isImageFilename s -> const . pure $ B.image s "" "" + _ | isUri s -> pure . B.link s "" + _ | isRelativeFilePath s -> pure . B.link s "" + _ | isAbsoluteFilePath s -> pure . B.link ("file://" ++ s) "" + _ -> \title -> do + anchorB <- (s `elem`) <$> asksF orgStateAnchorIds + if anchorB + then pure $ B.link ('#':s) "" title + else pure $ B.emph title + +isRelativeFilePath :: String -> Bool +isRelativeFilePath s = (("./" `isPrefixOf` s) || ("../" `isPrefixOf` s)) && + (':' `notElem` s) isUri :: String -> Bool isUri s = let (scheme, path) = break (== ':') s in all (\c -> isAlphaNum c || c `elem` ".-") scheme && not (null path) +isAbsoluteFilePath :: String -> Bool +isAbsoluteFilePath = ('/' ==) . head + isImageFilename :: String -> Bool isImageFilename filename = any (\x -> ('.':x) `isSuffixOf` filename) imageExtensions && @@ -1205,10 +1259,10 @@ displayMath = return . B.displayMath <$> choice [ rawMathBetween "\\[" "\\]" ] symbol :: OrgParser (F Inlines) symbol = return . B.str . (: "") <$> (oneOf specialChars >>= updatePositions) - where updatePositions c - | c `elem` emphasisPreChars = c <$ updateLastPreCharPos - | c `elem` emphasisForbiddenBorderChars = c <$ updateLastForbiddenCharPos - | otherwise = return c + where updatePositions c = do + when (c `elem` emphasisPreChars) updateLastPreCharPos + when (c `elem` emphasisForbiddenBorderChars) updateLastForbiddenCharPos + return c emphasisBetween :: Char -> OrgParser (F Inlines) @@ -1387,7 +1441,8 @@ simpleSubOrSuperString = try $ inlineLaTeX :: OrgParser (F Inlines) inlineLaTeX = try $ do cmd <- inlineLaTeXCommand - maybe mzero returnF $ parseAsMath cmd `mplus` parseAsInlineLaTeX cmd + maybe mzero returnF $ + parseAsMath cmd `mplus` parseAsMathMLSym cmd `mplus` parseAsInlineLaTeX cmd where parseAsMath :: String -> Maybe Inlines parseAsMath cs = B.fromList <$> texMathToPandoc cs @@ -1395,6 +1450,11 @@ inlineLaTeX = try $ do parseAsInlineLaTeX :: String -> Maybe Inlines parseAsInlineLaTeX cs = maybeRight $ runParser inlineCommand state "" cs + parseAsMathMLSym :: String -> Maybe Inlines + parseAsMathMLSym cs = B.str <$> MathMLEntityMap.getUnicode (clean cs) + -- dropWhileEnd would be nice here, but it's not available before base 4.5 + where clean = reverse . dropWhile (`elem` "{}") . reverse . drop 1 + state :: ParserState state = def{ stateOptions = def{ readerParseRaw = True }} diff --git a/src/Text/Pandoc/Readers/RST.hs b/src/Text/Pandoc/Readers/RST.hs index e5eccb116..732956981 100644 --- a/src/Text/Pandoc/Readers/RST.hs +++ b/src/Text/Pandoc/Readers/RST.hs @@ -47,7 +47,7 @@ import Text.Pandoc.Builder (Inlines, Blocks, trimInlines, (<>)) import qualified Text.Pandoc.Builder as B import Data.Monoid (mconcat, mempty) import Data.Sequence (viewr, ViewR(..)) -import Data.Char (toLower, isHexDigit) +import Data.Char (toLower, isHexDigit, isSpace) -- | Parse reStructuredText string and return Pandoc document. readRST :: ReaderOptions -- ^ Reader options @@ -335,6 +335,13 @@ indentedBlock = try $ do optional blanklines return $ unlines lns +quotedBlock :: Parser [Char] st [Char] +quotedBlock = try $ do + quote <- lookAhead $ oneOf "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~" + lns <- many1 $ lookAhead (char quote) >> anyLine + optional blanklines + return $ unlines lns + codeBlockStart :: Parser [Char] st Char codeBlockStart = string "::" >> blankline >> blankline @@ -342,7 +349,8 @@ codeBlock :: Parser [Char] st Blocks codeBlock = try $ codeBlockStart >> codeBlockBody codeBlockBody :: Parser [Char] st Blocks -codeBlockBody = try $ B.codeBlock . stripTrailingNewlines <$> indentedBlock +codeBlockBody = try $ B.codeBlock . stripTrailingNewlines <$> + (indentedBlock <|> quotedBlock) lhsCodeBlock :: RSTParser Blocks lhsCodeBlock = try $ do @@ -513,7 +521,6 @@ directive = try $ do -- TODO: line-block, parsed-literal, table, csv-table, list-table -- date -- include --- class -- title directive' :: RSTParser Blocks directive' = do @@ -594,6 +601,13 @@ directive' = do Just t -> B.link (escapeURI $ trim t) "" $ B.image src "" alt Nothing -> B.image src "" alt + "class" -> do + let attrs = ("", (splitBy isSpace $ trim top), map (\(k,v) -> (k, trimr v)) fields) + -- directive content or the first immediately following element + children <- case body of + "" -> block + _ -> parseFromString parseBlocks body' + return $ B.divWith attrs children _ -> return mempty -- TODO: diff --git a/src/Text/Pandoc/Readers/TWiki.hs b/src/Text/Pandoc/Readers/TWiki.hs new file mode 100644 index 000000000..c2325c0ea --- /dev/null +++ b/src/Text/Pandoc/Readers/TWiki.hs @@ -0,0 +1,526 @@ +{-# LANGUAGE RelaxedPolyRec, FlexibleInstances, TypeSynonymInstances #-} +-- RelaxedPolyRec needed for inlinesBetween on GHC < 7 +{- + Copyright (C) 2014 Alexander Sulfrian <alexander.sulfrian@fu-berlin.de> + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +-} + +{- | + Module : Text.Pandoc.Readers.TWiki + Copyright : Copyright (C) 2014 Alexander Sulfrian + License : GNU GPL, version 2 or above + + Maintainer : Alexander Sulfrian <alexander.sulfrian@fu-berlin.de> + Stability : alpha + Portability : portable + +Conversion of twiki text to 'Pandoc' document. +-} +module Text.Pandoc.Readers.TWiki ( readTWiki + , readTWikiWithWarnings + ) where + +import Text.Pandoc.Definition +import qualified Text.Pandoc.Builder as B +import Text.Pandoc.Options +import Text.Pandoc.Parsing hiding (enclosed, macro, nested) +import Text.Pandoc.Readers.HTML (htmlTag, isCommentTag) +import Data.Monoid (Monoid, mconcat, mempty) +import Control.Applicative ((<$>), (<*), (*>), (<$)) +import Control.Monad +import Text.Printf (printf) +import Debug.Trace (trace) +import Text.Pandoc.XML (fromEntities) +import Data.Maybe (fromMaybe) +import Text.HTML.TagSoup +import Data.Char (isAlphaNum) +import qualified Data.Foldable as F + +-- | Read twiki from an input string and return a Pandoc document. +readTWiki :: ReaderOptions -- ^ Reader options + -> String -- ^ String to parse (assuming @'\n'@ line endings) + -> Pandoc +readTWiki opts s = + (readWith parseTWiki) def{ stateOptions = opts } (s ++ "\n\n") + +readTWikiWithWarnings :: ReaderOptions -- ^ Reader options + -> String -- ^ String to parse (assuming @'\n'@ line endings) + -> (Pandoc, [String]) +readTWikiWithWarnings opts s = + (readWith parseTWikiWithWarnings) def{ stateOptions = opts } (s ++ "\n\n") + where parseTWikiWithWarnings = do + doc <- parseTWiki + warnings <- stateWarnings <$> getState + return (doc, warnings) + +type TWParser = Parser [Char] ParserState + +-- +-- utility functions +-- + +tryMsg :: String -> TWParser a -> TWParser a +tryMsg msg p = try p <?> msg + +skip :: TWParser a -> TWParser () +skip parser = parser >> return () + +nested :: TWParser a -> TWParser a +nested p = do + nestlevel <- stateMaxNestingLevel <$> getState + guard $ nestlevel > 0 + updateState $ \st -> st{ stateMaxNestingLevel = stateMaxNestingLevel st - 1 } + res <- p + updateState $ \st -> st{ stateMaxNestingLevel = nestlevel } + return res + +htmlElement :: String -> TWParser (Attr, String) +htmlElement tag = tryMsg tag $ do + (TagOpen _ attr, _) <- htmlTag (~== TagOpen tag []) + content <- manyTill anyChar (endtag <|> endofinput) + return (htmlAttrToPandoc attr, trim content) + where + endtag = skip $ htmlTag (~== TagClose tag) + endofinput = lookAhead $ try $ skipMany blankline >> skipSpaces >> eof + trim = dropWhile (=='\n') . reverse . dropWhile (=='\n') . reverse + +htmlAttrToPandoc :: [Attribute String] -> Attr +htmlAttrToPandoc attrs = (ident, classes, keyvals) + where + ident = fromMaybe "" $ lookup "id" attrs + classes = maybe [] words $ lookup "class" attrs + keyvals = [(k,v) | (k,v) <- attrs, k /= "id" && k /= "class"] + +parseHtmlContentWithAttrs :: String -> TWParser a -> TWParser (Attr, [a]) +parseHtmlContentWithAttrs tag parser = do + (attr, content) <- htmlElement tag + parsedContent <- try $ parseContent content + return (attr, parsedContent) + where + parseContent = parseFromString $ nested $ manyTill parser endOfContent + endOfContent = try $ skipMany blankline >> skipSpaces >> eof + +parseHtmlContent :: String -> TWParser a -> TWParser [a] +parseHtmlContent tag p = parseHtmlContentWithAttrs tag p >>= return . snd + +-- +-- main parser +-- + +parseTWiki :: TWParser Pandoc +parseTWiki = do + bs <- mconcat <$> many block + spaces + eof + return $ B.doc bs + + +-- +-- block parsers +-- + +block :: TWParser B.Blocks +block = do + tr <- getOption readerTrace + pos <- getPosition + res <- mempty <$ skipMany1 blankline + <|> blockElements + <|> para + skipMany blankline + when tr $ + trace (printf "line %d: %s" (sourceLine pos) + (take 60 $ show $ B.toList res)) (return ()) + return res + +blockElements :: TWParser B.Blocks +blockElements = choice [ separator + , header + , verbatim + , literal + , list "" + , table + , blockQuote + , noautolink + ] + +separator :: TWParser B.Blocks +separator = tryMsg "separator" $ string "---" >> newline >> return B.horizontalRule + +header :: TWParser B.Blocks +header = tryMsg "header" $ do + string "---" + level <- many1 (char '+') >>= return . length + guard $ level <= 6 + classes <- option [] $ string "!!" >> return ["unnumbered"] + skipSpaces + content <- B.trimInlines . mconcat <$> manyTill inline newline + attr <- registerHeader ("", classes, []) content + return $ B.headerWith attr level $ content + +verbatim :: TWParser B.Blocks +verbatim = (htmlElement "verbatim" <|> htmlElement "pre") + >>= return . (uncurry B.codeBlockWith) + +literal :: TWParser B.Blocks +literal = htmlElement "literal" >>= return . rawBlock + where + format (_, _, kvs) = fromMaybe "html" $ lookup "format" kvs + rawBlock (attrs, content) = B.rawBlock (format attrs) content + +list :: String -> TWParser B.Blocks +list prefix = choice [ bulletList prefix + , orderedList prefix + , definitionList prefix] + +definitionList :: String -> TWParser B.Blocks +definitionList prefix = tryMsg "definitionList" $ do + indent <- lookAhead $ string prefix *> (many1 $ string " ") <* string "$ " + elements <- many $ parseDefinitionListItem (prefix ++ concat indent) + return $ B.definitionList elements + where + parseDefinitionListItem :: String -> TWParser (B.Inlines, [B.Blocks]) + parseDefinitionListItem indent = do + string (indent ++ "$ ") >> skipSpaces + term <- many1Till inline $ string ": " + line <- listItemLine indent $ string "$ " + return $ (mconcat term, [line]) + +bulletList :: String -> TWParser B.Blocks +bulletList prefix = tryMsg "bulletList" $ + parseList prefix (char '*') (char ' ') + +orderedList :: String -> TWParser B.Blocks +orderedList prefix = tryMsg "orderedList" $ + parseList prefix (oneOf "1iIaA") (string ". ") + +parseList :: Show a => String -> TWParser Char -> TWParser a -> TWParser B.Blocks +parseList prefix marker delim = do + (indent, style) <- lookAhead $ string prefix *> listStyle <* delim + blocks <- many $ parseListItem (prefix ++ indent) (char style <* delim) + return $ case style of + '1' -> B.orderedListWith (1, DefaultStyle, DefaultDelim) blocks + 'i' -> B.orderedListWith (1, LowerRoman, DefaultDelim) blocks + 'I' -> B.orderedListWith (1, UpperRoman, DefaultDelim) blocks + 'a' -> B.orderedListWith (1, LowerAlpha, DefaultDelim) blocks + 'A' -> B.orderedListWith (1, UpperAlpha, DefaultDelim) blocks + _ -> B.bulletList blocks + where + listStyle = do + indent <- many1 $ string " " + style <- marker + return (concat indent, style) + +parseListItem :: Show a => String -> TWParser a -> TWParser B.Blocks +parseListItem prefix marker = string prefix >> marker >> listItemLine prefix marker + +listItemLine :: Show a => String -> TWParser a -> TWParser B.Blocks +listItemLine prefix marker = lineContent >>= parseContent >>= return . mconcat + where + lineContent = do + content <- anyLine + continuation <- optionMaybe listContinuation + return $ filterSpaces content ++ "\n" ++ (maybe "" (" " ++) continuation) + filterSpaces = reverse . dropWhile (== ' ') . reverse + listContinuation = notFollowedBy (string prefix >> marker) >> + string " " >> lineContent + parseContent = parseFromString $ many1 $ nestedList <|> parseInline + parseInline = many1Till inline (lastNewline <|> newlineBeforeNestedList) >>= + return . B.plain . mconcat + nestedList = list prefix + lastNewline = try $ char '\n' <* eof + newlineBeforeNestedList = try $ char '\n' <* lookAhead nestedList + +table :: TWParser B.Blocks +table = try $ do + tableHead <- optionMaybe $ many1Till tableParseHeader newline >>= return . unzip + rows <- many1 tableParseRow + return $ buildTable mempty rows $ fromMaybe (align rows, columns rows) tableHead + where + buildTable caption rows (aligns, heads) + = B.table caption aligns heads rows + align rows = replicate (columCount rows) (AlignDefault, 0) + columns rows = replicate (columCount rows) mempty + columCount rows = length $ head rows + +tableParseHeader :: TWParser ((Alignment, Double), B.Blocks) +tableParseHeader = try $ do + char '|' + leftSpaces <- many spaceChar >>= return . length + char '*' + content <- tableColumnContent (char '*' >> skipSpaces >> char '|') + char '*' + rightSpaces <- many spaceChar >>= return . length + optional tableEndOfRow + return (tableAlign leftSpaces rightSpaces, content) + where + tableAlign left right + | left >= 2 && left == right = (AlignCenter, 0) + | left > right = (AlignRight, 0) + | otherwise = (AlignLeft, 0) + +tableParseRow :: TWParser [B.Blocks] +tableParseRow = many1Till tableParseColumn newline + +tableParseColumn :: TWParser B.Blocks +tableParseColumn = char '|' *> skipSpaces *> + tableColumnContent (skipSpaces >> char '|') + <* skipSpaces <* optional tableEndOfRow + +tableEndOfRow :: TWParser Char +tableEndOfRow = lookAhead (try $ char '|' >> char '\n') >> char '|' + +tableColumnContent :: Show a => TWParser a -> TWParser B.Blocks +tableColumnContent end = manyTill content (lookAhead $ try end) >>= return . B.plain . mconcat + where + content = continuation <|> inline + continuation = try $ char '\\' >> newline >> return mempty + +blockQuote :: TWParser B.Blocks +blockQuote = parseHtmlContent "blockquote" block >>= return . B.blockQuote . mconcat + +noautolink :: TWParser B.Blocks +noautolink = do + (_, content) <- htmlElement "noautolink" + st <- getState + setState $ st{ stateAllowLinks = False } + blocks <- try $ parseContent content + setState $ st{ stateAllowLinks = True } + return $ mconcat blocks + where + parseContent = parseFromString $ many $ block + +para :: TWParser B.Blocks +para = many1Till inline endOfParaElement >>= return . result . mconcat + where + endOfParaElement = lookAhead $ endOfInput <|> endOfPara <|> newBlockElement + endOfInput = try $ skipMany blankline >> skipSpaces >> eof + endOfPara = try $ blankline >> skipMany1 blankline + newBlockElement = try $ blankline >> skip blockElements + result content = if F.all (==Space) content + then mempty + else B.para $ B.trimInlines content + + +-- +-- inline parsers +-- + +inline :: TWParser B.Inlines +inline = choice [ whitespace + , br + , macro + , strong + , strongHtml + , strongAndEmph + , emph + , emphHtml + , boldCode + , smart + , link + , htmlComment + , code + , codeHtml + , nop + , autoLink + , str + , symbol + ] <?> "inline" + +whitespace :: TWParser B.Inlines +whitespace = (lb <|> regsp) >>= return + where lb = try $ skipMany spaceChar >> linebreak >> return B.space + regsp = try $ skipMany1 spaceChar >> return B.space + +br :: TWParser B.Inlines +br = try $ string "%BR%" >> return B.linebreak + +linebreak :: TWParser B.Inlines +linebreak = newline >> notFollowedBy newline >> (lastNewline <|> innerNewline) + where lastNewline = eof >> return mempty + innerNewline = return B.space + +between :: (Show b, Monoid c) => TWParser a -> TWParser b -> (TWParser b -> TWParser c) -> TWParser c +between start end p = + mconcat <$> try (start >> notFollowedBy whitespace >> many1Till (p end) end) + +enclosed :: (Show a, Monoid b) => TWParser a -> (TWParser a -> TWParser b) -> TWParser b +enclosed sep p = between sep (try $ sep <* endMarker) p + where + endMarker = lookAhead $ skip endSpace <|> skip (oneOf ".,!?:)|") <|> eof + endSpace = (spaceChar <|> newline) >> return B.space + +macro :: TWParser B.Inlines +macro = macroWithParameters <|> withoutParameters + where + withoutParameters = enclosed (char '%') (\_ -> macroName) >>= return . emptySpan + emptySpan name = buildSpan name [] mempty + +macroWithParameters :: TWParser B.Inlines +macroWithParameters = try $ do + char '%' + name <- macroName + (content, kvs) <- attributes + char '%' + return $ buildSpan name kvs $ B.str content + +buildSpan :: String -> [(String, String)] -> B.Inlines -> B.Inlines +buildSpan className kvs = B.spanWith attrs + where + attrs = ("", ["twiki-macro", className] ++ additionalClasses, kvsWithoutClasses) + additionalClasses = maybe [] words $ lookup "class" kvs + kvsWithoutClasses = [(k,v) | (k,v) <- kvs, k /= "class"] + +macroName :: TWParser String +macroName = do + first <- letter + rest <- many $ alphaNum <|> char '_' + return (first:rest) + +attributes :: TWParser (String, [(String, String)]) +attributes = char '{' *> spnl *> many (attribute <* spnl) <* char '}' >>= + return . foldr (either mkContent mkKvs) ([], []) + where + spnl = skipMany (spaceChar <|> newline) + mkContent c ([], kvs) = (c, kvs) + mkContent c (rest, kvs) = (c ++ " " ++ rest, kvs) + mkKvs kv (cont, rest) = (cont, (kv : rest)) + +attribute :: TWParser (Either String (String, String)) +attribute = withKey <|> withoutKey + where + withKey = try $ do + key <- macroName + char '=' + parseValue False >>= return . (curry Right key) + withoutKey = try $ parseValue True >>= return . Left + parseValue allowSpaces = (withQuotes <|> withoutQuotes allowSpaces) >>= return . fromEntities + withQuotes = between (char '"') (char '"') (\_ -> count 1 $ noneOf ['"']) + withoutQuotes allowSpaces + | allowSpaces == True = many1 $ noneOf "}" + | otherwise = many1 $ noneOf " }" + +nestedInlines :: Show a => TWParser a -> TWParser B.Inlines +nestedInlines end = innerSpace <|> nestedInline + where + innerSpace = try $ whitespace <* (notFollowedBy end) + nestedInline = notFollowedBy whitespace >> nested inline + +strong :: TWParser B.Inlines +strong = try $ enclosed (char '*') nestedInlines >>= return . B.strong + +strongHtml :: TWParser B.Inlines +strongHtml = (parseHtmlContent "strong" inline <|> parseHtmlContent "b" inline) + >>= return . B.strong . mconcat + +strongAndEmph :: TWParser B.Inlines +strongAndEmph = try $ enclosed (string "__") nestedInlines >>= return . B.emph . B.strong + +emph :: TWParser B.Inlines +emph = try $ enclosed (char '_') nestedInlines >>= return . B.emph + +emphHtml :: TWParser B.Inlines +emphHtml = (parseHtmlContent "em" inline <|> parseHtmlContent "i" inline) + >>= return . B.emph . mconcat + +nestedString :: Show a => TWParser a -> TWParser String +nestedString end = innerSpace <|> (count 1 nonspaceChar) + where + innerSpace = try $ many1 spaceChar <* notFollowedBy end + +boldCode :: TWParser B.Inlines +boldCode = try $ enclosed (string "==") nestedString >>= return . B.strong . B.code . fromEntities + +htmlComment :: TWParser B.Inlines +htmlComment = htmlTag isCommentTag >> return mempty + +code :: TWParser B.Inlines +code = try $ enclosed (char '=') nestedString >>= return . B.code . fromEntities + +codeHtml :: TWParser B.Inlines +codeHtml = do + (attrs, content) <- parseHtmlContentWithAttrs "code" anyChar + return $ B.codeWith attrs $ fromEntities content + +autoLink :: TWParser B.Inlines +autoLink = try $ do + state <- getState + guard $ stateAllowLinks state + (text, url) <- parseLink + guard $ checkLink (head $ reverse url) + return $ makeLink (text, url) + where + parseLink = notFollowedBy nop >> (uri <|> emailAddress) + makeLink (text, url) = B.link url "" $ B.str text + checkLink c + | c == '/' = True + | otherwise = isAlphaNum c + +str :: TWParser B.Inlines +str = (many1 alphaNum <|> count 1 characterReference) >>= return . B.str + +nop :: TWParser B.Inlines +nop = try $ (skip exclamation <|> skip nopTag) >> followContent + where + exclamation = char '!' + nopTag = stringAnyCase "<nop>" + followContent = many1 nonspaceChar >>= return . B.str . fromEntities + +symbol :: TWParser B.Inlines +symbol = count 1 nonspaceChar >>= return . B.str + +smart :: TWParser B.Inlines +smart = do + getOption readerSmart >>= guard + doubleQuoted <|> singleQuoted <|> + choice [ apostrophe + , dash + , ellipses + ] + +singleQuoted :: TWParser B.Inlines +singleQuoted = try $ do + singleQuoteStart + withQuoteContext InSingleQuote $ + many1Till inline singleQuoteEnd >>= + (return . B.singleQuoted . B.trimInlines . mconcat) + +doubleQuoted :: TWParser B.Inlines +doubleQuoted = try $ do + doubleQuoteStart + contents <- mconcat <$> many (try $ notFollowedBy doubleQuoteEnd >> inline) + (withQuoteContext InDoubleQuote $ doubleQuoteEnd >> + return (B.doubleQuoted $ B.trimInlines contents)) + <|> (return $ (B.str "\8220") B.<> contents) + +link :: TWParser B.Inlines +link = try $ do + st <- getState + guard $ stateAllowLinks st + setState $ st{ stateAllowLinks = False } + (url, title, content) <- linkText + setState $ st{ stateAllowLinks = True } + return $ B.link url title content + +linkText :: TWParser (String, String, B.Inlines) +linkText = do + string "[[" + url <- many1Till anyChar (char ']') + content <- option [B.str url] linkContent + char ']' + return (url, "", mconcat content) + where + linkContent = (char '[') >> many1Till anyChar (char ']') >>= parseLinkContent + parseLinkContent = parseFromString $ many1 inline diff --git a/src/Text/Pandoc/SelfContained.hs b/src/Text/Pandoc/SelfContained.hs index 36839ddd0..5b8f7a75a 100644 --- a/src/Text/Pandoc/SelfContained.hs +++ b/src/Text/Pandoc/SelfContained.hs @@ -51,7 +51,8 @@ isOk c = isAscii c && isAlphaNum c convertTag :: MediaBag -> Maybe String -> Tag String -> IO (Tag String) convertTag media sourceURL t@(TagOpen tagname as) - | tagname `elem` ["img", "embed", "video", "input", "audio", "source"] = do + | tagname `elem` + ["img", "embed", "video", "input", "audio", "source", "track"] = do as' <- mapM processAttribute as return $ TagOpen tagname as' where processAttribute (x,y) = diff --git a/src/Text/Pandoc/Shared.hs b/src/Text/Pandoc/Shared.hs index 2d7c08718..9aa70e6f2 100644 --- a/src/Text/Pandoc/Shared.hs +++ b/src/Text/Pandoc/Shared.hs @@ -1,5 +1,6 @@ {-# LANGUAGE DeriveDataTypeable, CPP, MultiParamTypeClasses, - FlexibleContexts, ScopedTypeVariables, PatternGuards #-} + FlexibleContexts, ScopedTypeVariables, PatternGuards, + ViewPatterns #-} {- Copyright (C) 2006-2014 John MacFarlane <jgm@berkeley.edu> @@ -106,7 +107,7 @@ import Network.URI ( escapeURIString, isURI, nonStrictRelativeTo, unEscapeString, parseURIReference, isAllowedInURI ) import qualified Data.Set as Set import System.Directory -import System.FilePath (joinPath, splitDirectories) +import System.FilePath (joinPath, splitDirectories, pathSeparator, isPathSeparator) import Text.Pandoc.MIME (MimeType, getMimeType) import System.FilePath ( (</>), takeExtension, dropExtension) import Data.Generics (Typeable, Data) @@ -734,12 +735,10 @@ renderTags' = renderTagsOptions -- | Perform an IO action in a directory, returning to starting directory. inDirectory :: FilePath -> IO a -> IO a -inDirectory path action = do - oldDir <- getCurrentDirectory - setCurrentDirectory path - result <- action - setCurrentDirectory oldDir - return result +inDirectory path action = E.bracket + getCurrentDirectory + setCurrentDirectory + (const $ setCurrentDirectory path >> action) readDefaultDataFile :: FilePath -> IO BS.ByteString readDefaultDataFile fname = @@ -871,11 +870,14 @@ collapseFilePath = joinPath . reverse . foldl go [] . splitDirectories go rs "." = rs go r@(p:rs) ".." = case p of ".." -> ("..":r) - "/" -> ("..":r) + (checkPathSeperator -> Just True) -> ("..":r) _ -> rs - go _ "/" = ["/"] + go _ (checkPathSeperator -> Just True) = [[pathSeparator]] go rs x = x:rs - + isSingleton [] = Nothing + isSingleton [x] = Just x + isSingleton _ = Nothing + checkPathSeperator = fmap isPathSeparator . isSingleton -- -- Safe read diff --git a/src/Text/Pandoc/Writers/ConTeXt.hs b/src/Text/Pandoc/Writers/ConTeXt.hs index bbca7f858..ebdc4a3d3 100644 --- a/src/Text/Pandoc/Writers/ConTeXt.hs +++ b/src/Text/Pandoc/Writers/ConTeXt.hs @@ -36,6 +36,7 @@ import Text.Pandoc.Options import Text.Pandoc.Walk (query) import Text.Printf ( printf ) import Data.List ( intercalate ) +import Data.Char ( ord ) import Control.Monad.State import Text.Pandoc.Pretty import Text.Pandoc.Templates ( renderTemplate' ) @@ -114,6 +115,13 @@ escapeCharForConTeXt opts ch = stringToConTeXt :: WriterOptions -> String -> String stringToConTeXt opts = concatMap (escapeCharForConTeXt opts) +-- | Sanitize labels +toLabel :: String -> String +toLabel z = concatMap go z + where go x + | elem x "\\#[]\",{}%()|=" = "ux" ++ printf "%x" (ord x) + | otherwise = [x] + -- | Convert Elements to ConTeXt elementToConTeXt :: WriterOptions -> Element -> State WriterState Doc elementToConTeXt _ (Blk block) = blockToConTeXt block @@ -286,15 +294,16 @@ inlineToConTeXt Space = return space -- Handle HTML-like internal document references to sections inlineToConTeXt (Link txt (('#' : ref), _)) = do opts <- gets stOptions - label <- inlineListToConTeXt txt + contents <- inlineListToConTeXt txt + let ref' = toLabel $ stringToConTeXt opts ref return $ text "\\in" <> braces (if writerNumberSections opts - then label <+> text "(\\S" - else label) -- prefix + then contents <+> text "(\\S" + else contents) -- prefix <> braces (if writerNumberSections opts then text ")" else empty) -- suffix - <> brackets (text ref) + <> brackets (text ref') inlineToConTeXt (Link txt (src, _)) = do let isAutolink = txt == [Str (unEscapeString src)] @@ -302,13 +311,13 @@ inlineToConTeXt (Link txt (src, _)) = do let next = stNextRef st put $ st {stNextRef = next + 1} let ref = "url" ++ show next - label <- inlineListToConTeXt txt + contents <- inlineListToConTeXt txt return $ "\\useURL" <> brackets (text ref) <> brackets (text $ escapeStringUsing [('#',"\\#"),('%',"\\%")] src) <> (if isAutolink then empty - else brackets empty <> brackets label) + else brackets empty <> brackets contents) <> "\\from" <> brackets (text ref) inlineToConTeXt (Image _ (src, _)) = do @@ -337,6 +346,7 @@ sectionHeader (ident,classes,_) hdrLevel lst = do st <- get let opts = stOptions st let level' = if writerChapters opts then hdrLevel - 1 else hdrLevel + let ident' = toLabel ident let (section, chapter) = if "unnumbered" `elem` classes then (text "subject", text "title") else (text "section", text "chapter") @@ -344,7 +354,7 @@ sectionHeader (ident,classes,_) hdrLevel lst = do then char '\\' <> text (concat (replicate (level' - 1) "sub")) <> section - <> (if (not . null) ident then brackets (text ident) else empty) + <> (if (not . null) ident' then brackets (text ident') else empty) <> braces contents <> blankline else if level' == 0 diff --git a/src/Text/Pandoc/Writers/Docx.hs b/src/Text/Pandoc/Writers/Docx.hs index 5320a2816..f8e8cc34d 100644 --- a/src/Text/Pandoc/Writers/Docx.hs +++ b/src/Text/Pandoc/Writers/Docx.hs @@ -29,7 +29,7 @@ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Conversion of 'Pandoc' documents to docx. -} module Text.Pandoc.Writers.Docx ( writeDocx ) where -import Data.List ( intercalate, isPrefixOf, isSuffixOf ) +import Data.List ( intercalate, isPrefixOf, isSuffixOf, stripPrefix ) import qualified Data.ByteString as B import qualified Data.ByteString.Lazy as BL import qualified Data.ByteString.Lazy.Char8 as BL8 @@ -62,8 +62,9 @@ import Text.Printf (printf) import qualified Control.Exception as E import Text.Pandoc.MIME (MimeType, getMimeType, getMimeTypeDef, extensionFromMimeType) -import Control.Applicative ((<$>), (<|>)) +import Control.Applicative ((<$>), (<|>), (<*>)) import Data.Maybe (fromMaybe, mapMaybe) +import Data.Char (isDigit) data ListMarker = NoMarker | BulletMarker @@ -104,6 +105,8 @@ data WriterState = WriterState{ , stInDel :: Bool , stChangesAuthor :: String , stChangesDate :: String + , stPrintWidth :: Integer + , stHeadingStyles :: [(Int,String)] } defaultWriterState :: WriterState @@ -122,6 +125,8 @@ defaultWriterState = WriterState{ , stInDel = False , stChangesAuthor = "unknown" , stChangesDate = "1969-12-31T19:00:00Z" + , stPrintWidth = 1 + , stHeadingStyles = [] } type WS a = StateT WriterState IO a @@ -181,11 +186,59 @@ writeDocx opts doc@(Pandoc meta _) = do case writerReferenceDocx opts of Just f -> B.readFile f Nothing -> readDataFile datadir "reference.docx" - distArchive <- liftM (toArchive . toLazy) $ readDataFile Nothing "reference.docx" + distArchive <- liftM (toArchive . toLazy) $ readDataFile datadir "reference.docx" + + parsedDoc <- parseXml refArchive distArchive "word/document.xml" + let wname f qn = qPrefix qn == Just "w" && f (qName qn) + let mbsectpr = filterElementName (wname (=="sectPr")) parsedDoc + + -- Gets the template size + let mbpgsz = mbsectpr >>= (filterElementName (wname (=="pgSz"))) + let mbAttrSzWidth = (elAttribs <$> mbpgsz) >>= (lookupAttrBy ((=="w") . qName)) + + let mbpgmar = mbsectpr >>= (filterElementName (wname (=="pgMar"))) + let mbAttrMarLeft = (elAttribs <$> mbpgmar) >>= (lookupAttrBy ((=="left") . qName)) + let mbAttrMarRight = (elAttribs <$> mbpgmar) >>= (lookupAttrBy ((=="right") . qName)) + + -- Get the avaible area (converting the size and the margins to int and + -- doing the difference + let pgContentWidth = (-) <$> (read <$> mbAttrSzWidth ::Maybe Integer) + <*> ( + (+) <$> (read <$> mbAttrMarRight ::Maybe Integer) + <*> (read <$> mbAttrMarLeft ::Maybe Integer) + ) + + -- styles + let stylepath = "word/styles.xml" + styledoc <- parseXml refArchive distArchive stylepath + + -- parse styledoc for heading styles + let styleNamespaces = map ((,) <$> qName . attrKey <*> attrVal) . + filter ((==Just "xmlns") . qPrefix . attrKey) . + elAttribs $ styledoc + let headingStyles = + let + mywURI = lookup "w" styleNamespaces + myName name = QName name mywURI (Just "w") + getAttrStyleId = findAttr (myName "styleId") + getNameVal = findChild (myName "name") >=> findAttr (myName "val") + getNum s | not $ null s, all isDigit s = Just (read s :: Int) + | otherwise = Nothing + getEngHeader = getAttrStyleId >=> stripPrefix "Heading" >=> getNum + getIntHeader = getNameVal >=> stripPrefix "heading " >=> getNum + toTuple getF = liftM2 (,) <$> getF <*> getAttrStyleId + toMap getF = mapMaybe (toTuple getF) $ + findChildren (myName "style") styledoc + select a b | not $ null a = a + | otherwise = b + in + select (toMap getEngHeader) (toMap getIntHeader) ((contents, footnotes), st) <- runStateT (writeOpenXML opts{writerWrapText = False} doc') defaultWriterState{ stChangesAuthor = fromMaybe "unknown" username - , stChangesDate = formatTime defaultTimeLocale "%FT%XZ" utctime} + , stChangesDate = formatTime defaultTimeLocale "%FT%XZ" utctime + , stPrintWidth = (maybe 420 (\x -> quot x 20) pgContentWidth) + , stHeadingStyles = headingStyles} let epochtime = floor $ utcTimeToPOSIXSeconds utctime let imgs = M.elems $ stImages st @@ -193,9 +246,6 @@ writeDocx opts doc@(Pandoc meta _) = do let toImageEntry (_,path,_,_,img) = toEntry ("word/" ++ path) epochtime $ toLazy img let imageEntries = map toImageEntry imgs - - - let stdAttributes = [("xmlns:w","http://schemas.openxmlformats.org/wordprocessingml/2006/main") ,("xmlns:m","http://schemas.openxmlformats.org/officeDocument/2006/math") @@ -310,10 +360,7 @@ writeDocx opts doc@(Pandoc meta _) = do $ renderXml reldoc - -- adjust contents to add sectPr from reference.docx - parsedDoc <- parseXml refArchive distArchive "word/document.xml" - let wname f qn = qPrefix qn == Just "w" && f (qName qn) - let mbsectpr = filterElementName (wname (=="sectPr")) parsedDoc + -- adjust contents to add sectPr from reference.docx let sectpr = case mbsectpr of Just sectpr' -> let cs = renumIds (\q -> qName q == "id" && qPrefix q == Just "r") @@ -323,8 +370,6 @@ writeDocx opts doc@(Pandoc meta _) = do add_attrs (elAttribs sectpr') $ mknode "w:sectPr" [] cs Nothing -> (mknode "w:sectPr" [] ()) - - -- let sectpr = fromMaybe (mknode "w:sectPr" [] ()) mbsectpr' let contents' = contents ++ [sectpr] let docContents = mknode "w:document" stdAttributes @@ -347,8 +392,6 @@ writeDocx opts doc@(Pandoc meta _) = do -- styles let newstyles = styleToOpenXml $ writerHighlightStyle opts - let stylepath = "word/styles.xml" - styledoc <- parseXml refArchive distArchive stylepath let styledoc' = styledoc{ elContent = elContent styledoc ++ [Elem x | x <- newstyles, writerHighlight opts] } let styleEntry = toEntry stylepath epochtime $ renderXml styledoc' @@ -600,8 +643,9 @@ blockToOpenXML opts (Div (_,["references"],_) bs) = do return (header ++ rest) blockToOpenXML opts (Div _ bs) = blocksToOpenXML opts bs blockToOpenXML opts (Header lev (ident,_,_) lst) = do - paraProps <- withParaProp (pStyle $ "Heading" ++ show lev) $ - getParaProps False + headingStyles <- gets stHeadingStyles + paraProps <- maybe id (withParaProp . pStyle) (lookup lev headingStyles) $ + getParaProps False contents <- inlinesToOpenXML opts lst usedIdents <- gets stSectionIds let bookmarkName = if null ident @@ -927,6 +971,7 @@ inlineToOpenXML opts (Link txt (src,_)) = do return [ mknode "w:hyperlink" [("r:id",id')] contents ] inlineToOpenXML opts (Image alt (src, tit)) = do -- first, check to see if we've already done this image + pageWidth <- gets stPrintWidth imgs <- gets stImages case M.lookup src imgs of Just (_,_,_,elt,_) -> return [elt] @@ -943,7 +988,7 @@ inlineToOpenXML opts (Image alt (src, tit)) = do let size = imageSize img let (xpt,ypt) = maybe (120,120) sizeInPoints size -- 12700 emu = 1 pt - let (xemu,yemu) = fitToPage (xpt * 12700, ypt * 12700) + let (xemu,yemu) = fitToPage (xpt * 12700, ypt * 12700) (pageWidth * 12700) let cNvPicPr = mknode "pic:cNvPicPr" [] $ mknode "a:picLocks" [("noChangeArrowheads","1"),("noChangeAspect","1")] () let nvPicPr = mknode "pic:nvPicPr" [] @@ -1010,9 +1055,11 @@ parseXml refArchive distArchive relpath = Nothing -> fail $ relpath ++ " corrupt or missing in reference docx" -- | Scales the image to fit the page -fitToPage :: (Integer, Integer) -> (Integer, Integer) -fitToPage (x, y) - --5440680 is the emu width size of a letter page in portrait, minus the margins - | x > 5440680 = - (5440680, round $ (5440680 / ((fromIntegral :: Integer -> Double) x)) * (fromIntegral y)) +-- sizes are passed in emu +fitToPage :: (Integer, Integer) -> Integer -> (Integer, Integer) +fitToPage (x, y) pageWidth + -- Fixes width to the page width and scales the height + | x > pageWidth = + (pageWidth, round $ + ((fromIntegral pageWidth) / ((fromIntegral :: Integer -> Double) x)) * (fromIntegral y)) | otherwise = (x, y) diff --git a/src/Text/Pandoc/Writers/DokuWiki.hs b/src/Text/Pandoc/Writers/DokuWiki.hs index 8c1d360aa..74418aa7e 100644 --- a/src/Text/Pandoc/Writers/DokuWiki.hs +++ b/src/Text/Pandoc/Writers/DokuWiki.hs @@ -178,7 +178,7 @@ blockToDokuWiki _ (CodeBlock (_,classes,_) str) = do blockToDokuWiki opts (BlockQuote blocks) = do contents <- blockListToDokuWiki opts blocks if isSimpleBlockQuote blocks - then return $ "> " ++ contents + then return $ unlines $ map ("> " ++) $ lines contents else return $ "<HTML><blockquote>\n" ++ contents ++ "</blockquote></HTML>" blockToDokuWiki opts (Table capt aligns _ headers rows) = do @@ -352,9 +352,7 @@ isPlainOrPara (Para _) = True isPlainOrPara _ = False isSimpleBlockQuote :: [Block] -> Bool -isSimpleBlockQuote [BlockQuote bs] = isSimpleBlockQuote bs -isSimpleBlockQuote [b] = isPlainOrPara b -isSimpleBlockQuote _ = False +isSimpleBlockQuote bs = all isPlainOrPara bs -- | Concatenates strings with line breaks between them. vcat :: [String] -> String diff --git a/src/Text/Pandoc/Writers/EPUB.hs b/src/Text/Pandoc/Writers/EPUB.hs index 32256cb42..2291c7184 100644 --- a/src/Text/Pandoc/Writers/EPUB.hs +++ b/src/Text/Pandoc/Writers/EPUB.hs @@ -35,7 +35,7 @@ import Data.Maybe ( fromMaybe ) import Data.List ( isPrefixOf, isInfixOf, intercalate ) import System.Environment ( getEnv ) import Text.Printf (printf) -import System.FilePath ( (</>), takeExtension, takeFileName ) +import System.FilePath ( takeExtension, takeFileName ) import qualified Data.ByteString.Lazy as B import qualified Data.ByteString.Lazy.Char8 as B8 import qualified Text.Pandoc.UTF8 as UTF8 @@ -64,7 +64,6 @@ import Text.XML.Light ( unode, Element(..), unqual, Attr(..), add_attrs import Text.Pandoc.UUID (getRandomUUID) import Text.Pandoc.Writers.HTML (writeHtmlString, writeHtml) import Data.Char ( toLower, isDigit, isAlphaNum ) -import Network.URI ( unEscapeString ) import Text.Pandoc.MIME (MimeType, getMimeType) import qualified Control.Exception as E import Text.Blaze.Html.Renderer.Utf8 (renderHtml) @@ -359,8 +358,9 @@ writeEPUB opts doc@(Pandoc meta _) = do Nothing -> return ([],[]) Just img -> do let coverImage = "media/" ++ takeFileName img - let cpContent = renderHtml $ writeHtml opts' - (Pandoc meta [RawBlock (Format "html") $ "<div id=\"cover-image\">\n<img src=\"" ++ coverImage ++ "\" alt=\"cover image\" />\n</div>"]) + let cpContent = renderHtml $ writeHtml + opts'{ writerVariables = ("coverpage","true"):vars } + (Pandoc meta [RawBlock (Format "html") $ "<div id=\"cover-image\">\n<img src=\"" ++ coverImage ++ "\" alt=\"cover image\" />\n</div>"]) imgContent <- B.readFile img return ( [mkEntry "cover.xhtml" cpContent] , [mkEntry coverImage imgContent] ) @@ -612,17 +612,14 @@ writeEPUB opts doc@(Pandoc meta _) = do (_:_) -> [unode "ol" ! [("class","toc")] $ subs] let navtag = if epub3 then "nav" else "div" - let navData = UTF8.fromStringLazy $ ppTopElement $ - unode "html" ! [("xmlns","http://www.w3.org/1999/xhtml") - ,("xmlns:epub","http://www.idpf.org/2007/ops")] $ - [ unode "head" $ - [ unode "title" plainTitle - , unode "link" ! [("rel","stylesheet"),("type","text/css"),("href","stylesheet.css")] $ () ] - , unode "body" $ - unode navtag ! [("epub:type","toc") | epub3] $ - [ unode "h1" ! [("id","toc-title")] $ plainTitle - , unode "ol" ! [("class","toc")] $ evalState (mapM (navPointNode navXhtmlFormatter) secs) 1] - ] + let navBlocks = [RawBlock (Format "html") $ ppElement $ + unode navtag ! [("epub:type","toc") | epub3] $ + [ unode "h1" ! [("id","toc-title")] $ plainTitle + , unode "ol" ! [("class","toc")] $ evalState (mapM (navPointNode navXhtmlFormatter) secs) 1]] + let navData = renderHtml $ writeHtml opts' + (Pandoc (setMeta "title" + (walk removeNote $ fromList $ docTitle' meta) nullMeta) + navBlocks) let navEntry = mkEntry "nav.xhtml" navData -- mimetype @@ -765,23 +762,20 @@ metadataElement version md currentTime = showDateTimeISO8601 :: UTCTime -> String showDateTimeISO8601 = formatTime defaultTimeLocale "%FT%TZ" -transformTag :: WriterOptions - -> IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media +transformTag :: IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media -> Tag String -> IO (Tag String) -transformTag opts mediaRef tag@(TagOpen name attr) +transformTag mediaRef tag@(TagOpen name attr) | name `elem` ["video", "source", "img", "audio"] = do let src = fromAttrib "src" tag let poster = fromAttrib "poster" tag - let oldsrc = maybe src (</> src) $ writerSourceURL opts - let oldposter = maybe poster (</> poster) $ writerSourceURL opts - newsrc <- modifyMediaRef mediaRef oldsrc - newposter <- modifyMediaRef mediaRef oldposter + newsrc <- modifyMediaRef mediaRef src + newposter <- modifyMediaRef mediaRef poster let attr' = filter (\(x,_) -> x /= "src" && x /= "poster") attr ++ [("src", newsrc) | not (null newsrc)] ++ [("poster", newposter) | not (null newposter)] return $ TagOpen name attr' -transformTag _ _ tag = return tag +transformTag _ tag = return tag modifyMediaRef :: IORef [(FilePath, FilePath)] -> FilePath -> IO FilePath modifyMediaRef _ "" = return "" @@ -791,7 +785,7 @@ modifyMediaRef mediaRef oldsrc = do Just n -> return n Nothing -> do let new = "media/file" ++ show (length media) ++ - takeExtension oldsrc + takeExtension (takeWhile (/='?') oldsrc) -- remove query modifyIORef mediaRef ( (oldsrc, new): ) return new @@ -799,10 +793,10 @@ transformBlock :: WriterOptions -> IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media -> Block -> IO Block -transformBlock opts mediaRef (RawBlock fmt raw) +transformBlock _ mediaRef (RawBlock fmt raw) | fmt == Format "html" = do let tags = parseTags raw - tags' <- mapM (transformTag opts mediaRef) tags + tags' <- mapM (transformTag mediaRef) tags return $ RawBlock fmt (renderTags' tags') transformBlock _ _ b = return b @@ -810,19 +804,17 @@ transformInline :: WriterOptions -> IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media -> Inline -> IO Inline -transformInline opts mediaRef (Image lab (src,tit)) = do - let src' = unEscapeString src - let oldsrc = maybe src' (</> src) $ writerSourceURL opts - newsrc <- modifyMediaRef mediaRef oldsrc +transformInline _ mediaRef (Image lab (src,tit)) = do + newsrc <- modifyMediaRef mediaRef src return $ Image lab (newsrc, tit) transformInline opts _ (x@(Math _ _)) | WebTeX _ <- writerHTMLMathMethod opts = do raw <- makeSelfContained opts $ writeHtmlInline opts x return $ RawInline (Format "html") raw -transformInline opts mediaRef (RawInline fmt raw) +transformInline _ mediaRef (RawInline fmt raw) | fmt == Format "html" = do let tags = parseTags raw - tags' <- mapM (transformTag opts mediaRef) tags + tags' <- mapM (transformTag mediaRef) tags return $ RawInline fmt (renderTags' tags') transformInline _ _ x = return x @@ -1204,3 +1196,4 @@ docTitle' meta = fromMaybe [] $ go <$> lookupMeta "title" meta _ -> [] go (MetaList xs) = concatMap go xs go _ = [] + diff --git a/src/Text/Pandoc/Writers/HTML.hs b/src/Text/Pandoc/Writers/HTML.hs index 9ead604d7..e261cfca8 100644 --- a/src/Text/Pandoc/Writers/HTML.hs +++ b/src/Text/Pandoc/Writers/HTML.hs @@ -1,4 +1,4 @@ -{-# LANGUAGE OverloadedStrings, CPP #-} +{-# LANGUAGE OverloadedStrings, CPP, ViewPatterns #-} {-# OPTIONS_GHC -fno-warn-deprecations #-} {- Copyright (C) 2006-2014 John MacFarlane <jgm@berkeley.edu> @@ -60,6 +60,8 @@ import qualified Text.Blaze.XHtml1.Transitional.Attributes as A import Text.Blaze.Renderer.String (renderHtml) import Text.TeXMath import Text.XML.Light.Output +import Text.XML.Light (unode, elChildren, add_attr, unqual) +import qualified Text.XML.Light as XML import System.FilePath (takeExtension) import Data.Monoid import Data.Aeson (Value) @@ -71,11 +73,13 @@ data WriterState = WriterState , stQuotes :: Bool -- ^ <q> tag is used , stHighlighting :: Bool -- ^ Syntax highlighting is used , stSecNum :: [Int] -- ^ Number of current section + , stElement :: Bool -- ^ Processing an Element } defaultWriterState :: WriterState defaultWriterState = WriterState {stNotes= [], stMath = False, stQuotes = False, - stHighlighting = False, stSecNum = []} + stHighlighting = False, stSecNum = [], + stElement = False} -- Helpers to render HTML with the appropriate function. @@ -155,6 +159,10 @@ pandocToHtml opts (Pandoc meta blocks) = do H.script ! A.src (toValue url) ! A.type_ "text/javascript" $ mempty + KaTeX js css -> + (H.script ! A.src (toValue js) $ mempty) <> + (H.link ! A.rel "stylesheet" ! A.href (toValue css)) <> + (H.script ! A.type_ "text/javascript" $ toHtml renderKaTeX) _ -> case lookup "mathml-script" (writerVariables opts) of Just s | not (writerHtml5 opts) -> H.script ! A.type_ "text/javascript" @@ -274,7 +282,13 @@ elementToHtml slideLevel opts (Sec level num (id',classes,keyvals) title' elemen let titleSlide = slide && level < slideLevel header' <- if title' == [Str "\0"] -- marker for hrule then return mempty - else blockToHtml opts (Header level' (id',classes,keyvals) title') + else do + modify (\st -> st{ stElement = True}) + res <- blockToHtml opts + (Header level' (id',classes,keyvals) title') + modify (\st -> st{ stElement = False}) + return res + let isSec (Sec _ _ _ _ _) = True isSec (Blk _) = False let isPause (Blk x) = x == Para [Str ".",Space,Str ".",Space,Str "."] @@ -342,10 +356,10 @@ parseMailto s = do _ -> fail "not a mailto: URL" -- | Obfuscate a "mailto:" link. -obfuscateLink :: WriterOptions -> String -> String -> Html +obfuscateLink :: WriterOptions -> Html -> String -> Html obfuscateLink opts txt s | writerEmailObfuscation opts == NoObfuscation = - H.a ! A.href (toValue s) $ toHtml txt -obfuscateLink opts txt s = + H.a ! A.href (toValue s) $ txt +obfuscateLink opts (renderHtml -> txt) s = let meth = writerEmailObfuscation opts s' = map toLower (take 7 s) ++ drop 7 s in case parseMailto s' of @@ -485,7 +499,7 @@ blockToHtml opts (BlockQuote blocks) = else do contents <- blockListToHtml opts blocks return $ H.blockquote $ nl opts >> contents >> nl opts -blockToHtml opts (Header level (_,classes,_) lst) = do +blockToHtml opts (Header level attr@(_,classes,_) lst) = do contents <- inlineListToHtml opts lst secnum <- liftM stSecNum get let contents' = if writerNumberSections opts && not (null secnum) @@ -493,7 +507,9 @@ blockToHtml opts (Header level (_,classes,_) lst) = do then (H.span ! A.class_ "header-section-number" $ toHtml $ showSecNum secnum) >> strToHtml " " >> contents else contents - return $ case level of + inElement <- gets stElement + return $ (if inElement then id else addAttrs opts attr) + $ case level of 1 -> H.h1 contents' 2 -> H.h2 contents' 3 -> H.h3 contents' @@ -615,6 +631,18 @@ inlineListToHtml :: WriterOptions -> [Inline] -> State WriterState Html inlineListToHtml opts lst = mapM (inlineToHtml opts) lst >>= return . mconcat +-- | Annotates a MathML expression with the tex source +annotateMML :: XML.Element -> String -> XML.Element +annotateMML e tex = math (unode "semantics" [cs, unode "annotation" (annotAttrs, tex)]) + where + cs = case elChildren e of + [] -> unode "mrow" () + [x] -> x + xs -> unode "mrow" xs + math = add_attr (XML.Attr (unqual "xmlns") "http://www.w3.org/1998/Math/MathML") . unode "math" + annotAttrs = [XML.Attr (unqual "encoding") "application/x-tex"] + + -- | Convert Pandoc inline element to HTML. inlineToHtml :: WriterOptions -> Inline -> State WriterState Html inlineToHtml opts inline = @@ -706,7 +734,7 @@ inlineToHtml opts inline = defaultConfigPP case writeMathML dt <$> readTeX str of Right r -> return $ preEscapedString $ - ppcElement conf r + ppcElement conf (annotateMML r str) Left _ -> inlineListToHtml opts (texMathToInlines t str) >>= return . (H.span ! A.class_ "math") @@ -714,6 +742,10 @@ inlineToHtml opts inline = case t of InlineMath -> "\\(" ++ str ++ "\\)" DisplayMath -> "\\[" ++ str ++ "\\]" + KaTeX _ _ -> return $ H.span ! A.class_ "math" $ + toHtml (case t of + InlineMath -> str + DisplayMath -> "\\displaystyle " ++ str) PlainMath -> do x <- inlineListToHtml opts (texMathToInlines t str) let m = H.span ! A.class_ "math" $ x @@ -731,7 +763,7 @@ inlineToHtml opts inline = | otherwise -> return mempty (Link txt (s,_)) | "mailto:" `isPrefixOf` s -> do linkText <- inlineListToHtml opts txt - return $ obfuscateLink opts (renderHtml linkText) s + return $ obfuscateLink opts linkText s (Link txt (s,tit)) -> do linkText <- inlineListToHtml opts txt let s' = case s of @@ -815,3 +847,14 @@ blockListToNote opts ref blocks = Just EPUB3 -> noteItem ! customAttribute "epub:type" "footnote" _ -> noteItem return $ nl opts >> noteItem' + +-- Javascript snippet to render all KaTeX elements +renderKaTeX :: String +renderKaTeX = unlines [ + "window.onload = function(){var mathElements = document.getElementsByClassName(\"math\");" + , "for (var i=0; i < mathElements.length; i++)" + , "{" + , " var texText = mathElements[i].firstChild" + , " katex.render(texText.data, mathElements[i])" + , "}}" + ] diff --git a/src/Text/Pandoc/Writers/ICML.hs b/src/Text/Pandoc/Writers/ICML.hs index ae20efd4b..181c63df7 100644 --- a/src/Text/Pandoc/Writers/ICML.hs +++ b/src/Text/Pandoc/Writers/ICML.hs @@ -399,7 +399,7 @@ inlineToICML opts style (Subscript lst) = inlinesToICML opts (subscriptName:styl inlineToICML opts style (SmallCaps lst) = inlinesToICML opts (smallCapsName:style) lst inlineToICML opts style (Quoted SingleQuote lst) = inlinesToICML opts style $ [Str "‘"] ++ lst ++ [Str "’"] inlineToICML opts style (Quoted DoubleQuote lst) = inlinesToICML opts style $ [Str "“"] ++ lst ++ [Str "”"] -inlineToICML opts style (Cite _ lst) = footnoteToICML opts style [Para lst] +inlineToICML opts style (Cite _ lst) = inlinesToICML opts style lst inlineToICML _ style (Code _ str) = charStyle (codeName:style) $ text $ escapeStringForXML str inlineToICML _ style Space = charStyle style space inlineToICML _ style LineBreak = charStyle style $ text lineSeparator diff --git a/src/Text/Pandoc/Writers/ODT.hs b/src/Text/Pandoc/Writers/ODT.hs index 03f8e8ba4..2a4129512 100644 --- a/src/Text/Pandoc/Writers/ODT.hs +++ b/src/Text/Pandoc/Writers/ODT.hs @@ -41,7 +41,7 @@ import Control.Applicative ((<$>)) import Text.Pandoc.Options ( WriterOptions(..) ) import Text.Pandoc.Shared ( stringify, readDataFile, fetchItem', warn ) import Text.Pandoc.ImageSize ( imageSize, sizeInPoints ) -import Text.Pandoc.MIME ( getMimeType ) +import Text.Pandoc.MIME ( getMimeType, extensionFromMimeType ) import Text.Pandoc.Definition import Text.Pandoc.Walk import Text.Pandoc.Writers.Shared ( fixDisplayMath ) @@ -51,7 +51,7 @@ import Text.Pandoc.XML import Text.Pandoc.Pretty import qualified Control.Exception as E import Data.Time.Clock.POSIX ( getPOSIXTime ) -import System.FilePath ( takeExtension, takeDirectory ) +import System.FilePath ( takeExtension, takeDirectory, (<.>)) -- | Produce an ODT file from a Pandoc document. writeODT :: WriterOptions -- ^ Writer options @@ -133,12 +133,14 @@ transformPicMath opts entriesRef (Image lab (src,_)) = do Left (_ :: E.SomeException) -> do warn $ "Could not find image `" ++ src ++ "', skipping..." return $ Emph lab - Right (img, _) -> do + Right (img, mbMimeType) -> do let size = imageSize img let (w,h) = fromMaybe (0,0) $ sizeInPoints `fmap` size let tit' = show w ++ "x" ++ show h entries <- readIORef entriesRef - let newsrc = "Pictures/" ++ show (length entries) ++ takeExtension src + let extension = fromMaybe (takeExtension $ takeWhile (/='?') src) + (mbMimeType >>= extensionFromMimeType) + let newsrc = "Pictures/" ++ show (length entries) <.> extension let toLazy = B.fromChunks . (:[]) epochtime <- floor `fmap` getPOSIXTime let entry = toEntry newsrc epochtime $ toLazy img diff --git a/src/Text/Pandoc/Writers/RST.hs b/src/Text/Pandoc/Writers/RST.hs index 57ebfc360..5ba4c9983 100644 --- a/src/Text/Pandoc/Writers/RST.hs +++ b/src/Text/Pandoc/Writers/RST.hs @@ -173,11 +173,11 @@ blockToRST (Para [Image txt (src,'f':'i':'g':':':tit)]) = do capt <- inlineListToRST txt let fig = "figure:: " <> text src let alt = ":alt: " <> if null tit then capt else text tit - return $ hang 3 ".. " $ fig $$ alt $+$ capt $$ blankline + return $ hang 3 ".. " (fig $$ alt $+$ capt) $$ blankline blockToRST (Para inlines) | LineBreak `elem` inlines = do -- use line block if LineBreaks lns <- mapM inlineListToRST $ splitBy (==LineBreak) inlines - return $ (vcat $ map (text "| " <>) lns) <> blankline + return $ (vcat $ map (hang 2 (text "| ")) lns) <> blankline | otherwise = do contents <- inlineListToRST inlines return $ contents <> blankline @@ -239,8 +239,7 @@ blockToRST (Table caption _ widths headers rows) = do middle = hcat $ intersperse sep' blocks let makeRow = hpipeBlocks . zipWith lblock widthsInChars let head' = makeRow headers' - rows' <- mapM (\row -> do cols <- mapM blockListToRST row - return $ makeRow cols) rows + let rows' = map makeRow rawRows let border ch = char '+' <> char ch <> (hcat $ intersperse (char ch <> char '+' <> char ch) $ map (\l -> text $ replicate l ch) widthsInChars) <> @@ -253,7 +252,7 @@ blockToRST (Table caption _ widths headers rows) = do blockToRST (BulletList items) = do contents <- mapM bulletListItemToRST items -- ensure that sublists have preceding blank line - return $ blankline $$ vcat contents $$ blankline + return $ blankline $$ chomp (vcat contents) $$ blankline blockToRST (OrderedList (start, style', delim) items) = do let markers = if start == 1 && style' == DefaultStyle && delim == DefaultDelim then take (length items) $ repeat "#." @@ -265,11 +264,11 @@ blockToRST (OrderedList (start, style', delim) items) = do contents <- mapM (\(item, num) -> orderedListItemToRST item num) $ zip markers' items -- ensure that sublists have preceding blank line - return $ blankline $$ vcat contents $$ blankline + return $ blankline $$ chomp (vcat contents) $$ blankline blockToRST (DefinitionList items) = do contents <- mapM definitionListItemToRST items -- ensure that sublists have preceding blank line - return $ blankline $$ vcat contents $$ blankline + return $ blankline $$ chomp (vcat contents) $$ blankline -- | Convert bullet list item (list of blocks) to RST. bulletListItemToRST :: [Block] -> State WriterState Doc @@ -427,7 +426,7 @@ inlineToRST (Image alternate (source, tit)) = do return $ "|" <> label <> "|" inlineToRST (Note contents) = do -- add to notes in state - notes <- get >>= return . stNotes + notes <- gets stNotes modify $ \st -> st { stNotes = contents:notes } let ref = show $ (length notes) + 1 return $ " [" <> text ref <> "]_" diff --git a/tests/Tests/Old.hs b/tests/Tests/Old.hs index 8a256b761..0f7b33dd1 100644 --- a/tests/Tests/Old.hs +++ b/tests/Tests/Old.hs @@ -153,6 +153,9 @@ tests = [ testGroup "markdown" , test "formatting" ["-r", "epub", "-w", "native"] "epub/formatting.epub" "epub/formatting.native" ] + , testGroup "twiki" + [ test "reader" ["-r", "twiki", "-w", "native", "-s"] + "twiki-reader.twiki" "twiki-reader.native" ] , testGroup "other writers" $ map (\f -> testGroup f $ writerTests f) [ "opendocument" , "context" , "texinfo", "icml" , "man" , "plain" , "rtf", "org", "asciidoc" diff --git a/tests/Tests/Readers/Docx.hs b/tests/Tests/Readers/Docx.hs index 565d117e9..2963c34da 100644 --- a/tests/Tests/Readers/Docx.hs +++ b/tests/Tests/Readers/Docx.hs @@ -13,7 +13,6 @@ import Text.Pandoc.Writers.Native (writeNative) import qualified Data.Map as M import Text.Pandoc.MediaBag (MediaBag, lookupMedia, mediaDirectory) import Codec.Archive.Zip -import System.FilePath (combine) -- We define a wrapper around pandoc that doesn't normalize in the -- tests. Since we do our own normalization, we want to make sure @@ -60,7 +59,7 @@ testCompare = testCompareWithOpts def getMedia :: FilePath -> FilePath -> IO (Maybe B.ByteString) getMedia archivePath mediaPath = do zf <- B.readFile archivePath >>= return . toArchive - return $ findEntryByPath (combine "word" mediaPath) zf >>= (Just . fromEntry) + return $ findEntryByPath ("word/" ++ mediaPath) zf >>= (Just . fromEntry) compareMediaPathIO :: FilePath -> MediaBag -> FilePath -> IO Bool compareMediaPathIO mediaPath mediaBag docxPath = do @@ -157,9 +156,9 @@ tests = [ testGroup "inlines" "docx/numbered_header.docx" "docx/numbered_header.native" , testCompare - "headers in other languages" - "docx/danish_headers.docx" - "docx/danish_headers.native" + "i18n blocks (headers and blockquotes)" + "docx/i18n_blocks.docx" + "docx/i18n_blocks.native" , testCompare "lists" "docx/lists.docx" diff --git a/tests/Tests/Readers/EPUB.hs b/tests/Tests/Readers/EPUB.hs index f27ea979f..0d19a8400 100644 --- a/tests/Tests/Readers/EPUB.hs +++ b/tests/Tests/Readers/EPUB.hs @@ -8,6 +8,7 @@ import qualified Data.ByteString.Lazy as BL import Text.Pandoc.Readers.EPUB import Text.Pandoc.MediaBag (MediaBag, mediaDirectory) import Control.Applicative +import System.FilePath (joinPath) getMediaBag :: FilePath -> IO MediaBag getMediaBag fp = snd . readEPUB def <$> BL.readFile fp @@ -22,12 +23,12 @@ testMediaBag fp bag = do (actBag == bag) featuresBag :: [(String, String, Int)] -featuresBag = [("img/ElementaryMathExample.png","image/png",1331),("img/Maghreb1.png","image/png",2520),("img/check.gif","image/gif",1340),("img/check.jpg","image/jpeg",2661),("img/check.png","image/png",2815),("img/cichons_diagram.png","image/png",7045),("img/complex_number.png","image/png",5238),("img/multiscripts_and_greek_alphabet.png","image/png",10060)] +featuresBag = [(joinPath ["img","check.gif"],"image/gif",1340),(joinPath ["img","check.jpg"],"image/jpeg",2661),(joinPath ["img","check.png"],"image/png",2815),(joinPath ["img","multiscripts_and_greek_alphabet.png"],"image/png",10060)] tests :: [Test] tests = [ testGroup "EPUB Mediabag" [ testCase "features bag" - (testMediaBag "epub/features.epub" featuresBag) + (testMediaBag "epub/img.epub" featuresBag) ] ] diff --git a/tests/Tests/Readers/Markdown.hs b/tests/Tests/Readers/Markdown.hs index b45d94032..a7e322306 100644 --- a/tests/Tests/Readers/Markdown.hs +++ b/tests/Tests/Readers/Markdown.hs @@ -20,6 +20,9 @@ markdownCDL :: String -> Pandoc markdownCDL = readMarkdown def { readerExtensions = Set.insert Ext_compact_definition_lists $ readerExtensions def } +markdownGH :: String -> Pandoc +markdownGH = readMarkdown def { readerExtensions = githubMarkdownExtensions } + infix 4 =: (=:) :: ToString c => String -> (String, c) -> Test @@ -271,5 +274,14 @@ tests = [ testGroup "inline code" plain (text "if this button exists") <> rawBlock "html" "</button>" <> divWith nullAttr (para $ text "with this div too.")] + , test markdownGH "issue #1636" $ + unlines [ "* a" + , "* b" + , "* c" + , " * d" ] + =?> + bulletList [ plain "a" + , plain "b" + , plain "c" <> bulletList [plain "d"] ] ] ] diff --git a/tests/Tests/Readers/Org.hs b/tests/Tests/Readers/Org.hs index b9896e1b0..d1f673514 100644 --- a/tests/Tests/Readers/Org.hs +++ b/tests/Tests/Readers/Org.hs @@ -126,6 +126,14 @@ tests = , (emph "b") <> "." ]) + , "Quotes are forbidden border chars" =: + "/'nope/ *nope\"*" =?> + para ("/'nope/" <> space <> "*nope\"*") + + , "Commata are forbidden border chars" =: + "/nada,/" =?> + para "/nada,/" + , "Markup should work properly after a blank line" =: unlines ["foo", "", "/bar/"] =?> (para $ text "foo") <> (para $ emph $ text "bar") @@ -189,6 +197,18 @@ tests = "[[http://zeitlens.com/]]" =?> (para $ link "http://zeitlens.com/" "" "http://zeitlens.com/") + , "Absolute file link" =: + "[[/url][hi]]" =?> + (para $ link "file:///url" "" "hi") + + , "Link to file in parent directory" =: + "[[../file.txt][moin]]" =?> + (para $ link "../file.txt" "" "moin") + + , "Empty link (for gitit interop)" =: + "[[][New Link]]" =?> + (para $ link "" "" "New Link") + , "Image link" =: "[[sunset.png][dusk.svg]]" =?> (para $ link "sunset.png" "" (image "dusk.svg" "" "")) @@ -268,6 +288,18 @@ tests = "\\notacommand{foo}" =?> para (rawInline "latex" "\\notacommand{foo}") + , "MathML symbol in LaTeX-style" =: + "There is a hackerspace in Lübeck, Germany, called nbsp (unicode symbol: '\\nbsp')." =?> + para ("There is a hackerspace in Lübeck, Germany, called nbsp (unicode symbol: ' ').") + + , "MathML symbol in LaTeX-style, including braces" =: + "\\Aacute{}stor" =?> + para "Ástor" + + , "MathML copy sign" =: + "\\copy" =?> + para "©" + , "LaTeX citation" =: "\\cite{Coffee}" =?> let citation = Citation @@ -450,6 +482,18 @@ tests = , header 2 ("walk" <> space <> "dog") ] + , "Comment Trees" =: + unlines [ "* COMMENT A comment tree" + , " Not much going on here" + , "** This will be dropped" + , "* Comment tree above" + ] =?> + header 1 "Comment tree above" + + , "Nothing but a COMMENT header" =: + "* COMMENT Test" =?> + (mempty::Blocks) + , "Paragraph starting with an asterisk" =: "*five" =?> para "*five" @@ -578,6 +622,13 @@ tests = , plain "Item2" ] + , "Unindented *" =: + ("- Item1\n" ++ + "* Item2\n") =?> + bulletList [ plain "Item1" + ] <> + header 1 "Item2" + , "Multi-line Bullet Lists" =: ("- *Fat\n" ++ " Tony*\n" ++ @@ -622,6 +673,33 @@ tests = ] ] + , "Bullet List with Decreasing Indent" =: + (" - Discovery\n\ + \ - Human After All\n") =?> + mconcat [ bulletList [ plain "Discovery" ] + , bulletList [ plain ("Human" <> space <> "After" <> space <> "All")] + ] + + , "Header follows Bullet List" =: + (" - Discovery\n\ + \ - Human After All\n\ + \* Homework") =?> + mconcat [ bulletList [ plain "Discovery" + , plain ("Human" <> space <> "After" <> space <> "All") + ] + , header 1 "Homework" + ] + + , "Bullet List Unindented with trailing Header" =: + ("- Discovery\n\ + \- Homework\n\ + \* NotValidListItem") =?> + mconcat [ bulletList [ plain "Discovery" + , plain "Homework" + ] + , header 1 "NotValidListItem" + ] + , "Simple Ordered List" =: ("1. Item1\n" ++ "2. Item2\n") =?> @@ -702,7 +780,9 @@ tests = ] ]) ] - + , "Definition list with multi-word term" =: + " - Elijah Wood :: He plays Frodo" =?> + definitionList [ ("Elijah" <> space <> "Wood", [plain $ "He" <> space <> "plays" <> space <> "Frodo"])] , "Compact definition list" =: unlines [ "- ATP :: adenosine 5' triphosphate" , "- DNA :: deoxyribonucleic acid" @@ -715,6 +795,16 @@ tests = , ("PCR", [ plain $ spcSep [ "polymerase", "chain", "reaction" ] ]) ] + , "Definition List With Trailing Header" =: + "- definition :: list\n\ + \- cool :: defs\n\ + \* header" =?> + mconcat [ definitionList [ ("definition", [plain "list"]) + , ("cool", [plain "defs"]) + ] + , header 1 "header" + ] + , "Loose bullet list" =: unlines [ "- apple" , "" diff --git a/tests/Tests/Readers/RST.hs b/tests/Tests/Readers/RST.hs index a80dc32b7..c97dcb149 100644 --- a/tests/Tests/Readers/RST.hs +++ b/tests/Tests/Readers/RST.hs @@ -67,5 +67,26 @@ tests = [ "line block with blank line" =: link "http://foo.bar.baz" "" "http://foo.bar.baz" <> ". " <> link "http://foo.bar/baz_(bam)" "" "http://foo.bar/baz_(bam)" <> " (" <> link "http://foo.bar" "" "http://foo.bar" <> ")") + , "indented literal block" =: unlines + [ "::" + , "" + , " block quotes" + , "" + , " can go on for many lines" + , "but must stop here"] + =?> (doc $ + codeBlock "block quotes\n\ncan go on for many lines" <> + para "but must stop here") + , "line block with 3 lines" =: "| a\n| b\n| c" + =?> para ("a" <> linebreak <> "b" <> linebreak <> "c") + , "quoted literal block using >" =: "::\n\n> quoted\n> block\n\nOrdinary paragraph" + =?> codeBlock "> quoted\n> block" <> para "Ordinary paragraph" + , "quoted literal block using | (not a line block)" =: "::\n\n| quoted\n| block\n\nOrdinary paragraph" + =?> codeBlock "| quoted\n| block" <> para "Ordinary paragraph" + , "class directive with single paragraph" =: ".. class:: special\n\nThis is a \"special\" paragraph." + =?> divWith ("", ["special"], []) (para "This is a \"special\" paragraph.") + , "class directive with two paragraphs" =: ".. class:: exceptional remarkable\n\n First paragraph.\n\n Second paragraph." + =?> divWith ("", ["exceptional", "remarkable"], []) (para "First paragraph." <> para "Second paragraph.") + , "class directive around literal block" =: ".. class:: classy\n\n::\n\n a\n b" + =?> divWith ("", ["classy"], []) (codeBlock "a\nb") ] - diff --git a/tests/Tests/Shared.hs b/tests/Tests/Shared.hs index b6671835c..9b55b7b1d 100644 --- a/tests/Tests/Shared.hs +++ b/tests/Tests/Shared.hs @@ -9,6 +9,7 @@ import Test.Framework.Providers.HUnit import Test.HUnit ( assertBool, (@?=) ) import Text.Pandoc.Builder import Data.Monoid +import System.FilePath (joinPath) tests :: [Test] tests = [ testGroup "normalize" @@ -40,21 +41,21 @@ p_normalize_no_trailing_spaces ils = null ils' || last ils' /= Space testCollapse :: [Test] testCollapse = map (testCase "collapse") - [ (collapseFilePath "" @?= "") - , (collapseFilePath "./foo" @?= "foo") - , (collapseFilePath "././../foo" @?= "../foo") - , (collapseFilePath "../foo" @?= "../foo") - , (collapseFilePath "/bar/../baz" @?= "/baz") - , (collapseFilePath "/../baz" @?= "/../baz") - , (collapseFilePath "./foo/.././bar/../././baz" @?= "baz") - , (collapseFilePath "./" @?= "") - , (collapseFilePath "././" @?= "") - , (collapseFilePath "../" @?= "..") - , (collapseFilePath ".././" @?= "..") - , (collapseFilePath "./../" @?= "..") - , (collapseFilePath "../../" @?= "../..") - , (collapseFilePath "parent/foo/baz/../bar" @?= "parent/foo/bar") - , (collapseFilePath "parent/foo/baz/../../bar" @?= "parent/bar") - , (collapseFilePath "parent/foo/.." @?= "parent") - , (collapseFilePath "/parent/foo/../../bar" @?= "/bar") - , (collapseFilePath "/./parent/foo" @?= "/parent/foo")] + [ (collapseFilePath (joinPath [ ""]) @?= (joinPath [ ""])) + , (collapseFilePath (joinPath [ ".","foo"]) @?= (joinPath [ "foo"])) + , (collapseFilePath (joinPath [ ".",".","..","foo"]) @?= (joinPath [ joinPath ["..", "foo"]])) + , (collapseFilePath (joinPath [ "..","foo"]) @?= (joinPath [ "..","foo"])) + , (collapseFilePath (joinPath [ "","bar","..","baz"]) @?= (joinPath [ "","baz"])) + , (collapseFilePath (joinPath [ "","..","baz"]) @?= (joinPath [ "","..","baz"])) + , (collapseFilePath (joinPath [ ".","foo","..",".","bar","..",".",".","baz"]) @?= (joinPath [ "baz"])) + , (collapseFilePath (joinPath [ ".",""]) @?= (joinPath [ ""])) + , (collapseFilePath (joinPath [ ".",".",""]) @?= (joinPath [ ""])) + , (collapseFilePath (joinPath [ "..",""]) @?= (joinPath [ ".."])) + , (collapseFilePath (joinPath [ "..",".",""]) @?= (joinPath [ ".."])) + , (collapseFilePath (joinPath [ ".","..",""]) @?= (joinPath [ ".."])) + , (collapseFilePath (joinPath [ "..","..",""]) @?= (joinPath [ "..",".."])) + , (collapseFilePath (joinPath [ "parent","foo","baz","..","bar"]) @?= (joinPath [ "parent","foo","bar"])) + , (collapseFilePath (joinPath [ "parent","foo","baz","..","..","bar"]) @?= (joinPath [ "parent","bar"])) + , (collapseFilePath (joinPath [ "parent","foo",".."]) @?= (joinPath [ "parent"])) + , (collapseFilePath (joinPath [ "","parent","foo","..","..","bar"]) @?= (joinPath [ "","bar"])) + , (collapseFilePath (joinPath [ "",".","parent","foo"]) @?= (joinPath [ "","parent","foo"]))] diff --git a/tests/docx/danish_headers.docx b/tests/docx/danish_headers.docx Binary files differdeleted file mode 100644 index 345fc632c..000000000 --- a/tests/docx/danish_headers.docx +++ /dev/null diff --git a/tests/docx/danish_headers.native b/tests/docx/danish_headers.native deleted file mode 100644 index 12a857811..000000000 --- a/tests/docx/danish_headers.native +++ /dev/null @@ -1,10 +0,0 @@ -[Header 1 ("testoverskrift",[],[]) [Str "Testoverskrift"] -,Para [Str "Normal"] -,Header 2 ("testoverskrift-2",[],[]) [Str "Testoverskrift",Space,Str "2"] -,Table [] [AlignDefault,AlignDefault] [0.0,0.0] - [[Plain [Str "Kolonne1"]] - ,[Plain [Str "Kolonne2"]]] - [[[Plain [Str "Testdata",Space,Str "1"]] - ,[Plain [Str "Tester2"]]] - ,[[Plain [Str "Testdata",Space,Str "2"]] - ,[Plain [Str "Tester3"]]]]] diff --git a/tests/docx/i18n_blocks.docx b/tests/docx/i18n_blocks.docx Binary files differnew file mode 100644 index 000000000..36341c363 --- /dev/null +++ b/tests/docx/i18n_blocks.docx diff --git a/tests/docx/i18n_blocks.native b/tests/docx/i18n_blocks.native new file mode 100644 index 000000000..582a7360d --- /dev/null +++ b/tests/docx/i18n_blocks.native @@ -0,0 +1,8 @@ +[Header 1 ("this-is-heading-1",[],[]) [Str "This",Space,Str "is",Space,Str "Heading",Space,Str "1"] +,Header 2 ("this-is-heading-2",[],[]) [Str "This",Space,Str "is",Space,Str "Heading",Space,Str "2"] +,BlockQuote + [Para [Str "This",Space,Str "is",Space,Str "Quote"] + ,Para [Str "This",Space,Str "is",Space,Str "Block",Space,Str "Text"]] +,BulletList + [[Para [Str "This",Space,Str "is",Space,Str "list",Space,Str "item",Space,Str "1"]] + ,[Para [Str "This",Space,Str "is",Space,Str "list",Space,Str "item",Space,Str "2"]]]] diff --git a/tests/docx/links.docx b/tests/docx/links.docx Binary files differindex 10ec62fd7..538b84b08 100644 --- a/tests/docx/links.docx +++ b/tests/docx/links.docx diff --git a/tests/docx/links.native b/tests/docx/links.native index c741fe875..cd7ab6fb6 100644 --- a/tests/docx/links.native +++ b/tests/docx/links.native @@ -1,5 +1,6 @@ [Header 2 ("an-internal-link-and-an-external-link",[],[]) [Str "An",Space,Str "internal",Space,Str "link",Space,Str "and",Space,Str "an",Space,Str "external",Space,Str "link"] ,Para [Str "An",Space,Link [Str "external",Space,Str "link"] ("http://google.com",""),Space,Str "to",Space,Str "a",Space,Str "popular",Space,Str "website."] +,Para [Str "An",Space,Link [Str "external",Space,Str "link"] ("http://johnmacfarlane.net/pandoc/README.html#synopsis",""),Space,Str "to",Space,Str "a",Space,Str "website",Space,Str "with",Space,Str "an",Space,Str "anchor."] ,Para [Str "An",Space,Link [Str "internal",Space,Str "link"] ("#a-section-for-testing-link-targets",""),Space,Str "to",Space,Str "a",Space,Str "section",Space,Str "header."] ,Para [Str "An",Space,Link [Str "internal",Space,Str "link"] ("#my_bookmark",""),Space,Str "to",Space,Str "a",Space,Str "bookmark."] ,Header 2 ("a-section-for-testing-link-targets",[],[]) [Str "A",Space,Str "section",Space,Str "for",Space,Str "testing",Space,Str "link",Space,Str "targets"] diff --git a/tests/epub/features.epub b/tests/epub/features.epub Binary files differindex 8dcae384b..2690eec8b 100644 --- a/tests/epub/features.epub +++ b/tests/epub/features.epub diff --git a/tests/epub/features.native b/tests/epub/features.native index f01070383..6ccc04f43 100644 --- a/tests/epub/features.native +++ b/tests/epub/features.native @@ -1,5 +1,4 @@ -[Para [Image [] ("img/multiscripts_and_greek_alphabet.png","")] -,Para [Span ("front.xhtml",[],[]) []] +[Para [Span ("front.xhtml",[],[]) []] ,RawBlock (Format "html") "<section>" ,Header 1 ("",[],[]) [Str "Reflowable",Space,Str "EPUB",Space,Str "3",Space,Str "Conformance",Space,Str "Test",Space,Str "Document:",Space,Str "0100"] ,RawBlock (Format "html") "<section>" @@ -28,31 +27,6 @@ [[Plain [Str "@@@TODO",Space,Str "provide",Space,Str "info",Space,Str "on",Space,Str "where",Space,Str "to",Space,Str "get",Space,Str "the",Space,Str "results",Space,Str "form"]]])] ,RawBlock (Format "html") "</section>" ,RawBlock (Format "html") "</section>" -,Para [Span ("content-images-001.xhtml",[],[]) []] -,RawBlock (Format "html") "<section>" -,Header 2 ("content-images-001.xhtml#multimedia",[],[]) [Str "Multimedia"] -,RawBlock (Format "html") "<section>" -,Header 3 ("content-images-001.xhtml#images",[],[]) [Str "Images"] -,RawBlock (Format "html") "<section id=\"img-010\" class=\"ctest\">" -,Header 4 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "img-010"],Space,Str "GIF"] -,Para [Str "Tests",Space,Str "whether",Space,Str "the",Space,Str "GIF",Space,Str "image",Space,Str "format",Space,Str "is",Space,Str "supported."] -,Para [Image [Str "gif",Space,Str "test"] ("img/check.gif","")] -,Para [Str "If",Space,Str "a",Space,Str "checkmark",Space,Str "precedes",Space,Str "this",Space,Str "paragaph,",Space,Str "the",Space,Str "test",Space,Str "passes."] -,RawBlock (Format "html") "</section>" -,RawBlock (Format "html") "<section id=\"img-020\" class=\"ctest\">" -,Header 4 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "img-020"],Space,Str "PNG"] -,Para [Str "Tests",Space,Str "whether",Space,Str "the",Space,Str "PNG",Space,Str "image",Space,Str "format",Space,Str "is",Space,Str "supported."] -,Para [Image [Str "png",Space,Str "test"] ("img/check.png","")] -,Para [Str "If",Space,Str "a",Space,Str "checkmark",Space,Str "precedes",Space,Str "this",Space,Str "paragaph,",Space,Str "the",Space,Str "test",Space,Str "passes."] -,RawBlock (Format "html") "</section>" -,RawBlock (Format "html") "<section id=\"img-030\" class=\"ctest\">" -,Header 4 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "img-030"],Space,Str "JPEG"] -,Para [Str "Tests",Space,Str "whether",Space,Str "the",Space,Str "JPEG",Space,Str "image",Space,Str "format",Space,Str "is",Space,Str "supported."] -,Para [Image [Str "jpeg",Space,Str "test"] ("img/check.jpg","")] -,Para [Str "If",Space,Str "a",Space,Str "checkmark",Space,Str "precedes",Space,Str "this",Space,Str "paragaph,",Space,Str "the",Space,Str "test",Space,Str "passes."] -,RawBlock (Format "html") "</section>" -,RawBlock (Format "html") "</section>" -,RawBlock (Format "html") "</section>" ,Para [Span ("content-mathml-001.xhtml",[],[]) []] ,RawBlock (Format "html") "<section>" ,Header 2 ("content-mathml-001.xhtml#mathml",[],[]) [Str "MathML"] @@ -94,13 +68,13 @@ ,Header 2 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "mathml-024"],Str "Horizontal",Space,Str "stretch,",Space,Code ("",[],[]) "mover",Str ",",Space,Code ("",[],[]) "munder",Str ",",Space,Str "and",Space,Code ("",[],[]) "mspace",Space,Str "elements"] ,Para [Str "Tests",Space,Str "whether",Space,Str "horizontal",Space,Str "stretch,",Space,Code ("",[],[]) "mover",Str ",",Space,Code ("",[],[]) "munder",Str ",",Space,Code ("",[],[]) "mspace",Space,Str "elements",Space,Str "are",Space,Str "supported."] ,Para [Math DisplayMath "c = \\overset{\\text{complex\\ number}}{\\overbrace{\\underset{\\text{real}}{\\underbrace{\\mspace{20mu} a\\mspace{20mu}}} + \\underset{\\text{imaginary}}{\\underbrace{\\quad b{\\mathbb{i}}\\quad}}}}"] -,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Image [Str "description",Space,Str "of",Space,Str "imaginary",Space,Str "number:",Space,Str "c",Space,Str "=",Space,Str "a",Space,Str "+bi",Space,Str "with",Space,Str "an",Space,Str "overbrace",Space,Str "reading",Space,Str "'complex",Space,Str "number'",Space,Str "and",Space,Str "underbraces",Space,Str "below",Space,Str "'a'",Space,Str "and",Space,Str "'b",Space,Str "i'",Space,Str "reading",Space,Str "'real'",Space,Str "and",Space,Str "'imaginary'",Space,Str "respectively."] ("img/complex_number.png",""),Str "."] +,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Str "."] ,RawBlock (Format "html") "</section>" ,RawBlock (Format "html") "<section id=\"mathml-025\" class=\"ctest\">" ,Header 2 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "mathml-025"],Str "Testing",Space,Code ("",[],[]) "mtable",Space,Str "with",Space,Code ("",[],[]) "colspan",Space,Str "and",Space,Code ("",[],[]) "rowspan",Space,Str "attributes,",Space,Str "Hebrew",Space,Str "and",Space,Str "Script",Space,Str "fonts"] ,Para [Str "Tests",Space,Str "whether",Space,Code ("",[],[]) "mtable",Space,Str "with",Space,Code ("",[],[]) "colspan",Space,Str "and",Space,Code ("",[],[]) "mspace",Space,Str "attributes",Space,Str "(colum",Space,Str "and",Space,Str "row",Space,Str "spanning)",Space,Str "are",Space,Str "supported;",Space,Str "uses",Space,Str "Hebrew",Space,Str "and",Space,Str "Script",Space,Str "alphabets."] ,Para [Math DisplayMath "\\begin{array}{llllllllll}\n & {\\operatorname{cov}\\left( \\mathcal{L} \\right)} & \\longrightarrow & {\\operatorname{non}\\left( \\mathcal{K} \\right)} & \\longrightarrow & {\\operatorname{cof}\\left( \\mathcal{K} \\right)} & \\longrightarrow & {\\operatorname{cof}\\left( \\mathcal{L} \\right)} & \\longrightarrow & 2^{\\aleph_{0}} \\\\\n & \\uparrow & & \\uparrow & & \\uparrow & & \\uparrow & & \\\\\n & {\\mathfrak{b}} & \\longrightarrow & {\\mathfrak{d}} & & & & & & \\\\\n & \\uparrow & & \\uparrow & & & & & & \\\\\n\\aleph_{1} & \\longrightarrow & {\\operatorname{add}\\left( \\mathcal{L} \\right)} & \\longrightarrow & {\\operatorname{add}\\left( \\mathcal{K} \\right)} & \\longrightarrow & {\\operatorname{cov}\\left( \\mathcal{K} \\right)} & \\longrightarrow & {\\operatorname{non}\\left( \\mathcal{L} \\right)} & \\\\\n\\end{array}"] -,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Link [Str "Cicho\324's",Space,Str "Diagram"] ("Cicho%C5%84's_diagram",""),Str ":",Space,Image [Str "rendering",Space,Str "of",Space,Str "Cicho\324's",Space,Str "diagram."] ("img/cichons_diagram.png",""),Str "."] +,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Link [Str "Cicho\324's",Space,Str "Diagram"] ("Cicho%C5%84's_diagram",""),Str ":",Space,Str "."] ,RawBlock (Format "html") "</section>" ,RawBlock (Format "html") "<section id=\"mathml-026\" class=\"ctest\">" ,Header 2 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "mathml-026"],Str "BiDi,",Space,Str "RTL",Space,Str "and",Space,Str "Arabic",Space,Str "alphabets"] @@ -111,7 +85,7 @@ ,RawBlock (Format "html") "<section id=\"mathml-027\" class=\"ctest\">" ,Header 2 ("",[],[]) [Span ("",["nature"],[]) [Str "[REQUIRED]"],Space,Span ("",["test-id"],[]) [Str "mathml-027"],Str "Elementary",Space,Str "math:",Space,Str "long",Space,Str "division",Space,Str "notation"] ,Para [Str "Tests",Space,Str "whether",Space,Code ("",[],[]) "mlongdiv",Space,Str "elements",Space,Str "(from",Space,Str "elementary",Space,Str "math)",Space,Str "are",Space,Str "supported."] -,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Str "the",Space,Str "following",Space,Str "image:",Space,Image [Str "A",Space,Str "long",Space,Str "division",Space,Str "dividing",Space,Str "1306",Space,Str "by",Space,Str "3,",Space,Str "presented",Space,Str "in",Space,Str "'lefttop'",Space,Str "(US)",Space,Str "notation"] ("img/ElementaryMathExample.png",""),Str "."] +,Para [Str "The",Space,Str "test",Space,Str "passes",Space,Str "if",Space,Str "the",Space,Str "rendering",Space,Str "looks",Space,Str "like",Space,Str "the",Space,Str "following",Space,Str "image:",Space,Str "."] ,RawBlock (Format "html") "</section>" ,RawBlock (Format "html") "</section>" ,Para [Span ("content-switch-001.xhtml",[],[]) []] @@ -130,6 +104,4 @@ ,Para [Str "If",Space,Str "a",Space,Str "MathML",Space,Str "equation",Space,Str "is",Space,Str "rendered",Space,Str "before",Space,Str "this",Space,Str "paragraph,",Space,Str "the",Space,Str "test",Space,Str "passes."] ,Para [Str "If",Space,Str "test",Space,Code ("",[],[]) "switch-010",Space,Str "did",Space,Str "not",Space,Str "pass,",Space,Str "this",Space,Str "test",Space,Str "should",Space,Str "be",Space,Str "marked",Space,Code ("",[],[]) "Not Supported",Str "."] ,RawBlock (Format "html") "</section>" -,RawBlock (Format "html") "</section>" -,Para [Span ("Maghreb1.png",[],[]) []] -,Para [Image [] ("img/Maghreb1.png","")]] +,RawBlock (Format "html") "</section>"] diff --git a/tests/epub/img.epub b/tests/epub/img.epub Binary files differnew file mode 100644 index 000000000..ebe80d935 --- /dev/null +++ b/tests/epub/img.epub diff --git a/tests/html-reader.html b/tests/html-reader.html index 14ad3ed54..e9ba2a68b 100644 --- a/tests/html-reader.html +++ b/tests/html-reader.html @@ -451,5 +451,4 @@ An e-mail address: nobody [at] nowhere.net<blockquote> </tr> </table> </body> -</body> </html> diff --git a/tests/markdown-reader-more.native b/tests/markdown-reader-more.native index 30da0afbb..00313a0ac 100644 --- a/tests/markdown-reader-more.native +++ b/tests/markdown-reader-more.native @@ -17,6 +17,7 @@ ,Header 3 ("my-header",[],[]) [Str "my",Space,Str "header"] ,Header 2 ("in-math",[],[]) [Str "$",Space,Str "in",Space,Str "math"] ,Para [Math InlineMath "\\$2 + \\$3"] +,Para [Math InlineMath "x = \\text{the $n$th root of $y$}"] ,Para [Str "This",Space,Str "should",Space,Str "not",Space,Str "be",Space,Str "math:"] ,Para [Str "$PATH",Space,Str "90",Space,Str "$PATH"] ,Header 2 ("commented-out-list-item",[],[]) [Str "Commented-out",Space,Str "list",Space,Str "item"] @@ -78,6 +79,8 @@ ,Para [Str "Link",Space,Str "to",Space,Link [Str "Explicit",Space,Str "header",Space,Str "attributes"] ("#foobar",""),Str "."] ,Para [Str "But",Space,Str "this",Space,Str "is",Space,Str "not",Space,Str "a",Space,Str "link",Space,Str "to",Space,Link [Str "My",Space,Str "other",Space,Str "header"] ("/foo",""),Str ",",Space,Str "since",Space,Str "the",Space,Str "reference",Space,Str "is",Space,Str "defined."] ,Header 2 ("foobar",["baz"],[("key","val")]) [Str "Explicit",Space,Str "header",Space,Str "attributes"] +,BlockQuote + [Header 2 ("foobar",["baz"],[("key","val")]) [Str "Header",Space,Str "attributes",Space,Str "inside",Space,Str "block",Space,Str "quote"]] ,Header 2 ("line-blocks",[],[]) [Str "Line",Space,Str "blocks"] ,Para [Str "But",Space,Str "can",Space,Str "a",Space,Str "bee",Space,Str "be",Space,Str "said",Space,Str "to",Space,Str "be",LineBreak,Str "\160\160\160\160or",Space,Str "not",Space,Str "to",Space,Str "be",Space,Str "an",Space,Str "entire",Space,Str "bee,",LineBreak,Str "\160\160\160\160\160\160\160\160when",Space,Str "half",Space,Str "the",Space,Str "bee",Space,Str "is",Space,Str "not",Space,Str "a",Space,Str "bee,",LineBreak,Str "\160\160\160\160\160\160\160\160\160\160\160\160due",Space,Str "to",Space,Str "some",Space,Str "ancient",Space,Str "injury?"] ,Para [Str "Continuation",Space,Str "line",LineBreak,Str "\160\160and",Space,Str "another"] diff --git a/tests/markdown-reader-more.txt b/tests/markdown-reader-more.txt index c486f8885..ea4fc9cad 100644 --- a/tests/markdown-reader-more.txt +++ b/tests/markdown-reader-more.txt @@ -60,6 +60,8 @@ $\$2 + \$3$ +$x = \text{the $n$th root of $y$}$ + This should not be math: $PATH 90 $PATH @@ -174,6 +176,8 @@ But this is not a link to [My other header], since the reference is defined. ## Explicit header attributes {#foobar .baz key="val"} +> ## Header attributes inside block quote {#foobar .baz key="val"} + ## Line blocks | But can a bee be said to be diff --git a/tests/tables.asciidoc b/tests/tables.asciidoc index ba647866a..2a24544a3 100644 --- a/tests/tables.asciidoc +++ b/tests/tables.asciidoc @@ -65,4 +65,3 @@ Multiline table without column headers: |First |row |12.0 |Example of a row that spans multiple lines. |Second |row |5.0 |Here's another one. Note the blank line between rows. |======================================================================= - diff --git a/tests/tables.haddock b/tests/tables.haddock index 413ec97ad..f9efdc0de 100644 --- a/tests/tables.haddock +++ b/tests/tables.haddock @@ -74,4 +74,3 @@ Multiline table without column headers: > the blank line between > rows. > ----------- ---------- ------------ -------------------------- - diff --git a/tests/tables.org b/tests/tables.org index 8d9100d07..9eaf5e706 100644 --- a/tests/tables.org +++ b/tests/tables.org @@ -49,4 +49,3 @@ Multiline table without column headers: | First | row | 12.0 | Example of a row that spans multiple lines. | | Second | row | 5.0 | Here's another one. Note the blank line between rows. | - diff --git a/tests/tables.rst b/tests/tables.rst index e77f69d97..25d5932ea 100644 --- a/tests/tables.rst +++ b/tests/tables.rst @@ -88,4 +88,3 @@ Multiline table without column headers: | | | | the blank line between | | | | | rows. | +-------------+------------+--------------+----------------------------+ - diff --git a/tests/twiki-reader.native b/tests/twiki-reader.native new file mode 100644 index 000000000..bde55a378 --- /dev/null +++ b/tests/twiki-reader.native @@ -0,0 +1,174 @@ +Pandoc (Meta {unMeta = fromList []}) +[Header 1 ("header",[],[]) [Str "header"] +,Header 2 ("header-level-two",[],[]) [Str "header",Space,Str "level",Space,Str "two"] +,Header 3 ("header-level-3",[],[]) [Str "header",Space,Str "level",Space,Str "3"] +,Header 4 ("header-level-four",[],[]) [Str "header",Space,Emph [Str "level"],Space,Str "four"] +,Header 5 ("header-level-5",[],[]) [Str "header",Space,Str "level",Space,Str "5"] +,Header 6 ("header-level-6",[],[]) [Str "header",Space,Str "level",Space,Str "6"] +,Para [Str "---+++++++",Space,Str "not",Space,Str "a",Space,Str "header"] +,Para [Str "--++",Space,Str "not",Space,Str "a",Space,Str "header"] +,Header 1 ("emph-and-strong",[],[]) [Str "emph",Space,Str "and",Space,Str "strong"] +,Para [Emph [Str "emph"],Space,Strong [Str "strong"]] +,Para [Emph [Strong [Str "strong",Space,Str "and",Space,Str "emph"]]] +,Para [Strong [Emph [Str "emph",Space,Str "inside"],Space,Str "strong"]] +,Para [Strong [Str "strong",Space,Str "with",Space,Emph [Str "emph"]]] +,Para [Emph [Strong [Str "strong",Space,Str "inside"],Space,Str "emph"]] +,Header 1 ("horizontal-rule",[],[]) [Str "horizontal",Space,Str "rule"] +,Para [Str "top"] +,HorizontalRule +,Para [Str "bottom"] +,HorizontalRule +,Header 1 ("nop",[],[]) [Str "nop"] +,Para [Str "_not",Space,Str "emph_"] +,Header 1 ("entities",[],[]) [Str "entities"] +,Para [Str "hi",Space,Str "&",Space,Str "low"] +,Para [Str "hi",Space,Str "&",Space,Str "low"] +,Para [Str "G\246del"] +,Para [Str "\777\2730"] +,Header 1 ("comments",[],[]) [Str "comments"] +,Para [Str "inline",Space,Str "comment"] +,Para [Str "between",Space,Str "blocks"] +,Header 1 ("linebreaks",[],[]) [Str "linebreaks"] +,Para [Str "hi",LineBreak,Str "there"] +,Para [Str "hi",LineBreak,Space,Str "there"] +,Header 1 ("inline-code",[],[]) [Str "inline",Space,Str "code"] +,Para [Code ("",[],[]) "*\8594*",Space,Code ("",[],[]) "typed",Space,Code ("",["haskell"],[]) ">>="] +,Header 1 ("code-blocks",[],[]) [Str "code",Space,Str "blocks"] +,CodeBlock ("",[],[]) "case xs of\n (_:_) -> reverse xs\n [] -> ['*']" +,CodeBlock ("",["haskell"],[]) "case xs of\n (_:_) -> reverse xs\n [] -> ['*']" +,Header 1 ("block-quotes",[],[]) [Str "block",Space,Str "quotes"] +,Para [Str "Regular",Space,Str "paragraph"] +,BlockQuote + [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "block",Space,Str "quote."] + ,Para [Str "With",Space,Str "two",Space,Str "paragraphs."]] +,Para [Str "Nother",Space,Str "paragraph."] +,Header 1 ("external-links",[],[]) [Str "external",Space,Str "links"] +,Para [Link [Emph [Str "Google"],Space,Str "search",Space,Str "engine"] ("http://google.com","")] +,Para [Link [Str "http://johnmacfarlane.net/pandoc/"] ("http://johnmacfarlane.net/pandoc/","")] +,Para [Link [Str "http://google.com"] ("http://google.com",""),Space,Link [Str "http://yahoo.com"] ("http://yahoo.com","")] +,Para [Link [Str "email",Space,Str "me"] ("mailto:info@example.org","")] +,Para [Str "http://google.com"] +,Para [Str "http://google.com"] +,Para [Str "http://google.com"] +,Para [Str "info@example.org"] +,Para [Str "info@example.org"] +,Para [Str "info@example.org"] +,Header 1 ("lists",[],[]) [Str "lists"] +,BulletList + [[Plain [Str "Start",Space,Str "each",Space,Str "line"]] + ,[Plain [Str "with",Space,Str "an",Space,Str "asterisk",Space,Str "(*)."] + ,BulletList + [[Plain [Str "More",Space,Str "asterisks",Space,Str "gives",Space,Str "deeper"] + ,BulletList + [[Plain [Str "and",Space,Str "deeper",Space,Str "levels."]]]]]] + ,[Plain [Str "Line",Space,Str "breaks",LineBreak,Str "don't",Space,Str "break",Space,Str "levels."]] + ,[Plain [Str "Continuations",Space,Str "are",Space,Str "also",Space,Str "possible"] + ,BulletList + [[Plain [Str "and",Space,Str "do",Space,Str "not",Space,Str "break",Space,Str "the",Space,Str "list",Space,Str "flow"]]]] + ,[Plain [Str "Level",Space,Str "one"]]] +,Para [Str "Any",Space,Str "other",Space,Str "start",Space,Str "ends",Space,Str "the",Space,Str "list."] +,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "Start",Space,Str "each",Space,Str "line"]] + ,[Plain [Str "with",Space,Str "a",Space,Str "number",Space,Str "(1.)."] + ,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "More",Space,Str "number",Space,Str "signs",Space,Str "gives",Space,Str "deeper"] + ,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "and",Space,Str "deeper"]] + ,[Plain [Str "levels."]]]]]] + ,[Plain [Str "Line",Space,Str "breaks",LineBreak,Str "don't",Space,Str "break",Space,Str "levels."]] + ,[Plain [Str "Blank",Space,Str "lines"]]] +,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "end",Space,Str "the",Space,Str "list",Space,Str "and",Space,Str "start",Space,Str "another."]]] +,Para [Str "Any",Space,Str "other",Space,Str "start",Space,Str "also",Space,Str "ends",Space,Str "the",Space,Str "list."] +,DefinitionList + [([Str "item",Space,Str "1"], + [[Plain [Str "definition",Space,Str "1"]]]) + ,([Str "item",Space,Str "2"], + [[Plain [Str "definition",Space,Str "2-1",Space,Str "definition",Space,Str "2-2"]]]) + ,([Str "item",Space,Emph [Str "3"]], + [[Plain [Str "definition",Space,Emph [Str "3"]]]])] +,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "one"]] + ,[Plain [Str "two"] + ,BulletList + [[Plain [Str "two",Space,Str "point",Space,Str "one"]] + ,[Plain [Str "two",Space,Str "point",Space,Str "two"]]]] + ,[Plain [Str "three"] + ,DefinitionList + [([Str "three",Space,Str "item",Space,Str "one"], + [[Plain [Str "three",Space,Str "def",Space,Str "one"]]])]] + ,[Plain [Str "four"] + ,DefinitionList + [([Str "four",Space,Str "def",Space,Str "one"], + [[Plain [Str "this",Space,Str "is",Space,Str "a",Space,Str "continuation"]]])]] + ,[Plain [Str "five"] + ,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "five",Space,Str "sub",Space,Str "1"] + ,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "five",Space,Str "sub",Space,Str "1",Space,Str "sub",Space,Str "1"]]]] + ,[Plain [Str "five",Space,Str "sub",Space,Str "2"]]]]] +,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "other"] + ,OrderedList (1,UpperRoman,DefaultDelim) + [[Plain [Str "list"]] + ,[Plain [Str "styles"]]]] + ,[Plain [Str "are"] + ,OrderedList (1,LowerRoman,DefaultDelim) + [[Plain [Str "also"]] + ,[Plain [Str "possible"]]]] + ,[Plain [Str "all"] + ,OrderedList (1,LowerAlpha,DefaultDelim) + [[Plain [Str "the"]] + ,[Plain [Str "different"]] + ,[Plain [Str "styles"]]]] + ,[Plain [Str "are"] + ,OrderedList (1,UpperAlpha,DefaultDelim) + [[Plain [Str "implemented"]] + ,[Plain [Str "and"]] + ,[Plain [Str "supported"]]]]] +,Header 1 ("tables",[],[]) [Str "tables"] +,Table [] [AlignDefault,AlignDefault] [0.0,0.0] + [[] + ,[]] + [[[Plain [Str "Orange"]] + ,[Plain [Str "Apple"]]] + ,[[Plain [Str "Bread"]] + ,[Plain [Str "Pie"]]] + ,[[Plain [Str "Butter"]] + ,[Plain [Str "Ice",Space,Str "cream"]]]] +,Table [] [AlignLeft,AlignLeft] [0.0,0.0] + [[Plain [Str "Orange"]] + ,[Plain [Str "Apple"]]] + [[[Plain [Str "Bread"]] + ,[Plain [Str "Pie"]]] + ,[[Plain [Strong [Str "Butter"]]] + ,[Plain [Str "Ice",Space,Str "cream"]]]] +,Table [] [AlignLeft,AlignLeft] [0.0,0.0] + [[Plain [Str "Orange"]] + ,[Plain [Str "Apple"]]] + [[[Plain [Str "Bread",LineBreak,LineBreak,Str "and",Space,Str "cheese"]] + ,[Plain [Str "Pie",LineBreak,LineBreak,Strong [Str "apple"],Space,Str "and",Space,Emph [Str "carrot"]]]]] +,Table [] [AlignDefault,AlignDefault,AlignDefault] [0.0,0.0,0.0] + [[] + ,[] + ,[]] + [[[Plain [Str "Orange"]] + ,[Plain [Str "Apple"]] + ,[Plain [Str "more"]]] + ,[[Plain [Str "Bread"]] + ,[Plain [Str "Pie"]] + ,[Plain [Str "more"]]] + ,[[Plain [Str "Butter"]] + ,[Plain [Str "Ice",Space,Str "cream"]] + ,[Plain [Str "and",Space,Str "more"]]]] +,Header 1 ("macros",[],[]) [Str "macros"] +,Para [Span ("",["twiki-macro","TEST"],[]) []] +,Para [Span ("",["twiki-macro","TEST"],[]) [Str ""]] +,Para [Span ("",["twiki-macro","TEST"],[]) [Str "content with spaces"]] +,Para [Span ("",["twiki-macro","TEST"],[]) [Str "content with spaces"]] +,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test")]) [Str "content with spaces"]] +,Para [Span ("",["twiki-macro","TEST"],[]) [Str "content with spaces ARG1=test"]] +,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test")]) [Str "content with spaces"]] +,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test"),("ARG2","test2")]) [Str ""]] +,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test"),("ARG2","test2")]) [Str ""]] +,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test"),("ARG2","test2")]) [Str "multiline\ndoes also work"]]] diff --git a/tests/twiki-reader.twiki b/tests/twiki-reader.twiki new file mode 100644 index 000000000..51828ef80 --- /dev/null +++ b/tests/twiki-reader.twiki @@ -0,0 +1,221 @@ +---+ header + +---++ header level two + +---+++ header level 3 + +---++++ header _level_ four + +---+++++ header level 5 + +---++++++ header level 6 + +---+++++++ not a header + + --++ not a header + +---+ emph and strong + +_emph_ *strong* + +__strong and emph__ + +*<i>emph inside</i> strong* + +*strong with <i>emph</i>* + +_<b>strong inside</b> emph_ + +---+ horizontal rule + +top +--- +bottom + +--- + +---+ nop + +<nop>_not emph_ + +---+ entities + +hi & low + +hi & low + +Gödel + +̉પ + +---+ comments + +inline <!-- secret --> comment + +<!-- secret --> + +between blocks + + <!-- secret --> + +---+ linebreaks + +hi%BR%there + +hi%BR% +there + +---+ inline code + +<code>*→*</code> =typed= <code class="haskell">>>=</code> + +---+ code blocks + +<verbatim> +case xs of + (_:_) -> reverse xs + [] -> ['*'] +</verbatim> + +<verbatim class="haskell"> +case xs of + (_:_) -> reverse xs + [] -> ['*'] +</verbatim> + +---+ block quotes + +Regular paragraph +<blockquote> +This is a block quote. + +With two paragraphs. +</blockquote> +Nother paragraph. + +---+ external links + +[[http://google.com][<i>Google</i> search engine]] + +http://johnmacfarlane.net/pandoc/ + +[[http://google.com]] [[http://yahoo.com]] + +[[mailto:info@example.org][email me]] + +!http://google.com + +<nop>http://google.com + +<noautolink> +http://google.com +</noautolink> + +!info@example.org + +<nop>info@example.org + +<noautolink> +info@example.org +</noautolink> + +---+ lists + + * Start each line + * with an asterisk (*). + * More asterisks gives deeper + * and deeper levels. + * Line breaks%BR%don't break levels. + * Continuations + are also possible + * and do not break the list flow + * Level one +Any other start ends the list. + + 1. Start each line + 1. with a number (1.). + 1. More number signs gives deeper + 1. and deeper + 1. levels. + 1. Line breaks%BR%don't break levels. + 1. Blank lines + + 1. end the list and start another. +Any other start also +ends the list. + + $ item 1: definition 1 + $ item 2: definition 2-1 + definition 2-2 + $ item _3_: definition _3_ + + 1. one + 1. two + * two point one + * two point two + 1. three + $ three item one: three def one + 1. four + $ four def one: this + is a continuation + 1. five + 1. five sub 1 + 1. five sub 1 sub 1 + 1. five sub 2 + + 1. other + I. list + I. styles + 1. are + i. also + i. possible + 1. all + a. the + a. different + a. styles + 1. are + A. implemented + A. and + A. supported + +---+ tables + +|Orange|Apple| +|Bread|Pie| +|Butter|Ice cream| + +|*Orange*|*Apple*| +|Bread|Pie| +|*Butter*|Ice cream| + +|*Orange*|*Apple*| +|Bread%BR%%BR%and cheese|Pie%BR%%BR%*apple* and <i>carrot</i>| + +| Orange | Apple | more | +| Bread | Pie | more | +| Butter | Ice cream | and more | + +---+ macros + +%TEST% + +%TEST{}% + +%TEST{content with spaces}% + +%TEST{"content with spaces"}% + +%TEST{"content with spaces" ARG1="test"}% + +%TEST{content with spaces ARG1=test}% + +%TEST{ARG1=test content with spaces}% + +%TEST{ARG1=test ARG2=test2}% + +%TEST{ARG1="test" ARG2="test2"}% + +%TEST{ARG1="test" +ARG2="test2" +multiline +does also work}% diff --git a/tests/writer.dokuwiki b/tests/writer.dokuwiki index 704e79b87..dc14e9b00 100644 --- a/tests/writer.dokuwiki +++ b/tests/writer.dokuwiki @@ -268,7 +268,8 @@ Multiple blocks with italics: <HTML><dt></HTML>//orange//<HTML></dt></HTML> <HTML><dd></HTML><HTML><p></HTML>orange fruit<HTML></p></HTML> <code>{ orange code block }</code> -> <HTML><p></HTML>orange block quote<HTML></p></HTML><HTML></dd></HTML><HTML></dl></HTML> +> <HTML><p></HTML>orange block quote<HTML></p></HTML> +<HTML></dd></HTML><HTML></dl></HTML> Multiple definitions, tight: @@ -611,7 +612,7 @@ If you want, you can indent every line, but you can also be lazy and just indent )) > Notes can go in quotes.((In quote. -)) +> )) - And in list items.((In list.)) diff --git a/tests/writer.opml b/tests/writer.opml index 54be4b671..840c3c6e1 100644 --- a/tests/writer.opml +++ b/tests/writer.opml @@ -18,7 +18,7 @@ </outline> <outline text="Level 1"> <outline text="Level 2 with <em>emphasis</em>"> - <outline text="Level 3" _note="with no blank line "> + <outline text="Level 3" _note="with no blank line"> </outline> </outline> <outline text="Level 2" _note="with no blank line ------------------------------------------------------------------------"> @@ -55,18 +55,18 @@ <outline text="Special Characters" _note="Here is some unicode: - I hat: Î - o umlaut: ö - section: § - set membership: ∈ - copyright: © AT&T has an ampersand in their name. AT&T is another way to write it. This & that. 4 \< 5. 6 \> 5. Backslash: \\ Backtick: \` Asterisk: \* Underscore: \_ Left brace: { Right brace: } Left bracket: [ Right bracket: ] Left paren: ( Right paren: ) Greater-than: \> Hash: \# Period: . Bang: ! Plus: + Minus: - ------------------------------------------------------------------------"> </outline> <outline text="Links"> - <outline text="Explicit" _note="Just a [URL](/url/). [URL and title](/url/ "title"). [URL and title](/url/ "title preceded by two spaces"). [URL and title](/url/ "title preceded by a tab"). [URL and title](/url/ "title with "quotes" in it") [URL and title](/url/ "title with single quotes") [with\_underscore](/url/with_underscore) [Email link](mailto:nobody@nowhere.net) [Empty](). "> + <outline text="Explicit" _note="Just a [URL](/url/). [URL and title](/url/ "title"). [URL and title](/url/ "title preceded by two spaces"). [URL and title](/url/ "title preceded by a tab"). [URL and title](/url/ "title with "quotes" in it") [URL and title](/url/ "title with single quotes") [with\_underscore](/url/with_underscore) [Email link](mailto:nobody@nowhere.net) [Empty]()."> </outline> - <outline text="Reference" _note="Foo [bar](/url/). Foo [bar](/url/). Foo [bar](/url/). With [embedded [brackets]](/url/). [b](/url/) by itself should be a link. Indented [once](/url). Indented [twice](/url). Indented [thrice](/url). This should [not][] be a link. [not]: /url Foo [bar](/url/ "Title with "quotes" inside"). Foo [biz](/url/ "Title with "quote" inside"). "> + <outline text="Reference" _note="Foo [bar](/url/). Foo [bar](/url/). Foo [bar](/url/). With [embedded [brackets]](/url/). [b](/url/) by itself should be a link. Indented [once](/url). Indented [twice](/url). Indented [thrice](/url). This should [not][] be a link. [not]: /url Foo [bar](/url/ "Title with "quotes" inside"). Foo [biz](/url/ "Title with "quote" inside")."> </outline> - <outline text="With ampersands" _note="Here’s a [link with an ampersand in the URL](http://example.com/?foo=1&bar=2). Here’s a link with an amersand in the link text: [AT&T](http://att.com/ "AT&T"). Here’s an [inline link](/script?foo=1&bar=2). Here’s an [inline link in pointy braces](/script?foo=1&bar=2). "> + <outline text="With ampersands" _note="Here’s a [link with an ampersand in the URL](http://example.com/?foo=1&bar=2). Here’s a link with an amersand in the link text: [AT&T](http://att.com/ "AT&T"). Here’s an [inline link](/script?foo=1&bar=2). Here’s an [inline link in pointy braces](/script?foo=1&bar=2)."> </outline> <outline text="Autolinks" _note="With an ampersand: <http://example.com/?foo=1&bar=2> - In a list? - <http://example.com/> - It should. An e-mail address: <nobody@nowhere.net> > Blockquoted: <http://example.com/> Auto-links should not occur here: `<http://example.com/>` or here: <http://example.com/> ------------------------------------------------------------------------"> </outline> </outline> <outline text="Images" _note="From “Voyage dans la Lune” by Georges Melies (1902): ![lalune](lalune.jpg "Voyage dans la Lune") Here is a movie ![movie](movie.jpg) icon. ------------------------------------------------------------------------"> </outline> -<outline text="Footnotes" _note="Here is a footnote reference,[^1] and another.[^2] This should *not* be a footnote reference, because it contains a space.[\^my note] Here is an inline note.[^3] > Notes can go in quotes.[^4] 1. And in list items.[^5] This paragraph should not be part of the note, as it is not indented. [^1]: Here is the footnote. It can go anywhere after the footnote reference. It need not be placed at the end of the document. [^2]: Here’s the long note. This one contains multiple blocks. Subsequent blocks are indented to show that they belong to the footnote (as with list items). { <code> } If you want, you can indent every line, but you can also be lazy and just indent the first line of each block. [^3]: This is *easier* to type. Inline notes may contain [links](http://google.com) and `]` verbatim characters, as well as [bracketed text]. [^4]: In quote. [^5]: In list. "> +<outline text="Footnotes" _note="Here is a footnote reference,[^1] and another.[^2] This should *not* be a footnote reference, because it contains a space.[\^my note] Here is an inline note.[^3] > Notes can go in quotes.[^4] 1. And in list items.[^5] This paragraph should not be part of the note, as it is not indented. [^1]: Here is the footnote. It can go anywhere after the footnote reference. It need not be placed at the end of the document. [^2]: Here’s the long note. This one contains multiple blocks. Subsequent blocks are indented to show that they belong to the footnote (as with list items). { <code> } If you want, you can indent every line, but you can also be lazy and just indent the first line of each block. [^3]: This is *easier* to type. Inline notes may contain [links](http://google.com) and `]` verbatim characters, as well as [bracketed text]. [^4]: In quote. [^5]: In list."> </outline> </body> </opml> diff --git a/tests/writer.rst b/tests/writer.rst index 1a998d2ae..f09871a34 100644 --- a/tests/writer.rst +++ b/tests/writer.rst @@ -839,6 +839,7 @@ From “Voyage dans la Lune” by Georges Melies (1902): :alt: Voyage dans la Lune lalune + Here is a movie |movie| icon. -------------- |