diff options
59 files changed, 1888 insertions, 384 deletions
@@ -44,12 +44,7 @@ Quick install [Not sure where `$CABALDIR` is?](http://www.haskell.org/haskellwiki/Cabal-Install#The_cabal-install_configuration_file) -5. Make sure the `$CABALDIR/share/man/man1` directory is in your `MANPATH`. - You should now be able to access the `pandoc` man page: - - man pandoc - -6. If you want to process citations with pandoc, you will also need to +5. If you want to process citations with pandoc, you will also need to install a separate package, `pandoc-citeproc`. This can be installed using cabal: @@ -71,6 +66,20 @@ Quick install --extra-include-dirs=/usr/local/Cellar/icu4c/51.1/include \ -funicode_collation text-icu pandoc-citeproc +The cabal installation procedure does not generate man pages. +To build the `pandoc` man pages, build pandoc with the +`make-pandoc-man-pages` flag, and then use the command +`make-pandoc-man-pages` from the pandoc source directory. +This will create the man pages in `man/man1` and `man/man5`. + +To build the `pandoc-citeproc` man pages, go to the pandoc-citeproc +build directory, and + + cd man + make + +The man page will be created in the `man1` subdirectory. + [GHC]: http://www.haskell.org/ghc/ [Haskell platform]: http://hackage.haskell.org/platform/ [cabal-install]: http://hackage.haskell.org/trac/hackage/wiki/CabalInstall @@ -13,15 +13,16 @@ Description Pandoc is a [Haskell] library for converting from one markup format to another, and a command-line tool that uses this library. It can read [markdown] and (subsets of) [Textile], [reStructuredText], [HTML], -[LaTeX], [MediaWiki markup], [Haddock markup], [OPML], [Emacs -Org-mode], [DocBook], [txt2tags], [EPUB] and [Word docx]; and it can write plain text, -[markdown], [reStructuredText], [XHTML], [HTML 5], [LaTeX] (including -[beamer] slide shows), [ConTeXt], [RTF], [OPML], [DocBook], -[OpenDocument], [ODT], [Word docx], [GNU Texinfo], [MediaWiki markup], -[DokuWiki markup], [Haddock markup], [EPUB] (v2 or v3), [FictionBook2], -[Textile], [groff man] pages, [Emacs Org-Mode], [AsciiDoc], [InDesign ICML], -and [Slidy], [Slideous], [DZSlides], [reveal.js] or [S5] HTML slide shows. -It can also produce [PDF] output on systems where LaTeX is installed. +[LaTeX], [MediaWiki markup], [TWiki markup], [Haddock markup], [OPML], +[Emacs Org-mode], [DocBook], [txt2tags], [EPUB] and [Word docx]; and +it can write plain text, [markdown], [reStructuredText], [XHTML], +[HTML 5], [LaTeX] (including [beamer] slide shows), [ConTeXt], [RTF], +[OPML], [DocBook], [OpenDocument], [ODT], [Word docx], [GNU Texinfo], +[MediaWiki markup], [DokuWiki markup], [Haddock markup], [EPUB] (v2 or v3), +[FictionBook2], [Textile], [groff man] pages, [Emacs Org-Mode], [AsciiDoc], +[InDesign ICML], and [Slidy], [Slideous], [DZSlides], [reveal.js] or +[S5] HTML slide shows. It can also produce [PDF] output on systems where +LaTeX is installed. Pandoc's enhanced version of markdown includes syntax for footnotes, tables, flexible ordered lists, definition lists, fenced code blocks, @@ -50,6 +51,15 @@ default (though output to *stdout* is disabled for the `odt`, `docx`, pandoc -o output.html input.txt +By default, pandoc produces a document fragment, not a standalone +document with a proper header and footer. To produce a standalone +document, use the `-s` or `--standalone` flag: + + pandoc -s -o output.html input.txt + +For more information on how standalone documents are produced, see +[Templates](#templates), below. + Instead of a file, an absolute URI may be given. In this case pandoc will fetch the content using HTTP: @@ -95,6 +105,11 @@ should pipe input and output through `iconv`: iconv -t utf-8 input.txt | pandoc | iconv -f utf-8 +Note that in some output formats (such as HTML, LaTeX, ConTeXt, +RTF, OPML, DocBook, and Texinfo), information about +the character encoding is included in the document header, which +will only be included if you use the `-s/--standalone` option. + Creating a PDF -------------- @@ -147,8 +162,8 @@ General options `textile` (Textile), `rst` (reStructuredText), `html` (HTML), `docbook` (DocBook), `t2t` (txt2tags), `docx` (docx), `epub` (EPUB), `opml` (OPML), `org` (Emacs Org-mode), `mediawiki` (MediaWiki markup), - `haddock` (Haddock markup), or `latex` (LaTeX). If `+lhs` is appended - to `markdown`, `rst`, + `twiki` (TWiki markup), `haddock` (Haddock markup), or `latex` (LaTeX). + If `+lhs` is appended to `markdown`, `rst`, `latex`, or `html`, the input will be treated as literate Haskell source: see [Literate Haskell support](#literate-haskell-support), below. Markdown syntax extensions can be individually enabled or @@ -239,8 +254,8 @@ Reader options to curly quotes, `---` to em-dashes, `--` to en-dashes, and `...` to ellipses. Nonbreaking spaces are inserted after certain abbreviations, such as "Mr." (Note: This option is significant only when - the input format is `markdown`, `markdown_strict`, or `textile`. It - is selected automatically when the input format is `textile` or the + the input format is `markdown`, `markdown_strict`, `textile` or `twiki`. + It is selected automatically when the input format is `textile` or the output format is `latex` or `context`, unless `--no-tex-ligatures` is used.) @@ -1938,6 +1953,11 @@ A document may contain multiple metadata blocks. The metadata fields will be combined through a *left-biased union*: if two metadata blocks attempt to set the same field, the value from the first block will be taken. +When pandoc is used with `-t markdown` to create a markdown document, +a YAML metadata block will be produced only if the `-s/--standalone` +option is used. All of the metadata will appear in a single block +at the beginning of the document. + Note that YAML escaping rules must be followed. Thus, for example, if a title contains a colon, it must be quoted. The pipe character (`|`) can be used to begin an indented block that will be interpreted @@ -2365,6 +2385,10 @@ be followed by a link title, in quotes.) There can be no space between the bracketed part and the parenthesized part. The link text can contain formatting (such as emphasis), but the title cannot. +Email addresses in inline links are not autodetected, so they have to be +prefixed with `mailto`: + + [Write me!](mailto:sam@green.eggs.ham) ### Reference links ### @@ -2376,7 +2400,9 @@ The link consists of link text in square brackets, followed by a label in square brackets. (There can be space between the two.) The link definition consists of the bracketed label, followed by a colon and a space, followed by the URL, and optionally (after a space) a link title either in quotes or in -parentheses. +parentheses. The label must not be parseable as a citation (assuming +the `citations` extension is enabled): citations take precedence over +link labels. Here are some examples: @@ -3168,6 +3194,7 @@ Rosenthal. [Textile]: http://redcloth.org/textile [MediaWiki markup]: http://www.mediawiki.org/wiki/Help:Formatting [DokuWiki markup]: https://www.dokuwiki.org/dokuwiki +[TWiki markup]: http://twiki.org/cgi-bin/view/TWiki/TextFormattingRules [Haddock markup]: http://www.haskell.org/haddock/doc/html/ch03s08.html [groff man]: http://developer.apple.com/DOCUMENTATION/Darwin/Reference/ManPages/man7/groff_man.7.html [Haskell]: http://www.haskell.org/ diff --git a/data/templates b/data/templates -Subproject ec057f0ad3d191d18a2842e6a2fd41c471c6d97 +Subproject c76c6c52249118e49c5dd1b99fe916724350a59 diff --git a/pandoc.cabal b/pandoc.cabal index b2f82c04b..6abc8b0bc 100644 --- a/pandoc.cabal +++ b/pandoc.cabal @@ -1,5 +1,5 @@ Name: pandoc -Version: 1.13.1 +Version: 1.13.2 Cabal-Version: >= 1.10 Build-Type: Custom License: GPL @@ -16,14 +16,15 @@ Synopsis: Conversion between markup formats Description: Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. It can read markdown and (subsets of) HTML, - reStructuredText, LaTeX, DocBook, MediaWiki markup, Haddock - markup, OPML, Emacs Org-Mode, txt2tags and Textile, and it can write - markdown, reStructuredText, HTML, LaTeX, ConTeXt, Docbook, - OPML, OpenDocument, ODT, Word docx, RTF, MediaWiki, DokuWiki, - Textile, groff man pages, plain text, Emacs Org-Mode, AsciiDoc, - Haddock markup, EPUB (v2 and v3), FictionBook2, - InDesign ICML, and several kinds of HTML/javascript - slide shows (S5, Slidy, Slideous, DZSlides, reveal.js). + reStructuredText, LaTeX, DocBook, MediaWiki markup, TWiki + markup, Haddock markup, OPML, Emacs Org-Mode, txt2tags and + Textile, and it can write markdown, reStructuredText, XHTML, + HTML 5, LaTeX, ConTeXt, DocBook, OPML, OpenDocument, ODT, + Word docx, RTF, MediaWiki, DokuWiki, Textile, groff man + pages, plain text, Emacs Org-Mode, AsciiDoc, Haddock markup, + EPUB (v2 and v3), FictionBook2, InDesign ICML, and several + kinds of HTML/javascript slide shows (S5, Slidy, Slideous, + DZSlides, reveal.js). . Pandoc extends standard markdown syntax with footnotes, embedded LaTeX, definition lists, tables, and other @@ -169,6 +170,8 @@ Extra-Source-Files: tests/lhs-test.html+lhs tests/lhs-test.fragment.html+lhs tests/pipe-tables.txt + tests/dokuwiki_external_images.dokuwiki + tests/dokuwiki_external_images.native tests/dokuwiki_multiblock_table.dokuwiki tests/dokuwiki_multiblock_table.native tests/fb2/*.markdown @@ -182,6 +185,7 @@ Extra-Source-Files: tests/epub/*.epub tests/epub/*.native tests/txt2tags.t2t + tests/twiki-reader.twiki Source-repository head type: git @@ -247,7 +251,7 @@ Library haddock-library >= 1.1 && < 1.2, old-time, deepseq-generics >= 0.1 && < 0.2, - JuicyPixels >= 3.1.6.1 && < 3.2 + JuicyPixels >= 3.1.6.1 && < 3.3 if flag(network-uri) Build-Depends: network-uri >= 2.6 && < 2.7, network >= 2.6 else @@ -289,6 +293,7 @@ Library Text.Pandoc.Readers.Textile, Text.Pandoc.Readers.Native, Text.Pandoc.Readers.Haddock, + Text.Pandoc.Readers.TWiki, Text.Pandoc.Readers.Docx, Text.Pandoc.Readers.EPUB, Text.Pandoc.Writers.Native, @@ -444,7 +449,7 @@ Test-Suite test-pandoc Tests.Writers.Plain Tests.Writers.AsciiDoc Tests.Writers.LaTeX - Ghc-Options: -rtsopts -Wall -fno-warn-unused-do-bind + Ghc-Options: -rtsopts -Wall -fno-warn-unused-do-bind -threaded Default-Language: Haskell98 benchmark benchmark-pandoc @@ -930,6 +930,9 @@ defaultReaderName fallback (x:xs) = ".docx" -> "docx" ".t2t" -> "t2t" ".epub" -> "epub" + ".odt" -> "odt" -- so we get an "unknown reader" error + ".pdf" -> "pdf" -- so we get an "unknown reader" error + ".doc" -> "doc" -- so we get an "unknown reader" error _ -> defaultReaderName fallback xs -- Returns True if extension of first source is .lhs @@ -970,6 +973,7 @@ defaultWriterName x = ".pdf" -> "latex" ".fb2" -> "fb2" ".opml" -> "opml" + ".icml" -> "icml" ['.',y] | y `elem` ['1'..'9'] -> "man" _ -> "html" @@ -1148,7 +1152,15 @@ main = do (getT2TMeta sources outputFile) else case getReader readerName' of Right r -> return r - Left e -> err 7 e + Left e -> err 7 e' + where e' = case readerName' of + "odt" -> e ++ + "\nPandoc can convert to ODT, but not from ODT.\nTry using LibreOffice to export as HTML, and convert that with pandoc." + "pdf" -> e ++ + "\nPandoc can convert to PDF, but not from PDF." + "doc" -> e ++ + "\nPandoc can convert from DOCX, but not from DOC.\nTry using Word to save your DOC file as DOCX, and convert that with pandoc." + _ -> e let standalone' = standalone || not (isTextFormat writerName') || pdfOutput diff --git a/src/Text/Pandoc.hs b/src/Text/Pandoc.hs index fd849316b..d2bb85699 100644 --- a/src/Text/Pandoc.hs +++ b/src/Text/Pandoc.hs @@ -77,6 +77,7 @@ module Text.Pandoc , readHaddock , readNative , readJSON + , readTWiki , readTxt2Tags , readTxt2TagsNoMacros , readEPUB @@ -133,6 +134,7 @@ import Text.Pandoc.Readers.HTML import Text.Pandoc.Readers.Textile import Text.Pandoc.Readers.Native import Text.Pandoc.Readers.Haddock +import Text.Pandoc.Readers.TWiki import Text.Pandoc.Readers.Docx import Text.Pandoc.Readers.Txt2Tags import Text.Pandoc.Readers.EPUB @@ -199,19 +201,18 @@ parseFormatSpec = parse formatSpec "" '-' -> Set.delete ext _ -> Set.insert ext --- auxiliary function for readers: -markdown :: ReaderOptions -> String -> IO Pandoc -markdown o s = do - let (doc, warnings) = readMarkdownWithWarnings o s - mapM_ warn warnings - return doc - data Reader = StringReader (ReaderOptions -> String -> IO Pandoc) | ByteStringReader (ReaderOptions -> BL.ByteString -> IO (Pandoc, MediaBag)) mkStringReader :: (ReaderOptions -> String -> Pandoc) -> Reader mkStringReader r = StringReader (\o s -> return $ r o s) +mkStringReaderWithWarnings :: (ReaderOptions -> String -> (Pandoc, [String])) -> Reader +mkStringReaderWithWarnings r = StringReader $ \o s -> do + let (doc, warnings) = r o s + mapM_ warn warnings + return doc + mkBSReader :: (ReaderOptions -> BL.ByteString -> (Pandoc, MediaBag)) -> Reader mkBSReader r = ByteStringReader (\o s -> return $ r o s) @@ -219,12 +220,12 @@ mkBSReader r = ByteStringReader (\o s -> return $ r o s) readers :: [(String, Reader)] readers = [ ("native" , StringReader $ \_ s -> return $ readNative s) ,("json" , mkStringReader readJSON ) - ,("markdown" , StringReader markdown) - ,("markdown_strict" , StringReader markdown) - ,("markdown_phpextra" , StringReader markdown) - ,("markdown_github" , StringReader markdown) - ,("markdown_mmd", StringReader markdown) - ,("rst" , mkStringReader readRST ) + ,("markdown" , mkStringReaderWithWarnings readMarkdownWithWarnings) + ,("markdown_strict" , mkStringReaderWithWarnings readMarkdownWithWarnings) + ,("markdown_phpextra" , mkStringReaderWithWarnings readMarkdownWithWarnings) + ,("markdown_github" , mkStringReaderWithWarnings readMarkdownWithWarnings) + ,("markdown_mmd", mkStringReaderWithWarnings readMarkdownWithWarnings) + ,("rst" , mkStringReaderWithWarnings readRSTWithWarnings ) ,("mediawiki" , mkStringReader readMediaWiki) ,("docbook" , mkStringReader readDocBook) ,("opml" , mkStringReader readOPML) @@ -233,6 +234,7 @@ readers = [ ("native" , StringReader $ \_ s -> return $ readNative s) ,("html" , mkStringReader readHtml) ,("latex" , mkStringReader readLaTeX) ,("haddock" , mkStringReader readHaddock) + ,("twiki" , mkStringReader readTWiki) ,("docx" , mkBSReader readDocx) ,("t2t" , mkStringReader readTxt2TagsNoMacros) ,("epub" , mkBSReader readEPUB) diff --git a/src/Text/Pandoc/MIME.hs b/src/Text/Pandoc/MIME.hs index 3b3b3b5b3..2fdba93e0 100644 --- a/src/Text/Pandoc/MIME.hs +++ b/src/Text/Pandoc/MIME.hs @@ -328,7 +328,7 @@ mimeTypesList = -- List borrowed from happstack-server. ,("oth","application/vnd.oasis.opendocument.text-web") ,("otp","application/vnd.oasis.opendocument.presentation-template") ,("ots","application/vnd.oasis.opendocument.spreadsheet-template") - ,("otf","application/x-font-opentype") + ,("otf","application/vnd.ms-opentype") ,("ott","application/vnd.oasis.opendocument.text-template") ,("oza","application/x-oz-application") ,("p","text/x-pascal") @@ -477,6 +477,7 @@ mimeTypesList = -- List borrowed from happstack-server. ,("vrml","model/vrml") ,("vs","text/plain") ,("vsd","application/vnd.visio") + ,("vtt","text/vtt") ,("wad","application/x-doom") ,("wav","audio/x-wav") ,("wax","audio/x-ms-wax") diff --git a/src/Text/Pandoc/Parsing.hs b/src/Text/Pandoc/Parsing.hs index d1fba1e21..28ea2bd2f 100644 --- a/src/Text/Pandoc/Parsing.hs +++ b/src/Text/Pandoc/Parsing.hs @@ -65,6 +65,7 @@ module Text.Pandoc.Parsing ( anyLine, widthsFromIndices, gridTableWith, readWith, + readWithWarnings, readWithM, testStringWith, guardEnabled, @@ -162,6 +163,7 @@ module Text.Pandoc.Parsing ( anyLine, setSourceColumn, setSourceLine, newPos, + addWarning ) where @@ -312,12 +314,14 @@ stringAnyCase (x:xs) = do return (firstChar:rest) -- | Parse contents of 'str' using 'parser' and return result. -parseFromString :: Stream s m t => ParserT s st m a -> s -> ParserT s st m a +parseFromString :: Monad m => ParserT String st m a -> String -> ParserT String st m a parseFromString parser str = do oldPos <- getPosition oldInput <- getInput setInput str result <- parser + spaces + eof setInput oldInput setPosition oldPos return result @@ -452,7 +456,7 @@ uri = try $ do let percentEscaped = try $ char '%' >> skipMany1 (satisfy isHexDigit) let entity = () <$ characterReference let punct = skipMany1 (char ',') - <|> () <$ (satisfy (\c -> not (isSpace c) && c /= '<')) + <|> () <$ (satisfy (\c -> not (isSpace c) && c /= '<' && c /= '>')) let uriChunk = skipMany1 wordChar <|> percentEscaped <|> entity @@ -472,7 +476,12 @@ mathInlineWith op cl = try $ do string op notFollowedBy space words' <- many1Till (count 1 (noneOf " \t\n\\") - <|> (char '\\' >> anyChar >>= \c -> return ['\\',c]) + <|> (char '\\' >> + -- This next clause is needed because \text{..} can + -- contain $, \(\), etc. + (try (string "text" >> + (("\\text" ++) <$> inBalancedBraces 0 "")) + <|> (\c -> ['\\',c]) <$> anyChar)) <|> do (blankline <* notFollowedBy' blankline) <|> (oneOf " \t" <* skipMany (oneOf " \t")) notFollowedBy (char '$') @@ -480,6 +489,23 @@ mathInlineWith op cl = try $ do ) (try $ string cl) notFollowedBy digit -- to prevent capture of $5 return $ concat words' + where + inBalancedBraces :: Stream s m Char => Int -> String -> ParserT s st m String + inBalancedBraces 0 "" = do + c <- anyChar + if c == '{' + then inBalancedBraces 1 "{" + else mzero + inBalancedBraces 0 s = return $ reverse s + inBalancedBraces numOpen ('\\':xs) = do + c <- anyChar + inBalancedBraces numOpen (c:'\\':xs) + inBalancedBraces numOpen xs = do + c <- anyChar + case c of + '}' -> inBalancedBraces (numOpen - 1) (c:xs) + '{' -> inBalancedBraces (numOpen + 1) (c:xs) + _ -> inBalancedBraces numOpen (c:xs) mathDisplayWith :: Stream s m Char => String -> String -> ParserT s st m String mathDisplayWith op cl = try $ do @@ -858,6 +884,15 @@ readWith :: Parser [Char] st a -> a readWith p t inp = runIdentity $ readWithM p t inp +readWithWarnings :: Parser [Char] ParserState a + -> ParserState + -> String + -> (a, [String]) +readWithWarnings p = readWith $ do + doc <- p + warnings <- stateWarnings <$> getState + return (doc, warnings) + -- | Parse a string with @parser@ (for testing). testStringWith :: (Show a, Stream [Char] Identity Char) => ParserT [Char] ParserState Identity a @@ -888,10 +923,9 @@ data ParserState = ParserState stateHasChapters :: Bool, -- ^ True if \chapter encountered stateMacros :: [Macro], -- ^ List of macros defined so far stateRstDefaultRole :: String, -- ^ Current rST default interpreted text role - stateRstCustomRoles :: M.Map String (String, Maybe String, Attr -> (String, Attr)), -- ^ Current rST custom text roles + stateRstCustomRoles :: M.Map String (String, Maybe String, Attr), -- ^ Current rST custom text roles -- Triple represents: 1) Base role, 2) Optional format (only for :raw: - -- roles), 3) Source language annotation for code (could be used to - -- annotate role classes too). + -- roles), 3) Additional classes (rest of Attr is unused)). stateCaption :: Maybe Inlines, -- ^ Caption in current environment stateInHtmlBlock :: Maybe String, -- ^ Tag type of HTML block being parsed stateMarkdownAttribute :: Bool, -- ^ True if in markdown=1 context @@ -1223,3 +1257,10 @@ applyMacros' target = do then do macros <- extractMacros <$> getState return $ applyMacros macros target else return target + +-- | Append a warning to the log. +addWarning :: Maybe SourcePos -> String -> Parser [Char] ParserState () +addWarning mbpos msg = + updateState $ \st -> st{ + stateWarnings = (msg ++ maybe "" (\pos -> " " ++ show pos) mbpos) : + stateWarnings st } diff --git a/src/Text/Pandoc/Pretty.hs b/src/Text/Pandoc/Pretty.hs index 1e72c2040..2f2656086 100644 --- a/src/Text/Pandoc/Pretty.hs +++ b/src/Text/Pandoc/Pretty.hs @@ -286,6 +286,9 @@ renderList (BlankLines num : xs) = do | otherwise -> replicateM_ (1 + num - newlines st) (outp (-1) "\n") renderList xs +renderList (CarriageReturn : BlankLines m : xs) = + renderList (BlankLines m : xs) + renderList (CarriageReturn : xs) = do st <- get if newlines st > 0 || null xs @@ -531,4 +534,4 @@ charWidth c = -- | Get real length of string, taking into account combining and double-wide -- characters. realLength :: String -> Int -realLength = sum . map charWidth +realLength = foldr (\a b -> charWidth a + b) 0 diff --git a/src/Text/Pandoc/Readers/DocBook.hs b/src/Text/Pandoc/Readers/DocBook.hs index 59ff3e717..663960a87 100644 --- a/src/Text/Pandoc/Readers/DocBook.hs +++ b/src/Text/Pandoc/Readers/DocBook.hs @@ -70,8 +70,8 @@ List of all DocBook tags, with [x] indicating implemented, [x] book - A book [x] bookinfo - Meta-information for a Book [x] bridgehead - A free-floating heading -[ ] callout - A “called out” description of a marked Area -[ ] calloutlist - A list of Callouts +[x] callout - A “called out” description of a marked Area +[x] calloutlist - A list of Callouts [x] caption - A caption [x] caution - A note of caution [x] chapter - A chapter, as of a book @@ -81,7 +81,7 @@ List of all DocBook tags, with [x] indicating implemented, [ ] citerefentry - A citation to a reference page [ ] citetitle - The title of a cited work [ ] city - The name of a city in an address -[ ] classname - The name of a class, in the object-oriented programming sense +[x] classname - The name of a class, in the object-oriented programming sense [ ] classsynopsis - The syntax summary for a class definition [ ] classsynopsisinfo - Information supplementing the contents of a ClassSynopsis @@ -169,9 +169,9 @@ List of all DocBook tags, with [x] indicating implemented, [ ] guibutton - The text on a button in a GUI [ ] guiicon - Graphic and/or text appearing as a icon in a GUI [ ] guilabel - The text of a label in a GUI -[ ] guimenu - The name of a menu in a GUI -[ ] guimenuitem - The name of a terminal menu item in a GUI -[ ] guisubmenu - The name of a submenu in a GUI +[x] guimenu - The name of a menu in a GUI +[x] guimenuitem - The name of a terminal menu item in a GUI +[x] guisubmenu - The name of a submenu in a GUI [ ] hardware - A physical part of a computer system [ ] highlights - A summary of the main points of the discussed component [ ] holder - The name of the individual or organization that holds a copyright @@ -206,10 +206,10 @@ List of all DocBook tags, with [x] indicating implemented, other dingbat [ ] itermset - A set of index terms in the meta-information of a document [ ] jobtitle - The title of an individual in an organization -[ ] keycap - The text printed on a key on a keyboard +[x] keycap - The text printed on a key on a keyboard [ ] keycode - The internal, frequently numeric, identifier for a key on a keyboard -[ ] keycombo - A combination of input actions +[x] keycombo - A combination of input actions [ ] keysym - The symbolic name of a key on a keyboard [ ] keyword - One of a set of keywords describing the content of a document [ ] keywordset - A set of keywords describing the content of a document @@ -237,7 +237,7 @@ List of all DocBook tags, with [x] indicating implemented, [x] mediaobject - A displayed media object (video, audio, image, etc.) [ ] mediaobjectco - A media object that contains callouts [x] member - An element of a simple list -[ ] menuchoice - A selection or series of selections from a menu +[x] menuchoice - A selection or series of selections from a menu [ ] methodname - The name of a method [ ] methodparam - Parameters to a method [ ] methodsynopsis - A syntax summary for a method @@ -471,7 +471,7 @@ List of all DocBook tags, with [x] indicating implemented, [ ] token - A unit of information [x] tr - A row in an HTML table [ ] trademark - A trademark -[ ] type - The classification of a value +[x] type - The classification of a value [x] ulink - A link that addresses its target by means of a URL (Uniform Resource Locator) [x] uri - A Uniform Resource Identifier @@ -603,7 +603,7 @@ isBlockElement (Elem e) = qName (elName e) `elem` blocktags "important","caution","note","tip","warning","qandadiv", "question","answer","abstract","itemizedlist","orderedlist", "variablelist","article","book","table","informaltable", - "screen","programlisting","example"] + "screen","programlisting","example","calloutlist"] isBlockElement _ = False -- Trim leading and trailing newline characters @@ -712,6 +712,7 @@ parseBlock (Elem e) = "question" -> addToStart (strong (str "Q:") <> str " ") <$> getBlocks e "answer" -> addToStart (strong (str "A:") <> str " ") <$> getBlocks e "abstract" -> blockQuote <$> getBlocks e + "calloutlist" -> bulletList <$> callouts "itemizedlist" -> bulletList <$> listitems "orderedlist" -> do let listStyle = case attrValue "numeration" e of @@ -772,11 +773,6 @@ parseBlock (Elem e) = x -> [x] return $ codeBlockWith (attrValue "id" e, classes', []) $ trimNl $ strContentRecursive e - strContentRecursive = strContent . (\e' -> e'{ elContent = - map elementToStr $ elContent e' }) - elementToStr :: Content -> Content - elementToStr (Elem e') = Text $ CData CDataText (strContentRecursive e') Nothing - elementToStr x = x parseBlockquote = do attrib <- case filterChild (named "attribution") e of Nothing -> return mempty @@ -785,6 +781,7 @@ parseBlock (Elem e) = contents <- getBlocks e return $ blockQuote (contents <> attrib) listitems = mapM getBlocks $ filterChildren (named "listitem") e + callouts = mapM getBlocks $ filterChildren (named "callout") e deflistitems = mapM parseVarListEntry $ filterChildren (named "varlistentry") e parseVarListEntry e' = do @@ -871,13 +868,22 @@ parseBlock (Elem e) = Nothing -> return mempty modify $ \st -> st{ dbSectionLevel = n } b <- getBlocks e + let ident = attrValue "id" e modify $ \st -> st{ dbSectionLevel = n - 1 } - return $ header n' headerText <> b + return $ headerWith (ident,[],[]) n' headerText <> b metaBlock = acceptingMetadata (getBlocks e) >> return mempty getInlines :: Element -> DB Inlines getInlines e' = (trimInlines . mconcat) <$> (mapM parseInline $ elContent e') +strContentRecursive :: Element -> String +strContentRecursive = strContent . + (\e' -> e'{ elContent = map elementToStr $ elContent e' }) + +elementToStr :: Content -> Content +elementToStr (Elem e') = Text $ CData CDataText (strContentRecursive e') Nothing +elementToStr x = x + parseInline :: Content -> DB Inlines parseInline (Text (CData _ s _)) = return $ text s parseInline (CRef ref) = @@ -901,6 +907,7 @@ parseInline (Elem e) = else doubleQuoted contents "simplelist" -> simpleList "segmentedlist" -> segmentedList + "classname" -> codeWithLang "code" -> codeWithLang "filename" -> codeWithLang "literal" -> codeWithLang @@ -920,6 +927,10 @@ parseInline (Elem e) = "constant" -> codeWithLang "userinput" -> codeWithLang "varargs" -> return $ code "(...)" + "keycap" -> return (str $ strContent e) + "keycombo" -> keycombo <$> (mapM parseInline $ elContent e) + "menuchoice" -> menuchoice <$> (mapM parseInline $ + filter isGuiMenu $ elContent e) "xref" -> return $ str "?" -- so at least you know something is there "email" -> return $ link ("mailto:" ++ strContent e) "" $ str $ strContent e @@ -959,7 +970,7 @@ parseInline (Elem e) = let classes' = case attrValue "language" e of "" -> [] l -> [l] - return $ codeWith (attrValue "id" e,classes',[]) $ strContent e + return $ codeWith (attrValue "id" e,classes',[]) $ strContentRecursive e simpleList = (mconcat . intersperse (str "," <> space)) <$> mapM getInlines (filterChildren (named "member") e) segmentedList = do @@ -974,3 +985,10 @@ parseInline (Elem e) = then mempty else strong tit <> linebreak return $ linebreak <> tit' <> segs + keycombo = spanWith ("",["keycombo"],[]) . + mconcat . intersperse (str "+") + menuchoice = spanWith ("",["menuchoice"],[]) . + mconcat . intersperse (text " > ") + isGuiMenu (Elem x) = named "guimenu" x || named "guisubmenu" x || + named "guimenuitem" x + isGuiMenu _ = False diff --git a/src/Text/Pandoc/Readers/Docx.hs b/src/Text/Pandoc/Readers/Docx.hs index 4b5fbfdfc..64eb0322f 100644 --- a/src/Text/Pandoc/Readers/Docx.hs +++ b/src/Text/Pandoc/Readers/Docx.hs @@ -84,8 +84,7 @@ import Text.Pandoc.Readers.Docx.Lists import Text.Pandoc.Readers.Docx.Reducible import Text.Pandoc.Shared import Text.Pandoc.MediaBag (insertMedia, MediaBag) -import Data.Maybe (isJust) -import Data.List (delete, stripPrefix, (\\), intersect, isPrefixOf) +import Data.List (delete, (\\), intersect) import Data.Monoid import Text.TeXMath (writeTeX) import Data.Default (Default) @@ -197,19 +196,9 @@ fixAuthors mv = mv codeStyles :: [String] codeStyles = ["VerbatimChar"] -blockQuoteDivs :: [String] -blockQuoteDivs = ["Quote", "BlockQuote", "BlockQuotation"] - codeDivs :: [String] codeDivs = ["SourceCode"] - --- For the moment, we have English, Danish, German, and French. This --- is fairly ad-hoc, and there might be a more systematic way to do --- it, but it's better than nothing. -headerPrefixes :: [String] -headerPrefixes = ["Heading", "Overskrift", "berschrift", "Titre"] - runElemToInlines :: RunElem -> Inlines runElemToInlines (TextRun s) = text s runElemToInlines (LnBrk) = linebreak @@ -434,9 +423,9 @@ parStyleToTransform pPr let pPr' = pPr { pStyle = cs, indentation = Nothing} in (divWith ("", [c], [])) . (parStyleToTransform pPr') - | (c:cs) <- pStyle pPr - , c `elem` blockQuoteDivs = - let pPr' = pPr { pStyle = cs \\ blockQuoteDivs } + | (_:cs) <- pStyle pPr + , Just True <- pBlockQuote pPr = + let pPr' = pPr { pStyle = cs } in blockQuote . (parStyleToTransform pPr') | (_:cs) <- pStyle pPr = @@ -467,12 +456,11 @@ bodyPartToBlocks (Paragraph pPr parparts) $ parStyleToTransform pPr $ codeBlock $ concatMap parPartToString parparts - | (c : cs) <- filter (isJust . isHeaderClass) $ pStyle pPr - , Just (prefix, n) <- isHeaderClass c = do + | Just (style, n) <- pHeading pPr = do ils <- local (\s-> s{docxInHeaderBlock=True}) $ (concatReduce <$> mapM parPartToInlines parparts) makeHeaderAnchor $ - headerWith ("", delete (prefix ++ show n) cs, []) n ils + headerWith ("", delete style (pStyle pPr), []) n ils | otherwise = do ils <- concatReduce <$> mapM parPartToInlines parparts >>= (return . fromList . trimLineBreaks . normalizeSpaces . toList) @@ -559,12 +547,3 @@ docxToOutput :: ReaderOptions -> Docx -> (Meta, [Block], MediaBag) docxToOutput opts (Docx (Document _ body)) = let dEnv = def { docxOptions = opts} in evalDocxContext (bodyToOutput body) dEnv def - -isHeaderClass :: String -> Maybe (String, Int) -isHeaderClass s | (pref:_) <- filter (\h -> isPrefixOf h s) headerPrefixes - , Just s' <- stripPrefix pref s = - case reads s' :: [(Int, String)] of - [] -> Nothing - ((n, "") : []) -> Just (pref, n) - _ -> Nothing -isHeaderClass _ = Nothing diff --git a/src/Text/Pandoc/Readers/Docx/Parse.hs b/src/Text/Pandoc/Readers/Docx/Parse.hs index 2945a1eda..5fd6b7a81 100644 --- a/src/Text/Pandoc/Readers/Docx/Parse.hs +++ b/src/Text/Pandoc/Readers/Docx/Parse.hs @@ -1,4 +1,4 @@ -{-# LANGUAGE PatternGuards, ViewPatterns #-} +{-# LANGUAGE PatternGuards, ViewPatterns, FlexibleInstances #-} {- Copyright (C) 2014 Jesse Rosenthal <jrosenthal@jhu.edu> @@ -65,7 +65,7 @@ import Text.Pandoc.Compat.Except import Text.TeXMath.Readers.OMML (readOMML) import Text.Pandoc.Readers.Docx.Fonts (getUnicode, Font(..)) import Text.TeXMath (Exp) -import Data.Char (readLitChar, ord, chr) +import Data.Char (readLitChar, ord, chr, isDigit) data ReaderEnv = ReaderEnv { envNotes :: Notes , envNumbering :: Numbering @@ -73,6 +73,7 @@ data ReaderEnv = ReaderEnv { envNotes :: Notes , envMedia :: Media , envFont :: Maybe Font , envCharStyles :: CharStyleMap + , envParStyles :: ParStyleMap } deriving Show @@ -122,8 +123,12 @@ type Media = [(FilePath, B.ByteString)] type CharStyle = (String, RunStyle) +type ParStyle = (String, ParStyleData) + type CharStyleMap = M.Map String RunStyle +type ParStyleMap = M.Map String ParStyleData + data Numbering = Numbering NameSpaces [Numb] [AbstractNumb] deriving Show @@ -152,6 +157,8 @@ data ParIndentation = ParIndentation { leftParIndent :: Maybe Integer data ParagraphStyle = ParagraphStyle { pStyle :: [String] , indentation :: Maybe ParIndentation , dropCap :: Bool + , pHeading :: Maybe (String, Int) + , pBlockQuote :: Maybe Bool } deriving Show @@ -159,6 +166,8 @@ defaultParagraphStyle :: ParagraphStyle defaultParagraphStyle = ParagraphStyle { pStyle = [] , indentation = Nothing , dropCap = False + , pHeading = Nothing + , pBlockQuote = Nothing } @@ -213,6 +222,11 @@ data RunStyle = RunStyle { isBold :: Maybe Bool , rStyle :: Maybe CharStyle} deriving Show +data ParStyleData = ParStyleData { headingLev :: Maybe (String, Int) + , isBlockQuote :: Maybe Bool + , psStyle :: Maybe ParStyle} + deriving Show + defaultRunStyle :: RunStyle defaultRunStyle = RunStyle { isBold = Nothing , isItalic = Nothing @@ -242,8 +256,8 @@ archiveToDocx archive = do numbering = archiveToNumbering archive rels = archiveToRelationships archive media = archiveToMedia archive - styles = archiveToStyles archive - rEnv = ReaderEnv notes numbering rels media Nothing styles + (styles, parstyles) = archiveToStyles archive + rEnv = ReaderEnv notes numbering rels media Nothing styles parstyles doc <- runD (archiveToDocument archive) rEnv return $ Docx doc @@ -263,47 +277,69 @@ elemToBody ns element | isElem ns "w" "body" element = (\bps -> return $ Body bps) elemToBody _ _ = throwError WrongElem -archiveToStyles :: Archive -> CharStyleMap +archiveToStyles :: Archive -> (CharStyleMap, ParStyleMap) archiveToStyles zf = let stylesElem = findEntryByPath "word/styles.xml" zf >>= (parseXMLDoc . UTF8.toStringLazy . fromEntry) in case stylesElem of - Nothing -> M.empty + Nothing -> (M.empty, M.empty) Just styElem -> let namespaces = mapMaybe attrToNSPair (elAttribs styElem) in - M.fromList $ buildBasedOnList namespaces styElem Nothing + ( M.fromList $ buildBasedOnList namespaces styElem + (Nothing :: Maybe CharStyle), + M.fromList $ buildBasedOnList namespaces styElem + (Nothing :: Maybe ParStyle) ) -isBasedOnStyle :: NameSpaces -> Element -> Maybe CharStyle -> Bool +isBasedOnStyle :: (ElemToStyle a) => NameSpaces -> Element -> Maybe a -> Bool isBasedOnStyle ns element parentStyle | isElem ns "w" "style" element - , Just "character" <- findAttr (elemName ns "w" "type") element + , Just styleType <- findAttr (elemName ns "w" "type") element + , styleType == cStyleType parentStyle , Just basedOnVal <- findChild (elemName ns "w" "basedOn") element >>= findAttr (elemName ns "w" "val") - , Just (parentId, _) <- parentStyle = (basedOnVal == parentId) + , Just ps <- parentStyle = (basedOnVal == getStyleId ps) | isElem ns "w" "style" element - , Just "character" <- findAttr (elemName ns "w" "type") element + , Just styleType <- findAttr (elemName ns "w" "type") element + , styleType == cStyleType parentStyle , Nothing <- findChild (elemName ns "w" "basedOn") element , Nothing <- parentStyle = True | otherwise = False -elemToCharStyle :: NameSpaces -> Element -> Maybe CharStyle -> Maybe CharStyle -elemToCharStyle ns element parentStyle - | isElem ns "w" "style" element - , Just "character" <- findAttr (elemName ns "w" "type") element - , Just styleId <- findAttr (elemName ns "w" "styleId") element = - Just (styleId, elemToRunStyle ns element parentStyle) - | otherwise = Nothing - -getStyleChildren :: NameSpaces -> Element -> Maybe CharStyle -> [CharStyle] +class ElemToStyle a where + cStyleType :: Maybe a -> String + elemToStyle :: NameSpaces -> Element -> Maybe a -> Maybe a + getStyleId :: a -> String + +instance ElemToStyle CharStyle where + cStyleType _ = "character" + elemToStyle ns element parentStyle + | isElem ns "w" "style" element + , Just "character" <- findAttr (elemName ns "w" "type") element + , Just styleId <- findAttr (elemName ns "w" "styleId") element = + Just (styleId, elemToRunStyle ns element parentStyle) + | otherwise = Nothing + getStyleId s = fst s + +instance ElemToStyle ParStyle where + cStyleType _ = "paragraph" + elemToStyle ns element parentStyle + | isElem ns "w" "style" element + , Just "paragraph" <- findAttr (elemName ns "w" "type") element + , Just styleId <- findAttr (elemName ns "w" "styleId") element = + Just (styleId, elemToParStyleData ns element parentStyle) + | otherwise = Nothing + getStyleId s = fst s + +getStyleChildren :: (ElemToStyle a) => NameSpaces -> Element -> Maybe a -> [a] getStyleChildren ns element parentStyle | isElem ns "w" "styles" element = - mapMaybe (\e -> elemToCharStyle ns e parentStyle) $ + mapMaybe (\e -> elemToStyle ns e parentStyle) $ filterChildren (\e' -> isBasedOnStyle ns e' parentStyle) element | otherwise = [] -buildBasedOnList :: NameSpaces -> Element -> Maybe CharStyle -> [CharStyle] +buildBasedOnList :: (ElemToStyle a) => NameSpaces -> Element -> Maybe a -> [a] buildBasedOnList ns element rootStyle = case (getStyleChildren ns element rootStyle) of [] -> [] @@ -543,7 +579,8 @@ elemToBodyPart ns element elemToBodyPart ns element | isElem ns "w" "p" element , Just (numId, lvl) <- elemToNumInfo ns element = do - let parstyle = elemToParagraphStyle ns element + sty <- asks envParStyles + let parstyle = elemToParagraphStyle ns element sty parparts <- mapD (elemToParPart ns) (elChildren element) num <- asks envNumbering case lookupLevel numId lvl num of @@ -551,7 +588,8 @@ elemToBodyPart ns element Nothing -> throwError WrongElem elemToBodyPart ns element | isElem ns "w" "p" element = do - let parstyle = elemToParagraphStyle ns element + sty <- asks envParStyles + let parstyle = elemToParagraphStyle ns element sty parparts <- mapD (elemToParPart ns) (elChildren element) return $ Paragraph parstyle parparts elemToBodyPart ns element @@ -625,17 +663,20 @@ elemToParPart ns element return $ BookMark bmId bmName elemToParPart ns element | isElem ns "w" "hyperlink" element - , Just anchor <- findAttr (elemName ns "w" "anchor") element = do + , Just relId <- findAttr (elemName ns "r" "id") element = do runs <- mapD (elemToRun ns) (elChildren element) - return $ InternalHyperLink anchor runs + rels <- asks envRelationships + case lookupRelationship relId rels of + Just target -> do + case findAttr (elemName ns "w" "anchor") element of + Just anchor -> return $ ExternalHyperLink (target ++ '#':anchor) runs + Nothing -> return $ ExternalHyperLink target runs + Nothing -> return $ ExternalHyperLink "" runs elemToParPart ns element | isElem ns "w" "hyperlink" element - , Just relId <- findAttr (elemName ns "r" "id") element = do + , Just anchor <- findAttr (elemName ns "w" "anchor") element = do runs <- mapD (elemToRun ns) (elChildren element) - rels <- asks envRelationships - return $ case lookupRelationship relId rels of - Just target -> ExternalHyperLink target runs - Nothing -> ExternalHyperLink "" runs + return $ InternalHyperLink anchor runs elemToParPart ns element | isElem ns "m" "oMath" element = (eitherToD $ readOMML $ showElement element) >>= (return . PlainOMath) @@ -684,14 +725,30 @@ elemToRun ns element return $ Run runStyle runElems elemToRun _ _ = throwError WrongElem -elemToParagraphStyle :: NameSpaces -> Element -> ParagraphStyle -elemToParagraphStyle ns element +getParentStyleValue :: (ParStyleData -> Maybe a) -> ParStyleData -> Maybe a +getParentStyleValue field style + | Just value <- field style = Just value + | Just parentStyle <- psStyle style + = getParentStyleValue field (snd parentStyle) +getParentStyleValue _ _ = Nothing + +getParStyleField :: (ParStyleData -> Maybe a) -> ParStyleMap -> [String] -> + Maybe a +getParStyleField field stylemap styles + | x <- mapMaybe (\x -> M.lookup x stylemap) styles + , (y:_) <- mapMaybe (getParentStyleValue field) x + = Just y +getParStyleField _ _ _ = Nothing + +elemToParagraphStyle :: NameSpaces -> Element -> ParStyleMap -> ParagraphStyle +elemToParagraphStyle ns element sty | Just pPr <- findChild (elemName ns "w" "pPr") element = - ParagraphStyle - {pStyle = + let style = mapMaybe (findAttr (elemName ns "w" "val")) (findChildren (elemName ns "w" "pStyle") pPr) + in ParagraphStyle + {pStyle = style , indentation = findChild (elemName ns "w" "ind") pPr >>= elemToParIndentation ns @@ -703,8 +760,10 @@ elemToParagraphStyle ns element Just "none" -> False Just _ -> True Nothing -> False + , pHeading = getParStyleField headingLev sty style + , pBlockQuote = getParStyleField isBlockQuote sty style } -elemToParagraphStyle _ _ = defaultParagraphStyle +elemToParagraphStyle _ _ _ = defaultParagraphStyle checkOnOff :: NameSpaces -> Element -> QName -> Maybe Bool checkOnOff ns rPr tag @@ -758,6 +817,45 @@ elemToRunStyle ns element parentStyle } elemToRunStyle _ _ _ = defaultRunStyle +isNumericNotNull :: String -> Bool +isNumericNotNull str = (str /= []) && (all isDigit str) + +getHeaderLevel :: NameSpaces -> Element -> Maybe (String,Int) +getHeaderLevel ns element + | Just styleId <- findAttr (elemName ns "w" "styleId") element + , Just index <- stripPrefix "Heading" styleId + , isNumericNotNull index = Just (styleId, read index) + | Just styleId <- findAttr (elemName ns "w" "styleId") element + , Just index <- findChild (elemName ns "w" "name") element >>= + findAttr (elemName ns "w" "val") >>= + stripPrefix "heading " + , isNumericNotNull index = Just (styleId, read index) +getHeaderLevel _ _ = Nothing + +blockQuoteStyleIds :: [String] +blockQuoteStyleIds = ["Quote", "BlockQuote", "BlockQuotation"] + +blockQuoteStyleNames :: [String] +blockQuoteStyleNames = ["Quote", "Block Text"] + +getBlockQuote :: NameSpaces -> Element -> Maybe Bool +getBlockQuote ns element + | Just styleId <- findAttr (elemName ns "w" "styleId") element + , styleId `elem` blockQuoteStyleIds = Just True + | Just styleName <- findChild (elemName ns "w" "name") element >>= + findAttr (elemName ns "w" "val") + , styleName `elem` blockQuoteStyleNames = Just True +getBlockQuote _ _ = Nothing + +elemToParStyleData :: NameSpaces -> Element -> Maybe ParStyle -> ParStyleData +elemToParStyleData ns element parentStyle = + ParStyleData + { + headingLev = getHeaderLevel ns element + , isBlockQuote = getBlockQuote ns element + , psStyle = parentStyle + } + elemToRunElem :: NameSpaces -> Element -> D RunElem elemToRunElem ns element | isElem ns "w" "t" element diff --git a/src/Text/Pandoc/Readers/HTML.hs b/src/Text/Pandoc/Readers/HTML.hs index 4e0bb375a..2a23f2a62 100644 --- a/src/Text/Pandoc/Readers/HTML.hs +++ b/src/Text/Pandoc/Readers/HTML.hs @@ -740,7 +740,7 @@ pSpace = many1 (satisfy isSpace) >> return B.space -- eitherBlockOrInline :: [String] -eitherBlockOrInline = ["audio", "applet", "button", "iframe", +eitherBlockOrInline = ["audio", "applet", "button", "iframe", "embed", "del", "ins", "progress", "map", "area", "noscript", "script", "object", "svg", "video", "source"] @@ -758,7 +758,7 @@ blockHtmlTags :: [String] blockHtmlTags = ["?xml", "!DOCTYPE", "address", "article", "aside", "blockquote", "body", "button", "canvas", "caption", "center", "col", "colgroup", "dd", "dir", "div", - "dl", "dt", "embed", "fieldset", "figcaption", "figure", + "dl", "dt", "fieldset", "figcaption", "figure", "footer", "form", "h1", "h2", "h3", "h4", "h5", "h6", "head", "header", "hgroup", "hr", "html", "isindex", "menu", "noframes", "ol", "output", "p", "pre", diff --git a/src/Text/Pandoc/Readers/LaTeX.hs b/src/Text/Pandoc/Readers/LaTeX.hs index 6748f1efd..5a3ba4808 100644 --- a/src/Text/Pandoc/Readers/LaTeX.hs +++ b/src/Text/Pandoc/Readers/LaTeX.hs @@ -414,11 +414,14 @@ inlineCommands = M.fromList $ , ("sim", lit "~") , ("label", unlessParseRaw >> (inBrackets <$> tok)) , ("ref", unlessParseRaw >> (inBrackets <$> tok)) + , ("noindent", unlessParseRaw >> return mempty) + , ("textgreek", tok) , ("sep", lit ",") - , ("cref", unlessParseRaw >> (inBrackets <$> tok)) + , ("cref", unlessParseRaw >> (inBrackets <$> tok)) -- from cleveref.sty , ("(", mathInline $ manyTill anyChar (try $ string "\\)")) , ("[", mathDisplay $ manyTill anyChar (try $ string "\\]")) , ("ensuremath", mathInline $ braced) + , ("texorpdfstring", (\_ x -> x) <$> tok <*> tok) , ("P", lit "¶") , ("S", lit "§") , ("$", lit "$") @@ -496,6 +499,7 @@ inlineCommands = M.fromList $ , ("citealp", citation "citealp" NormalCitation False) , ("citealp*", citation "citealp*" NormalCitation False) , ("autocite", citation "autocite" NormalCitation False) + , ("smartcite", citation "smartcite" NormalCitation False) , ("footcite", inNote <$> citation "footcite" NormalCitation False) , ("parencite", citation "parencite" NormalCitation False) , ("supercite", citation "supercite" NormalCitation False) @@ -518,6 +522,7 @@ inlineCommands = M.fromList $ , ("supercites", citation "supercites" NormalCitation True) , ("footcitetexts", inNote <$> citation "footcitetexts" NormalCitation True) , ("Autocite", citation "Autocite" NormalCitation False) + , ("Smartcite", citation "Smartcite" NormalCitation False) , ("Footcite", citation "Footcite" NormalCitation False) , ("Parencite", citation "Parencite" NormalCitation False) , ("Supercite", citation "Supercite" NormalCitation False) @@ -544,7 +549,7 @@ inlineCommands = M.fromList $ ] ++ map ignoreInlines -- these commands will be ignored unless --parse-raw is specified, -- in which case they will appear as raw latex blocks: - [ "noindent", "index" ] + [ "index" ] mkImage :: String -> LP Inlines mkImage src = do diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs index 02a787670..187b479c3 100644 --- a/src/Text/Pandoc/Readers/Markdown.hs +++ b/src/Text/Pandoc/Readers/Markdown.hs @@ -79,11 +79,7 @@ readMarkdownWithWarnings :: ReaderOptions -- ^ Reader options -> String -- ^ String to parse (assuming @'\n'@ line endings) -> (Pandoc, [String]) readMarkdownWithWarnings opts s = - (readWith parseMarkdownWithWarnings) def{ stateOptions = opts } (s ++ "\n\n") - where parseMarkdownWithWarnings = do - doc <- parseMarkdown - warnings <- stateWarnings <$> getState - return (doc, warnings) + (readWithWarnings parseMarkdown) def{ stateOptions = opts } (s ++ "\n\n") trimInlinesF :: F Inlines -> F Inlines trimInlinesF = liftM trimInlines @@ -117,6 +113,12 @@ isBlank _ = False -- auxiliary functions -- +-- | Succeeds when we're in list context. +inList :: MarkdownParser () +inList = do + ctx <- stateParserContext <$> getState + guard (ctx == ListItemState) + isNull :: F Inlines -> Bool isNull ils = B.isNull $ runF ils def @@ -337,12 +339,6 @@ parseMarkdown = do let Pandoc _ bs = B.doc $ runF blocks st return $ Pandoc meta bs -addWarning :: Maybe SourcePos -> String -> MarkdownParser () -addWarning mbpos msg = - updateState $ \st -> st{ - stateWarnings = (msg ++ maybe "" (\pos -> " " ++ show pos) mbpos) : - stateWarnings st } - referenceKey :: MarkdownParser (F Blocks) referenceKey = try $ do pos <- getPosition @@ -735,9 +731,9 @@ anyOrderedListStart = try $ do skipNonindentSpaces notFollowedBy $ string "p." >> spaceChar >> digit -- page number res <- do guardDisabled Ext_fancy_lists - many1 digit + start <- many1 digit >>= safeRead char '.' - return (1, DefaultStyle, DefaultDelim) + return (start, DefaultStyle, DefaultDelim) <|> do (num, style, delim) <- anyOrderedListMarker -- if it could be an abbreviated first name, -- insist on more than one space @@ -926,6 +922,8 @@ para = try $ do <|> (guardEnabled Ext_backtick_code_blocks >> () <$ lookAhead codeBlockFenced) <|> (guardDisabled Ext_blank_before_header >> () <$ lookAhead header) <|> (guardEnabled Ext_lists_without_preceding_blankline >> + -- Avoid creating a paragraph in a nested list. + notFollowedBy' inList >> () <$ lookAhead listStart) <|> do guardEnabled Ext_native_divs inHtmlBlock <- stateInHtmlBlock <$> getState @@ -1610,8 +1608,7 @@ endline = try $ do newline notFollowedBy blankline -- parse potential list-starts differently if in a list: - st <- getState - when (stateParserContext st == ListItemState) $ notFollowedBy listStart + notFollowedBy (inList >> listStart) guardDisabled Ext_lists_without_preceding_blankline <|> notFollowedBy listStart guardEnabled Ext_blank_before_blockquote <|> notFollowedBy emailBlockQuoteStart guardEnabled Ext_blank_before_header <|> notFollowedBy (char '#') -- atx header @@ -1678,9 +1675,10 @@ referenceLink :: (String -> String -> Inlines -> Inlines) -> (F Inlines, String) -> MarkdownParser (F Inlines) referenceLink constructor (lab, raw) = do sp <- (True <$ lookAhead (char ' ')) <|> return False - (ref,raw') <- try - (skipSpaces >> optional (newline >> skipSpaces) >> reference) - <|> return (mempty, "") + (ref,raw') <- option (mempty, "") $ + lookAhead (try (spnl >> normalCite >> return (mempty, ""))) + <|> + try (spnl >> reference) let labIsRef = raw' == "" || raw' == "[]" let key = toKey $ if labIsRef then raw else raw' parsedRaw <- parseFromString (mconcat <$> many inline) raw' diff --git a/src/Text/Pandoc/Readers/Org.hs b/src/Text/Pandoc/Readers/Org.hs index 5c00a1b27..440b6d144 100644 --- a/src/Text/Pandoc/Readers/Org.hs +++ b/src/Text/Pandoc/Readers/Org.hs @@ -42,6 +42,7 @@ import Text.Pandoc.Parsing hiding ( F, unF, askF, asksF, runF import Text.Pandoc.Readers.LaTeX (inlineCommand, rawLaTeXInline) import Text.Pandoc.Shared (compactify', compactify'DL) import Text.TeXMath (readTeX, writePandoc, DisplayType(..)) +import qualified Text.TeXMath.Readers.MathML.EntityMap as MathMLEntityMap import Control.Applicative ( Applicative, pure , (<$>), (<$), (<*>), (<*), (*>) ) @@ -69,7 +70,32 @@ parseOrg = do blocks' <- parseBlocks st <- getState let meta = runF (orgStateMeta' st) st - return $ Pandoc meta $ filter (/= Null) (B.toList $ runF blocks' st) + let removeUnwantedBlocks = dropCommentTrees . filter (/= Null) + return $ Pandoc meta $ removeUnwantedBlocks (B.toList $ runF blocks' st) + +-- | Drop COMMENT headers and the document tree below those headers. +dropCommentTrees :: [Block] -> [Block] +dropCommentTrees [] = [] +dropCommentTrees blks@(b:bs) = + maybe blks (flip dropUntilHeaderAboveLevel bs) $ commentHeaderLevel b + +-- | Return the level of a header starting a comment tree and Nothing +-- otherwise. +commentHeaderLevel :: Block -> Maybe Int +commentHeaderLevel blk = + case blk of + (Header level _ ((Str "COMMENT"):_)) -> Just level + _ -> Nothing + +-- | Drop blocks until a header on or above the given level is seen +dropUntilHeaderAboveLevel :: Int -> [Block] -> [Block] +dropUntilHeaderAboveLevel n = dropWhile (not . isHeaderLevelLowerEq n) + +isHeaderLevelLowerEq :: Int -> Block -> Bool +isHeaderLevelLowerEq n blk = + case blk of + (Header level _ _) -> n >= level + _ -> False -- -- Parser State for Org @@ -828,12 +854,14 @@ list :: OrgParser (F Blocks) list = choice [ definitionList, bulletList, orderedList ] <?> "list" definitionList :: OrgParser (F Blocks) -definitionList = fmap B.definitionList . fmap compactify'DL . sequence - <$> many1 (definitionListItem bulletListStart) +definitionList = try $ do n <- lookAhead (bulletListStart' Nothing) + fmap B.definitionList . fmap compactify'DL . sequence + <$> many1 (definitionListItem $ bulletListStart' (Just n)) bulletList :: OrgParser (F Blocks) -bulletList = fmap B.bulletList . fmap compactify' . sequence - <$> many1 (listItem bulletListStart) +bulletList = try $ do n <- lookAhead (bulletListStart' Nothing) + fmap B.bulletList . fmap compactify' . sequence + <$> many1 (listItem (bulletListStart' $ Just n)) orderedList :: OrgParser (F Blocks) orderedList = fmap B.orderedList . fmap compactify' . sequence @@ -845,10 +873,27 @@ genericListStart listMarker = try $ (+) <$> (length <$> many spaceChar) <*> (length <$> listMarker <* many1 spaceChar) --- parses bullet list start and returns its length (excl. following whitespace) +-- parses bullet list marker. maybe we know the indent level bulletListStart :: OrgParser Int -bulletListStart = genericListStart bulletListMarker - where bulletListMarker = pure <$> oneOf "*-+" +bulletListStart = bulletListStart' Nothing + +bulletListStart' :: Maybe Int -> OrgParser Int +-- returns length of bulletList prefix, inclusive of marker +bulletListStart' Nothing = do ind <- length <$> many spaceChar + when (ind == 0) $ notFollowedBy (char '*') + oneOf bullets + many1 spaceChar + return (ind + 1) + -- Unindented lists are legal, but they can't use '*' bullets + -- We return n to maintain compatibility with the generic listItem +bulletListStart' (Just n) = do count (n-1) spaceChar + when (n == 1) $ notFollowedBy (char '*') + oneOf bullets + many1 spaceChar + return n + +bullets :: String +bullets = "*+-" orderedListStart :: OrgParser Int orderedListStart = genericListStart orderedListMarker @@ -927,7 +972,7 @@ parseInlines = trimInlinesF . mconcat <$> many1 inline -- treat these as potentially non-text when parsing inline: specialChars :: [Char] -specialChars = "\"$'()*+-./:<=>[\\]^_{|}~" +specialChars = "\"$'()*+-,./:<=>[\\]^_{|}~" whitespace :: OrgParser (F Inlines) @@ -1054,7 +1099,7 @@ linkOrImage = explicitOrImageLink explicitOrImageLink :: OrgParser (F Inlines) explicitOrImageLink = try $ do char '[' - srcF <- applyCustomLinkFormat =<< linkTarget + srcF <- applyCustomLinkFormat =<< possiblyEmptyLinkTarget title <- enclosedRaw (char '[') (char ']') title' <- parseFromString (mconcat <$> many inline) title char ']' @@ -1087,6 +1132,9 @@ selfTarget = try $ char '[' *> linkTarget <* char ']' linkTarget :: OrgParser String linkTarget = enclosedByPair '[' ']' (noneOf "\n\r[]") +possiblyEmptyLinkTarget :: OrgParser String +possiblyEmptyLinkTarget = try linkTarget <|> ("" <$ string "[]") + applyCustomLinkFormat :: String -> OrgParser (F String) applyCustomLinkFormat link = do let (linkType, rest) = break (== ':') link @@ -1094,27 +1142,38 @@ applyCustomLinkFormat link = do formatter <- M.lookup linkType <$> asksF orgStateLinkFormatters return $ maybe link ($ drop 1 rest) formatter - +-- TODO: might be a lot smarter/cleaner to use parsec and ADTs for this kind +-- of parsing. linkToInlinesF :: String -> Inlines -> F Inlines -linkToInlinesF s@('#':_) = pure . B.link s "" -linkToInlinesF s - | isImageFilename s = const . pure $ B.image s "" "" - | isUri s = pure . B.link s "" - | isRelativeUrl s = pure . B.link s "" -linkToInlinesF s = \title -> do - anchorB <- (s `elem`) <$> asksF orgStateAnchorIds - if anchorB - then pure $ B.link ('#':s) "" title - else pure $ B.emph title - -isRelativeUrl :: String -> Bool -isRelativeUrl s = (':' `notElem` s) && ("./" `isPrefixOf` s) +linkToInlinesF s = + case s of + "" -> pure . B.link "" "" + ('#':_) -> pure . B.link s "" + _ | isImageFilename s -> const . pure $ B.image s "" "" + _ | isFileLink s -> pure . B.link (dropLinkType s) "" + _ | isUri s -> pure . B.link s "" + _ | isAbsoluteFilePath s -> pure . B.link ("file://" ++ s) "" + _ | isRelativeFilePath s -> pure . B.link s "" + _ -> internalLink s + +isFileLink :: String -> Bool +isFileLink s = ("file:" `isPrefixOf` s) && not ("file://" `isPrefixOf` s) + +dropLinkType :: String -> String +dropLinkType = tail . snd . break (== ':') + +isRelativeFilePath :: String -> Bool +isRelativeFilePath s = (("./" `isPrefixOf` s) || ("../" `isPrefixOf` s)) && + (':' `notElem` s) isUri :: String -> Bool isUri s = let (scheme, path) = break (== ':') s in all (\c -> isAlphaNum c || c `elem` ".-") scheme && not (null path) +isAbsoluteFilePath :: String -> Bool +isAbsoluteFilePath = ('/' ==) . head + isImageFilename :: String -> Bool isImageFilename filename = any (\x -> ('.':x) `isSuffixOf` filename) imageExtensions && @@ -1124,6 +1183,13 @@ isImageFilename filename = imageExtensions = [ "jpeg" , "jpg" , "png" , "gif" , "svg" ] protocols = [ "file", "http", "https" ] +internalLink :: String -> Inlines -> F Inlines +internalLink link title = do + anchorB <- (link `elem`) <$> asksF orgStateAnchorIds + if anchorB + then return $ B.link ('#':link) "" title + else return $ B.emph title + -- | Parse an anchor like @<<anchor-id>>@ and return an empty span with -- @anchor-id@ set as id. Legal anchors in org-mode are defined through -- @org-target-regexp@, which is fairly liberal. Since no link is created if @@ -1205,10 +1271,10 @@ displayMath = return . B.displayMath <$> choice [ rawMathBetween "\\[" "\\]" ] symbol :: OrgParser (F Inlines) symbol = return . B.str . (: "") <$> (oneOf specialChars >>= updatePositions) - where updatePositions c - | c `elem` emphasisPreChars = c <$ updateLastPreCharPos - | c `elem` emphasisForbiddenBorderChars = c <$ updateLastForbiddenCharPos - | otherwise = return c + where updatePositions c = do + when (c `elem` emphasisPreChars) updateLastPreCharPos + when (c `elem` emphasisForbiddenBorderChars) updateLastForbiddenCharPos + return c emphasisBetween :: Char -> OrgParser (F Inlines) @@ -1387,7 +1453,8 @@ simpleSubOrSuperString = try $ inlineLaTeX :: OrgParser (F Inlines) inlineLaTeX = try $ do cmd <- inlineLaTeXCommand - maybe mzero returnF $ parseAsMath cmd `mplus` parseAsInlineLaTeX cmd + maybe mzero returnF $ + parseAsMath cmd `mplus` parseAsMathMLSym cmd `mplus` parseAsInlineLaTeX cmd where parseAsMath :: String -> Maybe Inlines parseAsMath cs = B.fromList <$> texMathToPandoc cs @@ -1395,6 +1462,11 @@ inlineLaTeX = try $ do parseAsInlineLaTeX :: String -> Maybe Inlines parseAsInlineLaTeX cs = maybeRight $ runParser inlineCommand state "" cs + parseAsMathMLSym :: String -> Maybe Inlines + parseAsMathMLSym cs = B.str <$> MathMLEntityMap.getUnicode (clean cs) + -- dropWhileEnd would be nice here, but it's not available before base 4.5 + where clean = reverse . dropWhile (`elem` "{}") . reverse . drop 1 + state :: ParserState state = def{ stateOptions = def{ readerParseRaw = True }} diff --git a/src/Text/Pandoc/Readers/RST.hs b/src/Text/Pandoc/Readers/RST.hs index e5eccb116..8bfc6f606 100644 --- a/src/Text/Pandoc/Readers/RST.hs +++ b/src/Text/Pandoc/Readers/RST.hs @@ -29,25 +29,26 @@ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Conversion from reStructuredText to 'Pandoc' document. -} module Text.Pandoc.Readers.RST ( - readRST + readRST, + readRSTWithWarnings ) where import Text.Pandoc.Definition import Text.Pandoc.Builder (setMeta, fromList) import Text.Pandoc.Shared import Text.Pandoc.Parsing import Text.Pandoc.Options -import Control.Monad ( when, liftM, guard, mzero, mplus ) +import Control.Monad ( when, liftM, guard, mzero ) import Data.List ( findIndex, intersperse, intercalate, - transpose, sort, deleteFirstsBy, isSuffixOf ) + transpose, sort, deleteFirstsBy, isSuffixOf , nub, union) import Data.Maybe (fromMaybe) import qualified Data.Map as M import Text.Printf ( printf ) -import Control.Applicative ((<$>), (<$), (<*), (*>), (<*>)) +import Control.Applicative ((<$>), (<$), (<*), (*>), (<*>), pure) import Text.Pandoc.Builder (Inlines, Blocks, trimInlines, (<>)) import qualified Text.Pandoc.Builder as B import Data.Monoid (mconcat, mempty) import Data.Sequence (viewr, ViewR(..)) -import Data.Char (toLower, isHexDigit) +import Data.Char (toLower, isHexDigit, isSpace) -- | Parse reStructuredText string and return Pandoc document. readRST :: ReaderOptions -- ^ Reader options @@ -55,6 +56,9 @@ readRST :: ReaderOptions -- ^ Reader options -> Pandoc readRST opts s = (readWith parseRST) def{ stateOptions = opts } (s ++ "\n\n") +readRSTWithWarnings :: ReaderOptions -> String -> (Pandoc, [String]) +readRSTWithWarnings opts s = (readWithWarnings parseRST) def{ stateOptions = opts } (s ++ "\n\n") + type RSTParser = Parser [Char] ParserState -- @@ -335,6 +339,13 @@ indentedBlock = try $ do optional blanklines return $ unlines lns +quotedBlock :: Parser [Char] st [Char] +quotedBlock = try $ do + quote <- lookAhead $ oneOf "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~" + lns <- many1 $ lookAhead (char quote) >> anyLine + optional blanklines + return $ unlines lns + codeBlockStart :: Parser [Char] st Char codeBlockStart = string "::" >> blankline >> blankline @@ -342,7 +353,8 @@ codeBlock :: Parser [Char] st Blocks codeBlock = try $ codeBlockStart >> codeBlockBody codeBlockBody :: Parser [Char] st Blocks -codeBlockBody = try $ B.codeBlock . stripTrailingNewlines <$> indentedBlock +codeBlockBody = try $ B.codeBlock . stripTrailingNewlines <$> + (indentedBlock <|> quotedBlock) lhsCodeBlock :: RSTParser Blocks lhsCodeBlock = try $ do @@ -513,7 +525,6 @@ directive = try $ do -- TODO: line-block, parsed-literal, table, csv-table, list-table -- date -- include --- class -- title directive' :: RSTParser Blocks directive' = do @@ -594,38 +605,69 @@ directive' = do Just t -> B.link (escapeURI $ trim t) "" $ B.image src "" alt Nothing -> B.image src "" alt - _ -> return mempty + "class" -> do + let attrs = ("", (splitBy isSpace $ trim top), map (\(k,v) -> (k, trimr v)) fields) + -- directive content or the first immediately following element + children <- case body of + "" -> block + _ -> parseFromString parseBlocks body' + return $ B.divWith attrs children + other -> do + pos <- getPosition + addWarning (Just pos) $ "ignoring unknown directive: " ++ other + return mempty -- TODO: -- - Silently ignores illegal fields --- - Silently drops classes -- - Only supports :format: fields with a single format for :raw: roles, -- change Text.Pandoc.Definition.Format to fix addNewRole :: String -> [(String, String)] -> RSTParser Blocks addNewRole roleString fields = do (role, parentRole) <- parseFromString inheritedRole roleString customRoles <- stateRstCustomRoles <$> getState - baseRole <- case M.lookup parentRole customRoles of - Just (base, _, _) -> return base - Nothing -> return parentRole - - let fmt = if baseRole == "raw" then lookup "format" fields else Nothing - annotate = maybe id addLanguage $ - if baseRole == "code" + let (baseRole, baseFmt, baseAttr) = + maybe (parentRole, Nothing, nullAttr) id $ + M.lookup parentRole customRoles + fmt = if parentRole == "raw" then lookup "format" fields else baseFmt + annotate :: [String] -> [String] + annotate = maybe id (:) $ + if parentRole == "code" then lookup "language" fields else Nothing + attr = let (ident, classes, keyValues) = baseAttr + -- nub in case role name & language class are the same + in (ident, nub . (role :) . annotate $ classes, keyValues) + + -- warn about syntax we ignore + flip mapM_ fields $ \(key, _) -> case key of + "language" -> when (parentRole /= "code") $ addWarning Nothing $ + "ignoring :language: field because the parent of role :" ++ + role ++ ": is :" ++ parentRole ++ ": not :code:" + "format" -> when (parentRole /= "raw") $ addWarning Nothing $ + "ignoring :format: field because the parent of role :" ++ + role ++ ": is :" ++ parentRole ++ ": not :raw:" + _ -> addWarning Nothing $ "ignoring unknown field :" ++ key ++ + ": in definition of role :" ++ role ++ ": in" + when (parentRole == "raw" && countKeys "format" > 1) $ + addWarning Nothing $ + "ignoring :format: fields after the first in the definition of role :" + ++ role ++": in" + when (parentRole == "code" && countKeys "language" > 1) $ + addWarning Nothing $ + "ignoring :language: fields after the first in the definition of role :" + ++ role ++": in" updateState $ \s -> s { stateRstCustomRoles = - M.insert role (baseRole, fmt, (,) parentRole . annotate) customRoles + M.insert role (baseRole, fmt, attr) customRoles } return $ B.singleton Null where - addLanguage lang (ident, classes, keyValues) = - (ident, "sourceCode" : lang : classes, keyValues) + countKeys k = length . filter (== k) . map fst $ fields inheritedRole = - (,) <$> roleNameEndingIn (char '(') <*> roleNameEndingIn (char ')') + (,) <$> roleName <*> ((char '(' *> roleName <* char ')') <|> pure "span") + -- Can contain character codes as decimal numbers or -- hexadecimal numbers, prefixed by 0x, x, \x, U+, u, or \u @@ -985,21 +1027,23 @@ renderRole contents fmt role attr = case role of "RFC" -> return $ rfcLink contents "pep-reference" -> return $ pepLink contents "PEP" -> return $ pepLink contents - "literal" -> return $ B.str contents + "literal" -> return $ B.codeWith attr contents "math" -> return $ B.math contents "title-reference" -> titleRef contents "title" -> titleRef contents "t" -> titleRef contents - "code" -> return $ B.codeWith attr contents + "code" -> return $ B.codeWith (addClass "sourceCode" attr) contents + "span" -> return $ B.spanWith attr $ B.str contents "raw" -> return $ B.rawInline (fromMaybe "" fmt) contents custom -> do - customRole <- stateRstCustomRoles <$> getState - case M.lookup custom customRole of - Just (_, newFmt, inherit) -> let - fmtStr = fmt `mplus` newFmt - (newRole, newAttr) = inherit attr - in renderRole contents fmtStr newRole newAttr - Nothing -> return $ B.str contents -- Undefined role + customRoles <- stateRstCustomRoles <$> getState + case M.lookup custom customRoles of + Just (newRole, newFmt, newAttr) -> + renderRole contents newFmt newRole newAttr + Nothing -> do + pos <- getPosition + addWarning (Just pos) $ "ignoring unknown role :" ++ custom ++ ": in" + return $ B.str contents -- Undefined role where titleRef ref = return $ B.str ref -- FIXME: Not a sensible behaviour rfcLink rfcNo = B.link rfcUrl ("RFC " ++ rfcNo) $ B.str ("RFC " ++ rfcNo) @@ -1008,11 +1052,14 @@ renderRole contents fmt role attr = case role of where padNo = replicate (4 - length pepNo) '0' ++ pepNo pepUrl = "http://www.python.org/dev/peps/pep-" ++ padNo ++ "/" -roleNameEndingIn :: RSTParser Char -> RSTParser String -roleNameEndingIn end = many1Till (letter <|> char '-') end +addClass :: String -> Attr -> Attr +addClass c (ident, classes, keyValues) = (ident, union classes [c], keyValues) + +roleName :: RSTParser String +roleName = many1 (letter <|> char '-') roleMarker :: RSTParser String -roleMarker = char ':' *> roleNameEndingIn (char ':') +roleMarker = char ':' *> roleName <* char ':' roleBefore :: RSTParser (String,String) roleBefore = try $ do diff --git a/src/Text/Pandoc/Readers/TWiki.hs b/src/Text/Pandoc/Readers/TWiki.hs new file mode 100644 index 000000000..c2325c0ea --- /dev/null +++ b/src/Text/Pandoc/Readers/TWiki.hs @@ -0,0 +1,526 @@ +{-# LANGUAGE RelaxedPolyRec, FlexibleInstances, TypeSynonymInstances #-} +-- RelaxedPolyRec needed for inlinesBetween on GHC < 7 +{- + Copyright (C) 2014 Alexander Sulfrian <alexander.sulfrian@fu-berlin.de> + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +-} + +{- | + Module : Text.Pandoc.Readers.TWiki + Copyright : Copyright (C) 2014 Alexander Sulfrian + License : GNU GPL, version 2 or above + + Maintainer : Alexander Sulfrian <alexander.sulfrian@fu-berlin.de> + Stability : alpha + Portability : portable + +Conversion of twiki text to 'Pandoc' document. +-} +module Text.Pandoc.Readers.TWiki ( readTWiki + , readTWikiWithWarnings + ) where + +import Text.Pandoc.Definition +import qualified Text.Pandoc.Builder as B +import Text.Pandoc.Options +import Text.Pandoc.Parsing hiding (enclosed, macro, nested) +import Text.Pandoc.Readers.HTML (htmlTag, isCommentTag) +import Data.Monoid (Monoid, mconcat, mempty) +import Control.Applicative ((<$>), (<*), (*>), (<$)) +import Control.Monad +import Text.Printf (printf) +import Debug.Trace (trace) +import Text.Pandoc.XML (fromEntities) +import Data.Maybe (fromMaybe) +import Text.HTML.TagSoup +import Data.Char (isAlphaNum) +import qualified Data.Foldable as F + +-- | Read twiki from an input string and return a Pandoc document. +readTWiki :: ReaderOptions -- ^ Reader options + -> String -- ^ String to parse (assuming @'\n'@ line endings) + -> Pandoc +readTWiki opts s = + (readWith parseTWiki) def{ stateOptions = opts } (s ++ "\n\n") + +readTWikiWithWarnings :: ReaderOptions -- ^ Reader options + -> String -- ^ String to parse (assuming @'\n'@ line endings) + -> (Pandoc, [String]) +readTWikiWithWarnings opts s = + (readWith parseTWikiWithWarnings) def{ stateOptions = opts } (s ++ "\n\n") + where parseTWikiWithWarnings = do + doc <- parseTWiki + warnings <- stateWarnings <$> getState + return (doc, warnings) + +type TWParser = Parser [Char] ParserState + +-- +-- utility functions +-- + +tryMsg :: String -> TWParser a -> TWParser a +tryMsg msg p = try p <?> msg + +skip :: TWParser a -> TWParser () +skip parser = parser >> return () + +nested :: TWParser a -> TWParser a +nested p = do + nestlevel <- stateMaxNestingLevel <$> getState + guard $ nestlevel > 0 + updateState $ \st -> st{ stateMaxNestingLevel = stateMaxNestingLevel st - 1 } + res <- p + updateState $ \st -> st{ stateMaxNestingLevel = nestlevel } + return res + +htmlElement :: String -> TWParser (Attr, String) +htmlElement tag = tryMsg tag $ do + (TagOpen _ attr, _) <- htmlTag (~== TagOpen tag []) + content <- manyTill anyChar (endtag <|> endofinput) + return (htmlAttrToPandoc attr, trim content) + where + endtag = skip $ htmlTag (~== TagClose tag) + endofinput = lookAhead $ try $ skipMany blankline >> skipSpaces >> eof + trim = dropWhile (=='\n') . reverse . dropWhile (=='\n') . reverse + +htmlAttrToPandoc :: [Attribute String] -> Attr +htmlAttrToPandoc attrs = (ident, classes, keyvals) + where + ident = fromMaybe "" $ lookup "id" attrs + classes = maybe [] words $ lookup "class" attrs + keyvals = [(k,v) | (k,v) <- attrs, k /= "id" && k /= "class"] + +parseHtmlContentWithAttrs :: String -> TWParser a -> TWParser (Attr, [a]) +parseHtmlContentWithAttrs tag parser = do + (attr, content) <- htmlElement tag + parsedContent <- try $ parseContent content + return (attr, parsedContent) + where + parseContent = parseFromString $ nested $ manyTill parser endOfContent + endOfContent = try $ skipMany blankline >> skipSpaces >> eof + +parseHtmlContent :: String -> TWParser a -> TWParser [a] +parseHtmlContent tag p = parseHtmlContentWithAttrs tag p >>= return . snd + +-- +-- main parser +-- + +parseTWiki :: TWParser Pandoc +parseTWiki = do + bs <- mconcat <$> many block + spaces + eof + return $ B.doc bs + + +-- +-- block parsers +-- + +block :: TWParser B.Blocks +block = do + tr <- getOption readerTrace + pos <- getPosition + res <- mempty <$ skipMany1 blankline + <|> blockElements + <|> para + skipMany blankline + when tr $ + trace (printf "line %d: %s" (sourceLine pos) + (take 60 $ show $ B.toList res)) (return ()) + return res + +blockElements :: TWParser B.Blocks +blockElements = choice [ separator + , header + , verbatim + , literal + , list "" + , table + , blockQuote + , noautolink + ] + +separator :: TWParser B.Blocks +separator = tryMsg "separator" $ string "---" >> newline >> return B.horizontalRule + +header :: TWParser B.Blocks +header = tryMsg "header" $ do + string "---" + level <- many1 (char '+') >>= return . length + guard $ level <= 6 + classes <- option [] $ string "!!" >> return ["unnumbered"] + skipSpaces + content <- B.trimInlines . mconcat <$> manyTill inline newline + attr <- registerHeader ("", classes, []) content + return $ B.headerWith attr level $ content + +verbatim :: TWParser B.Blocks +verbatim = (htmlElement "verbatim" <|> htmlElement "pre") + >>= return . (uncurry B.codeBlockWith) + +literal :: TWParser B.Blocks +literal = htmlElement "literal" >>= return . rawBlock + where + format (_, _, kvs) = fromMaybe "html" $ lookup "format" kvs + rawBlock (attrs, content) = B.rawBlock (format attrs) content + +list :: String -> TWParser B.Blocks +list prefix = choice [ bulletList prefix + , orderedList prefix + , definitionList prefix] + +definitionList :: String -> TWParser B.Blocks +definitionList prefix = tryMsg "definitionList" $ do + indent <- lookAhead $ string prefix *> (many1 $ string " ") <* string "$ " + elements <- many $ parseDefinitionListItem (prefix ++ concat indent) + return $ B.definitionList elements + where + parseDefinitionListItem :: String -> TWParser (B.Inlines, [B.Blocks]) + parseDefinitionListItem indent = do + string (indent ++ "$ ") >> skipSpaces + term <- many1Till inline $ string ": " + line <- listItemLine indent $ string "$ " + return $ (mconcat term, [line]) + +bulletList :: String -> TWParser B.Blocks +bulletList prefix = tryMsg "bulletList" $ + parseList prefix (char '*') (char ' ') + +orderedList :: String -> TWParser B.Blocks +orderedList prefix = tryMsg "orderedList" $ + parseList prefix (oneOf "1iIaA") (string ". ") + +parseList :: Show a => String -> TWParser Char -> TWParser a -> TWParser B.Blocks +parseList prefix marker delim = do + (indent, style) <- lookAhead $ string prefix *> listStyle <* delim + blocks <- many $ parseListItem (prefix ++ indent) (char style <* delim) + return $ case style of + '1' -> B.orderedListWith (1, DefaultStyle, DefaultDelim) blocks + 'i' -> B.orderedListWith (1, LowerRoman, DefaultDelim) blocks + 'I' -> B.orderedListWith (1, UpperRoman, DefaultDelim) blocks + 'a' -> B.orderedListWith (1, LowerAlpha, DefaultDelim) blocks + 'A' -> B.orderedListWith (1, UpperAlpha, DefaultDelim) blocks + _ -> B.bulletList blocks + where + listStyle = do + indent <- many1 $ string " " + style <- marker + return (concat indent, style) + +parseListItem :: Show a => String -> TWParser a -> TWParser B.Blocks +parseListItem prefix marker = string prefix >> marker >> listItemLine prefix marker + +listItemLine :: Show a => String -> TWParser a -> TWParser B.Blocks +listItemLine prefix marker = lineContent >>= parseContent >>= return . mconcat + where + lineContent = do + content <- anyLine + continuation <- optionMaybe listContinuation + return $ filterSpaces content ++ "\n" ++ (maybe "" (" " ++) continuation) + filterSpaces = reverse . dropWhile (== ' ') . reverse + listContinuation = notFollowedBy (string prefix >> marker) >> + string " " >> lineContent + parseContent = parseFromString $ many1 $ nestedList <|> parseInline + parseInline = many1Till inline (lastNewline <|> newlineBeforeNestedList) >>= + return . B.plain . mconcat + nestedList = list prefix + lastNewline = try $ char '\n' <* eof + newlineBeforeNestedList = try $ char '\n' <* lookAhead nestedList + +table :: TWParser B.Blocks +table = try $ do + tableHead <- optionMaybe $ many1Till tableParseHeader newline >>= return . unzip + rows <- many1 tableParseRow + return $ buildTable mempty rows $ fromMaybe (align rows, columns rows) tableHead + where + buildTable caption rows (aligns, heads) + = B.table caption aligns heads rows + align rows = replicate (columCount rows) (AlignDefault, 0) + columns rows = replicate (columCount rows) mempty + columCount rows = length $ head rows + +tableParseHeader :: TWParser ((Alignment, Double), B.Blocks) +tableParseHeader = try $ do + char '|' + leftSpaces <- many spaceChar >>= return . length + char '*' + content <- tableColumnContent (char '*' >> skipSpaces >> char '|') + char '*' + rightSpaces <- many spaceChar >>= return . length + optional tableEndOfRow + return (tableAlign leftSpaces rightSpaces, content) + where + tableAlign left right + | left >= 2 && left == right = (AlignCenter, 0) + | left > right = (AlignRight, 0) + | otherwise = (AlignLeft, 0) + +tableParseRow :: TWParser [B.Blocks] +tableParseRow = many1Till tableParseColumn newline + +tableParseColumn :: TWParser B.Blocks +tableParseColumn = char '|' *> skipSpaces *> + tableColumnContent (skipSpaces >> char '|') + <* skipSpaces <* optional tableEndOfRow + +tableEndOfRow :: TWParser Char +tableEndOfRow = lookAhead (try $ char '|' >> char '\n') >> char '|' + +tableColumnContent :: Show a => TWParser a -> TWParser B.Blocks +tableColumnContent end = manyTill content (lookAhead $ try end) >>= return . B.plain . mconcat + where + content = continuation <|> inline + continuation = try $ char '\\' >> newline >> return mempty + +blockQuote :: TWParser B.Blocks +blockQuote = parseHtmlContent "blockquote" block >>= return . B.blockQuote . mconcat + +noautolink :: TWParser B.Blocks +noautolink = do + (_, content) <- htmlElement "noautolink" + st <- getState + setState $ st{ stateAllowLinks = False } + blocks <- try $ parseContent content + setState $ st{ stateAllowLinks = True } + return $ mconcat blocks + where + parseContent = parseFromString $ many $ block + +para :: TWParser B.Blocks +para = many1Till inline endOfParaElement >>= return . result . mconcat + where + endOfParaElement = lookAhead $ endOfInput <|> endOfPara <|> newBlockElement + endOfInput = try $ skipMany blankline >> skipSpaces >> eof + endOfPara = try $ blankline >> skipMany1 blankline + newBlockElement = try $ blankline >> skip blockElements + result content = if F.all (==Space) content + then mempty + else B.para $ B.trimInlines content + + +-- +-- inline parsers +-- + +inline :: TWParser B.Inlines +inline = choice [ whitespace + , br + , macro + , strong + , strongHtml + , strongAndEmph + , emph + , emphHtml + , boldCode + , smart + , link + , htmlComment + , code + , codeHtml + , nop + , autoLink + , str + , symbol + ] <?> "inline" + +whitespace :: TWParser B.Inlines +whitespace = (lb <|> regsp) >>= return + where lb = try $ skipMany spaceChar >> linebreak >> return B.space + regsp = try $ skipMany1 spaceChar >> return B.space + +br :: TWParser B.Inlines +br = try $ string "%BR%" >> return B.linebreak + +linebreak :: TWParser B.Inlines +linebreak = newline >> notFollowedBy newline >> (lastNewline <|> innerNewline) + where lastNewline = eof >> return mempty + innerNewline = return B.space + +between :: (Show b, Monoid c) => TWParser a -> TWParser b -> (TWParser b -> TWParser c) -> TWParser c +between start end p = + mconcat <$> try (start >> notFollowedBy whitespace >> many1Till (p end) end) + +enclosed :: (Show a, Monoid b) => TWParser a -> (TWParser a -> TWParser b) -> TWParser b +enclosed sep p = between sep (try $ sep <* endMarker) p + where + endMarker = lookAhead $ skip endSpace <|> skip (oneOf ".,!?:)|") <|> eof + endSpace = (spaceChar <|> newline) >> return B.space + +macro :: TWParser B.Inlines +macro = macroWithParameters <|> withoutParameters + where + withoutParameters = enclosed (char '%') (\_ -> macroName) >>= return . emptySpan + emptySpan name = buildSpan name [] mempty + +macroWithParameters :: TWParser B.Inlines +macroWithParameters = try $ do + char '%' + name <- macroName + (content, kvs) <- attributes + char '%' + return $ buildSpan name kvs $ B.str content + +buildSpan :: String -> [(String, String)] -> B.Inlines -> B.Inlines +buildSpan className kvs = B.spanWith attrs + where + attrs = ("", ["twiki-macro", className] ++ additionalClasses, kvsWithoutClasses) + additionalClasses = maybe [] words $ lookup "class" kvs + kvsWithoutClasses = [(k,v) | (k,v) <- kvs, k /= "class"] + +macroName :: TWParser String +macroName = do + first <- letter + rest <- many $ alphaNum <|> char '_' + return (first:rest) + +attributes :: TWParser (String, [(String, String)]) +attributes = char '{' *> spnl *> many (attribute <* spnl) <* char '}' >>= + return . foldr (either mkContent mkKvs) ([], []) + where + spnl = skipMany (spaceChar <|> newline) + mkContent c ([], kvs) = (c, kvs) + mkContent c (rest, kvs) = (c ++ " " ++ rest, kvs) + mkKvs kv (cont, rest) = (cont, (kv : rest)) + +attribute :: TWParser (Either String (String, String)) +attribute = withKey <|> withoutKey + where + withKey = try $ do + key <- macroName + char '=' + parseValue False >>= return . (curry Right key) + withoutKey = try $ parseValue True >>= return . Left + parseValue allowSpaces = (withQuotes <|> withoutQuotes allowSpaces) >>= return . fromEntities + withQuotes = between (char '"') (char '"') (\_ -> count 1 $ noneOf ['"']) + withoutQuotes allowSpaces + | allowSpaces == True = many1 $ noneOf "}" + | otherwise = many1 $ noneOf " }" + +nestedInlines :: Show a => TWParser a -> TWParser B.Inlines +nestedInlines end = innerSpace <|> nestedInline + where + innerSpace = try $ whitespace <* (notFollowedBy end) + nestedInline = notFollowedBy whitespace >> nested inline + +strong :: TWParser B.Inlines +strong = try $ enclosed (char '*') nestedInlines >>= return . B.strong + +strongHtml :: TWParser B.Inlines +strongHtml = (parseHtmlContent "strong" inline <|> parseHtmlContent "b" inline) + >>= return . B.strong . mconcat + +strongAndEmph :: TWParser B.Inlines +strongAndEmph = try $ enclosed (string "__") nestedInlines >>= return . B.emph . B.strong + +emph :: TWParser B.Inlines +emph = try $ enclosed (char '_') nestedInlines >>= return . B.emph + +emphHtml :: TWParser B.Inlines +emphHtml = (parseHtmlContent "em" inline <|> parseHtmlContent "i" inline) + >>= return . B.emph . mconcat + +nestedString :: Show a => TWParser a -> TWParser String +nestedString end = innerSpace <|> (count 1 nonspaceChar) + where + innerSpace = try $ many1 spaceChar <* notFollowedBy end + +boldCode :: TWParser B.Inlines +boldCode = try $ enclosed (string "==") nestedString >>= return . B.strong . B.code . fromEntities + +htmlComment :: TWParser B.Inlines +htmlComment = htmlTag isCommentTag >> return mempty + +code :: TWParser B.Inlines +code = try $ enclosed (char '=') nestedString >>= return . B.code . fromEntities + +codeHtml :: TWParser B.Inlines +codeHtml = do + (attrs, content) <- parseHtmlContentWithAttrs "code" anyChar + return $ B.codeWith attrs $ fromEntities content + +autoLink :: TWParser B.Inlines +autoLink = try $ do + state <- getState + guard $ stateAllowLinks state + (text, url) <- parseLink + guard $ checkLink (head $ reverse url) + return $ makeLink (text, url) + where + parseLink = notFollowedBy nop >> (uri <|> emailAddress) + makeLink (text, url) = B.link url "" $ B.str text + checkLink c + | c == '/' = True + | otherwise = isAlphaNum c + +str :: TWParser B.Inlines +str = (many1 alphaNum <|> count 1 characterReference) >>= return . B.str + +nop :: TWParser B.Inlines +nop = try $ (skip exclamation <|> skip nopTag) >> followContent + where + exclamation = char '!' + nopTag = stringAnyCase "<nop>" + followContent = many1 nonspaceChar >>= return . B.str . fromEntities + +symbol :: TWParser B.Inlines +symbol = count 1 nonspaceChar >>= return . B.str + +smart :: TWParser B.Inlines +smart = do + getOption readerSmart >>= guard + doubleQuoted <|> singleQuoted <|> + choice [ apostrophe + , dash + , ellipses + ] + +singleQuoted :: TWParser B.Inlines +singleQuoted = try $ do + singleQuoteStart + withQuoteContext InSingleQuote $ + many1Till inline singleQuoteEnd >>= + (return . B.singleQuoted . B.trimInlines . mconcat) + +doubleQuoted :: TWParser B.Inlines +doubleQuoted = try $ do + doubleQuoteStart + contents <- mconcat <$> many (try $ notFollowedBy doubleQuoteEnd >> inline) + (withQuoteContext InDoubleQuote $ doubleQuoteEnd >> + return (B.doubleQuoted $ B.trimInlines contents)) + <|> (return $ (B.str "\8220") B.<> contents) + +link :: TWParser B.Inlines +link = try $ do + st <- getState + guard $ stateAllowLinks st + setState $ st{ stateAllowLinks = False } + (url, title, content) <- linkText + setState $ st{ stateAllowLinks = True } + return $ B.link url title content + +linkText :: TWParser (String, String, B.Inlines) +linkText = do + string "[[" + url <- many1Till anyChar (char ']') + content <- option [B.str url] linkContent + char ']' + return (url, "", mconcat content) + where + linkContent = (char '[') >> many1Till anyChar (char ']') >>= parseLinkContent + parseLinkContent = parseFromString $ many1 inline diff --git a/src/Text/Pandoc/SelfContained.hs b/src/Text/Pandoc/SelfContained.hs index 36839ddd0..5b8f7a75a 100644 --- a/src/Text/Pandoc/SelfContained.hs +++ b/src/Text/Pandoc/SelfContained.hs @@ -51,7 +51,8 @@ isOk c = isAscii c && isAlphaNum c convertTag :: MediaBag -> Maybe String -> Tag String -> IO (Tag String) convertTag media sourceURL t@(TagOpen tagname as) - | tagname `elem` ["img", "embed", "video", "input", "audio", "source"] = do + | tagname `elem` + ["img", "embed", "video", "input", "audio", "source", "track"] = do as' <- mapM processAttribute as return $ TagOpen tagname as' where processAttribute (x,y) = diff --git a/src/Text/Pandoc/Shared.hs b/src/Text/Pandoc/Shared.hs index 6e1f84335..9aa70e6f2 100644 --- a/src/Text/Pandoc/Shared.hs +++ b/src/Text/Pandoc/Shared.hs @@ -735,12 +735,10 @@ renderTags' = renderTagsOptions -- | Perform an IO action in a directory, returning to starting directory. inDirectory :: FilePath -> IO a -> IO a -inDirectory path action = do - oldDir <- getCurrentDirectory - setCurrentDirectory path - result <- action - setCurrentDirectory oldDir - return result +inDirectory path action = E.bracket + getCurrentDirectory + setCurrentDirectory + (const $ setCurrentDirectory path >> action) readDefaultDataFile :: FilePath -> IO BS.ByteString readDefaultDataFile fname = diff --git a/src/Text/Pandoc/Writers/Docx.hs b/src/Text/Pandoc/Writers/Docx.hs index 5320a2816..5b9cc62ab 100644 --- a/src/Text/Pandoc/Writers/Docx.hs +++ b/src/Text/Pandoc/Writers/Docx.hs @@ -29,7 +29,7 @@ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Conversion of 'Pandoc' documents to docx. -} module Text.Pandoc.Writers.Docx ( writeDocx ) where -import Data.List ( intercalate, isPrefixOf, isSuffixOf ) +import Data.List ( intercalate, isPrefixOf, isSuffixOf, stripPrefix ) import qualified Data.ByteString as B import qualified Data.ByteString.Lazy as BL import qualified Data.ByteString.Lazy.Char8 as BL8 @@ -62,8 +62,9 @@ import Text.Printf (printf) import qualified Control.Exception as E import Text.Pandoc.MIME (MimeType, getMimeType, getMimeTypeDef, extensionFromMimeType) -import Control.Applicative ((<$>), (<|>)) +import Control.Applicative ((<$>), (<|>), (<*>)) import Data.Maybe (fromMaybe, mapMaybe) +import Data.Char (isDigit) data ListMarker = NoMarker | BulletMarker @@ -104,6 +105,8 @@ data WriterState = WriterState{ , stInDel :: Bool , stChangesAuthor :: String , stChangesDate :: String + , stPrintWidth :: Integer + , stHeadingStyles :: [(Int,String)] } defaultWriterState :: WriterState @@ -122,6 +125,8 @@ defaultWriterState = WriterState{ , stInDel = False , stChangesAuthor = "unknown" , stChangesDate = "1969-12-31T19:00:00Z" + , stPrintWidth = 1 + , stHeadingStyles = [] } type WS a = StateT WriterState IO a @@ -181,11 +186,59 @@ writeDocx opts doc@(Pandoc meta _) = do case writerReferenceDocx opts of Just f -> B.readFile f Nothing -> readDataFile datadir "reference.docx" - distArchive <- liftM (toArchive . toLazy) $ readDataFile Nothing "reference.docx" + distArchive <- liftM (toArchive . toLazy) $ readDataFile datadir "reference.docx" + + parsedDoc <- parseXml refArchive distArchive "word/document.xml" + let wname f qn = qPrefix qn == Just "w" && f (qName qn) + let mbsectpr = filterElementName (wname (=="sectPr")) parsedDoc + + -- Gets the template size + let mbpgsz = mbsectpr >>= (filterElementName (wname (=="pgSz"))) + let mbAttrSzWidth = (elAttribs <$> mbpgsz) >>= (lookupAttrBy ((=="w") . qName)) + + let mbpgmar = mbsectpr >>= (filterElementName (wname (=="pgMar"))) + let mbAttrMarLeft = (elAttribs <$> mbpgmar) >>= (lookupAttrBy ((=="left") . qName)) + let mbAttrMarRight = (elAttribs <$> mbpgmar) >>= (lookupAttrBy ((=="right") . qName)) + + -- Get the avaible area (converting the size and the margins to int and + -- doing the difference + let pgContentWidth = (-) <$> (read <$> mbAttrSzWidth ::Maybe Integer) + <*> ( + (+) <$> (read <$> mbAttrMarRight ::Maybe Integer) + <*> (read <$> mbAttrMarLeft ::Maybe Integer) + ) + + -- styles + let stylepath = "word/styles.xml" + styledoc <- parseXml refArchive distArchive stylepath + + -- parse styledoc for heading styles + let styleNamespaces = map ((,) <$> qName . attrKey <*> attrVal) . + filter ((==Just "xmlns") . qPrefix . attrKey) . + elAttribs $ styledoc + let headingStyles = + let + mywURI = lookup "w" styleNamespaces + myName name = QName name mywURI (Just "w") + getAttrStyleId = findAttr (myName "styleId") + getNameVal = findChild (myName "name") >=> findAttr (myName "val") + getNum s | not $ null s, all isDigit s = Just (read s :: Int) + | otherwise = Nothing + getEngHeader = getAttrStyleId >=> stripPrefix "Heading" >=> getNum + getIntHeader = getNameVal >=> stripPrefix "heading " >=> getNum + toTuple getF = liftM2 (,) <$> getF <*> getAttrStyleId + toMap getF = mapMaybe (toTuple getF) $ + findChildren (myName "style") styledoc + select a b | not $ null a = a + | otherwise = b + in + select (toMap getEngHeader) (toMap getIntHeader) ((contents, footnotes), st) <- runStateT (writeOpenXML opts{writerWrapText = False} doc') defaultWriterState{ stChangesAuthor = fromMaybe "unknown" username - , stChangesDate = formatTime defaultTimeLocale "%FT%XZ" utctime} + , stChangesDate = formatTime defaultTimeLocale "%FT%XZ" utctime + , stPrintWidth = (maybe 420 (\x -> quot x 20) pgContentWidth) + , stHeadingStyles = headingStyles} let epochtime = floor $ utcTimeToPOSIXSeconds utctime let imgs = M.elems $ stImages st @@ -193,9 +246,6 @@ writeDocx opts doc@(Pandoc meta _) = do let toImageEntry (_,path,_,_,img) = toEntry ("word/" ++ path) epochtime $ toLazy img let imageEntries = map toImageEntry imgs - - - let stdAttributes = [("xmlns:w","http://schemas.openxmlformats.org/wordprocessingml/2006/main") ,("xmlns:m","http://schemas.openxmlformats.org/officeDocument/2006/math") @@ -310,10 +360,7 @@ writeDocx opts doc@(Pandoc meta _) = do $ renderXml reldoc - -- adjust contents to add sectPr from reference.docx - parsedDoc <- parseXml refArchive distArchive "word/document.xml" - let wname f qn = qPrefix qn == Just "w" && f (qName qn) - let mbsectpr = filterElementName (wname (=="sectPr")) parsedDoc + -- adjust contents to add sectPr from reference.docx let sectpr = case mbsectpr of Just sectpr' -> let cs = renumIds (\q -> qName q == "id" && qPrefix q == Just "r") @@ -323,8 +370,6 @@ writeDocx opts doc@(Pandoc meta _) = do add_attrs (elAttribs sectpr') $ mknode "w:sectPr" [] cs Nothing -> (mknode "w:sectPr" [] ()) - - -- let sectpr = fromMaybe (mknode "w:sectPr" [] ()) mbsectpr' let contents' = contents ++ [sectpr] let docContents = mknode "w:document" stdAttributes @@ -347,8 +392,6 @@ writeDocx opts doc@(Pandoc meta _) = do -- styles let newstyles = styleToOpenXml $ writerHighlightStyle opts - let stylepath = "word/styles.xml" - styledoc <- parseXml refArchive distArchive stylepath let styledoc' = styledoc{ elContent = elContent styledoc ++ [Elem x | x <- newstyles, writerHighlight opts] } let styleEntry = toEntry stylepath epochtime $ renderXml styledoc' @@ -600,8 +643,9 @@ blockToOpenXML opts (Div (_,["references"],_) bs) = do return (header ++ rest) blockToOpenXML opts (Div _ bs) = blocksToOpenXML opts bs blockToOpenXML opts (Header lev (ident,_,_) lst) = do - paraProps <- withParaProp (pStyle $ "Heading" ++ show lev) $ - getParaProps False + headingStyles <- gets stHeadingStyles + paraProps <- maybe id (withParaProp . pStyle) (lookup lev headingStyles) $ + getParaProps False contents <- inlinesToOpenXML opts lst usedIdents <- gets stSectionIds let bookmarkName = if null ident @@ -671,7 +715,8 @@ blockToOpenXML opts (Table caption aligns widths headers rows) = do let mkgridcol w = mknode "w:gridCol" [("w:w", show (floor (textwidth * w) :: Integer))] () return $ - mknode "w:tbl" [] + caption' ++ + [mknode "w:tbl" [] ( mknode "w:tblPr" [] ( mknode "w:tblStyle" [("w:val","TableNormal")] () : mknode "w:tblW" [("w:type", "pct"), ("w:w", show rowwidth)] () : @@ -683,7 +728,7 @@ blockToOpenXML opts (Table caption aligns widths headers rows) = do else map mkgridcol widths) : [ mkrow True headers' | not (all null headers) ] ++ map (mkrow False) rows' - ) : caption' + )] blockToOpenXML opts (BulletList lst) = do let marker = BulletMarker addList marker @@ -927,6 +972,7 @@ inlineToOpenXML opts (Link txt (src,_)) = do return [ mknode "w:hyperlink" [("r:id",id')] contents ] inlineToOpenXML opts (Image alt (src, tit)) = do -- first, check to see if we've already done this image + pageWidth <- gets stPrintWidth imgs <- gets stImages case M.lookup src imgs of Just (_,_,_,elt,_) -> return [elt] @@ -943,7 +989,7 @@ inlineToOpenXML opts (Image alt (src, tit)) = do let size = imageSize img let (xpt,ypt) = maybe (120,120) sizeInPoints size -- 12700 emu = 1 pt - let (xemu,yemu) = fitToPage (xpt * 12700, ypt * 12700) + let (xemu,yemu) = fitToPage (xpt * 12700, ypt * 12700) (pageWidth * 12700) let cNvPicPr = mknode "pic:cNvPicPr" [] $ mknode "a:picLocks" [("noChangeArrowheads","1"),("noChangeAspect","1")] () let nvPicPr = mknode "pic:nvPicPr" [] @@ -1010,9 +1056,11 @@ parseXml refArchive distArchive relpath = Nothing -> fail $ relpath ++ " corrupt or missing in reference docx" -- | Scales the image to fit the page -fitToPage :: (Integer, Integer) -> (Integer, Integer) -fitToPage (x, y) - --5440680 is the emu width size of a letter page in portrait, minus the margins - | x > 5440680 = - (5440680, round $ (5440680 / ((fromIntegral :: Integer -> Double) x)) * (fromIntegral y)) +-- sizes are passed in emu +fitToPage :: (Integer, Integer) -> Integer -> (Integer, Integer) +fitToPage (x, y) pageWidth + -- Fixes width to the page width and scales the height + | x > pageWidth = + (pageWidth, round $ + ((fromIntegral pageWidth) / ((fromIntegral :: Integer -> Double) x)) * (fromIntegral y)) | otherwise = (x, y) diff --git a/src/Text/Pandoc/Writers/DokuWiki.hs b/src/Text/Pandoc/Writers/DokuWiki.hs index 8c1d360aa..eed45a965 100644 --- a/src/Text/Pandoc/Writers/DokuWiki.hs +++ b/src/Text/Pandoc/Writers/DokuWiki.hs @@ -134,7 +134,9 @@ blockToDokuWiki opts (Para [Image txt (src,'f':'i':'g':':':tit)]) = do let opt = if null txt then "" else "|" ++ if null tit then capt else tit ++ capt - return $ "{{:" ++ src ++ opt ++ "}}\n" + -- Relative links fail isURI and receive a colon + prefix = if isURI src then "" else ":" + return $ "{{" ++ prefix ++ src ++ opt ++ "}}\n" blockToDokuWiki opts (Para inlines) = do indent <- stIndent <$> ask @@ -178,7 +180,7 @@ blockToDokuWiki _ (CodeBlock (_,classes,_) str) = do blockToDokuWiki opts (BlockQuote blocks) = do contents <- blockListToDokuWiki opts blocks if isSimpleBlockQuote blocks - then return $ "> " ++ contents + then return $ unlines $ map ("> " ++) $ lines contents else return $ "<HTML><blockquote>\n" ++ contents ++ "</blockquote></HTML>" blockToDokuWiki opts (Table capt aligns _ headers rows) = do @@ -352,9 +354,7 @@ isPlainOrPara (Para _) = True isPlainOrPara _ = False isSimpleBlockQuote :: [Block] -> Bool -isSimpleBlockQuote [BlockQuote bs] = isSimpleBlockQuote bs -isSimpleBlockQuote [b] = isPlainOrPara b -isSimpleBlockQuote _ = False +isSimpleBlockQuote bs = all isPlainOrPara bs -- | Concatenates strings with line breaks between them. vcat :: [String] -> String @@ -480,7 +480,9 @@ inlineToDokuWiki opts (Image alt (source, tit)) = do ("", []) -> "" ("", _ ) -> "|" ++ alt' (_ , _ ) -> "|" ++ tit - return $ "{{:" ++ source ++ txt ++ "}}" + -- Relative links fail isURI and receive a colon + prefix = if isURI source then "" else ":" + return $ "{{" ++ prefix ++ source ++ txt ++ "}}" inlineToDokuWiki opts (Note contents) = do contents' <- blockListToDokuWiki opts contents diff --git a/src/Text/Pandoc/Writers/EPUB.hs b/src/Text/Pandoc/Writers/EPUB.hs index 32256cb42..3f9c6a606 100644 --- a/src/Text/Pandoc/Writers/EPUB.hs +++ b/src/Text/Pandoc/Writers/EPUB.hs @@ -35,7 +35,7 @@ import Data.Maybe ( fromMaybe ) import Data.List ( isPrefixOf, isInfixOf, intercalate ) import System.Environment ( getEnv ) import Text.Printf (printf) -import System.FilePath ( (</>), takeExtension, takeFileName ) +import System.FilePath ( takeExtension, takeFileName ) import qualified Data.ByteString.Lazy as B import qualified Data.ByteString.Lazy.Char8 as B8 import qualified Text.Pandoc.UTF8 as UTF8 @@ -64,7 +64,6 @@ import Text.XML.Light ( unode, Element(..), unqual, Attr(..), add_attrs import Text.Pandoc.UUID (getRandomUUID) import Text.Pandoc.Writers.HTML (writeHtmlString, writeHtml) import Data.Char ( toLower, isDigit, isAlphaNum ) -import Network.URI ( unEscapeString ) import Text.Pandoc.MIME (MimeType, getMimeType) import qualified Control.Exception as E import Text.Blaze.Html.Renderer.Utf8 (renderHtml) @@ -359,8 +358,9 @@ writeEPUB opts doc@(Pandoc meta _) = do Nothing -> return ([],[]) Just img -> do let coverImage = "media/" ++ takeFileName img - let cpContent = renderHtml $ writeHtml opts' - (Pandoc meta [RawBlock (Format "html") $ "<div id=\"cover-image\">\n<img src=\"" ++ coverImage ++ "\" alt=\"cover image\" />\n</div>"]) + let cpContent = renderHtml $ writeHtml + opts'{ writerVariables = ("coverpage","true"):vars } + (Pandoc meta [RawBlock (Format "html") $ "<div id=\"cover-image\">\n<img src=\"" ++ coverImage ++ "\" alt=\"cover image\" />\n</div>"]) imgContent <- B.readFile img return ( [mkEntry "cover.xhtml" cpContent] , [mkEntry coverImage imgContent] ) @@ -570,8 +570,7 @@ writeEPUB opts doc@(Pandoc meta _) = do let navMapFormatter :: Int -> String -> String -> [Element] -> Element navMapFormatter n tit src subs = unode "navPoint" ! - [("id", "navPoint-" ++ show n) - ,("playOrder", show n)] $ + [("id", "navPoint-" ++ show n)] $ [ unode "navLabel" $ unode "text" tit , unode "content" ! [("src", src)] $ () ] ++ subs @@ -612,17 +611,14 @@ writeEPUB opts doc@(Pandoc meta _) = do (_:_) -> [unode "ol" ! [("class","toc")] $ subs] let navtag = if epub3 then "nav" else "div" - let navData = UTF8.fromStringLazy $ ppTopElement $ - unode "html" ! [("xmlns","http://www.w3.org/1999/xhtml") - ,("xmlns:epub","http://www.idpf.org/2007/ops")] $ - [ unode "head" $ - [ unode "title" plainTitle - , unode "link" ! [("rel","stylesheet"),("type","text/css"),("href","stylesheet.css")] $ () ] - , unode "body" $ - unode navtag ! [("epub:type","toc") | epub3] $ - [ unode "h1" ! [("id","toc-title")] $ plainTitle - , unode "ol" ! [("class","toc")] $ evalState (mapM (navPointNode navXhtmlFormatter) secs) 1] - ] + let navBlocks = [RawBlock (Format "html") $ ppElement $ + unode navtag ! [("epub:type","toc") | epub3] $ + [ unode "h1" ! [("id","toc-title")] $ plainTitle + , unode "ol" ! [("class","toc")] $ evalState (mapM (navPointNode navXhtmlFormatter) secs) 1]] + let navData = renderHtml $ writeHtml opts' + (Pandoc (setMeta "title" + (walk removeNote $ fromList $ docTitle' meta) nullMeta) + navBlocks) let navEntry = mkEntry "nav.xhtml" navData -- mimetype @@ -765,23 +761,20 @@ metadataElement version md currentTime = showDateTimeISO8601 :: UTCTime -> String showDateTimeISO8601 = formatTime defaultTimeLocale "%FT%TZ" -transformTag :: WriterOptions - -> IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media +transformTag :: IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media -> Tag String -> IO (Tag String) -transformTag opts mediaRef tag@(TagOpen name attr) +transformTag mediaRef tag@(TagOpen name attr) | name `elem` ["video", "source", "img", "audio"] = do let src = fromAttrib "src" tag let poster = fromAttrib "poster" tag - let oldsrc = maybe src (</> src) $ writerSourceURL opts - let oldposter = maybe poster (</> poster) $ writerSourceURL opts - newsrc <- modifyMediaRef mediaRef oldsrc - newposter <- modifyMediaRef mediaRef oldposter + newsrc <- modifyMediaRef mediaRef src + newposter <- modifyMediaRef mediaRef poster let attr' = filter (\(x,_) -> x /= "src" && x /= "poster") attr ++ [("src", newsrc) | not (null newsrc)] ++ [("poster", newposter) | not (null newposter)] return $ TagOpen name attr' -transformTag _ _ tag = return tag +transformTag _ tag = return tag modifyMediaRef :: IORef [(FilePath, FilePath)] -> FilePath -> IO FilePath modifyMediaRef _ "" = return "" @@ -791,7 +784,7 @@ modifyMediaRef mediaRef oldsrc = do Just n -> return n Nothing -> do let new = "media/file" ++ show (length media) ++ - takeExtension oldsrc + takeExtension (takeWhile (/='?') oldsrc) -- remove query modifyIORef mediaRef ( (oldsrc, new): ) return new @@ -799,10 +792,10 @@ transformBlock :: WriterOptions -> IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media -> Block -> IO Block -transformBlock opts mediaRef (RawBlock fmt raw) +transformBlock _ mediaRef (RawBlock fmt raw) | fmt == Format "html" = do let tags = parseTags raw - tags' <- mapM (transformTag opts mediaRef) tags + tags' <- mapM (transformTag mediaRef) tags return $ RawBlock fmt (renderTags' tags') transformBlock _ _ b = return b @@ -810,19 +803,17 @@ transformInline :: WriterOptions -> IORef [(FilePath, FilePath)] -- ^ (oldpath, newpath) media -> Inline -> IO Inline -transformInline opts mediaRef (Image lab (src,tit)) = do - let src' = unEscapeString src - let oldsrc = maybe src' (</> src) $ writerSourceURL opts - newsrc <- modifyMediaRef mediaRef oldsrc +transformInline _ mediaRef (Image lab (src,tit)) = do + newsrc <- modifyMediaRef mediaRef src return $ Image lab (newsrc, tit) transformInline opts _ (x@(Math _ _)) | WebTeX _ <- writerHTMLMathMethod opts = do raw <- makeSelfContained opts $ writeHtmlInline opts x return $ RawInline (Format "html") raw -transformInline opts mediaRef (RawInline fmt raw) +transformInline _ mediaRef (RawInline fmt raw) | fmt == Format "html" = do let tags = parseTags raw - tags' <- mapM (transformTag opts mediaRef) tags + tags' <- mapM (transformTag mediaRef) tags return $ RawInline fmt (renderTags' tags') transformInline _ _ x = return x @@ -1204,3 +1195,4 @@ docTitle' meta = fromMaybe [] $ go <$> lookupMeta "title" meta _ -> [] go (MetaList xs) = concatMap go xs go _ = [] + diff --git a/src/Text/Pandoc/Writers/HTML.hs b/src/Text/Pandoc/Writers/HTML.hs index 1a00c7660..1760ee006 100644 --- a/src/Text/Pandoc/Writers/HTML.hs +++ b/src/Text/Pandoc/Writers/HTML.hs @@ -60,7 +60,7 @@ import qualified Text.Blaze.XHtml1.Transitional.Attributes as A import Text.Blaze.Renderer.String (renderHtml) import Text.TeXMath import Text.XML.Light.Output -import Text.XML.Light (unode, elChildren, add_attr, unqual) +import Text.XML.Light (unode, elChildren, unqual) import qualified Text.XML.Light as XML import System.FilePath (takeExtension) import Data.Monoid @@ -73,11 +73,13 @@ data WriterState = WriterState , stQuotes :: Bool -- ^ <q> tag is used , stHighlighting :: Bool -- ^ Syntax highlighting is used , stSecNum :: [Int] -- ^ Number of current section + , stElement :: Bool -- ^ Processing an Element } defaultWriterState :: WriterState defaultWriterState = WriterState {stNotes= [], stMath = False, stQuotes = False, - stHighlighting = False, stSecNum = []} + stHighlighting = False, stSecNum = [], + stElement = False} -- Helpers to render HTML with the appropriate function. @@ -280,7 +282,13 @@ elementToHtml slideLevel opts (Sec level num (id',classes,keyvals) title' elemen let titleSlide = slide && level < slideLevel header' <- if title' == [Str "\0"] -- marker for hrule then return mempty - else blockToHtml opts (Header level' (id',classes,keyvals) title') + else do + modify (\st -> st{ stElement = True}) + res <- blockToHtml opts + (Header level' (id',classes,keyvals) title') + modify (\st -> st{ stElement = False}) + return res + let isSec (Sec _ _ _ _ _) = True isSec (Blk _) = False let isPause (Blk x) = x == Para [Str ".",Space,Str ".",Space,Str "."] @@ -491,7 +499,7 @@ blockToHtml opts (BlockQuote blocks) = else do contents <- blockListToHtml opts blocks return $ H.blockquote $ nl opts >> contents >> nl opts -blockToHtml opts (Header level (_,classes,_) lst) = do +blockToHtml opts (Header level attr@(_,classes,_) lst) = do contents <- inlineListToHtml opts lst secnum <- liftM stSecNum get let contents' = if writerNumberSections opts && not (null secnum) @@ -499,7 +507,9 @@ blockToHtml opts (Header level (_,classes,_) lst) = do then (H.span ! A.class_ "header-section-number" $ toHtml $ showSecNum secnum) >> strToHtml " " >> contents else contents - return $ case level of + inElement <- gets stElement + return $ (if inElement then id else addAttrs opts attr) + $ case level of 1 -> H.h1 contents' 2 -> H.h2 contents' 3 -> H.h3 contents' @@ -629,7 +639,9 @@ annotateMML e tex = math (unode "semantics" [cs, unode "annotation" (annotAttrs, [] -> unode "mrow" () [x] -> x xs -> unode "mrow" xs - math = add_attr (XML.Attr (unqual "xmlns") "http://www.w3.org/1998/Math/MathML") . unode "math" + math childs = XML.Element q as [XML.Elem childs] l + where + (XML.Element q as _ l) = e annotAttrs = [XML.Attr (unqual "encoding") "application/x-tex"] diff --git a/src/Text/Pandoc/Writers/ICML.hs b/src/Text/Pandoc/Writers/ICML.hs index ae20efd4b..181c63df7 100644 --- a/src/Text/Pandoc/Writers/ICML.hs +++ b/src/Text/Pandoc/Writers/ICML.hs @@ -399,7 +399,7 @@ inlineToICML opts style (Subscript lst) = inlinesToICML opts (subscriptName:styl inlineToICML opts style (SmallCaps lst) = inlinesToICML opts (smallCapsName:style) lst inlineToICML opts style (Quoted SingleQuote lst) = inlinesToICML opts style $ [Str "‘"] ++ lst ++ [Str "’"] inlineToICML opts style (Quoted DoubleQuote lst) = inlinesToICML opts style $ [Str "“"] ++ lst ++ [Str "”"] -inlineToICML opts style (Cite _ lst) = footnoteToICML opts style [Para lst] +inlineToICML opts style (Cite _ lst) = inlinesToICML opts style lst inlineToICML _ style (Code _ str) = charStyle (codeName:style) $ text $ escapeStringForXML str inlineToICML _ style Space = charStyle style space inlineToICML _ style LineBreak = charStyle style $ text lineSeparator diff --git a/src/Text/Pandoc/Writers/LaTeX.hs b/src/Text/Pandoc/Writers/LaTeX.hs index ae2f4e907..2c86d1a6e 100644 --- a/src/Text/Pandoc/Writers/LaTeX.hs +++ b/src/Text/Pandoc/Writers/LaTeX.hs @@ -613,6 +613,7 @@ sectionHeader :: Bool -- True for unnumbered sectionHeader unnumbered ref level lst = do txt <- inlineListToLaTeX lst lab <- text `fmap` toLabel ref + plain <- stringToLaTeX TextString $ foldl (++) "" $ map stringify lst let noNote (Note _) = Str "" noNote x = x let lstNoNotes = walk noNote lst @@ -625,7 +626,12 @@ sectionHeader unnumbered ref level lst = do then return empty else do return $ brackets txtNoNotes - let stuffing = star <> optional <> braces txt + let contents = if render Nothing txt == plain + then braces txt + else braces (text "\\texorpdfstring" + <> braces txt + <> braces (text plain)) + let stuffing = star <> optional <> contents book <- gets stBook opts <- gets stOptions let level' = if book || writerChapters opts then level - 1 else level @@ -746,8 +752,10 @@ inlineToLaTeX (Code (_,classes,_) str) = do Nothing -> rawCode Just h -> modify (\st -> st{ stHighlighting = True }) >> return (text h) - rawCode = liftM (text . (\s -> "\\texttt{" ++ s ++ "}")) + rawCode = liftM (text . (\s -> "\\texttt{" ++ escapeSpaces s ++ "}")) $ stringToLaTeX CodeString str + where + escapeSpaces = concatMap (\c -> if c == ' ' then "\\ " else [c]) inlineToLaTeX (Quoted qt lst) = do contents <- inlineListToLaTeX lst csquotes <- liftM stCsquotes get diff --git a/src/Text/Pandoc/Writers/ODT.hs b/src/Text/Pandoc/Writers/ODT.hs index 03f8e8ba4..2a4129512 100644 --- a/src/Text/Pandoc/Writers/ODT.hs +++ b/src/Text/Pandoc/Writers/ODT.hs @@ -41,7 +41,7 @@ import Control.Applicative ((<$>)) import Text.Pandoc.Options ( WriterOptions(..) ) import Text.Pandoc.Shared ( stringify, readDataFile, fetchItem', warn ) import Text.Pandoc.ImageSize ( imageSize, sizeInPoints ) -import Text.Pandoc.MIME ( getMimeType ) +import Text.Pandoc.MIME ( getMimeType, extensionFromMimeType ) import Text.Pandoc.Definition import Text.Pandoc.Walk import Text.Pandoc.Writers.Shared ( fixDisplayMath ) @@ -51,7 +51,7 @@ import Text.Pandoc.XML import Text.Pandoc.Pretty import qualified Control.Exception as E import Data.Time.Clock.POSIX ( getPOSIXTime ) -import System.FilePath ( takeExtension, takeDirectory ) +import System.FilePath ( takeExtension, takeDirectory, (<.>)) -- | Produce an ODT file from a Pandoc document. writeODT :: WriterOptions -- ^ Writer options @@ -133,12 +133,14 @@ transformPicMath opts entriesRef (Image lab (src,_)) = do Left (_ :: E.SomeException) -> do warn $ "Could not find image `" ++ src ++ "', skipping..." return $ Emph lab - Right (img, _) -> do + Right (img, mbMimeType) -> do let size = imageSize img let (w,h) = fromMaybe (0,0) $ sizeInPoints `fmap` size let tit' = show w ++ "x" ++ show h entries <- readIORef entriesRef - let newsrc = "Pictures/" ++ show (length entries) ++ takeExtension src + let extension = fromMaybe (takeExtension $ takeWhile (/='?') src) + (mbMimeType >>= extensionFromMimeType) + let newsrc = "Pictures/" ++ show (length entries) <.> extension let toLazy = B.fromChunks . (:[]) epochtime <- floor `fmap` getPOSIXTime let entry = toEntry newsrc epochtime $ toLazy img diff --git a/src/Text/Pandoc/Writers/RST.hs b/src/Text/Pandoc/Writers/RST.hs index a96670c96..5ba4c9983 100644 --- a/src/Text/Pandoc/Writers/RST.hs +++ b/src/Text/Pandoc/Writers/RST.hs @@ -173,7 +173,7 @@ blockToRST (Para [Image txt (src,'f':'i':'g':':':tit)]) = do capt <- inlineListToRST txt let fig = "figure:: " <> text src let alt = ":alt: " <> if null tit then capt else text tit - return $ hang 3 ".. " $ fig $$ alt $+$ capt $$ blankline + return $ hang 3 ".. " (fig $$ alt $+$ capt) $$ blankline blockToRST (Para inlines) | LineBreak `elem` inlines = do -- use line block if LineBreaks lns <- mapM inlineListToRST $ splitBy (==LineBreak) inlines @@ -239,8 +239,7 @@ blockToRST (Table caption _ widths headers rows) = do middle = hcat $ intersperse sep' blocks let makeRow = hpipeBlocks . zipWith lblock widthsInChars let head' = makeRow headers' - rows' <- mapM (\row -> do cols <- mapM blockListToRST row - return $ makeRow cols) rows + let rows' = map makeRow rawRows let border ch = char '+' <> char ch <> (hcat $ intersperse (char ch <> char '+' <> char ch) $ map (\l -> text $ replicate l ch) widthsInChars) <> @@ -253,7 +252,7 @@ blockToRST (Table caption _ widths headers rows) = do blockToRST (BulletList items) = do contents <- mapM bulletListItemToRST items -- ensure that sublists have preceding blank line - return $ blankline $$ vcat contents $$ blankline + return $ blankline $$ chomp (vcat contents) $$ blankline blockToRST (OrderedList (start, style', delim) items) = do let markers = if start == 1 && style' == DefaultStyle && delim == DefaultDelim then take (length items) $ repeat "#." @@ -265,11 +264,11 @@ blockToRST (OrderedList (start, style', delim) items) = do contents <- mapM (\(item, num) -> orderedListItemToRST item num) $ zip markers' items -- ensure that sublists have preceding blank line - return $ blankline $$ vcat contents $$ blankline + return $ blankline $$ chomp (vcat contents) $$ blankline blockToRST (DefinitionList items) = do contents <- mapM definitionListItemToRST items -- ensure that sublists have preceding blank line - return $ blankline $$ vcat contents $$ blankline + return $ blankline $$ chomp (vcat contents) $$ blankline -- | Convert bullet list item (list of blocks) to RST. bulletListItemToRST :: [Block] -> State WriterState Doc @@ -427,7 +426,7 @@ inlineToRST (Image alternate (source, tit)) = do return $ "|" <> label <> "|" inlineToRST (Note contents) = do -- add to notes in state - notes <- get >>= return . stNotes + notes <- gets stNotes modify $ \st -> st { stNotes = contents:notes } let ref = show $ (length notes) + 1 return $ " [" <> text ref <> "]_" diff --git a/src/Text/Pandoc/Writers/RTF.hs b/src/Text/Pandoc/Writers/RTF.hs index 43405ce3c..dfad4b0e2 100644 --- a/src/Text/Pandoc/Writers/RTF.hs +++ b/src/Text/Pandoc/Writers/RTF.hs @@ -106,7 +106,9 @@ writeRTF options (Pandoc meta@(Meta metamap) blocks) = $ metadata in if writerStandalone options then renderTemplate' (writerTemplate options) context - else body + else case reverse body of + ('\n':_) -> body + _ -> body ++ "\n" -- | Construct table of contents from list of header blocks. tableOfContents :: [Block] -> String diff --git a/tests/Tests/Old.hs b/tests/Tests/Old.hs index 8a256b761..5bdf325b1 100644 --- a/tests/Tests/Old.hs +++ b/tests/Tests/Old.hs @@ -130,6 +130,8 @@ tests = [ testGroup "markdown" "dokuwiki_inline_formatting.native" "dokuwiki_inline_formatting.dokuwiki" , test "multiblock table" ["-r", "native", "-w", "dokuwiki", "-s"] "dokuwiki_multiblock_table.native" "dokuwiki_multiblock_table.dokuwiki" + , test "external images" ["-r", "native", "-w", "dokuwiki", "-s"] + "dokuwiki_external_images.native" "dokuwiki_external_images.dokuwiki" ] , testGroup "opml" [ test "basic" ["-r", "native", "-w", "opml", "--columns=78", "-s"] @@ -153,6 +155,9 @@ tests = [ testGroup "markdown" , test "formatting" ["-r", "epub", "-w", "native"] "epub/formatting.epub" "epub/formatting.native" ] + , testGroup "twiki" + [ test "reader" ["-r", "twiki", "-w", "native", "-s"] + "twiki-reader.twiki" "twiki-reader.native" ] , testGroup "other writers" $ map (\f -> testGroup f $ writerTests f) [ "opendocument" , "context" , "texinfo", "icml" , "man" , "plain" , "rtf", "org", "asciidoc" diff --git a/tests/Tests/Readers/Docx.hs b/tests/Tests/Readers/Docx.hs index 584e7cc94..2963c34da 100644 --- a/tests/Tests/Readers/Docx.hs +++ b/tests/Tests/Readers/Docx.hs @@ -156,9 +156,9 @@ tests = [ testGroup "inlines" "docx/numbered_header.docx" "docx/numbered_header.native" , testCompare - "headers in other languages" - "docx/danish_headers.docx" - "docx/danish_headers.native" + "i18n blocks (headers and blockquotes)" + "docx/i18n_blocks.docx" + "docx/i18n_blocks.native" , testCompare "lists" "docx/lists.docx" diff --git a/tests/Tests/Readers/Markdown.hs b/tests/Tests/Readers/Markdown.hs index b45d94032..fdb1a7417 100644 --- a/tests/Tests/Readers/Markdown.hs +++ b/tests/Tests/Readers/Markdown.hs @@ -20,6 +20,9 @@ markdownCDL :: String -> Pandoc markdownCDL = readMarkdown def { readerExtensions = Set.insert Ext_compact_definition_lists $ readerExtensions def } +markdownGH :: String -> Pandoc +markdownGH = readMarkdown def { readerExtensions = githubMarkdownExtensions } + infix 4 =: (=:) :: ToString c => String -> (String, c) -> Test @@ -184,6 +187,11 @@ tests = [ testGroup "inline code" ] , testGroup "bare URIs" (map testBareLink bareLinkTests) + , testGroup "autolinks" + [ "with unicode dash following" =: + "<http://foo.bar>\8212" =?> para (autolink "http://foo.bar" <> + str "\8212") + ] , testGroup "Headers" [ "blank line before header" =: "\n# Header\n" @@ -271,5 +279,14 @@ tests = [ testGroup "inline code" plain (text "if this button exists") <> rawBlock "html" "</button>" <> divWith nullAttr (para $ text "with this div too.")] + , test markdownGH "issue #1636" $ + unlines [ "* a" + , "* b" + , "* c" + , " * d" ] + =?> + bulletList [ plain "a" + , plain "b" + , plain "c" <> bulletList [plain "d"] ] ] ] diff --git a/tests/Tests/Readers/Org.hs b/tests/Tests/Readers/Org.hs index 92ec8155b..39c40cd45 100644 --- a/tests/Tests/Readers/Org.hs +++ b/tests/Tests/Readers/Org.hs @@ -4,7 +4,6 @@ module Tests.Readers.Org (tests) where import Text.Pandoc.Definition import Test.Framework import Tests.Helpers -import Tests.Arbitrary() import Text.Pandoc.Builder import Text.Pandoc import Data.List (intersperse) @@ -126,6 +125,14 @@ tests = , (emph "b") <> "." ]) + , "Quotes are forbidden border chars" =: + "/'nope/ *nope\"*" =?> + para ("/'nope/" <> space <> "*nope\"*") + + , "Commata are forbidden border chars" =: + "/nada,/" =?> + para "/nada,/" + , "Markup should work properly after a blank line" =: unlines ["foo", "", "/bar/"] =?> (para $ text "foo") <> (para $ emph $ text "bar") @@ -189,6 +196,18 @@ tests = "[[http://zeitlens.com/]]" =?> (para $ link "http://zeitlens.com/" "" "http://zeitlens.com/") + , "Absolute file link" =: + "[[/url][hi]]" =?> + (para $ link "file:///url" "" "hi") + + , "Link to file in parent directory" =: + "[[../file.txt][moin]]" =?> + (para $ link "../file.txt" "" "moin") + + , "Empty link (for gitit interop)" =: + "[[][New Link]]" =?> + (para $ link "" "" "New Link") + , "Image link" =: "[[sunset.png][dusk.svg]]" =?> (para $ link "sunset.png" "" (image "dusk.svg" "" "")) @@ -207,6 +226,14 @@ tests = , "for", "fnords." ]) + , "Absolute file link" =: + "[[file:///etc/passwd][passwd]]" =?> + (para $ link "file:///etc/passwd" "" "passwd") + + , "File link" =: + "[[file:target][title]]" =?> + (para $ link "target" "" "title") + , "Anchor" =: "<<anchor>> Link here later." =?> (para $ spanWith ("anchor", [], []) mempty <> @@ -268,6 +295,18 @@ tests = "\\notacommand{foo}" =?> para (rawInline "latex" "\\notacommand{foo}") + , "MathML symbol in LaTeX-style" =: + "There is a hackerspace in Lübeck, Germany, called nbsp (unicode symbol: '\\nbsp')." =?> + para ("There is a hackerspace in Lübeck, Germany, called nbsp (unicode symbol: ' ').") + + , "MathML symbol in LaTeX-style, including braces" =: + "\\Aacute{}stor" =?> + para "Ástor" + + , "MathML copy sign" =: + "\\copy" =?> + para "©" + , "LaTeX citation" =: "\\cite{Coffee}" =?> let citation = Citation @@ -450,6 +489,18 @@ tests = , header 2 ("walk" <> space <> "dog") ] + , "Comment Trees" =: + unlines [ "* COMMENT A comment tree" + , " Not much going on here" + , "** This will be dropped" + , "* Comment tree above" + ] =?> + header 1 "Comment tree above" + + , "Nothing but a COMMENT header" =: + "* COMMENT Test" =?> + (mempty::Blocks) + , "Paragraph starting with an asterisk" =: "*five" =?> para "*five" @@ -578,6 +629,13 @@ tests = , plain "Item2" ] + , "Unindented *" =: + ("- Item1\n" ++ + "* Item2\n") =?> + bulletList [ plain "Item1" + ] <> + header 1 "Item2" + , "Multi-line Bullet Lists" =: ("- *Fat\n" ++ " Tony*\n" ++ @@ -622,6 +680,33 @@ tests = ] ] + , "Bullet List with Decreasing Indent" =: + (" - Discovery\n\ + \ - Human After All\n") =?> + mconcat [ bulletList [ plain "Discovery" ] + , bulletList [ plain ("Human" <> space <> "After" <> space <> "All")] + ] + + , "Header follows Bullet List" =: + (" - Discovery\n\ + \ - Human After All\n\ + \* Homework") =?> + mconcat [ bulletList [ plain "Discovery" + , plain ("Human" <> space <> "After" <> space <> "All") + ] + , header 1 "Homework" + ] + + , "Bullet List Unindented with trailing Header" =: + ("- Discovery\n\ + \- Homework\n\ + \* NotValidListItem") =?> + mconcat [ bulletList [ plain "Discovery" + , plain "Homework" + ] + , header 1 "NotValidListItem" + ] + , "Simple Ordered List" =: ("1. Item1\n" ++ "2. Item2\n") =?> @@ -717,6 +802,16 @@ tests = , ("PCR", [ plain $ spcSep [ "polymerase", "chain", "reaction" ] ]) ] + , "Definition List With Trailing Header" =: + "- definition :: list\n\ + \- cool :: defs\n\ + \* header" =?> + mconcat [ definitionList [ ("definition", [plain "list"]) + , ("cool", [plain "defs"]) + ] + , header 1 "header" + ] + , "Loose bullet list" =: unlines [ "- apple" , "" diff --git a/tests/Tests/Readers/RST.hs b/tests/Tests/Readers/RST.hs index a80dc32b7..1aaf4897f 100644 --- a/tests/Tests/Readers/RST.hs +++ b/tests/Tests/Readers/RST.hs @@ -67,5 +67,45 @@ tests = [ "line block with blank line" =: link "http://foo.bar.baz" "" "http://foo.bar.baz" <> ". " <> link "http://foo.bar/baz_(bam)" "" "http://foo.bar/baz_(bam)" <> " (" <> link "http://foo.bar" "" "http://foo.bar" <> ")") + , testGroup "literal / line / code blocks" + [ "indented literal block" =: unlines + [ "::" + , "" + , " block quotes" + , "" + , " can go on for many lines" + , "but must stop here"] + =?> (doc $ + codeBlock "block quotes\n\ncan go on for many lines" <> + para "but must stop here") + , "line block with 3 lines" =: "| a\n| b\n| c" + =?> para ("a" <> linebreak <> "b" <> linebreak <> "c") + , "quoted literal block using >" =: "::\n\n> quoted\n> block\n\nOrdinary paragraph" + =?> codeBlock "> quoted\n> block" <> para "Ordinary paragraph" + , "quoted literal block using | (not a line block)" =: "::\n\n| quoted\n| block\n\nOrdinary paragraph" + =?> codeBlock "| quoted\n| block" <> para "Ordinary paragraph" + , "class directive with single paragraph" =: ".. class:: special\n\nThis is a \"special\" paragraph." + =?> divWith ("", ["special"], []) (para "This is a \"special\" paragraph.") + , "class directive with two paragraphs" =: ".. class:: exceptional remarkable\n\n First paragraph.\n\n Second paragraph." + =?> divWith ("", ["exceptional", "remarkable"], []) (para "First paragraph." <> para "Second paragraph.") + , "class directive around literal block" =: ".. class:: classy\n\n::\n\n a\n b" + =?> divWith ("", ["classy"], []) (codeBlock "a\nb")] + , testGroup "interpreted text roles" + [ "literal role prefix" =: ":literal:`a`" =?> para (code "a") + , "literal role postfix" =: "`a`:literal:" =?> para (code "a") + , "literal text" =: "``text``" =?> para (code "text") + , "code role" =: ":code:`a`" =?> para (codeWith ("", ["sourceCode"], []) "a") + , "inherited code role" =: ".. role:: codeLike(code)\n\n:codeLike:`a`" + =?> para (codeWith ("", ["codeLike", "sourceCode"], []) "a") + , "custom code role with language field" + =: ".. role:: lhs(code)\n :language: haskell\n\n:lhs:`a`" + =?> para (codeWith ("", ["lhs", "haskell","sourceCode"], []) "a") + , "custom role with unspecified parent role" + =: ".. role:: classy\n\n:classy:`text`" + =?> para (spanWith ("", ["classy"], []) "text") + , "role with recursive inheritance" + =: ".. role:: haskell(code)\n.. role:: lhs(haskell)\n\n:lhs:`text`" + =?> para (codeWith ("", ["lhs", "haskell", "sourceCode"], []) "text") + , "unknown role" =: ":unknown:`text`" =?> para (str "text") + ] ] - diff --git a/tests/Tests/Writers/LaTeX.hs b/tests/Tests/Writers/LaTeX.hs index 4d2d3fc05..f083176d0 100644 --- a/tests/Tests/Writers/LaTeX.hs +++ b/tests/Tests/Writers/LaTeX.hs @@ -52,7 +52,7 @@ tests = [ testGroup "code blocks" [ "unnumbered header" =: headerWith ("foo",["unnumbered"],[]) 1 (text "Header 1" <> note (plain $ text "note")) =?> - "\\section*{Header 1\\footnote{note}}\\label{foo}\n\\addcontentsline{toc}{section}{Header 1}\n" + "\\section*{\\texorpdfstring{Header 1\\footnote{note}}{Header 1}}\\label{foo}\n\\addcontentsline{toc}{section}{Header 1}\n" , "in list item" =: bulletList [header 2 (text "foo")] =?> "\\begin{itemize}\n\\item ~\n \\subsection{foo}\n\\end{itemize}" @@ -62,7 +62,7 @@ tests = [ testGroup "code blocks" "\\begin{description}\n\\item[foo] ~ \n\\subsection{bar}\n\nbaz\n\\end{description}" , "containing image" =: header 1 (image "imgs/foo.jpg" "" (text "Alt text")) =?> - "\\section{\\protect\\includegraphics{imgs/foo.jpg}}" + "\\section{\\texorpdfstring{\\protect\\includegraphics{imgs/foo.jpg}}{Alt text}}" ] , testGroup "inline code" [ "struck out and highlighted" =: diff --git a/tests/docbook-reader.docbook b/tests/docbook-reader.docbook index 9ba965d9b..cf5059646 100644 --- a/tests/docbook-reader.docbook +++ b/tests/docbook-reader.docbook @@ -509,6 +509,25 @@ These should not be escaped: \$ \\ \> \[ \{ B. Williams </para> </sect2> + <sect2 id="callout"> + <title>Callout</title> + <para>Simple.</para> + <calloutlist> + <callout arearefs="loop1-letrec-co" id="loop1-letrec"> + <para id="x_QA1">A <code>__letrec</code> is equivalent to a normal + Haskell &let;.</para> + </callout> + <callout arearefs="loop1-def-co" id="loop1-def"> + <para id="x_RA1">&GHC; compiled the body of our list comprehension into + a loop named <function>go_s1YC</function>.</para> + </callout> + <callout arearefs="loop1-pat-empty-co" id="loop1-pat-empty"> + <para id="x_SA1">If our &case; expression matches the empty list, we + return the empty list. This is reassuringly + familiar.</para> + </callout> + </calloutlist> + </sect2> </sect1> <sect1 id="definition-lists"> <title>Definition Lists</title> @@ -691,6 +710,9 @@ These should not be escaped: \$ \\ \> \[ \{ <literal><html></literal>. </para> <para> + More code: <classname>Class</classname> and <type>Type</type> + </para> + <para> <emphasis role="strikethrough">This is <emphasis>strikeout</emphasis>.</emphasis> </para> diff --git a/tests/docbook-reader.native b/tests/docbook-reader.native index 90d76b3c2..353a352a2 100644 --- a/tests/docbook-reader.native +++ b/tests/docbook-reader.native @@ -1,22 +1,22 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Space,Str "MacFarlane"],MetaInlines [Str "Anonymous"]]),("date",MetaInlines [Str "July",Space,Str "17,",Space,Str "2006"]),("title",MetaInlines [Str "Pandoc",Space,Str "Test",Space,Str "Suite"])]}) [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "set",Space,Str "of",Space,Str "tests",Space,Str "for",Space,Str "pandoc.",Space,Str "Most",Space,Str "of",Space,Str "them",Space,Str "are",Space,Str "adapted",Space,Str "from",Space,Str "John",Space,Str "Gruber\8217s",Space,Str "markdown",Space,Str "test",Space,Str "suite."] -,Header 1 ("",[],[]) [Str "Headers"] -,Header 2 ("",[],[]) [Str "Level",Space,Str "2",Space,Str "with",Space,Str "an",Space,Link [Str "embedded",Space,Str "link"] ("/url","")] -,Header 3 ("",[],[]) [Str "Level",Space,Str "3",Space,Str "with",Space,Emph [Str "emphasis"]] -,Header 4 ("",[],[]) [Str "Level",Space,Str "4"] -,Header 5 ("",[],[]) [Str "Level",Space,Str "5"] +,Header 1 ("headers",[],[]) [Str "Headers"] +,Header 2 ("level-2-with-an-embedded-link",[],[]) [Str "Level",Space,Str "2",Space,Str "with",Space,Str "an",Space,Link [Str "embedded",Space,Str "link"] ("/url","")] +,Header 3 ("level-3-with-emphasis",[],[]) [Str "Level",Space,Str "3",Space,Str "with",Space,Emph [Str "emphasis"]] +,Header 4 ("level-4",[],[]) [Str "Level",Space,Str "4"] +,Header 5 ("level-5",[],[]) [Str "Level",Space,Str "5"] ,Para [Str "Hi."] -,Header 1 ("",[],[]) [Str "Level",Space,Str "1"] -,Header 2 ("",[],[]) [Str "Level",Space,Str "2",Space,Str "with",Space,Emph [Str "emphasis"]] -,Header 3 ("",[],[]) [Str "Level",Space,Str "3"] +,Header 1 ("level-1",[],[]) [Str "Level",Space,Str "1"] +,Header 2 ("level-2-with-emphasis",[],[]) [Str "Level",Space,Str "2",Space,Str "with",Space,Emph [Str "emphasis"]] +,Header 3 ("level-3",[],[]) [Str "Level",Space,Str "3"] ,Para [Str "with",Space,Str "no",Space,Str "blank",Space,Str "line"] -,Header 2 ("",[],[]) [Str "Level",Space,Str "2"] +,Header 2 ("level-2",[],[]) [Str "Level",Space,Str "2"] ,Para [Str "with",Space,Str "no",Space,Str "blank",Space,Str "line"] -,Header 1 ("",[],[]) [Str "Paragraphs"] +,Header 1 ("paragraphs",[],[]) [Str "Paragraphs"] ,Para [Str "Here\8217s",Space,Str "a",Space,Str "regular",Space,Str "paragraph."] ,Para [Str "In",Space,Str "Markdown",Space,Str "1.0.0",Space,Str "and",Space,Str "earlier.",Space,Str "Version",Space,Str "8.",Space,Str "This",Space,Str "line",Space,Str "turns",Space,Str "into",Space,Str "a",Space,Str "list",Space,Str "item.",Space,Str "Because",Space,Str "a",Space,Str "hard-wrapped",Space,Str "line",Space,Str "in",Space,Str "the",Space,Str "middle",Space,Str "of",Space,Str "a",Space,Str "paragraph",Space,Str "looked",Space,Str "like",Space,Str "a",Space,Str "list",Space,Str "item."] ,Para [Str "Here\8217s",Space,Str "one",Space,Str "with",Space,Str "a",Space,Str "bullet.",Space,Str "*",Space,Str "criminey."] -,Header 1 ("",[],[]) [Str "Block",Space,Str "Quotes"] +,Header 1 ("block-quotes",[],[]) [Str "Block",Space,Str "Quotes"] ,Para [Str "E-mail",Space,Str "style:"] ,BlockQuote [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "block",Space,Str "quote.",Space,Str "It",Space,Str "is",Space,Str "pretty",Space,Str "short."]] @@ -35,13 +35,13 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa [Para [Str "nested"]]] ,Para [Str "This",Space,Str "should",Space,Str "not",Space,Str "be",Space,Str "a",Space,Str "block",Space,Str "quote:",Space,Str "2",Space,Str ">",Space,Str "1."] ,Para [Str "And",Space,Str "a",Space,Str "following",Space,Str "paragraph."] -,Header 1 ("",[],[]) [Str "Code",Space,Str "Blocks"] +,Header 1 ("code-blocks",[],[]) [Str "Code",Space,Str "Blocks"] ,Para [Str "Code:"] ,CodeBlock ("",[],[]) "---- (should be four hyphens)\n\nsub status {\n print \"working\";\n}\n\nthis code block is indented by one tab" ,Para [Str "And:"] ,CodeBlock ("",[],[]) " this code block is indented by two tabs\n\nThese should not be escaped: \\$ \\\\ \\> \\[ \\{" -,Header 1 ("",[],[]) [Str "Lists"] -,Header 2 ("",[],[]) [Str "Unordered"] +,Header 1 ("lists",[],[]) [Str "Lists"] +,Header 2 ("unordered",[],[]) [Str "Unordered"] ,Para [Str "Asterisks",Space,Str "loose:"] ,BulletList [[Para [Str "asterisk",Space,Str "1"]] @@ -57,7 +57,7 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa [[Para [Str "Minus",Space,Str "1"]] ,[Para [Str "Minus",Space,Str "2"]] ,[Para [Str "Minus",Space,Str "3"]]] -,Header 2 ("",[],[]) [Str "Ordered"] +,Header 2 ("ordered",[],[]) [Str "Ordered"] ,OrderedList (1,Decimal,DefaultDelim) [[Para [Str "First"]] ,[Para [Str "Second"]] @@ -73,7 +73,7 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa ,Para [Str "Item",Space,Str "1.",Space,Str "graf",Space,Str "two.",Space,Str "The",Space,Str "quick",Space,Str "brown",Space,Str "fox",Space,Str "jumped",Space,Str "over",Space,Str "the",Space,Str "lazy",Space,Str "dog\8217s",Space,Str "back."]] ,[Para [Str "Item",Space,Str "2."]] ,[Para [Str "Item",Space,Str "3."]]] -,Header 2 ("",[],[]) [Str "Nested"] +,Header 2 ("nested",[],[]) [Str "Nested"] ,BulletList [[Para [Str "Tab"] ,BulletList @@ -98,14 +98,14 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa ,[Para [Str "Fie"]] ,[Para [Str "Foe"]]]] ,[Para [Str "Third"]]] -,Header 2 ("",[],[]) [Str "Tabs",Space,Str "and",Space,Str "spaces"] +,Header 2 ("tabs-and-spaces",[],[]) [Str "Tabs",Space,Str "and",Space,Str "spaces"] ,BulletList [[Para [Str "this",Space,Str "is",Space,Str "a",Space,Str "list",Space,Str "item",Space,Str "indented",Space,Str "with",Space,Str "tabs"]] ,[Para [Str "this",Space,Str "is",Space,Str "a",Space,Str "list",Space,Str "item",Space,Str "indented",Space,Str "with",Space,Str "spaces"] ,BulletList [[Para [Str "this",Space,Str "is",Space,Str "an",Space,Str "example",Space,Str "list",Space,Str "item",Space,Str "indented",Space,Str "with",Space,Str "tabs"]] ,[Para [Str "this",Space,Str "is",Space,Str "an",Space,Str "example",Space,Str "list",Space,Str "item",Space,Str "indented",Space,Str "with",Space,Str "spaces"]]]]] -,Header 2 ("",[],[]) [Str "Fancy",Space,Str "list",Space,Str "markers"] +,Header 2 ("fancy-list-markers",[],[]) [Str "Fancy",Space,Str "list",Space,Str "markers"] ,OrderedList (2,Decimal,DefaultDelim) [[Para [Str "begins",Space,Str "with",Space,Str "2"]] ,[Para [Str "and",Space,Str "now",Space,Str "3"] @@ -134,7 +134,13 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa ,Para [Str "Should",Space,Str "not",Space,Str "be",Space,Str "a",Space,Str "list",Space,Str "item:"] ,Para [Str "M.A.\160\&2007"] ,Para [Str "B.",Space,Str "Williams"] -,Header 1 ("",[],[]) [Str "Definition",Space,Str "Lists"] +,Header 2 ("callout",[],[]) [Str "Callout"] +,Para [Str "Simple."] +,BulletList + [[Para [Str "A",Space,Code ("",[],[]) "__letrec",Space,Str "is",Space,Str "equivalent",Space,Str "to",Space,Str "a",Space,Str "normal",Space,Str "Haskell",Space,Str "LET."]] + ,[Para [Str "GHC",Space,Str "compiled",Space,Str "the",Space,Str "body",Space,Str "of",Space,Str "our",Space,Str "list",Space,Str "comprehension",Space,Str "into",Space,Str "a",Space,Str "loop",Space,Str "named",Space,Code ("",[],[]) "go_s1YC",Str "."]] + ,[Para [Str "If",Space,Str "our",Space,Str "CASE",Space,Str "expression",Space,Str "matches",Space,Str "the",Space,Str "empty",Space,Str "list,",Space,Str "we",Space,Str "return",Space,Str "the",Space,Str "empty",Space,Str "list.",Space,Str "This",Space,Str "is",Space,Str "reassuringly",Space,Str "familiar."]]] +,Header 1 ("definition-lists",[],[]) [Str "Definition",Space,Str "Lists"] ,DefinitionList [([Str "apple"], [[Para [Str "red",Space,Str "fruit"]]]) @@ -170,7 +176,7 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa ,OrderedList (1,Decimal,DefaultDelim) [[Para [Str "sublist"]] ,[Para [Str "sublist"]]]]])] -,Header 1 ("",[],[]) [Str "Inline",Space,Str "Markup"] +,Header 1 ("inline-markup",[],[]) [Str "Inline",Space,Str "Markup"] ,Para [Str "This",Space,Str "is",Space,Emph [Str "emphasized"],Str ",",Space,Str "and",Space,Str "so",Space,Emph [Str "is",Space,Str "this"],Str "."] ,Para [Str "This",Space,Str "is",Space,Strong [Str "strong"],Str ",",Space,Str "and",Space,Str "so",Space,Strong [Str "is",Space,Str "this"],Str "."] ,Para [Str "An",Space,Emph [Link [Str "emphasized",Space,Str "link"] ("/url","")],Str "."] @@ -179,20 +185,21 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa ,Para [Strong [Emph [Str "This",Space,Str "is",Space,Str "strong",Space,Str "and",Space,Str "em."]]] ,Para [Str "So",Space,Str "is",Space,Strong [Emph [Str "this"]],Space,Str "word."] ,Para [Str "This",Space,Str "is",Space,Str "code:",Space,Code ("",[],[]) ">",Str ",",Space,Code ("",[],[]) "$",Str ",",Space,Code ("",[],[]) "\\",Str ",",Space,Code ("",[],[]) "\\$",Str ",",Space,Code ("",[],[]) "<html>",Str "."] +,Para [Str "More",Space,Str "code:",Space,Code ("",[],[]) "Class",Space,Str "and",Space,Code ("",[],[]) "Type"] ,Para [Strikeout [Str "This",Space,Str "is",Space,Emph [Str "strikeout"],Str "."]] ,Para [Str "Superscripts:",Space,Str "a",Superscript [Str "bc"],Str "d",Space,Str "a",Superscript [Emph [Str "hello"]],Space,Str "a",Superscript [Str "hello\160there"],Str "."] ,Para [Str "Subscripts:",Space,Str "H",Subscript [Str "2"],Str "O,",Space,Str "H",Subscript [Str "23"],Str "O,",Space,Str "H",Subscript [Str "many\160of\160them"],Str "O."] ,Para [Str "These",Space,Str "should",Space,Str "not",Space,Str "be",Space,Str "superscripts",Space,Str "or",Space,Str "subscripts,",Space,Str "because",Space,Str "of",Space,Str "the",Space,Str "unescaped",Space,Str "spaces:",Space,Str "a^b",Space,Str "c^d,",Space,Str "a~b",Space,Str "c~d."] -,Header 1 ("",[],[]) [Str "Smart",Space,Str "quotes,",Space,Str "ellipses,",Space,Str "dashes"] +,Header 1 ("smart-quotes-ellipses-dashes",[],[]) [Str "Smart",Space,Str "quotes,",Space,Str "ellipses,",Space,Str "dashes"] ,Para [Quoted DoubleQuote [Str "Hello,"],Space,Str "said",Space,Str "the",Space,Str "spider.",Space,Quoted DoubleQuote [Quoted SingleQuote [Str "Shelob"],Space,Str "is",Space,Str "my",Space,Str "name."]] ,Para [Quoted DoubleQuote [Str "A"],Str ",",Space,Quoted DoubleQuote [Str "B"],Str ",",Space,Str "and",Space,Quoted DoubleQuote [Str "C"],Space,Str "are",Space,Str "letters."] ,Para [Quoted DoubleQuote [Str "He",Space,Str "said,",Space,Quoted SingleQuote [Str "I",Space,Str "want",Space,Str "to",Space,Str "go."]],Space,Str "Were",Space,Str "you",Space,Str "alive",Space,Str "in",Space,Str "the",Space,Str "70\8217s?"] ,Para [Str "Some",Space,Str "dashes:",Space,Str "one\8212two",Space,Str "\8212",Space,Str "three\8212four",Space,Str "\8212",Space,Str "five."] ,Para [Str "Dashes",Space,Str "between",Space,Str "numbers:",Space,Str "5\8211\&7,",Space,Str "255\8211\&66,",Space,Str "1987\8211\&1999."] ,Para [Str "Ellipses\8230and\8230and\8230."] -,Header 1 ("",[],[]) [] +,Header 1 ("math",[],[]) [] ,Para [Math DisplayMath "e = mc^{2}",Math DisplayMath "1",Space,Math InlineMath "e = mc^{2}",Space,Math DisplayMath "e = mc^{2}"] -,Header 1 ("",[],[]) [Str "Special",Space,Str "Characters"] +,Header 1 ("special-characters",[],[]) [Str "Special",Space,Str "Characters"] ,Para [Str "Here",Space,Str "is",Space,Str "some",Space,Str "unicode:"] ,BulletList [[Para [Str "I",Space,Str "hat:",Space,Str "\206"]] @@ -221,8 +228,8 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa ,Para [Str "Bang:",Space,Str "!"] ,Para [Str "Plus:",Space,Str "+"] ,Para [Str "Minus:",Space,Str "-"] -,Header 1 ("",[],[]) [Str "Links"] -,Header 2 ("",[],[]) [Str "Explicit"] +,Header 1 ("links",[],[]) [Str "Links"] +,Header 2 ("explicit",[],[]) [Str "Explicit"] ,Para [Str "Just",Space,Str "a",Space,Link [Str "URL"] ("/url/",""),Str "."] ,Para [Link [Str "URL",Space,Str "and",Space,Str "title"] ("/url/",""),Str "."] ,Para [Link [Str "URL",Space,Str "and",Space,Str "title"] ("/url/",""),Str "."] @@ -232,7 +239,7 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa ,Para [Link [Str "with_underscore"] ("/url/with_underscore","")] ,Para [Link [Str "nobody@nowhere.net"] ("mailto:nobody@nowhere.net","")] ,Para [Link [Str "Empty"] ("",""),Str "."] -,Header 2 ("",[],[]) [Str "Reference"] +,Header 2 ("reference",[],[]) [Str "Reference"] ,Para [Str "Foo",Space,Link [Str "bar"] ("/url/",""),Str "."] ,Para [Str "Foo",Space,Link [Str "bar"] ("/url/",""),Str "."] ,Para [Str "Foo",Space,Link [Str "bar"] ("/url/",""),Str "."] @@ -245,12 +252,12 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa ,CodeBlock ("",[],[]) "[not]: /url" ,Para [Str "Foo",Space,Link [Str "bar"] ("/url/",""),Str "."] ,Para [Str "Foo",Space,Link [Str "biz"] ("/url/",""),Str "."] -,Header 2 ("",[],[]) [Str "With",Space,Str "ampersands"] +,Header 2 ("with-ampersands",[],[]) [Str "With",Space,Str "ampersands"] ,Para [Str "Here\8217s",Space,Str "a",Space,Link [Str "link",Space,Str "with",Space,Str "an",Space,Str "ampersand",Space,Str "in",Space,Str "the",Space,Str "URL"] ("http://example.com/?foo=1&bar=2",""),Str "."] ,Para [Str "Here\8217s",Space,Str "a",Space,Str "link",Space,Str "with",Space,Str "an",Space,Str "amersand",Space,Str "in",Space,Str "the",Space,Str "link",Space,Str "text:",Space,Link [Str "AT&T"] ("http://att.com/",""),Str "."] ,Para [Str "Here\8217s",Space,Str "an",Space,Link [Str "inline",Space,Str "link"] ("/script?foo=1&bar=2",""),Str "."] ,Para [Str "Here\8217s",Space,Str "an",Space,Link [Str "inline",Space,Str "link",Space,Str "in",Space,Str "pointy",Space,Str "braces"] ("/script?foo=1&bar=2",""),Str "."] -,Header 2 ("",[],[]) [Str "Autolinks"] +,Header 2 ("autolinks",[],[]) [Str "Autolinks"] ,Para [Str "With",Space,Str "an",Space,Str "ampersand:",Space,Link [Str "http://example.com/?foo=1&bar=2"] ("http://example.com/?foo=1&bar=2","")] ,BulletList [[Para [Str "In",Space,Str "a",Space,Str "list?"]] @@ -261,18 +268,18 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa [Para [Str "Blockquoted:",Space,Link [Str "http://example.com/"] ("http://example.com/","")]] ,Para [Str "Auto-links",Space,Str "should",Space,Str "not",Space,Str "occur",Space,Str "here:",Space,Code ("",[],[]) "<http://example.com/>"] ,CodeBlock ("",[],[]) "or here: <http://example.com/>" -,Header 1 ("",[],[]) [Str "Images"] +,Header 1 ("images",[],[]) [Str "Images"] ,Para [Str "From",Space,Quoted DoubleQuote [Str "Voyage",Space,Str "dans",Space,Str "la",Space,Str "Lune"],Space,Str "by",Space,Str "Georges",Space,Str "Melies",Space,Str "(1902):"] ,Para [Image [Str "lalune"] ("lalune.jpg","")] ,Para [Str "Here",Space,Str "is",Space,Str "a",Space,Str "movie",Space,Image [] ("movie.jpg",""),Space,Str "icon."] -,Header 1 ("",[],[]) [Str "Footnotes"] +,Header 1 ("footnotes",[],[]) [Str "Footnotes"] ,Para [Str "Here",Space,Str "is",Space,Str "a",Space,Str "footnote",Space,Str "reference,",Note [Para [Str "Here",Space,Str "is",Space,Str "the",Space,Str "footnote.",Space,Str "It",Space,Str "can",Space,Str "go",Space,Str "anywhere",Space,Str "after",Space,Str "the",Space,Str "footnote",Space,Str "reference.",Space,Str "It",Space,Str "need",Space,Str "not",Space,Str "be",Space,Str "placed",Space,Str "at",Space,Str "the",Space,Str "end",Space,Str "of",Space,Str "the",Space,Str "document."]],Space,Str "and",Space,Str "another.",Note [Para [Str "Here\8217s",Space,Str "the",Space,Str "long",Space,Str "note.",Space,Str "This",Space,Str "one",Space,Str "contains",Space,Str "multiple",Space,Str "blocks."],Para [Str "Subsequent",Space,Str "blocks",Space,Str "are",Space,Str "indented",Space,Str "to",Space,Str "show",Space,Str "that",Space,Str "they",Space,Str "belong",Space,Str "to",Space,Str "the",Space,Str "footnote",Space,Str "(as",Space,Str "with",Space,Str "list",Space,Str "items)."],CodeBlock ("",[],[]) " { <code> }",Para [Str "If",Space,Str "you",Space,Str "want,",Space,Str "you",Space,Str "can",Space,Str "indent",Space,Str "every",Space,Str "line,",Space,Str "but",Space,Str "you",Space,Str "can",Space,Str "also",Space,Str "be",Space,Str "lazy",Space,Str "and",Space,Str "just",Space,Str "indent",Space,Str "the",Space,Str "first",Space,Str "line",Space,Str "of",Space,Str "each",Space,Str "block."]],Space,Str "This",Space,Str "should",Space,Emph [Str "not"],Space,Str "be",Space,Str "a",Space,Str "footnote",Space,Str "reference,",Space,Str "because",Space,Str "it",Space,Str "contains",Space,Str "a",Space,Str "space.[^my",Space,Str "note]",Space,Str "Here",Space,Str "is",Space,Str "an",Space,Str "inline",Space,Str "note.",Note [Para [Str "This",Space,Str "is",Space,Emph [Str "easier"],Space,Str "to",Space,Str "type.",Space,Str "Inline",Space,Str "notes",Space,Str "may",Space,Str "contain",Space,Link [Str "links"] ("http://google.com",""),Space,Str "and",Space,Code ("",[],[]) "]",Space,Str "verbatim",Space,Str "characters,",Space,Str "as",Space,Str "well",Space,Str "as",Space,Str "[bracketed",Space,Str "text]."]]] ,BlockQuote [Para [Str "Notes",Space,Str "can",Space,Str "go",Space,Str "in",Space,Str "quotes.",Note [Para [Str "In",Space,Str "quote."]]]] ,OrderedList (1,Decimal,DefaultDelim) [[Para [Str "And",Space,Str "in",Space,Str "list",Space,Str "items.",Note [Para [Str "In",Space,Str "list."]]]]] ,Para [Str "This",Space,Str "paragraph",Space,Str "should",Space,Str "not",Space,Str "be",Space,Str "part",Space,Str "of",Space,Str "the",Space,Str "note,",Space,Str "as",Space,Str "it",Space,Str "is",Space,Str "not",Space,Str "indented."] -,Header 1 ("",[],[]) [Str "Tables"] +,Header 1 ("tables",[],[]) [Str "Tables"] ,Para [Str "Simple",Space,Str "table",Space,Str "with",Space,Str "caption:"] ,Table [Str "Demonstration",Space,Str "of",Space,Str "simple",Space,Str "table",Space,Str "syntax."] [AlignRight,AlignLeft,AlignCenter,AlignLeft] [0.0,0.0,0.0,0.0] [[Plain [Str "Right"]] diff --git a/tests/docx/danish_headers.docx b/tests/docx/danish_headers.docx Binary files differdeleted file mode 100644 index 345fc632c..000000000 --- a/tests/docx/danish_headers.docx +++ /dev/null diff --git a/tests/docx/danish_headers.native b/tests/docx/danish_headers.native deleted file mode 100644 index 12a857811..000000000 --- a/tests/docx/danish_headers.native +++ /dev/null @@ -1,10 +0,0 @@ -[Header 1 ("testoverskrift",[],[]) [Str "Testoverskrift"] -,Para [Str "Normal"] -,Header 2 ("testoverskrift-2",[],[]) [Str "Testoverskrift",Space,Str "2"] -,Table [] [AlignDefault,AlignDefault] [0.0,0.0] - [[Plain [Str "Kolonne1"]] - ,[Plain [Str "Kolonne2"]]] - [[[Plain [Str "Testdata",Space,Str "1"]] - ,[Plain [Str "Tester2"]]] - ,[[Plain [Str "Testdata",Space,Str "2"]] - ,[Plain [Str "Tester3"]]]]] diff --git a/tests/docx/i18n_blocks.docx b/tests/docx/i18n_blocks.docx Binary files differnew file mode 100644 index 000000000..36341c363 --- /dev/null +++ b/tests/docx/i18n_blocks.docx diff --git a/tests/docx/i18n_blocks.native b/tests/docx/i18n_blocks.native new file mode 100644 index 000000000..582a7360d --- /dev/null +++ b/tests/docx/i18n_blocks.native @@ -0,0 +1,8 @@ +[Header 1 ("this-is-heading-1",[],[]) [Str "This",Space,Str "is",Space,Str "Heading",Space,Str "1"] +,Header 2 ("this-is-heading-2",[],[]) [Str "This",Space,Str "is",Space,Str "Heading",Space,Str "2"] +,BlockQuote + [Para [Str "This",Space,Str "is",Space,Str "Quote"] + ,Para [Str "This",Space,Str "is",Space,Str "Block",Space,Str "Text"]] +,BulletList + [[Para [Str "This",Space,Str "is",Space,Str "list",Space,Str "item",Space,Str "1"]] + ,[Para [Str "This",Space,Str "is",Space,Str "list",Space,Str "item",Space,Str "2"]]]] diff --git a/tests/docx/links.docx b/tests/docx/links.docx Binary files differindex 10ec62fd7..538b84b08 100644 --- a/tests/docx/links.docx +++ b/tests/docx/links.docx diff --git a/tests/docx/links.native b/tests/docx/links.native index c741fe875..cd7ab6fb6 100644 --- a/tests/docx/links.native +++ b/tests/docx/links.native @@ -1,5 +1,6 @@ [Header 2 ("an-internal-link-and-an-external-link",[],[]) [Str "An",Space,Str "internal",Space,Str "link",Space,Str "and",Space,Str "an",Space,Str "external",Space,Str "link"] ,Para [Str "An",Space,Link [Str "external",Space,Str "link"] ("http://google.com",""),Space,Str "to",Space,Str "a",Space,Str "popular",Space,Str "website."] +,Para [Str "An",Space,Link [Str "external",Space,Str "link"] ("http://johnmacfarlane.net/pandoc/README.html#synopsis",""),Space,Str "to",Space,Str "a",Space,Str "website",Space,Str "with",Space,Str "an",Space,Str "anchor."] ,Para [Str "An",Space,Link [Str "internal",Space,Str "link"] ("#a-section-for-testing-link-targets",""),Space,Str "to",Space,Str "a",Space,Str "section",Space,Str "header."] ,Para [Str "An",Space,Link [Str "internal",Space,Str "link"] ("#my_bookmark",""),Space,Str "to",Space,Str "a",Space,Str "bookmark."] ,Header 2 ("a-section-for-testing-link-targets",[],[]) [Str "A",Space,Str "section",Space,Str "for",Space,Str "testing",Space,Str "link",Space,Str "targets"] diff --git a/tests/dokuwiki_external_images.dokuwiki b/tests/dokuwiki_external_images.dokuwiki new file mode 100644 index 000000000..cc7eddcda --- /dev/null +++ b/tests/dokuwiki_external_images.dokuwiki @@ -0,0 +1 @@ +{{https://cooluri.com/image.png|HTTPS image}} {{http://cooluri.com/image.png|HTTP image}} {{ftp://ftp.cooluri.com/image.png|FTP image}} {{file:///tmp/coolimage.png|Filesystem image}} {{:/image.jpg|Relative image 1}} {{:image.jpg|Relative image 2}} diff --git a/tests/dokuwiki_external_images.native b/tests/dokuwiki_external_images.native new file mode 100644 index 000000000..c2b8876d3 --- /dev/null +++ b/tests/dokuwiki_external_images.native @@ -0,0 +1 @@ +[Para [Image [Str "HTTPS",Space,Str "image"] ("https://cooluri.com/image.png",""),Space,Image [Str "HTTP",Space,Str "image"] ("http://cooluri.com/image.png",""),Space,Image [Str "FTP",Space,Str "image"] ("ftp://ftp.cooluri.com/image.png",""),Space,Image [Str "Filesystem",Space,Str "image"] ("file:///tmp/coolimage.png",""),Space,Image [Str "Relative",Space,Str "image",Space,Str "1"] ("/image.jpg",""),Space,Image [Str "Relative",Space,Str "image",Space,Str "2"] ("image.jpg","")]] diff --git a/tests/html-reader.html b/tests/html-reader.html index 14ad3ed54..e9ba2a68b 100644 --- a/tests/html-reader.html +++ b/tests/html-reader.html @@ -451,5 +451,4 @@ An e-mail address: nobody [at] nowhere.net<blockquote> </tr> </table> </body> -</body> </html> diff --git a/tests/markdown-reader-more.native b/tests/markdown-reader-more.native index 30da0afbb..3f4bb5740 100644 --- a/tests/markdown-reader-more.native +++ b/tests/markdown-reader-more.native @@ -17,6 +17,7 @@ ,Header 3 ("my-header",[],[]) [Str "my",Space,Str "header"] ,Header 2 ("in-math",[],[]) [Str "$",Space,Str "in",Space,Str "math"] ,Para [Math InlineMath "\\$2 + \\$3"] +,Para [Math InlineMath "x = \\text{the $n$th root of $y$}"] ,Para [Str "This",Space,Str "should",Space,Str "not",Space,Str "be",Space,Str "math:"] ,Para [Str "$PATH",Space,Str "90",Space,Str "$PATH"] ,Header 2 ("commented-out-list-item",[],[]) [Str "Commented-out",Space,Str "list",Space,Str "item"] @@ -78,6 +79,8 @@ ,Para [Str "Link",Space,Str "to",Space,Link [Str "Explicit",Space,Str "header",Space,Str "attributes"] ("#foobar",""),Str "."] ,Para [Str "But",Space,Str "this",Space,Str "is",Space,Str "not",Space,Str "a",Space,Str "link",Space,Str "to",Space,Link [Str "My",Space,Str "other",Space,Str "header"] ("/foo",""),Str ",",Space,Str "since",Space,Str "the",Space,Str "reference",Space,Str "is",Space,Str "defined."] ,Header 2 ("foobar",["baz"],[("key","val")]) [Str "Explicit",Space,Str "header",Space,Str "attributes"] +,BlockQuote + [Header 2 ("foobar",["baz"],[("key","val")]) [Str "Header",Space,Str "attributes",Space,Str "inside",Space,Str "block",Space,Str "quote"]] ,Header 2 ("line-blocks",[],[]) [Str "Line",Space,Str "blocks"] ,Para [Str "But",Space,Str "can",Space,Str "a",Space,Str "bee",Space,Str "be",Space,Str "said",Space,Str "to",Space,Str "be",LineBreak,Str "\160\160\160\160or",Space,Str "not",Space,Str "to",Space,Str "be",Space,Str "an",Space,Str "entire",Space,Str "bee,",LineBreak,Str "\160\160\160\160\160\160\160\160when",Space,Str "half",Space,Str "the",Space,Str "bee",Space,Str "is",Space,Str "not",Space,Str "a",Space,Str "bee,",LineBreak,Str "\160\160\160\160\160\160\160\160\160\160\160\160due",Space,Str "to",Space,Str "some",Space,Str "ancient",Space,Str "injury?"] ,Para [Str "Continuation",Space,Str "line",LineBreak,Str "\160\160and",Space,Str "another"] @@ -149,6 +152,8 @@ ,Para [Link [Str "linky"] ("hi_(there_(nested))","")] ,Header 2 ("reference-link-fallbacks",[],[]) [Str "Reference",Space,Str "link",Space,Str "fallbacks"] ,Para [Str "[",Emph [Str "not",Space,Str "a",Space,Str "link"],Str "]",Space,Str "[",Emph [Str "nope"],Str "]\8230"] +,Header 2 ("reference-link-followed-by-a-citation",[],[]) [Str "Reference",Space,Str "link",Space,Str "followed",Space,Str "by",Space,Str "a",Space,Str "citation"] +,Para [Str "MapReduce",Space,Str "is",Space,Str "a",Space,Str "paradigm",Space,Str "popularized",Space,Str "by",Space,Link [Str "Google"] ("http://google.com",""),Space,Cite [Citation {citationId = "mapreduce", citationPrefix = [], citationSuffix = [], citationMode = NormalCitation, citationNoteNum = 0, citationHash = 0}] [Str "[@mapreduce]"],Space,Str "as",Space,Str "its",Space,Str "most",Space,Str "vocal",Space,Str "proponent."] ,Header 2 ("empty-reference-links",[],[]) [Str "Empty",Space,Str "reference",Space,Str "links"] ,Para [Str "bar"] ,Para [Link [Str "foo2"] ("","")]] diff --git a/tests/markdown-reader-more.txt b/tests/markdown-reader-more.txt index c486f8885..d7439f6cb 100644 --- a/tests/markdown-reader-more.txt +++ b/tests/markdown-reader-more.txt @@ -60,6 +60,8 @@ $\$2 + \$3$ +$x = \text{the $n$th root of $y$}$ + This should not be math: $PATH 90 $PATH @@ -174,6 +176,8 @@ But this is not a link to [My other header], since the reference is defined. ## Explicit header attributes {#foobar .baz key="val"} +> ## Header attributes inside block quote {#foobar .baz key="val"} + ## Line blocks | But can a bee be said to be @@ -258,6 +262,13 @@ Empty cells [*not a link*] [*nope*]... +## Reference link followed by a citation + +MapReduce is a paradigm popularized by [Google] [@mapreduce] as its +most vocal proponent. + +[Google]: http://google.com + ## Empty reference links [foo2]: diff --git a/tests/rst-reader.native b/tests/rst-reader.native index c77d15775..1f402f835 100644 --- a/tests/rst-reader.native +++ b/tests/rst-reader.native @@ -322,12 +322,12 @@ Pandoc (Meta {unMeta = fromList [("author",MetaList [MetaInlines [Str "John",Spa ,Null ,Para [Str "And",Space,Str "now",Space,Str "with",Space,RawInline (Format "html") "<b>inline</b> <span id=\"test\">HTML</span>",Str "."] ,Null -,Para [Str "And",Space,Str "some",Space,Str "inline",Space,Str "haskell",Space,Code ("",["sourceCode","haskell"],[]) "fmap id [1,2..10]",Str "."] +,Para [Str "And",Space,Str "some",Space,Str "inline",Space,Str "haskell",Space,Code ("",["haskell","sourceCode"],[]) "fmap id [1,2..10]",Str "."] ,Null ,Null -,Para [Str "Indirect",Space,Str "python",Space,Str "role",Space,Code ("",["sourceCode","python"],[]) "[x*x for x in [1,2,3,4,5]]",Str "."] +,Para [Str "Indirect",Space,Str "python",Space,Str "role",Space,Code ("",["python","indirect","sourceCode"],[]) "[x*x for x in [1,2,3,4,5]]",Str "."] ,Null ,Null -,Para [Str "Different",Space,Str "indirect",Space,Str "C",Space,Code ("",["sourceCode","c"],[]) "int x = 15;",Str "."] +,Para [Str "Different",Space,Str "indirect",Space,Str "C",Space,Code ("",["c","different-indirect","sourceCode"],[]) "int x = 15;",Str "."] ,Header 2 ("literal-symbols",[],[]) [Str "Literal",Space,Str "symbols"] ,Para [Str "2*2",Space,Str "=",Space,Str "4*1"]] diff --git a/tests/tables.asciidoc b/tests/tables.asciidoc index ba647866a..2a24544a3 100644 --- a/tests/tables.asciidoc +++ b/tests/tables.asciidoc @@ -65,4 +65,3 @@ Multiline table without column headers: |First |row |12.0 |Example of a row that spans multiple lines. |Second |row |5.0 |Here's another one. Note the blank line between rows. |======================================================================= - diff --git a/tests/tables.haddock b/tests/tables.haddock index 413ec97ad..f9efdc0de 100644 --- a/tests/tables.haddock +++ b/tests/tables.haddock @@ -74,4 +74,3 @@ Multiline table without column headers: > the blank line between > rows. > ----------- ---------- ------------ -------------------------- - diff --git a/tests/tables.org b/tests/tables.org index 8d9100d07..9eaf5e706 100644 --- a/tests/tables.org +++ b/tests/tables.org @@ -49,4 +49,3 @@ Multiline table without column headers: | First | row | 12.0 | Example of a row that spans multiple lines. | | Second | row | 5.0 | Here's another one. Note the blank line between rows. | - diff --git a/tests/tables.rst b/tests/tables.rst index e77f69d97..25d5932ea 100644 --- a/tests/tables.rst +++ b/tests/tables.rst @@ -88,4 +88,3 @@ Multiline table without column headers: | | | | the blank line between | | | | | rows. | +-------------+------------+--------------+----------------------------+ - diff --git a/tests/twiki-reader.native b/tests/twiki-reader.native new file mode 100644 index 000000000..bde55a378 --- /dev/null +++ b/tests/twiki-reader.native @@ -0,0 +1,174 @@ +Pandoc (Meta {unMeta = fromList []}) +[Header 1 ("header",[],[]) [Str "header"] +,Header 2 ("header-level-two",[],[]) [Str "header",Space,Str "level",Space,Str "two"] +,Header 3 ("header-level-3",[],[]) [Str "header",Space,Str "level",Space,Str "3"] +,Header 4 ("header-level-four",[],[]) [Str "header",Space,Emph [Str "level"],Space,Str "four"] +,Header 5 ("header-level-5",[],[]) [Str "header",Space,Str "level",Space,Str "5"] +,Header 6 ("header-level-6",[],[]) [Str "header",Space,Str "level",Space,Str "6"] +,Para [Str "---+++++++",Space,Str "not",Space,Str "a",Space,Str "header"] +,Para [Str "--++",Space,Str "not",Space,Str "a",Space,Str "header"] +,Header 1 ("emph-and-strong",[],[]) [Str "emph",Space,Str "and",Space,Str "strong"] +,Para [Emph [Str "emph"],Space,Strong [Str "strong"]] +,Para [Emph [Strong [Str "strong",Space,Str "and",Space,Str "emph"]]] +,Para [Strong [Emph [Str "emph",Space,Str "inside"],Space,Str "strong"]] +,Para [Strong [Str "strong",Space,Str "with",Space,Emph [Str "emph"]]] +,Para [Emph [Strong [Str "strong",Space,Str "inside"],Space,Str "emph"]] +,Header 1 ("horizontal-rule",[],[]) [Str "horizontal",Space,Str "rule"] +,Para [Str "top"] +,HorizontalRule +,Para [Str "bottom"] +,HorizontalRule +,Header 1 ("nop",[],[]) [Str "nop"] +,Para [Str "_not",Space,Str "emph_"] +,Header 1 ("entities",[],[]) [Str "entities"] +,Para [Str "hi",Space,Str "&",Space,Str "low"] +,Para [Str "hi",Space,Str "&",Space,Str "low"] +,Para [Str "G\246del"] +,Para [Str "\777\2730"] +,Header 1 ("comments",[],[]) [Str "comments"] +,Para [Str "inline",Space,Str "comment"] +,Para [Str "between",Space,Str "blocks"] +,Header 1 ("linebreaks",[],[]) [Str "linebreaks"] +,Para [Str "hi",LineBreak,Str "there"] +,Para [Str "hi",LineBreak,Space,Str "there"] +,Header 1 ("inline-code",[],[]) [Str "inline",Space,Str "code"] +,Para [Code ("",[],[]) "*\8594*",Space,Code ("",[],[]) "typed",Space,Code ("",["haskell"],[]) ">>="] +,Header 1 ("code-blocks",[],[]) [Str "code",Space,Str "blocks"] +,CodeBlock ("",[],[]) "case xs of\n (_:_) -> reverse xs\n [] -> ['*']" +,CodeBlock ("",["haskell"],[]) "case xs of\n (_:_) -> reverse xs\n [] -> ['*']" +,Header 1 ("block-quotes",[],[]) [Str "block",Space,Str "quotes"] +,Para [Str "Regular",Space,Str "paragraph"] +,BlockQuote + [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "block",Space,Str "quote."] + ,Para [Str "With",Space,Str "two",Space,Str "paragraphs."]] +,Para [Str "Nother",Space,Str "paragraph."] +,Header 1 ("external-links",[],[]) [Str "external",Space,Str "links"] +,Para [Link [Emph [Str "Google"],Space,Str "search",Space,Str "engine"] ("http://google.com","")] +,Para [Link [Str "http://johnmacfarlane.net/pandoc/"] ("http://johnmacfarlane.net/pandoc/","")] +,Para [Link [Str "http://google.com"] ("http://google.com",""),Space,Link [Str "http://yahoo.com"] ("http://yahoo.com","")] +,Para [Link [Str "email",Space,Str "me"] ("mailto:info@example.org","")] +,Para [Str "http://google.com"] +,Para [Str "http://google.com"] +,Para [Str "http://google.com"] +,Para [Str "info@example.org"] +,Para [Str "info@example.org"] +,Para [Str "info@example.org"] +,Header 1 ("lists",[],[]) [Str "lists"] +,BulletList + [[Plain [Str "Start",Space,Str "each",Space,Str "line"]] + ,[Plain [Str "with",Space,Str "an",Space,Str "asterisk",Space,Str "(*)."] + ,BulletList + [[Plain [Str "More",Space,Str "asterisks",Space,Str "gives",Space,Str "deeper"] + ,BulletList + [[Plain [Str "and",Space,Str "deeper",Space,Str "levels."]]]]]] + ,[Plain [Str "Line",Space,Str "breaks",LineBreak,Str "don't",Space,Str "break",Space,Str "levels."]] + ,[Plain [Str "Continuations",Space,Str "are",Space,Str "also",Space,Str "possible"] + ,BulletList + [[Plain [Str "and",Space,Str "do",Space,Str "not",Space,Str "break",Space,Str "the",Space,Str "list",Space,Str "flow"]]]] + ,[Plain [Str "Level",Space,Str "one"]]] +,Para [Str "Any",Space,Str "other",Space,Str "start",Space,Str "ends",Space,Str "the",Space,Str "list."] +,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "Start",Space,Str "each",Space,Str "line"]] + ,[Plain [Str "with",Space,Str "a",Space,Str "number",Space,Str "(1.)."] + ,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "More",Space,Str "number",Space,Str "signs",Space,Str "gives",Space,Str "deeper"] + ,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "and",Space,Str "deeper"]] + ,[Plain [Str "levels."]]]]]] + ,[Plain [Str "Line",Space,Str "breaks",LineBreak,Str "don't",Space,Str "break",Space,Str "levels."]] + ,[Plain [Str "Blank",Space,Str "lines"]]] +,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "end",Space,Str "the",Space,Str "list",Space,Str "and",Space,Str "start",Space,Str "another."]]] +,Para [Str "Any",Space,Str "other",Space,Str "start",Space,Str "also",Space,Str "ends",Space,Str "the",Space,Str "list."] +,DefinitionList + [([Str "item",Space,Str "1"], + [[Plain [Str "definition",Space,Str "1"]]]) + ,([Str "item",Space,Str "2"], + [[Plain [Str "definition",Space,Str "2-1",Space,Str "definition",Space,Str "2-2"]]]) + ,([Str "item",Space,Emph [Str "3"]], + [[Plain [Str "definition",Space,Emph [Str "3"]]]])] +,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "one"]] + ,[Plain [Str "two"] + ,BulletList + [[Plain [Str "two",Space,Str "point",Space,Str "one"]] + ,[Plain [Str "two",Space,Str "point",Space,Str "two"]]]] + ,[Plain [Str "three"] + ,DefinitionList + [([Str "three",Space,Str "item",Space,Str "one"], + [[Plain [Str "three",Space,Str "def",Space,Str "one"]]])]] + ,[Plain [Str "four"] + ,DefinitionList + [([Str "four",Space,Str "def",Space,Str "one"], + [[Plain [Str "this",Space,Str "is",Space,Str "a",Space,Str "continuation"]]])]] + ,[Plain [Str "five"] + ,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "five",Space,Str "sub",Space,Str "1"] + ,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "five",Space,Str "sub",Space,Str "1",Space,Str "sub",Space,Str "1"]]]] + ,[Plain [Str "five",Space,Str "sub",Space,Str "2"]]]]] +,OrderedList (1,DefaultStyle,DefaultDelim) + [[Plain [Str "other"] + ,OrderedList (1,UpperRoman,DefaultDelim) + [[Plain [Str "list"]] + ,[Plain [Str "styles"]]]] + ,[Plain [Str "are"] + ,OrderedList (1,LowerRoman,DefaultDelim) + [[Plain [Str "also"]] + ,[Plain [Str "possible"]]]] + ,[Plain [Str "all"] + ,OrderedList (1,LowerAlpha,DefaultDelim) + [[Plain [Str "the"]] + ,[Plain [Str "different"]] + ,[Plain [Str "styles"]]]] + ,[Plain [Str "are"] + ,OrderedList (1,UpperAlpha,DefaultDelim) + [[Plain [Str "implemented"]] + ,[Plain [Str "and"]] + ,[Plain [Str "supported"]]]]] +,Header 1 ("tables",[],[]) [Str "tables"] +,Table [] [AlignDefault,AlignDefault] [0.0,0.0] + [[] + ,[]] + [[[Plain [Str "Orange"]] + ,[Plain [Str "Apple"]]] + ,[[Plain [Str "Bread"]] + ,[Plain [Str "Pie"]]] + ,[[Plain [Str "Butter"]] + ,[Plain [Str "Ice",Space,Str "cream"]]]] +,Table [] [AlignLeft,AlignLeft] [0.0,0.0] + [[Plain [Str "Orange"]] + ,[Plain [Str "Apple"]]] + [[[Plain [Str "Bread"]] + ,[Plain [Str "Pie"]]] + ,[[Plain [Strong [Str "Butter"]]] + ,[Plain [Str "Ice",Space,Str "cream"]]]] +,Table [] [AlignLeft,AlignLeft] [0.0,0.0] + [[Plain [Str "Orange"]] + ,[Plain [Str "Apple"]]] + [[[Plain [Str "Bread",LineBreak,LineBreak,Str "and",Space,Str "cheese"]] + ,[Plain [Str "Pie",LineBreak,LineBreak,Strong [Str "apple"],Space,Str "and",Space,Emph [Str "carrot"]]]]] +,Table [] [AlignDefault,AlignDefault,AlignDefault] [0.0,0.0,0.0] + [[] + ,[] + ,[]] + [[[Plain [Str "Orange"]] + ,[Plain [Str "Apple"]] + ,[Plain [Str "more"]]] + ,[[Plain [Str "Bread"]] + ,[Plain [Str "Pie"]] + ,[Plain [Str "more"]]] + ,[[Plain [Str "Butter"]] + ,[Plain [Str "Ice",Space,Str "cream"]] + ,[Plain [Str "and",Space,Str "more"]]]] +,Header 1 ("macros",[],[]) [Str "macros"] +,Para [Span ("",["twiki-macro","TEST"],[]) []] +,Para [Span ("",["twiki-macro","TEST"],[]) [Str ""]] +,Para [Span ("",["twiki-macro","TEST"],[]) [Str "content with spaces"]] +,Para [Span ("",["twiki-macro","TEST"],[]) [Str "content with spaces"]] +,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test")]) [Str "content with spaces"]] +,Para [Span ("",["twiki-macro","TEST"],[]) [Str "content with spaces ARG1=test"]] +,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test")]) [Str "content with spaces"]] +,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test"),("ARG2","test2")]) [Str ""]] +,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test"),("ARG2","test2")]) [Str ""]] +,Para [Span ("",["twiki-macro","TEST"],[("ARG1","test"),("ARG2","test2")]) [Str "multiline\ndoes also work"]]] diff --git a/tests/twiki-reader.twiki b/tests/twiki-reader.twiki new file mode 100644 index 000000000..51828ef80 --- /dev/null +++ b/tests/twiki-reader.twiki @@ -0,0 +1,221 @@ +---+ header + +---++ header level two + +---+++ header level 3 + +---++++ header _level_ four + +---+++++ header level 5 + +---++++++ header level 6 + +---+++++++ not a header + + --++ not a header + +---+ emph and strong + +_emph_ *strong* + +__strong and emph__ + +*<i>emph inside</i> strong* + +*strong with <i>emph</i>* + +_<b>strong inside</b> emph_ + +---+ horizontal rule + +top +--- +bottom + +--- + +---+ nop + +<nop>_not emph_ + +---+ entities + +hi & low + +hi & low + +Gödel + +̉પ + +---+ comments + +inline <!-- secret --> comment + +<!-- secret --> + +between blocks + + <!-- secret --> + +---+ linebreaks + +hi%BR%there + +hi%BR% +there + +---+ inline code + +<code>*→*</code> =typed= <code class="haskell">>>=</code> + +---+ code blocks + +<verbatim> +case xs of + (_:_) -> reverse xs + [] -> ['*'] +</verbatim> + +<verbatim class="haskell"> +case xs of + (_:_) -> reverse xs + [] -> ['*'] +</verbatim> + +---+ block quotes + +Regular paragraph +<blockquote> +This is a block quote. + +With two paragraphs. +</blockquote> +Nother paragraph. + +---+ external links + +[[http://google.com][<i>Google</i> search engine]] + +http://johnmacfarlane.net/pandoc/ + +[[http://google.com]] [[http://yahoo.com]] + +[[mailto:info@example.org][email me]] + +!http://google.com + +<nop>http://google.com + +<noautolink> +http://google.com +</noautolink> + +!info@example.org + +<nop>info@example.org + +<noautolink> +info@example.org +</noautolink> + +---+ lists + + * Start each line + * with an asterisk (*). + * More asterisks gives deeper + * and deeper levels. + * Line breaks%BR%don't break levels. + * Continuations + are also possible + * and do not break the list flow + * Level one +Any other start ends the list. + + 1. Start each line + 1. with a number (1.). + 1. More number signs gives deeper + 1. and deeper + 1. levels. + 1. Line breaks%BR%don't break levels. + 1. Blank lines + + 1. end the list and start another. +Any other start also +ends the list. + + $ item 1: definition 1 + $ item 2: definition 2-1 + definition 2-2 + $ item _3_: definition _3_ + + 1. one + 1. two + * two point one + * two point two + 1. three + $ three item one: three def one + 1. four + $ four def one: this + is a continuation + 1. five + 1. five sub 1 + 1. five sub 1 sub 1 + 1. five sub 2 + + 1. other + I. list + I. styles + 1. are + i. also + i. possible + 1. all + a. the + a. different + a. styles + 1. are + A. implemented + A. and + A. supported + +---+ tables + +|Orange|Apple| +|Bread|Pie| +|Butter|Ice cream| + +|*Orange*|*Apple*| +|Bread|Pie| +|*Butter*|Ice cream| + +|*Orange*|*Apple*| +|Bread%BR%%BR%and cheese|Pie%BR%%BR%*apple* and <i>carrot</i>| + +| Orange | Apple | more | +| Bread | Pie | more | +| Butter | Ice cream | and more | + +---+ macros + +%TEST% + +%TEST{}% + +%TEST{content with spaces}% + +%TEST{"content with spaces"}% + +%TEST{"content with spaces" ARG1="test"}% + +%TEST{content with spaces ARG1=test}% + +%TEST{ARG1=test content with spaces}% + +%TEST{ARG1=test ARG2=test2}% + +%TEST{ARG1="test" ARG2="test2"}% + +%TEST{ARG1="test" +ARG2="test2" +multiline +does also work}% diff --git a/tests/writer.dokuwiki b/tests/writer.dokuwiki index 704e79b87..dc14e9b00 100644 --- a/tests/writer.dokuwiki +++ b/tests/writer.dokuwiki @@ -268,7 +268,8 @@ Multiple blocks with italics: <HTML><dt></HTML>//orange//<HTML></dt></HTML> <HTML><dd></HTML><HTML><p></HTML>orange fruit<HTML></p></HTML> <code>{ orange code block }</code> -> <HTML><p></HTML>orange block quote<HTML></p></HTML><HTML></dd></HTML><HTML></dl></HTML> +> <HTML><p></HTML>orange block quote<HTML></p></HTML> +<HTML></dd></HTML><HTML></dl></HTML> Multiple definitions, tight: @@ -611,7 +612,7 @@ If you want, you can indent every line, but you can also be lazy and just indent )) > Notes can go in quotes.((In quote. -)) +> )) - And in list items.((In list.)) diff --git a/tests/writer.latex b/tests/writer.latex index a966e374b..5b41bac75 100644 --- a/tests/writer.latex +++ b/tests/writer.latex @@ -74,10 +74,11 @@ markdown test suite. \section{Headers}\label{headers} -\subsection{Level 2 with an \href{/url}{embedded -link}}\label{level-2-with-an-embedded-link} +\subsection{\texorpdfstring{Level 2 with an \href{/url}{embedded +link}}{Level 2 with an embedded link}}\label{level-2-with-an-embedded-link} -\subsubsection{Level 3 with \emph{emphasis}}\label{level-3-with-emphasis} +\subsubsection{\texorpdfstring{Level 3 with +\emph{emphasis}}{Level 3 with emphasis}}\label{level-3-with-emphasis} \paragraph{Level 4}\label{level-4} @@ -85,7 +86,8 @@ link}}\label{level-2-with-an-embedded-link} \section{Level 1}\label{level-1} -\subsection{Level 2 with \emph{emphasis}}\label{level-2-with-emphasis} +\subsection{\texorpdfstring{Level 2 with +\emph{emphasis}}{Level 2 with emphasis}}\label{level-2-with-emphasis} \subsubsection{Level 3}\label{level-3} @@ -734,7 +736,7 @@ These shouldn't be math: \begin{itemize} \itemsep1pt\parskip0pt\parsep0pt \item - To get the famous equation, write \texttt{\$e = mc\^{}2\$}. + To get the famous equation, write \texttt{\$e\ =\ mc\^{}2\$}. \item \$22,000 is a \emph{lot} of money. So is \$34,000. (It worked if ``lot'' is emphasized.) diff --git a/tests/writer.opml b/tests/writer.opml index 54be4b671..840c3c6e1 100644 --- a/tests/writer.opml +++ b/tests/writer.opml @@ -18,7 +18,7 @@ </outline> <outline text="Level 1"> <outline text="Level 2 with <em>emphasis</em>"> - <outline text="Level 3" _note="with no blank line "> + <outline text="Level 3" _note="with no blank line"> </outline> </outline> <outline text="Level 2" _note="with no blank line ------------------------------------------------------------------------"> @@ -55,18 +55,18 @@ <outline text="Special Characters" _note="Here is some unicode: - I hat: Î - o umlaut: ö - section: § - set membership: ∈ - copyright: © AT&T has an ampersand in their name. AT&T is another way to write it. This & that. 4 \< 5. 6 \> 5. Backslash: \\ Backtick: \` Asterisk: \* Underscore: \_ Left brace: { Right brace: } Left bracket: [ Right bracket: ] Left paren: ( Right paren: ) Greater-than: \> Hash: \# Period: . Bang: ! Plus: + Minus: - ------------------------------------------------------------------------"> </outline> <outline text="Links"> - <outline text="Explicit" _note="Just a [URL](/url/). [URL and title](/url/ "title"). [URL and title](/url/ "title preceded by two spaces"). [URL and title](/url/ "title preceded by a tab"). [URL and title](/url/ "title with "quotes" in it") [URL and title](/url/ "title with single quotes") [with\_underscore](/url/with_underscore) [Email link](mailto:nobody@nowhere.net) [Empty](). "> + <outline text="Explicit" _note="Just a [URL](/url/). [URL and title](/url/ "title"). [URL and title](/url/ "title preceded by two spaces"). [URL and title](/url/ "title preceded by a tab"). [URL and title](/url/ "title with "quotes" in it") [URL and title](/url/ "title with single quotes") [with\_underscore](/url/with_underscore) [Email link](mailto:nobody@nowhere.net) [Empty]()."> </outline> - <outline text="Reference" _note="Foo [bar](/url/). Foo [bar](/url/). Foo [bar](/url/). With [embedded [brackets]](/url/). [b](/url/) by itself should be a link. Indented [once](/url). Indented [twice](/url). Indented [thrice](/url). This should [not][] be a link. [not]: /url Foo [bar](/url/ "Title with "quotes" inside"). Foo [biz](/url/ "Title with "quote" inside"). "> + <outline text="Reference" _note="Foo [bar](/url/). Foo [bar](/url/). Foo [bar](/url/). With [embedded [brackets]](/url/). [b](/url/) by itself should be a link. Indented [once](/url). Indented [twice](/url). Indented [thrice](/url). This should [not][] be a link. [not]: /url Foo [bar](/url/ "Title with "quotes" inside"). Foo [biz](/url/ "Title with "quote" inside")."> </outline> - <outline text="With ampersands" _note="Here’s a [link with an ampersand in the URL](http://example.com/?foo=1&bar=2). Here’s a link with an amersand in the link text: [AT&T](http://att.com/ "AT&T"). Here’s an [inline link](/script?foo=1&bar=2). Here’s an [inline link in pointy braces](/script?foo=1&bar=2). "> + <outline text="With ampersands" _note="Here’s a [link with an ampersand in the URL](http://example.com/?foo=1&bar=2). Here’s a link with an amersand in the link text: [AT&T](http://att.com/ "AT&T"). Here’s an [inline link](/script?foo=1&bar=2). Here’s an [inline link in pointy braces](/script?foo=1&bar=2)."> </outline> <outline text="Autolinks" _note="With an ampersand: <http://example.com/?foo=1&bar=2> - In a list? - <http://example.com/> - It should. An e-mail address: <nobody@nowhere.net> > Blockquoted: <http://example.com/> Auto-links should not occur here: `<http://example.com/>` or here: <http://example.com/> ------------------------------------------------------------------------"> </outline> </outline> <outline text="Images" _note="From “Voyage dans la Lune” by Georges Melies (1902):  Here is a movie  icon. ------------------------------------------------------------------------"> </outline> -<outline text="Footnotes" _note="Here is a footnote reference,[^1] and another.[^2] This should *not* be a footnote reference, because it contains a space.[\^my note] Here is an inline note.[^3] > Notes can go in quotes.[^4] 1. And in list items.[^5] This paragraph should not be part of the note, as it is not indented. [^1]: Here is the footnote. It can go anywhere after the footnote reference. It need not be placed at the end of the document. [^2]: Here’s the long note. This one contains multiple blocks. Subsequent blocks are indented to show that they belong to the footnote (as with list items). { <code> } If you want, you can indent every line, but you can also be lazy and just indent the first line of each block. [^3]: This is *easier* to type. Inline notes may contain [links](http://google.com) and `]` verbatim characters, as well as [bracketed text]. [^4]: In quote. [^5]: In list. "> +<outline text="Footnotes" _note="Here is a footnote reference,[^1] and another.[^2] This should *not* be a footnote reference, because it contains a space.[\^my note] Here is an inline note.[^3] > Notes can go in quotes.[^4] 1. And in list items.[^5] This paragraph should not be part of the note, as it is not indented. [^1]: Here is the footnote. It can go anywhere after the footnote reference. It need not be placed at the end of the document. [^2]: Here’s the long note. This one contains multiple blocks. Subsequent blocks are indented to show that they belong to the footnote (as with list items). { <code> } If you want, you can indent every line, but you can also be lazy and just indent the first line of each block. [^3]: This is *easier* to type. Inline notes may contain [links](http://google.com) and `]` verbatim characters, as well as [bracketed text]. [^4]: In quote. [^5]: In list."> </outline> </body> </opml> diff --git a/tests/writer.rst b/tests/writer.rst index 1a998d2ae..f09871a34 100644 --- a/tests/writer.rst +++ b/tests/writer.rst @@ -839,6 +839,7 @@ From “Voyage dans la Lune” by Georges Melies (1902): :alt: Voyage dans la Lune lalune + Here is a movie |movie| icon. -------------- |
