diff options
-rw-r--r-- | README | 217 | ||||
-rw-r--r-- | changelog | 16 | ||||
-rw-r--r-- | man/man1/pandoc.1.md | 8 | ||||
-rw-r--r-- | pandoc.cabal | 11 | ||||
-rw-r--r-- | src/Text/Pandoc/Plugins.hs | 69 | ||||
-rw-r--r-- | src/pandoc.hs | 19 |
6 files changed, 8 insertions, 332 deletions
@@ -309,14 +309,6 @@ For further documentation, see the `pandoc(1)` man page. repeatedly to include multiple files. They will be included in the order specified. -`-P` *MODULE[,MODULE...]*, `--plugins`*=MODULE[,MODULE...]* -: specifies plugins to load, by module name or source file pathname. - Plugins should export a function `transform` of type `a -> a` - or `a -> IO a`, where `a` is `Inline`, `Block`, `Pandoc`, - `[Inline]`, or `[Block]`. This function will be used to transform - the pandoc document after it is parsed by the reader and before it - is written out by the writer. (See below on [Plugins](#plugins).) - `-T` or `--title-prefix` *string* : includes *string* as a prefix at the beginning of the title that appears in the HTML header (but not in the title as it appears at @@ -1124,212 +1116,3 @@ ordinary HTML (without bird tracks). writes HTML with the Haskell code in bird tracks, so it can be copied and pasted as literate Haskell source. -Plugins -======= - -Pandoc's plugin system allows users to modify pandoc's behavior by writing -short Haskell programs. To how this works, and why it is useful, we - need to understand that pandoc transforms one format (the source -format) into another (the target format) by first converting from the -source format into a Haskell data structure representing the document, -and then converting this data structure into the target format. For -example: - - ------------------------------------------------------------------------------- - Document Format Contents - ------------ --------- ----------------------------------------------------- - source markdown `Hello *world*.` - - ↓ ↓ ↓ - - intermediate native `Pandoc (Meta [] [] "") - [Para [Str "Hello",Space,Emph [Str "world"],Str "."]]` - - ↓ ↓ ↓ - - target HTML `<p>Hello <em>world</em>.</p>` - ------------------------------------------------------------------------------- - -We can use standard text-processing tools (`perl`, `sed`, `awk`, etc.) -to modify the source or target documents. But what if we want to modify -the intermediate representation -- the parsed document -- before it is -written to the target format? That's where plugins are needed. -A plugin is a Haskell module that exports a function `transform`, which -will be used to transform the native representation, after it is generated -by the reader, but before it has been transformed by the writer. - -An example will help make this clearer. Suppose we want to use pandoc with -the WordPress blog engine. WordPress provides support for LaTeX math, but -instead of `$e = mc^2$`, WordPress wants `$LaTeX e = mc^2$`. Prior to plugins, -there was no good way to make pandoc do this. We could have tried using -regex replacements on the markdown input or HTML output, but this would have -been error-prone: if someone writes `$e = mc^2$` in a code block, for -example, we wouldn't want to insert `LaTeX` there. There's no good way to -identify the math chunks without parsing the whole document. And pandoc -is already doing that, so why not make use of this work? By writing a -plugin, we can. Here's the whole plugin: - -~~~ {.haskell} --- WordPressPlugin.hs -module WordPressPlugin (transform) where -import Text.Pandoc - -transform :: Inline -> Inline -transform (Math x y) = Math x $ "LaTeX " ++ y -transform x = x -~~~ - -This is a Haskell program, but a very short one. The lines - -~~~ {.haskell} -module WordPressPlugin (transform) where -import Text.Pandoc -~~~ - -just define the name of the module (`WordPressPlugin`), the names of any -exported functions (for a plugin, this will always just be `transform`), -and the modules that will be used in the program itself (`Text.Pandoc`). -Every plugin must export a function named `transform`. - -The real meat of the program is the three-line definition of `transform`: - -~~~ {.haskell} -transform :: Inline -> Inline -transform (Math x y) = Math x $ "LaTeX " ++ y -transform x = x -~~~ - -The first line defines the type of the function: it is a function that -takes an `Inline` element and returns an `Inline` element. (For the definition -of `Inline`, see the module `Text.Pandoc.Definition`.) The `transform` -function in a plugin need not be `Inline -> Inline`, but it must have -type `a -> a` or `a -> IO a`, where `a` is `Pandoc`, `Block`, `Inline`, -`[Block]`, or `[Inline]`. - -The next line says that when the input matches the pattern `Math x y`, -the string `LaTeX ` should be inserted at the beginning of `y`. (The `x` -just specifies whether the math element is inline or display math, so -we leave it alone.) The last line says, in effect, that the `transform` -function has no effect on any other kind of `Inline` element -- it just -passes it through. When the plugin is applied, this transformation will -be used on every `Inline` element in the document, and `LaTeX ` will be -inserted where needed in math elements. - -To use this plugin, we just specify the module (or alternatively the filename) -with the `--plugins` option: - - % echo "Hello, $e=mc^2$." | pandoc -m --plugins=WordPressPlugin.hs - <p - >Hello, <span class="LaTeX" - >$LaTeX e=mc^2$</span - >.</p - > - -Let's look at a more complex example, involving IO. Suppose we want to include -some [graphviz](http://www.graphviz.org/) diagrams in our document. -Of course, we could use a Makefile to generate the diagrams, then use -regular images in our document. But wouldn't it be nicer just to include -the graphviz code in the document itself, perhaps in a specially marked -delimited code block? - - ~~~ {.dot name="diagram1"} - graph G { - e - subgraph clusterA { - a -- b; - subgraph clusterC { - C -- D; - } - } - subgraph clusterB { - d -- f - } - d -- D - e -- clusterB - clusterC -- clusterB - } - ~~~ - -This can be accomplished by a plugin: - -~~~ {.haskell} --- DotPlugin.hs -module DotPlugin (transform) where -import Text.Pandoc -import Text.Pandoc.Shared -import System.Process (readProcess) -import Data.Char (ord) --- from the utf8-string package on HackageDB: -import Data.ByteString.Lazy.UTF8 (fromString) --- from the SHA package on HackageDB: -import Data.Digest.Pure.SHA - -transform :: Block -> IO Block -transform (CodeBlock (id, classes, namevals) contents) | "dot" `elem` classes = do - let (name, outfile) = case lookup "name" namevals of - Just fn -> ([Str fn], fn ++ ".png") - Nothing -> ([], uniqueName contents ++ ".png") - result <- readProcess "dot" ["-Tpng"] contents - writeFile outfile result - return $ Para [Image name (outfile, "")] -transform x = return x - --- | Generate a unique filename given the file's contents. -uniqueName :: String -> String -uniqueName = showDigest . sha1 . fromString -~~~ - -The heart of this plugin is the `transform` function, which converts a `Block` -to a `Block`. Again, there are two clauses, one for code blocks that are marked -with the "dot" class, one for all other blocks. Code blocks with ".dot" are -replaced with links to an image file; this file is generated by running -`dot -Tpng` on the contents of the code block. - -Because `transform` performs file reads and writes, it needs to be in the -IO monad, hence the type: `Block -> IO Block`. - -One more example. Suppose we want emphasized text to be CAPITALIZED -instead of italicized. We could use a plugin: - -~~~ {.haskell} -module CapitalizeEmphasisPlugin (transform) where -import Text.Pandoc -import Data.Char (toUpper) - -transform :: [Inline] -> [Inline] -transform (Emph xs : ys) = processWith capStr xs ++ transform ys -transform (x : ys) = x : transform ys -transform [] = [] - -capStr :: Inline -> Inline -capStr (Str x) = Str (map toUpper x) -capStr x = x -~~~ - -Here `transform` converts a whole list of `Inline` elements to another -such list. The key clause is - -~~~ {.haskell} -transform (Emph xs : ys) = processWith capStr xs ++ transform ys -~~~ - -This applies the `capStr` function recursively to all inlines in the -list of emphasized inlines and puts the transformed list in place -of the original. `capStr` is a simple `Inline` transformation that -capitalizes `Str` elements and leaves everything else alone. The -function `processWith`, defined in `Text.Pandoc.Definition`, uses some -`Data.Generics` magic to apply its argument (here `capStr`) to every -`Inline` element in a list, including elements that are deeply buried in -other elements. Thus - - processWith captStr [Str "one", Strong [Str "two", Space]] ==> - [Str "ONE", Strong [Str "TWO", Space]] - -There are other sample plugins in the `plugins` subdirectory of the -pandoc source code. - -**Note:** Do not attempt to use plugins when running pandoc in the -directory containing pandoc's source code. The interpreter will try to -load the files directly from the source code, rather than reading the compiled -versions, and pandoc will hang. - @@ -16,19 +16,9 @@ pandoc (1.2) + Added tests (run only if highlighting support compiled in). + Documented lhs features in man page and README. - * Added a plugin system, based on hint. This allows users to - write small interpreted Haskell programs that transform the - AST between the reader and the writer. - - + In Text.Pandoc.Definition, added processWith, processWithM, - and queryWith, and deprecated processPandoc and queryPandoc - for these more general functions, which are useful in writing - plugins. - + Added module Text.Pandoc.Plugins. - + Added a --plugins option to pandoc, and code to run the parsed - document through all the plugins. - + Provided five sample plugin files in the plugins/ directory. - + Documented --plugin in the pandoc man page and README. + * In Text.Pandoc.Definition, added processWith, processWithM, + and queryWith, and deprecated processPandoc and queryPandoc + for these more general functions. * Fixed bug in mediawiki writer: improper closing tags in tables. Thanks to Benct Philip Jonsson for reporting the bug. diff --git a/man/man1/pandoc.1.md b/man/man1/pandoc.1.md index 467a62b0a..9a0907321 100644 --- a/man/man1/pandoc.1.md +++ b/man/man1/pandoc.1.md @@ -181,14 +181,6 @@ to Pandoc. Or use `html2markdown`(1), a wrapper around `pandoc`. `context`, `docbook`, `man`, `markdown`, `opendocument`, `rst`, `rtf`). --P *MODULE[,MODULE...]*, \--plugins=*MODULE[,MODULE...]* -: Specify plugins to load, by module name or source file pathname. - Plugins should export a function `transform` of type `a -> a` - or `a -> IO a`, where `a` is `Inline`, `Block`, `Pandoc`, - `[Inline]`, or `[Block]`. This function will be used to transform - the pandoc document after it is parsed by the reader and before it - is written out by the writer. - -T *STRING*, \--title-prefix=*STRING* : Specify *STRING* as a prefix to the HTML window title. diff --git a/pandoc.cabal b/pandoc.cabal index 5fdca06ab..0a0d10f0e 100644 --- a/pandoc.cabal +++ b/pandoc.cabal @@ -38,13 +38,7 @@ Data-Files: -- documentation README, INSTALL, COPYRIGHT, BUGS, changelog, -- wrappers - markdown2pdf, html2markdown, hsmarkdown, - -- sample plugins - plugins/WordPressPlugin.hs, - plugins/CapitalizeEmphasisPlugin.hs, - plugins/IncludeFilePlugin.hs, - plugins/DotPlugin.hs, - plugins/ListLinksPlugin.hs + markdown2pdf, html2markdown, hsmarkdown Extra-Source-Files: -- sources for man pages man/man1/pandoc.1.md, man/man1/markdown2pdf.1.md, @@ -159,7 +153,7 @@ Library mtl >= 1.1, network >= 2, filepath >= 1.1, process >= 1, directory >= 1, template-haskell >= 2.2, bytestring >= 0.9, zip-archive >= 0.1.1, - utf8-string >= 0.3, old-time >= 1, hint >= 0.3.0.0 + utf8-string >= 0.3, old-time >= 1 if impl(ghc >= 6.10) Build-depends: base >= 4, syb else @@ -178,7 +172,6 @@ Library Text.Pandoc.CharacterReferences, Text.Pandoc.Shared, Text.Pandoc.ODT, - Text.Pandoc.Plugins, Text.Pandoc.LaTeXMathML, Text.Pandoc.DefaultHeaders, Text.Pandoc.Highlighting, diff --git a/src/Text/Pandoc/Plugins.hs b/src/Text/Pandoc/Plugins.hs deleted file mode 100644 index cb8ad1e11..000000000 --- a/src/Text/Pandoc/Plugins.hs +++ /dev/null @@ -1,69 +0,0 @@ -{- -Copyright (C) 2008 John MacFarlane <jgm@berkeley.edu> - -This program is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 2 of the License, or -(at your option) any later version. - -This program is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -You should have received a copy of the GNU General Public License -along with this program; if not, write to the Free Software -Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA --} - -{- | - Module : Text.Pandoc.Pluigns - Copyright : Copyright (C) 2008 John MacFarlane - License : GNU GPL, version 2 or above - - Maintainer : John MacFarlane <jgm@berkeley.edu> - Stability : alpha - Portability : portable - -Support for plugins. --} - -module Text.Pandoc.Plugins (getPlugin) -where - -import Language.Haskell.Interpreter -import Text.Pandoc -import Control.Monad (unless, liftM) -import Control.Monad.Error (throwError) -import Data.List (isInfixOf) - --- | Returns the function named @transform@ in the specified --- module. The module may be identified either by module name --- or by path name. The @transform@ function should have type --- @a -> a@ or @a -> IO a@, where @a@ is an instance of 'Data': --- for example, @Pandoc -> Pandoc@, @Inline -> IO Inline@, --- @Block -> Block@, or @[Inline] -> IO [Inline]@. -getPlugin :: String -> IO (Pandoc -> IO Pandoc) -getPlugin modsrc = do - res <- runInterpreter (evaluatePlugin modsrc) - case res of - Right func -> return func - Left (WontCompile xs) -> error $ "WontCompile error for plugin '" ++ modsrc ++ "'\n" ++ unlines (map errMsg xs) - Left (NotAllowed x) -> error $ "NotAllowed error for plugin '" ++ modsrc ++ "'\n" ++ x - Left (UnknownError x) -> error $ "UnknownError for plugin '" ++ modsrc ++ "'\n" ++ x - Left (GhcException x) -> error $ "GhcException for plugin '" ++ modsrc ++ "'\n" ++ x - -evaluatePlugin :: String -> Interpreter (Pandoc -> IO Pandoc) -evaluatePlugin modsrc = do - set [installedModulesInScope := False] - loadModules [modsrc] - modnames <- getLoadedModules - setTopLevelModules modnames - setImports ["Prelude", "Text.Pandoc", "Text.Pandoc.Definition"] - exports <- liftM concat $ mapM getModuleExports modnames - unless ((Fun "transform") `elem` exports) $ - throwError $ UnknownError $ "The plugin module must define a function 'transform'." - transformType <- typeOf "transform" - if "-> IO" `isInfixOf` transformType - then interpret "processWithM transform" (as :: Pandoc -> IO Pandoc) - else interpret "return . (processWith transform)" (as :: Pandoc -> IO Pandoc) diff --git a/src/pandoc.hs b/src/pandoc.hs index 64b79619c..12605c401 100644 --- a/src/pandoc.hs +++ b/src/pandoc.hs @@ -32,7 +32,7 @@ writers. module Main where import Text.Pandoc import Text.Pandoc.ODT -import Text.Pandoc.Shared ( HTMLMathMethod (..), splitBy, tabFilter, ObfuscationMethod (..) ) +import Text.Pandoc.Shared ( HTMLMathMethod (..), tabFilter, ObfuscationMethod (..) ) import Text.Pandoc.Highlighting ( languages ) import System.Environment ( getArgs, getProgName, getEnvironment ) import System.Exit ( exitWith, ExitCode (..) ) @@ -48,8 +48,7 @@ import System.IO.UTF8 import Text.CSL import Text.Pandoc.Biblio #endif -import Text.Pandoc.Plugins (getPlugin) -import Control.Monad (foldM, when, unless) +import Control.Monad (when, unless) copyrightMessage :: String copyrightMessage = "\nCopyright (C) 2006-8 John MacFarlane\n" ++ @@ -365,15 +364,6 @@ options = "FILENAME") "" -- "File to use for custom header (implies -s)" - , Option "P" ["plugins"] - (ReqArg - (\arg opt -> do - let pluginModules = splitBy ',' arg - plugins <- mapM getPlugin pluginModules - return opt { optPlugins = plugins }) - "MODULE[,MODULE...]") - "" -- "Haskell modules" - , Option "T" ["title-prefix"] (ReqArg (\arg opt -> return opt { optTitlePrefix = arg, @@ -540,7 +530,6 @@ main = do , optReferenceLinks = referenceLinks , optWrapText = wrap , optSanitizeHTML = sanitize - , optPlugins = plugins , optEmailObfuscation = obfuscationMethod #ifdef _CITEPROC , optBiblioFile = biblioFile @@ -653,9 +642,7 @@ main = do return doc #endif - doc'' <- foldM (flip ($)) doc' plugins - - let writerOutput = writer writerOptions doc'' ++ "\n" + let writerOutput = writer writerOptions doc' ++ "\n" case writerName' of "odt" -> saveOpenDocumentAsODT outputFile sourceDirRelative writerOutput |