aboutsummaryrefslogtreecommitdiff
path: root/README
diff options
context:
space:
mode:
authorfiddlosopher <fiddlosopher@788f1e2b-df1e-0410-8736-df70ead52e1b>2009-01-24 19:58:06 +0000
committerfiddlosopher <fiddlosopher@788f1e2b-df1e-0410-8736-df70ead52e1b>2009-01-24 19:58:06 +0000
commit874c3e0deabab154548a3e91e271e86e94ba8502 (patch)
treee976f223d7d1c6435f93ebaf83e8ef0aa7be31a8 /README
parent243008242d76017d3828550d2ec23580894d5490 (diff)
downloadpandoc-874c3e0deabab154548a3e91e271e86e94ba8502.tar.gz
Added a plugin system, based on hint.
+ In Text.Pandoc.Definition, added processIn, processInM, and queryIn, and deprecated processPandoc and queryPandoc for these more general functions, which are useful in writing plugins. + Added module Text.Pandoc.Plugins. + Added a --plugins option to Main, and code to run the parsed pandoc document through all the plugins. + Provided five sample plugin files in the plugins/ directory. + Documented --plugin in the pandoc man page and README. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1519 788f1e2b-df1e-0410-8736-df70ead52e1b
Diffstat (limited to 'README')
-rw-r--r--README170
1 files changed, 170 insertions, 0 deletions
diff --git a/README b/README
index 86c52ac2b..4746ffae0 100644
--- a/README
+++ b/README
@@ -309,6 +309,14 @@ For further documentation, see the `pandoc(1)` man page.
repeatedly to include multiple files. They will be included in the
order specified.
+`-P` *MODULE[,MODULE...]*, `--plugins`*=MODULE[,MODULE...]*
+: specifies plugins to load, by module name or source file pathname.
+ Plugins should export a function `transform` of type `a -> a`
+ or `a -> IO a`, where `a` is `Inline`, `Block`, `Pandoc`,
+ `[Inline]`, or `[Block]`. This function will be used to transform
+ the pandoc document after it is parsed by the reader and before it
+ is written out by the writer. (See below on [Plugins](#plugins).)
+
`-T` or `--title-prefix` *string*
: includes *string* as a prefix at the beginning of the title that
appears in the HTML header (but not in the title as it appears at
@@ -1108,3 +1116,165 @@ ordinary HTML (without bird tracks).
writes HTML with the Haskell code in bird tracks, so it can be copied
and pasted as literate Haskell source.
+Plugins
+=======
+
+Pandoc's plugin system allows users to modify pandoc's behavior by writing
+short Haskell programs. A plugin is a Haskell module that exports a function
+`transform`, of type `a -> a` or `a -> IO a`, where `a` is `Pandoc`,
+`Block`, `Inline`, `[Block]`, or `[Inline]`. The `transform` function will
+be used to transform the pandoc document generated by the reader, before
+it is transformed by the writer.
+
+An example will help make this clearer. Suppose we want to use pandoc with
+the WordPress blog engine. WordPress provides support for LaTeX math, but
+instead of `$e = mc^2$`, WordPress wants `$LaTeX e = mc^2$`. Prior to plugins,
+there was no good way to make pandoc do this. We could have tried using
+regex replacements on the markdown input or HTML output, but this would have
+been error-prone: we'd have to make sure we weren't capturing non-math text
+between dollar signs (for example, text inside a code block). Besides,
+pandoc's markdown reader has already identified the math bits; why not
+make use of that? By writing a plugin, we can:
+
+~~~ {.haskell}
+-- WordPressPlugin.hs
+module WordPressPlugin (transform) where
+import Text.Pandoc
+
+transform :: Inline -> Inline
+transform (Math x y) = Math x $ "LaTeX " ++ y
+transform x = x
+~~~
+
+This is a Haskell program, but a very short one. The lines
+
+~~~ {.haskell}
+module WordPressPlugin (transform) where
+import Text.Pandoc
+~~~
+
+just define the name of the module (`WordPressPlugin`), the names of any
+exported functions (for a plugin, this will always just be `transform`),
+and the modules that will be used in the program itself (`Text.Pandoc`).
+The real meat of the program is the three-line definition of `transform`:
+
+~~~ {.haskell}
+transform :: Inline -> Inline
+transform (Math x y) = Math x $ "LaTeX " ++ y
+transform x = x
+~~~
+
+The first line defines the type of the function: it is a function that
+takes an `Inline` element and returns an `Inline` element. (For the definition
+of `Inline`, see the module `Text.Pandoc.Definition`.) The next line says
+that when the input matches the pattern `Math x y`, the string `LaTeX `
+should be inserted at the beginning of `y`. (`x` just specifies whether the
+math element is inline or display math, so we leave it alone.) The last
+line says, in effect, that the `transform` function has no effect on any
+other kind of `Inline` element -- it just passes it through. When the plugin
+is applied, this transformation will be used on every `Inline` element in
+the document, and `LaTeX ` will be inserted where needed in math elements.
+
+To use this plugin, we just specify the module (or alternatively the filename)
+with the `--plugins` option:
+
+ % echo "Hello, $e=mc^2$." | pandoc -m --plugins=WordPressPlugin.hs
+ <p
+ >Hello, <span class="LaTeX"
+ >$LaTeX e=mc^2$</span
+ >.</p
+ >
+
+Let's look at a more complex example, involving IO. Suppose we want to include
+some graphviz diagrams in our document. Of course, we could use a Makefile to
+generate the diagrams, then use regular images in our document. But wouldn't it
+be nicer just to include the graphviz code in the document itself, perhaps in
+a specially marked delimited code block?
+
+ ~~~ {.dot name="diagram1"}
+ digraph G {Hello->World}
+ ~~~
+
+This can be accomplished by a plugin:
+
+~~~ {.haskell}
+-- DotPlugin.hs
+module DotPlugin (transform) where
+import Text.Pandoc
+import Text.Pandoc.Shared
+import System.Process (readProcess)
+import Data.Char (ord)
+-- from the utf8-string package on HackageDB:
+import Data.ByteString.Lazy.UTF8 (fromString)
+-- from the SHA package on HackageDB:
+import Data.Digest.Pure.SHA
+
+transform :: Block -> IO Block
+transform (CodeBlock (id, classes, namevals) contents) | "dot" `elem` classes = do
+ let (name, outfile) = case lookup "name" namevals of
+ Just fn -> ([Str fn], fn ++ ".png")
+ Nothing -> ([], uniqueName contents ++ ".png")
+ result <- readProcess "dot" ["-Tpng"] contents
+ writeFile outfile result
+ return $ Para [Image name (outfile, "")]
+transform x = return x
+
+-- | Generate a unique filename given the file's contents.
+uniqueName :: String -> String
+uniqueName = showDigest . sha1 . fromString
+~~~
+
+The heart of this plugin is the `transform` function, which converts a `Block`
+to a `Block`. Again, there are two clauses, one for code blocks that are marked
+with the "dot" class, one for all other blocks. Code blocks with ".dot" are
+replaced with links to an image file; this file is generated by running
+`dot -Tpng` on the contents of the code block.
+
+Because `transform` performs file reads and writes, it needs to be in the
+IO monad, hence the type: `Block -> IO Block`.
+
+One more example. Suppose we want emphasized text to be CAPITALIZED
+instead of italicized. We could use a plugin:
+
+~~~ {.haskell}
+module CapitalizeEmphasisPlugin (transform) where
+import Text.Pandoc
+import Data.Char (toUpper)
+
+transform :: [Inline] -> [Inline]
+transform (Emph xs : ys) = processIn capStr xs ++ transform ys
+transform (x : ys) = x : transform ys
+transform [] = []
+
+capStr :: Inline -> Inline
+capStr (Str x) = Str (map toUpper x)
+capStr x = x
+~~~
+
+Here `transform` converts a whole list of `Inline` elements to another
+such list. The key clause is
+
+~~~ {.haskell}
+transform (Emph xs : ys) = processIn capStr xs ++ transform ys
+~~~
+
+This applies the `capStr` function recursively to all inlines in the
+list of emphasized inlines and puts the transformed list in place
+of the original. `capStr` is a simple `Inline` transformation that
+capitalizes `Str` elements and leaves everything else alone. The
+function `processIn`, defined in `Text.Pandoc.Definition`, uses some
+`Data.Generics` magic to apply its argument (here `capStr`) to every
+`Inline` element in a list, including elements that are deeply buried in
+other elements. Thus
+
+ processIn captStr [Str "one", Strong [Str "two", Space]] ==>
+ [Str "ONE", Strong [Str "TWO", Space]]
+
+There are other sample plugins in the `plugins` subdirectory of the
+pandoc source code.
+
+**Note:** Do not attempt to use plugins when running pandoc in the
+directory containing pandoc's source code. The interpreter will try to
+load the files directly from the source code, rather than reading the compiled
+versions, and pandoc will hang.
+