diff options
author | fiddlosopher <fiddlosopher@788f1e2b-df1e-0410-8736-df70ead52e1b> | 2009-01-24 19:58:06 +0000 |
---|---|---|
committer | fiddlosopher <fiddlosopher@788f1e2b-df1e-0410-8736-df70ead52e1b> | 2009-01-24 19:58:06 +0000 |
commit | 874c3e0deabab154548a3e91e271e86e94ba8502 (patch) | |
tree | e976f223d7d1c6435f93ebaf83e8ef0aa7be31a8 /README | |
parent | 243008242d76017d3828550d2ec23580894d5490 (diff) | |
download | pandoc-874c3e0deabab154548a3e91e271e86e94ba8502.tar.gz |
Added a plugin system, based on hint.
+ In Text.Pandoc.Definition, added processIn, processInM,
and queryIn, and deprecated processPandoc and queryPandoc
for these more general functions, which are useful in writing
plugins.
+ Added module Text.Pandoc.Plugins.
+ Added a --plugins option to Main, and code to run the parsed pandoc
document through all the plugins.
+ Provided five sample plugin files in the plugins/ directory.
+ Documented --plugin in the pandoc man page and README.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1519 788f1e2b-df1e-0410-8736-df70ead52e1b
Diffstat (limited to 'README')
-rw-r--r-- | README | 170 |
1 files changed, 170 insertions, 0 deletions
@@ -309,6 +309,14 @@ For further documentation, see the `pandoc(1)` man page. repeatedly to include multiple files. They will be included in the order specified. +`-P` *MODULE[,MODULE...]*, `--plugins`*=MODULE[,MODULE...]* +: specifies plugins to load, by module name or source file pathname. + Plugins should export a function `transform` of type `a -> a` + or `a -> IO a`, where `a` is `Inline`, `Block`, `Pandoc`, + `[Inline]`, or `[Block]`. This function will be used to transform + the pandoc document after it is parsed by the reader and before it + is written out by the writer. (See below on [Plugins](#plugins).) + `-T` or `--title-prefix` *string* : includes *string* as a prefix at the beginning of the title that appears in the HTML header (but not in the title as it appears at @@ -1108,3 +1116,165 @@ ordinary HTML (without bird tracks). writes HTML with the Haskell code in bird tracks, so it can be copied and pasted as literate Haskell source. +Plugins +======= + +Pandoc's plugin system allows users to modify pandoc's behavior by writing +short Haskell programs. A plugin is a Haskell module that exports a function +`transform`, of type `a -> a` or `a -> IO a`, where `a` is `Pandoc`, +`Block`, `Inline`, `[Block]`, or `[Inline]`. The `transform` function will +be used to transform the pandoc document generated by the reader, before +it is transformed by the writer. + +An example will help make this clearer. Suppose we want to use pandoc with +the WordPress blog engine. WordPress provides support for LaTeX math, but +instead of `$e = mc^2$`, WordPress wants `$LaTeX e = mc^2$`. Prior to plugins, +there was no good way to make pandoc do this. We could have tried using +regex replacements on the markdown input or HTML output, but this would have +been error-prone: we'd have to make sure we weren't capturing non-math text +between dollar signs (for example, text inside a code block). Besides, +pandoc's markdown reader has already identified the math bits; why not +make use of that? By writing a plugin, we can: + +~~~ {.haskell} +-- WordPressPlugin.hs +module WordPressPlugin (transform) where +import Text.Pandoc + +transform :: Inline -> Inline +transform (Math x y) = Math x $ "LaTeX " ++ y +transform x = x +~~~ + +This is a Haskell program, but a very short one. The lines + +~~~ {.haskell} +module WordPressPlugin (transform) where +import Text.Pandoc +~~~ + +just define the name of the module (`WordPressPlugin`), the names of any +exported functions (for a plugin, this will always just be `transform`), +and the modules that will be used in the program itself (`Text.Pandoc`). +The real meat of the program is the three-line definition of `transform`: + +~~~ {.haskell} +transform :: Inline -> Inline +transform (Math x y) = Math x $ "LaTeX " ++ y +transform x = x +~~~ + +The first line defines the type of the function: it is a function that +takes an `Inline` element and returns an `Inline` element. (For the definition +of `Inline`, see the module `Text.Pandoc.Definition`.) The next line says +that when the input matches the pattern `Math x y`, the string `LaTeX ` +should be inserted at the beginning of `y`. (`x` just specifies whether the +math element is inline or display math, so we leave it alone.) The last +line says, in effect, that the `transform` function has no effect on any +other kind of `Inline` element -- it just passes it through. When the plugin +is applied, this transformation will be used on every `Inline` element in +the document, and `LaTeX ` will be inserted where needed in math elements. + +To use this plugin, we just specify the module (or alternatively the filename) +with the `--plugins` option: + + % echo "Hello, $e=mc^2$." | pandoc -m --plugins=WordPressPlugin.hs + <p + >Hello, <span class="LaTeX" + >$LaTeX e=mc^2$</span + >.</p + > + +Let's look at a more complex example, involving IO. Suppose we want to include +some graphviz diagrams in our document. Of course, we could use a Makefile to +generate the diagrams, then use regular images in our document. But wouldn't it +be nicer just to include the graphviz code in the document itself, perhaps in +a specially marked delimited code block? + + ~~~ {.dot name="diagram1"} + digraph G {Hello->World} + ~~~ + +This can be accomplished by a plugin: + +~~~ {.haskell} +-- DotPlugin.hs +module DotPlugin (transform) where +import Text.Pandoc +import Text.Pandoc.Shared +import System.Process (readProcess) +import Data.Char (ord) +-- from the utf8-string package on HackageDB: +import Data.ByteString.Lazy.UTF8 (fromString) +-- from the SHA package on HackageDB: +import Data.Digest.Pure.SHA + +transform :: Block -> IO Block +transform (CodeBlock (id, classes, namevals) contents) | "dot" `elem` classes = do + let (name, outfile) = case lookup "name" namevals of + Just fn -> ([Str fn], fn ++ ".png") + Nothing -> ([], uniqueName contents ++ ".png") + result <- readProcess "dot" ["-Tpng"] contents + writeFile outfile result + return $ Para [Image name (outfile, "")] +transform x = return x + +-- | Generate a unique filename given the file's contents. +uniqueName :: String -> String +uniqueName = showDigest . sha1 . fromString +~~~ + +The heart of this plugin is the `transform` function, which converts a `Block` +to a `Block`. Again, there are two clauses, one for code blocks that are marked +with the "dot" class, one for all other blocks. Code blocks with ".dot" are +replaced with links to an image file; this file is generated by running +`dot -Tpng` on the contents of the code block. + +Because `transform` performs file reads and writes, it needs to be in the +IO monad, hence the type: `Block -> IO Block`. + +One more example. Suppose we want emphasized text to be CAPITALIZED +instead of italicized. We could use a plugin: + +~~~ {.haskell} +module CapitalizeEmphasisPlugin (transform) where +import Text.Pandoc +import Data.Char (toUpper) + +transform :: [Inline] -> [Inline] +transform (Emph xs : ys) = processIn capStr xs ++ transform ys +transform (x : ys) = x : transform ys +transform [] = [] + +capStr :: Inline -> Inline +capStr (Str x) = Str (map toUpper x) +capStr x = x +~~~ + +Here `transform` converts a whole list of `Inline` elements to another +such list. The key clause is + +~~~ {.haskell} +transform (Emph xs : ys) = processIn capStr xs ++ transform ys +~~~ + +This applies the `capStr` function recursively to all inlines in the +list of emphasized inlines and puts the transformed list in place +of the original. `capStr` is a simple `Inline` transformation that +capitalizes `Str` elements and leaves everything else alone. The +function `processIn`, defined in `Text.Pandoc.Definition`, uses some +`Data.Generics` magic to apply its argument (here `capStr`) to every +`Inline` element in a list, including elements that are deeply buried in +other elements. Thus + + processIn captStr [Str "one", Strong [Str "two", Space]] ==> + [Str "ONE", Strong [Str "TWO", Space]] + +There are other sample plugins in the `plugins` subdirectory of the +pandoc source code. + +**Note:** Do not attempt to use plugins when running pandoc in the +directory containing pandoc's source code. The interpreter will try to +load the files directly from the source code, rather than reading the compiled +versions, and pandoc will hang. + |