7 files changed, 44 insertions, 506 deletions
diff --git a/README b/README
index b8c9db03e..dc7d1f63c 100644
--- a/README
+++ b/README
@@ -127,92 +127,49 @@ will convert `source.txt` from the local encoding to UTF-8, then
 convert it to HTML, then convert back to the local encoding,
 putting the output in `output.html`.
 
-The wrapper scripts (described below) automatically convert the input
-from the local encoding to UTF-8 before running them through `pandoc`,
-then convert the output back to the local encoding.
-
 Wrappers
 ========
 
-Three wrapper scripts, `markdown2pdf`, `html2markdown`, and
-`hsmarkdown`, are included in the standard Pandoc installation. (The
-Windows binary package does not include `html2markdown`, which is
-a POSIX shell script. It does include portable Haskell versions of
-`markdown2pdf` and `hsmarkdown`.)
-
-1.  `markdown2pdf` produces a PDF file from markdown-formatted
-    text, using `pandoc` and `pdflatex`.  The default
-    behavior of `markdown2pdf` is to create a file with the same
-    base name as the first argument and the extension `pdf`; thus,
-    for example,
-
-           markdown2pdf sample.txt endnotes.txt
-
-    will produce `sample.pdf`.  (If `sample.pdf` exists already,
-    it will be backed up before being overwritten.)  An output file
-    name can be specified explicitly using the `-o` option:
-
-           markdown2pdf -o book.pdf chap1 chap2
-
-    If no input file is specified, input will be taken from stdin.
-    All of `pandoc`'s options will work with `markdown2pdf` as well.
-
-    `markdown2pdf` assumes that `pdflatex` is in the path.  It also
-    assumes that the following LaTeX packages are available:
-    `unicode`, `fancyhdr` (if you have verbatim text in footnotes),
-    `graphicx` (if you use images), `array` (if you use tables),
-    and `ulem` (if you use strikeout text).  If they are not already
-    included in your LaTeX distribution, you can get them from
-    [CTAN]. A full [TeX Live] or [MacTeX] distribution will have all of
-    these packages.
-
-2.  `html2markdown` grabs a web page from a file or URL and converts
-    it to markdown-formatted text, using `tidy` and `pandoc`.
-
-    All of `pandoc`'s options will work with `html2markdown` as well.
-    In addition, the following special options may be used.
-    The special options must be separated from the `html2markdown`
-    command and any regular Pandoc options by the delimiter `--`:
-
-        html2markdown -o out.txt -- -e latin1 -g curl google.com 
-
-    The `-e` or `--encoding` option specifies the character encoding
-    of the HTML input.  If this option is not specified, and input
-    is not from stdin, `html2markdown` will attempt to determine the
-    page's character encoding from the "Content-type" meta tag.
-    If this is not present, UTF-8 is assumed.
-
-    The `-g` or `--grabber` option specifies the command to be used to
-    fetch the contents of a URL:
-
-        html2markdown -g 'curl --user foo:bar' www.mysite.com
-
-    If this option is not specified, `html2markdown` searches for an
-    available program (`wget`, `curl`, or a text-mode browser) to fetch
-    the contents of a URL.
-
-    `html2markdown` requires [HTML Tidy], which must be in the path.
-    It uses [`iconv`] for character encoding conversions; if `iconv`
-    is absent, it will still work, but it will treat everything as UTF-8.
-
-3.  `hsmarkdown` is designed to be used as a drop-in replacement for
-    `Markdown.pl`.  It forces `pandoc` to convert from markdown to
-    HTML, and to use the `--strict` flag for maximal compliance with
-    official markdown syntax.  (All of Pandoc's syntax extensions and
-    variants, described below, are disabled.)  No other command-line
-    options are allowed.  (In fact, options will be interpreted as
-    filenames.)
-
-    As an alternative to using the `hsmarkdown` script, the
-    user may create a symbolic link to `pandoc` called `hsmarkdown`.
-    When invoked under the name `hsmarkdown`, `pandoc` will behave
-    as if the `--strict` flag had been selected, and no command-line
-    options will be recognized.  However, this approach does not work
-    under Cygwin, due to problems with its simulation of symbolic
-    links.
+`markdown2pdf`
+--------------
+
+The standard Pandoc installation includes `markdown2pdf`, a wrapper
+around `pandoc` and `pdflatex` that produces PDFs directly from markdown
+sources. The default behavior of `markdown2pdf` is to create a file with
+the same base name as the first argument and the extension `pdf`; thus,
+for example,
+
+   markdown2pdf sample.txt endnotes.txt
+
+will produce `sample.pdf`.  (If `sample.pdf` exists already,
+it will be backed up before being overwritten.)  An output file
+name can be specified explicitly using the `-o` option:
+
+   markdown2pdf -o book.pdf chap1 chap2
+
+If no input file is specified, input will be taken from stdin.
+All of `pandoc`'s options will work with `markdown2pdf` as well.
+
+`markdown2pdf` assumes that `pdflatex` is in the path.  It also
+assumes that the following LaTeX packages are available:
+`unicode`, `fancyhdr` (if you have verbatim text in footnotes),
+`graphicx` (if you use images), `array` (if you use tables),
+and `ulem` (if you use strikeout text).  If they are not already
+included in your LaTeX distribution, you can get them from
+[CTAN]. A full [TeX Live] or [MacTeX] distribution will have all of
+these packages.
+
+`hsmarkdown`
+------------
+
+A user who wants a drop-in replacement for `Markdown.pl` may create
+a symbolic link to the `pandoc` executable called `hsmarkdown`. When
+invoked under the name `hsmarkdown`, `pandoc` will behave as if the
+`--strict` flag had been selected, and no command-line options will be
+recognized. However, this approach does not work under Cygwin, due to
+problems with its simulation of symbolic links.
 
 [Cygwin]:  http://www.cygwin.com/ 
-[HTML Tidy]:  http://tidy.sourceforge.net/
 [`iconv`]: http://www.gnu.org/software/libiconv/
 [CTAN]: http://www.ctan.org "Comprehensive TeX Archive Network"
 [TeX Live]: http://www.tug.org/texlive/
@@ -562,8 +519,7 @@ Pandoc's markdown vs. standard markdown
 
 In parsing markdown, Pandoc departs from and extends [standard markdown]
 in a few respects.  Except where noted, these differences can
-be suppressed by specifying the `--strict` command-line option or by
-using the `hsmarkdown` wrapper.
+be suppressed by specifying the `--strict` command-line option.
 
 [standard markdown]:  http://daringfireball.net/projects/markdown/syntax
   "Markdown syntax description"
diff --git a/Setup.hs b/Setup.hs
index bd48dbe6e..7284202f2 100644
--- a/Setup.hs
+++ b/Setup.hs
@@ -51,7 +51,7 @@ makeManPages :: Args -> BuildFlags -> PackageDescription -> LocalBuildInfo -> IO
 makeManPages _ flags _ _ = mapM_ (makeManPage (fromFlag $ buildVerbosity flags)) manpages
 
 manpages :: [FilePath]
-manpages = ["pandoc.1", "hsmarkdown.1", "html2markdown.1", "markdown2pdf.1"]
+manpages = ["pandoc.1", "markdown2pdf.1"]
 
 manDir :: FilePath
 manDir = "man" </> "man1"
@@ -80,7 +80,7 @@ installScripts pkg lbi verbosity copy =
       (zip (repeat ".") (wrappers \\ exes))
     where exes = map exeName $ filter isBuildable $ executables pkg
           isBuildable = buildable . buildInfo
-          wrappers = ["html2markdown", "hsmarkdown", "markdown2pdf"]
+          wrappers = ["markdown2pdf"]
 
 installManpages :: PackageDescription -> LocalBuildInfo
                 -> Verbosity -> CopyDest -> IO ()
diff --git a/html2markdown b/html2markdown
deleted file mode 100755
index 0649e0478..000000000
--- a/html2markdown
+++ /dev/null
@@ -1,221 +0,0 @@
-#!/bin/sh -e
-# converts HTML from a URL, file, or stdin to markdown
-# uses an available program to fetch URL and tidy to normalize it first
-
-REQUIRED="tidy"
-SYNOPSIS="converts HTML from a URL, file, or STDIN to markdown-formatted text."
-
-THIS=${0##*/}
-
-NEWLINE='
-'
-
-err ()  { echo "$*"   | fold -s -w ${COLUMNS:-110} >&2; }
-errn () { printf "$*" | fold -s -w ${COLUMNS:-110} >&2; }
-
-usage () {
-    err "$1 - $2" # short description
-    err "See the $1(1) man page for usage."
-}
-
-# Portable which(1).
-pathfind () {
-    oldifs="$IFS"; IFS=':'
-    for _p in $PATH; do
-        if [ -x "$_p/$*" ] && [ -f "$_p/$*" ]; then
-            IFS="$oldifs"
-            return 0
-        fi
-    done
-    IFS="$oldifs"
-    return 1
-}
-
-for p in pandoc $REQUIRED; do
-    pathfind $p || {
-        err "You need '$p' to use this program!"
-        exit 1
-    }
-done
-
-CONF=$(pandoc --dump-args "$@" 2>&1) || {
-    errcode=$?
-    echo "$CONF" | sed -e '/^pandoc \[OPTIONS\] \[FILES\]/,$d' >&2
-    [ $errcode -eq 2 ] && usage "$THIS" "$SYNOPSIS"
-    exit $errcode
-}
-
-OUTPUT=$(echo "$CONF" | sed -ne '1p')
-ARGS=$(echo "$CONF" | sed -e '1d')
-
-
-grab_url_with () {
-    url="${1:?internal error: grab_url_with: url required}"
-
-    shift
-    cmdline="$@"
-
-    prog=
-    prog_opts=
-    if [ -n "$cmdline" ]; then
-	eval "set -- $cmdline"
-	prog=$1
-	shift
-	prog_opts="$@"
-    fi
-
-    if [ -z "$prog" ]; then
-	# Locate a sensible web grabber (note the order).
-	for p in wget lynx w3m curl links w3c; do
-		if pathfind $p; then
-		    prog=$p
-		    break
-		fi
-	done
-
-	[ -n "$prog" ] || {
-            errn "$THIS:  Couldn't find a program to fetch the file from URL "
-	    err "(e.g. wget, w3m, lynx, w3c, or curl)."
-	    return 1
-	}
-    else
-	pathfind "$prog" || {
-	    err "$THIS:  No such web grabber '$prog' found; aborting."
-	    return 1
-	}
-    fi
-
-    # Setup proper base options for known grabbers.
-    base_opts=
-    case "$prog" in
-    wget)  base_opts="-O-" ;;
-    lynx)  base_opts="-source" ;;
-    w3m)   base_opts="-dump_source" ;;
-    curl)  base_opts="" ;;
-    links) base_opts="-source" ;;
-    w3c)   base_opts="-n -get" ;;
-    *)     err "$THIS:  unhandled web grabber '$prog'; hope it succeeds."
-    esac
-
-    err "$THIS: invoking '$prog $base_opts $prog_opts $url'..."
-    eval "set -- $base_opts $prog_opts"
-    $prog "$@" "$url"
-}
-
-# Parse command-line arguments
-parse_arguments () {
-    while [ $# -gt 0 ]; do
-        case "$1" in
-            --encoding=*)
-                wholeopt="$1"
-                # extract encoding from after =
-                encoding="${wholeopt#*=}" ;;
-            -e|--encoding|-encoding)
-                shift
-                encoding="$1" ;; 
-            --grabber=*)
-                wholeopt="$1"
-                # extract encoding from after =
-                grabber="\"${wholeopt#*=}\"" ;;
-            -g|--grabber|-grabber)
-                shift
-                grabber="$1" ;; 
-            *)
-                if [ -z "$argument" ]; then
-                    argument="$1"
-                else
-                    err "Warning:  extra argument '$1' will be ignored."
-                fi ;;
-            esac
-        shift
-    done
-}
-
-argument=
-encoding=
-grabber=
-
-oldifs="$IFS"
-IFS=$NEWLINE
-parse_arguments $ARGS
-IFS="$oldifs"
-
-inurl=
-if [ -n "$argument" ] && ! [ -f "$argument" ]; then
-    # Treat given argument as an URL.
-    inurl="$argument"
-fi
-
-# As a security measure refuse to proceed if mktemp is not available.
-pathfind mktemp || { err "Couldn't find 'mktemp'; aborting."; exit 1;  }
-
-# Avoid issues with /tmp directory on Windows/Cygwin 
-cygwin=
-cygwin=$(uname | sed -ne '/^CYGWIN/p')
-if [ -n "$cygwin" ]; then
-    TMPDIR=.
-    export TMPDIR
-fi
-
-THIS_TEMPDIR=
-THIS_TEMPDIR="$(mktemp -d -t $THIS.XXXXXXXX)" || exit 1
-readonly THIS_TEMPDIR
-
-trap 'exitcode=$?
-      [ -z "$THIS_TEMPDIR" ] || rm -rf "$THIS_TEMPDIR"
-      exit $exitcode' 0 1 2 3 13 15
-
-if [ -n "$inurl" ]; then
-    err "Attempting to fetch file from '$inurl'..."
-
-    grabber_out=$THIS_TEMPDIR/grabber.out
-    grabber_log=$THIS_TEMPDIR/grabber.log
-    if ! grab_url_with "$inurl" "$grabber" 1>$grabber_out 2>$grabber_log; then
-        errn "grab_url_with failed"
-        if [ -f $grabber_log ]; then
-            err " with the following error log."
-            err
-            cat >&2 $grabber_log
-        else
-            err .
-        fi
-        exit 1
-    fi
-
-    argument="$grabber_out"
-fi
-
-if [ -z "$encoding" ] && [ "x$argument" != "x" ]; then
-    # Try to determine character encoding if not specified
-    # and input is not STDIN.
-    encoding=$(
-        head "$argument" |
-        LC_ALL=C tr 'A-Z' 'a-z' |
-        sed -ne '/<meta .*content-type.*charset=/ {
-            s/.*charset=["'\'']*\([-a-zA-Z0-9]*\).*["'\'']*/\1/p
-        }'
-    )
-fi
-
-if [ -n "$encoding" ] && pathfind iconv; then
-    alias to_utf8='iconv -f "$encoding" -t utf-8'
-else # assume UTF-8
-    alias to_utf8='cat'
-fi 
-
-htmlinput=$THIS_TEMPDIR/htmlinput
-
-if [ -z "$argument" ]; then
-    to_utf8 > $htmlinput                # read from STDIN
-elif [ -f "$argument" ]; then
-    to_utf8 "$argument" > $htmlinput    # read from file
-else
-    err "File '$argument' not found."
-    exit 1
-fi
-
-if ! cat $htmlinput | pandoc --ignore-args -r html -w markdown "$@" ; then
-     err "Failed to parse HTML.  Trying again with tidy..."
-     tidy -q -asxhtml -utf8 $htmlinput | \
-        pandoc --ignore-args -r html -w markdown "$@"
-fi
diff --git a/man/man1/hsmarkdown.1.md b/man/man1/hsmarkdown.1.md
deleted file mode 100644
index a197ef2ca..000000000
--- a/man/man1/hsmarkdown.1.md
+++ /dev/null
@@ -1,42 +0,0 @@
-% HSMARKDOWN(1) Pandoc User Manuals
-% John MacFarlane
-% January 8, 2008
-
-# NAME
-
-hsmarkdown - convert markdown-formatted text to HTML
-
-# SYNOPSIS
-
-hsmarkdown [*input-file*]...
-
-# DESCRIPTION
-
-`hsmarkdown` converts markdown-formatted text to HTML. It is designed
-to be usable as a drop-in replacement for John Gruber's `Markdown.pl`.
-
-If no *input-file* is specified, input is read from *stdin*.
-Otherwise, the *input-files* are concatenated (with a blank
-line between each) and used as input.  Output goes to *stdout* by
-default.  For output to a file, use shell redirection:
-
-    hsmarkdown input.txt > output.html
-
-`hsmarkdown` uses the UTF-8 character encoding for both input and output.
-If your local character encoding is not UTF-8, you should pipe input
-and output through `iconv`:
-
-    iconv -t utf-8 input.txt | hsmarkdown | iconv -f utf-8
-
-`hsmarkdown` is implemented as a wrapper around `pandoc`(1).  It
-calls `pandoc` with the options `--from markdown --to html
---strict` and disables all other options.  (Command-line options
-will be interpreted as filenames, as they are by `Markdown.pl`.)
-
-# SEE ALSO
-
-`pandoc`(1).  The *README*
-file distributed with Pandoc contains full documentation.
-
-The Pandoc source code and all documentation may be downloaded from
-<http://johnmacfarlane.net/pandoc/>.
diff --git a/man/man1/html2markdown.1.md b/man/man1/html2markdown.1.md
deleted file mode 100644
index 73e3420dd..000000000
--- a/man/man1/html2markdown.1.md
+++ /dev/null
@@ -1,95 +0,0 @@
-% HTML2MARKDOWN(1) Pandoc User Manuals
-% John MacFarlane and Recai Oktas
-% January 8, 2008
-
-# NAME
-
-html2markdown - converts HTML to markdown-formatted text
-
-# SYNOPSIS
-
-html2markdown [*pandoc-options*] [\-- *special-options*] [*input-file* or
-*URL*]
-
-# DESCRIPTION
-
-`html2markdown` converts *input-file* or *URL* (or text
-from *stdin*) from HTML to markdown-formatted plain text.
-If a URL is specified, `html2markdown` uses an available program
-(e.g. wget, w3m, lynx or curl) to fetch its contents.  Output is sent
-to *stdout* unless an output file is specified using the `-o`
-option.
-
-`html2markdown` uses the character encoding specified in the
-"Content-type" meta tag.  If this is not present, or if input comes
-from *stdin*, UTF-8 is assumed.  A character encoding may be specified
-explicitly using the `-e` special option.
-
-# OPTIONS
-
-`html2markdown` is a wrapper for `pandoc`, so all of
-`pandoc`'s options may be used.  See `pandoc`(1) for
-a complete list.  The following options are most relevant:
-
--s, \--standalone
-:   Include title, author, and date information (if present) at the
-    top of markdown output.
-
--o *FILE*, \--output=*FILE*
-:   Write output to *FILE* instead of *stdout*.
-
-\--strict
-:   Use strict markdown syntax, with no extensions or variants.
-
-\--reference-links
-:   Use reference-style links, rather than inline links, in writing markdown
-    or reStructuredText.
-
--R, \--parse-raw
-:   Parse untranslatable HTML codes as raw HTML.
-
-\--no-wrap
-:   Disable text wrapping in output.  (Default is to wrap text.)
-
--H *FILE*, \--include-in-header=*FILE*
-:   Include contents of *FILE* at the end of the header.  Implies
-    `-s`.
-
--B *FILE*, \--include-before-body=*FILE*
-:   Include contents of *FILE* at the beginning of the document body.
-
--A *FILE*, \--include-after-body=*FILE*
-:   Include contents of *FILE* at the end of the document body.
-
--C *FILE*, \--custom-header=*FILE*
-:   Use contents of *FILE*
-    as the document header (overriding the default header, which can be
-    printed using `pandoc -D markdown`).  Implies `-s`.
-
-# SPECIAL OPTIONS
-
-In addition, the following special options may be used.  The special
-options must be separated from the `html2markdown` command and any
-regular `pandoc` options by the delimiter \``--`', as in
-
-    html2markdown -o foo.txt -- -g 'curl -u bar:baz' -e latin1  \
-    www.foo.com
-
--e *encoding*, \--encoding=*encoding* 
-:   Assume the character encoding *encoding* in reading HTML.
-    (Note: *encoding* will be passed to `iconv`; a list of
-    available encodings may be obtained using `iconv -l`.)
-    If this option is not specified and input is not from
-    *stdin*, `html2markdown` will try to extract the character encoding
-    from the "Content-type" meta tag.  If no character encoding is
-    specified in this way, or if input is from *stdin*, UTF-8 will be
-    assumed.
-
--g *command*, \--grabber=*command*
-:   Use *command* to fetch the contents of a URL.  (By default,
-    `html2markdown` searches for an available program or text-based
-    browser to fetch the contents of a URL.)
-
-# SEE ALSO
-
-`pandoc`(1), `iconv`(1)
diff --git a/pandoc.cabal b/pandoc.cabal
index 4a2120079..57ad24b78 100644
--- a/pandoc.cabal
+++ b/pandoc.cabal
@@ -59,11 +59,10 @@ Data-Files:
                  -- documentation
                  README, INSTALL, COPYRIGHT, BUGS, changelog,
                  -- wrappers
-                 markdown2pdf, html2markdown, hsmarkdown
+                 markdown2pdf
 Extra-Source-Files:
                  -- sources for man pages
                  man/man1/pandoc.1.md, man/man1/markdown2pdf.1.md,
-                 man/man1/html2markdown.1.md, man/man1/hsmarkdown.1.md,
                  -- tests
                  tests/bodybg.gif,
                  tests/writer.latex,
@@ -120,8 +119,7 @@ Extra-Source-Files:
                  tests/lhs-test.html+lhs,
                  tests/lhs-test.fragment.html+lhs,
                  tests/RunTests.hs
-Extra-Tmp-Files: man/man1/pandoc.1, man/man1/hsmarkdown.1,
-                 man/man1/html2markdown.1, man/man1/markdown2pdf.1
+Extra-Tmp-Files: man/man1/pandoc.1, man/man1/markdown2pdf.1
 
 Flag highlighting
   Description:   Compile in support for syntax highlighting of code blocks.
@@ -130,7 +128,7 @@ Flag executable
   Description:   Build the pandoc executable.
   Default:       True
 Flag wrappers
-  Description:   Build the wrappers (hsmarkdown, markdown2pdf).
+  Description:   Build the wrappers (markdown2pdf).
   Default:       True
 Flag library
   Description:   Build the pandoc library.
@@ -219,17 +217,6 @@ Executable pandoc
   else
     Buildable:      False
 
-Executable hsmarkdown
-  Hs-Source-Dirs:     src
-  Main-Is:            hsmarkdown.hs
-  Ghc-Options:        -Wall -threaded
-  Ghc-Prof-Options:   -auto-all
-  Extensions:         CPP
-  if flag(wrappers)
-    Buildable:      True
-  else
-    Buildable:      False
-
 Executable markdown2pdf
   Hs-Source-Dirs:     src
   Main-Is:            markdown2pdf.hs
diff --git a/src/hsmarkdown.hs b/src/hsmarkdown.hs
deleted file mode 100644
index 3f689d4ec..000000000
--- a/src/hsmarkdown.hs
+++ /dev/null
@@ -1,47 +0,0 @@
-{-
-Copyright (C) 2006-8 John MacFarlane <jgm@berkeley.edu>
-
-This program is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 2 of the License, or
-(at your option) any later version.
-
-This program is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with this program; if not, write to the Free Software
-Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
--}
-
-{- |
-   Copyright   : Copyright (C) 2009 John MacFarlane
-   License     : GNU GPL, version 2 or above
-
-   Maintainer  : John MacFarlane <jgm@berkeley@edu>
-   Stability   : alpha
-   Portability : portable
-
-Wrapper around pandoc that emulates Markdown.pl as closely as possible.
--}
-module Main where
-import System.Process
-import System.Environment ( getArgs )
--- Note: ghc >= 6.12 (base >=4.2) supports unicode through iconv
--- So we use System.IO.UTF8 only if we have an earlier version
-#if MIN_VERSION_base(4,2,0)
-#else
-import Prelude hiding ( putStr, putStrLn, writeFile, readFile, getContents )
-import System.IO.UTF8
-#endif
-import Control.Monad (forM_)
-
-main :: IO ()
-main = do
-    files <- getArgs
-    let runPandoc inp = readProcess "pandoc" ["--from", "markdown", "--to", "html", "--strict"] inp >>= putStrLn
-    if null files
-       then getContents >>= runPandoc
-       else forM_ files $ \f -> readFile f >>= runPandoc