diff options
-rw-r--r-- | Makefile | 2 | ||||
-rw-r--r-- | README | 18 | ||||
-rw-r--r-- | man/man1/html2markdown.1 (renamed from man/man1/web2markdown.1) | 20 | ||||
-rw-r--r-- | man/man1/pandoc.1 | 4 | ||||
-rw-r--r-- | src/wrappers/html2markdown.in (renamed from src/wrappers/web2markdown.in) | 4 | ||||
-rw-r--r-- | web/demos.sh | 2 | ||||
-rw-r--r-- | web/index.txt | 2 |
7 files changed, 26 insertions, 26 deletions
@@ -25,7 +25,7 @@ EXECSBASE := $(shell sed -ne 's/^[Ee]xecutable:[[:space:]]*//p' $(CABAL).in) #------------------------------------------------------------------------------- # Install targets #------------------------------------------------------------------------------- -WRAPPERS := web2markdown markdown2pdf +WRAPPERS := html2markdown markdown2pdf # Add .exe extensions if we're running Windows/Cygwin. EXTENSION := $(shell uname | tr '[:upper:]' '[:lower:]' | \ sed -ne 's/^cygwin.*$$/\.exe/p') @@ -38,14 +38,14 @@ Requirements The `pandoc` program itself does not depend on any external libraries or programs. -The wrapper script `web2markdown` requires +The wrapper script `html2markdown` requires - `pandoc` (which must be in the PATH) - a POSIX-compliant shell (installed by default on all linux and unix systems, including Mac OS X, and in [Cygwin] for Windows), - `HTML Tidy` - `iconv` (for character encoding conversion). (If `iconv` is absent, - `web2markdown` will still work, but it will treat everything as UTF-8.) + `html2markdown` will still work, but it will treat everything as UTF-8.) [Cygwin]: http://www.cygwin.com/ [HTML Tidy]: http://tidy.sourceforge.net/ @@ -117,7 +117,7 @@ But for simple documents it should be adequate. The `latex` and `html` readers are also limited in what they can do. Because the `html` reader is picky about the HTML it parses, it is recommended that you pipe HTML through [HTML Tidy] before sending it to `pandoc`, or use the -`web2markdown` script described below. +`html2markdown` script described below. If you don't specify a reader or writer explicitly, `pandoc` will try to determine the input and output format from the extensions of @@ -151,10 +151,10 @@ The shell scripts (described below) automatically convert the input from the local encoding to UTF-8 before running them through `pandoc`, then convert the output back to the local encoding. -`markdown2pdf` and `web2markdown` -================================= +`markdown2pdf` and `html2markdown` +================================== -Two shell scripts, `markdown2pdf` and `web2markdown`, are included in +Two shell scripts, `markdown2pdf` and `html2markdown`, are included in the standard Pandoc installation. (They are not included in the Windows binary package, as they require a POSIX shell, but they may be used in Windows under Cygwin.) @@ -175,19 +175,19 @@ in Windows under Cygwin.) If no input file is specified, input will be taken from STDIN. -2. `web2markdown` grabs a web page from a file or URL and converts +2. `html2markdown` grabs a web page from a file or URL and converts it to markdown-formatted text, using `tidy` and `pandoc`. Unless input is from STDIN, an attempt is made to determine the character encoding of the page from the "Content-type" meta tag. If this is not present, UTF-8 is assumed. Alternatively, a character encoding may be specified explicitly using the `-e` option. - `web2markdown` searches for an available program (`wget`, `curl`, + `html2markdown` searches for an available program (`wget`, `curl`, or a text-mode browser) to fetch the contents of a URL. Optionally, the `-g` command may be used to specify the command to be used: - web2markdown -g 'wget --user=foo --password=bar' mysite.com + html2markdown -g 'wget --user=foo --password=bar' mysite.com Command-line options ==================== diff --git a/man/man1/web2markdown.1 b/man/man1/html2markdown.1 index 242b50671..413feb115 100644 --- a/man/man1/web2markdown.1 +++ b/man/man1/html2markdown.1 @@ -1,22 +1,22 @@ -.TH WEB2MARKDOWN 1 "December 15, 2006" Pandoc "User Manuals" +.TH HTML2MARKDOWN 1 "December 15, 2006" Pandoc "User Manuals" .SH NAME -web2markdown \- converts HTML to markdown-formatted text +html2markdown \- converts HTML to markdown-formatted text .SH SYNOPSIS -\fBweb2markdown\fR [\fIoptions\fR] [\fIinput\-file\fR or \fIURL\fR] +\fBhtml2markdown\fR [\fIoptions\fR] [\fIinput\-file\fR or \fIURL\fR] .SH DESCRIPTION -\fBweb2markdown\fR converts \fIinput\-file\fR or \fIURL\fR (or text +\fBhtml2markdown\fR converts \fIinput\-file\fR or \fIURL\fR (or text from STDIN) from HTML to markdown\-formatted plain text. -If a URL is specified, \fBweb2markdown\fR uses an available program +If a URL is specified, \fBhtml2markdown\fR uses an available program (e.g. wget, w3m, lynx or curl) to fetch its contents. Output is sent to STDOUT unless an output file is specified using the \fB\-o\fR option. .PP -\fBweb2markdown\fR uses the character encoding specified in the +\fBhtml2markdown\fR uses the character encoding specified in the "Content-type" meta tag. If this is not present, or if input comes from STDIN, UTF-8 is assumed. A character encoding may be specified explicitly using the \fB\-e\fR option. .PP -\fBweb2markdown\fR is a wrapper for \fBpandoc\fR. +\fBhtml2markdown\fR is a wrapper for \fBpandoc\fR. .SH OPTIONS .TP .B \-s, \-\-standalone @@ -62,17 +62,17 @@ Assume the character encoding \fIencoding\fR in reading HTML. (Note: \fIencoding\fR will be passed to \fBiconv\fR; a list of available encodings may be obtained using `\fBiconv \-l\fR'.) If the \fB\-e\fR option is not specified and input is not from -STDIN, \fBweb2markdown\fR will try to extract the character encoding +STDIN, \fBhtml2markdown\fR will try to extract the character encoding from the "Content-type" meta tag. If no character encoding is specified in this way, or if input is from STDIN, UTF-8 will be assumed. .TP .B \-g \fIcommand\fR Use \fIcommand\fR to fetch the contents of a URL. (By default, -\fBweb2markdown\fR searches for an available program or text-based +\fBhtml2markdown\fR searches for an available program or text-based browser to fetch the contents of a URL.) For example: .IP -web2markdown \-g 'wget \-\-user=foo \-\-password=bar' mysite.com +html2markdown \-g 'wget \-\-user=foo \-\-password=bar' mysite.com .SH "SEE ALSO" \fBpandoc\fR(1), diff --git a/man/man1/pandoc.1 b/man/man1/pandoc.1 index f6280f463..a955e9e8a 100644 --- a/man/man1/pandoc.1 +++ b/man/man1/pandoc.1 @@ -41,7 +41,7 @@ and output through \fBiconv\fR: .PP \fIPandoc\fR's HTML parser is not very forgiving. If your input is HTML, consider running it through \fBtidy\fR(1) before passing it -to Pandoc. Or use \fBweb2markdown\fR(1), a wrapper around \fBpandoc\fR. +to Pandoc. Or use \fBhtml2markdown\fR(1), a wrapper around \fBpandoc\fR. .SH OPTIONS .TP @@ -151,7 +151,7 @@ Print version. Show usage message. .SH "SEE ALSO" -\fBweb2markdown\fR(1), +\fBhtml2markdown\fR(1), \fBmarkdown2pdf\fR(1). The .I README diff --git a/src/wrappers/web2markdown.in b/src/wrappers/html2markdown.in index 89e884c3d..740d69588 100644 --- a/src/wrappers/web2markdown.in +++ b/src/wrappers/html2markdown.in @@ -72,7 +72,7 @@ grabber= while [ $# -gt 0 ]; do case "$1" in -h|--help) - pandoc -h 2>&1 | sed -e 's/pandoc/web2markdown/' \ + pandoc -h 2>&1 | sed -e 's/pandoc/html2markdown/' \ -e '/^[[:space:]]*\(-f\|-t\|-S\|-N\|-m\|-i\|-c\|-T\|-D\|-d\)/,/./d'\ 1>&2 err " -e ENCODING, --encoding=ENCODING" @@ -81,7 +81,7 @@ while [ $# -gt 0 ]; do err " Specify command to be used to grab contents of URL" exit 0 ;; -v|--version) - pandoc -v 2>&1 | sed -e 's/pandoc/web2markdown/' 1>&2 + pandoc -v 2>&1 | sed -e 's/pandoc/html2markdown/' 1>&2 exit 0 ;; -e) shift diff --git a/web/demos.sh b/web/demos.sh index 6c6a2a698..bd87151d5 100644 --- a/web/demos.sh +++ b/web/demos.sh @@ -14,7 +14,7 @@ pandoc -s README.tex -o demo0.txt pandoc -s -w rst README -o demo0.txt pandoc -s README -o demo0.rtf pandoc -s -m -i -w s5 S5DEMO -o demo0.html -web2markdown http://www.gnu.org/software/make/ -o demo0.txt +html2markdown http://www.gnu.org/software/make/ -o demo0.txt markdown2pdf README -o demo0.pdf markdown2pdf -C myheader.tex README -o demo0.pdf' diff --git a/web/index.txt b/web/index.txt index 9fb86a4d9..024133487 100644 --- a/web/index.txt +++ b/web/index.txt @@ -35,7 +35,7 @@ you should extract from the zip archive and put somewhere in your PATH). See the included file `README-WINDOWS.txt` for instructions on using the program. Note: If you use [Cygwin], we recommend that you compile Pandoc from source. This will give you access to the -wrapper scripts `markdown2pdf` and `web2markdown`, which are not +wrapper scripts `markdown2pdf` and `html2markdown`, which are not included in the Windows binary package. [`@TARBALL_NAME@`]: http://pandoc.googlecode.com/files/@TARBALL_NAME@ |