diff options
| -rw-r--r-- | README | 30 | ||||
| -rw-r--r-- | debian/changelog | 3 | ||||
| -rw-r--r-- | man/man1/html2markdown.1 | 41 | ||||
| -rw-r--r-- | man/man1/markdown2pdf.1 | 19 | 
4 files changed, 45 insertions, 48 deletions
| @@ -176,20 +176,32 @@ may be used in Windows under Cygwin.)  	    markdown2pdf -o "My Book.pdf" chap1.txt chap2.txt chap3.txt        If no input file is specified, input will be taken from STDIN. +    All of `pandoc`'s options will work with `markdown2pdf` as well.  2.  `html2markdown` grabs a web page from a file or URL and converts      it to markdown-formatted text, using `tidy` and `pandoc`. -    Unless input is from STDIN, an attempt is made to determine the -    character encoding of the page from the "Content-type" meta tag. -    If this is not present, UTF-8 is assumed.  Alternatively, a character -    encoding may be specified explicitly using the `-e` option. -    `html2markdown` searches for an available program (`wget`, `curl`, -    or a text-mode browser) to fetch the contents of a URL. -    Optionally, the `-g` command may be used to specify the command -    to be used: +    All of `pandoc`'s options will work with `html2markdown` as well. +    In addition, the following special options may be used. +    The special options must be separated from the `html2markdown` +    command and any regular Pandoc options by the delimiter `--`: -        html2markdown -g 'wget --user=foo --password=bar' mysite.com +        html2markdown -o out.txt -- -e latin1 -g curl google.com  + +    The `-e` or `--encoding` option specifies the character encoding +    of the HTML input.  If this option is not specified, and input +    is not from STDIN, `html2markdown` will attempt to determine the +    page's character encoding from the "Content-type" meta tag. +    If this is not present, UTF-8 is assumed. +      +    The `-g` or `--grabber` option specifies the command to be used to +    fetch the contents of a URL: + +        html2markdown -g 'curl --user foo:bar' www.mysite.com + +    If this option is not specified, `html2markdown` searches for an +    available program (`wget`, `curl`, or a text-mode browser) to fetch +    the contents of a URL.  3.  `hsmarkdown` is designed to be used as a drop-in replacement for      `Markdown.pl`.  It forces `pandoc` to convert from markdown to diff --git a/debian/changelog b/debian/changelog index a06c40579..8ce9acc47 100644 --- a/debian/changelog +++ b/debian/changelog @@ -210,9 +210,6 @@ pandoc (0.3) unstable; urgency=low      + getopts shell builtin is used for portable option parsing.      + Improved html2markdown's web grabber code, making it more robust,        configurable and verbose.  Added '-e', '-g' options. -      Possible use case: -        # Use wget by setting timeout to 10 seconds and limit retries to 2. -        html2markdown -g 'wget --timeout=10 --tries=2'   -- Recai Oktaş <roktas@debian.org>  Fri, 05 Jan 2007 09:41:19 +0200 diff --git a/man/man1/html2markdown.1 b/man/man1/html2markdown.1 index 542d26852..78c27808e 100644 --- a/man/man1/html2markdown.1 +++ b/man/man1/html2markdown.1 @@ -2,7 +2,8 @@  .SH NAME  html2markdown \- converts HTML to markdown-formatted text  .SH SYNOPSIS -\fBhtml2markdown\fR [\fIoptions\fR] [\fIinput\-file\fR or \fIURL\fR] +\fBhtml2markdown\fR [\fIpandoc\-options\fR]  +[\-\- \fIspecial\-options\fR] [\fIinput\-file\fR or \fIURL\fR]  .SH DESCRIPTION  \fBhtml2markdown\fR converts \fIinput\-file\fR or \fIURL\fR (or text  from STDIN) from HTML to markdown\-formatted plain text.  @@ -14,10 +15,12 @@ option.  \fBhtml2markdown\fR uses the character encoding specified in the  "Content-type" meta tag.  If this is not present, or if input comes  from STDIN, UTF-8 is assumed.  A character encoding may be specified -explicitly using the \fB\-e\fR option. -.PP -\fBhtml2markdown\fR is a wrapper for \fBpandoc\fR. +explicitly using the \fB\-e\fR special option.  .SH OPTIONS +.PP +\fBhtml2markdown\fR is a wrapper for \fBpandoc\fR, so all of +\fBpandoc\fR's options may be used.  See \fBpandoc\fR(1) for +a complete list.  The following options are most relevant:  .TP  .B \-s, \-\-standalone  Include title, author, and date information (if present) at the @@ -26,12 +29,6 @@ top of markdown output.  .B \-o FILE, \-\-output=FILE  Write output to \fIFILE\fR instead of STDOUT.   .TP -.B \-p, \-\-preserve-tabs -Preserve tabs instead of converting them to spaces. -.TP -.B \-\-tab-stop=\fITABSTOP\fB -Specify tab stop (default is 4). -.TP  .B \-\-strict  Use strict markdown syntax, with no extensions or variants.  .TP @@ -54,29 +51,29 @@ Use contents of \fIFILE\fR  as the document header (overriding the default header, which can be  printed using '\fBpandoc \-D markdown\fR').  Implies  \fB-s\fR. +.SH "SPECIAL OPTIONS" +.PP +In addition, the following special options may be used.  The special +options must be separated from the \fBhtml2markdown\fR command and any +regular \fBpandoc\fR options by the delimiter `\-\-', as in +.IP +.B html2markdown \-o foo.txt \-\- \-g 'curl \-u bar:baz' \-e latin1  +.B www.foo.com  .TP -.B \-v, \-\-version -Print version. -.TP -.B \-h, \-\-help -Show usage message. -.TP -.B \-e \fIencoding\fR +.B \-e \fIencoding\fR, \-\-encoding=\fIencoding\fR   Assume the character encoding \fIencoding\fR in reading HTML.  (Note: \fIencoding\fR will be passed to \fBiconv\fR; a list of  available encodings may be obtained using `\fBiconv \-l\fR'.) -If the \fB\-e\fR option is not specified and input is not from +If this option is not specified and input is not from  STDIN, \fBhtml2markdown\fR will try to extract the character encoding  from the "Content-type" meta tag.  If no character encoding is  specified in this way, or if input is from STDIN, UTF-8 will be  assumed.  .TP -.B \-g \fIcommand\fR +.B \-g \fIcommand\fR, \-\-grabber=\fIcommand\fR  Use \fIcommand\fR to fetch the contents of a URL.  (By default,  \fBhtml2markdown\fR searches for an available program or text-based -browser to fetch the contents of a URL.)  For example: -.IP -html2markdown \-g 'wget \-\-user=foo \-\-password=bar' mysite.com +browser to fetch the contents of a URL.)  .SH "SEE ALSO"  \fBpandoc\fR(1), diff --git a/man/man1/markdown2pdf.1 b/man/man1/markdown2pdf.1 index 4524c0ac2..3162742bb 100644 --- a/man/man1/markdown2pdf.1 +++ b/man/man1/markdown2pdf.1 @@ -23,19 +23,16 @@ output through \fBiconv\fR:  \fBmarkdown2pdf\fR assumes that the 'unicode' and 'fancyvrb' packages  are in latex's search path.  If these packages are not included in your  latex setup, they can be obtained from <http://ctan.org>. -.PP -\fBmarkdown2pdf\fR is a wrapper around \fBpandoc\fR.  .SH OPTIONS +.PP +\fBmarkdown2pdf\fR is a wrapper around \fBpandoc\fR, so all of +\fBpandoc\fR's options can be used with \fBmarkdown2pdf\fR as well. +See \fBpandoc\fR(1) for a complete list. +The following options are most relevant:  .TP  .B \-o FILE, \-\-output=FILE  Write output to \fIFILE\fR.  .TP -.B \-p, \-\-preserve-tabs -Preserve tabs instead of converting them to spaces. -.TP -.B \-\-tab-stop=\fITABSTOP\fB -Specify tab stop (default is 4). -.TP  .B \-\-strict  Use strict markdown syntax, with no extensions or variants.  .TP @@ -57,12 +54,6 @@ Include (LaTeX) contents of \fIFILE\fR at the end of the document body.  Use contents of \fIFILE\fR  as the LaTeX document header (overriding the default header, which can be  printed using '\fBpandoc \-D latex\fR').  Implies \fB-s\fR. -.TP -.B \-v, \-\-version -Print version. -.TP -.B \-h, \-\-help -Show usage message.  .SH "SEE ALSO"  \fBpandoc\fR(1),  \fBpdflatex\fR(1) | 
