aboutsummaryrefslogtreecommitdiff
path: root/man/man1/html2markdown.1
blob: 6cdba595c854562a6e709b1e533a97e81a532a2b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
.TH HTML2MARKDOWN 1 "November 21, 2006" Pandoc "User Manuals"
.SH NAME
html2markdown \- converts HTML to markdown-formatted text
.SH SYNOPSIS
\fBhtml2markdown\fR [\fIoptions\fR] [\fIinput\-file\fR or \fIURL\fR]
[\fB\-\-\fR] [\fIpandoc\-opts\fR]
.SH DESCRIPTION
\fBhtml2markdown\fR converts \fIinput\-file\fR or \fIURL\fR (or text
from STDIN) from HTML to markdown\-formatted plain text. 
If a URL is specified, \fBhtml2markdown\fR uses an available program
(e.g. wget, w3m, lynx or curl) to fetch its contents.  Output is sent
to STDOUT.
.PP
\fBhtml2markdown\fR is a wrapper for \fBpandoc\fR.
.SH OPTIONS
.TP
.B \-h
Show usage message.
.TP
.B \-e \fIencoding\fR
Assume the character encoding \fIencoding\fR in reading the HTML.
(Note: \fIencoding\fR will be passed to \fBiconv\fR; a list of
available encodings may be obtained using `\fBiconv \-l\fR'.)
If the \fB\-e\fR option is not specified, the encoding will be
determined as follows:  If input is from STDIN, the local encoding 
will be assumed.  Otherwise, \fBhtml2markdown\fR will try to
extract the character encoding from the "Content-type" meta tag.
If no character encoding is specified in this way, UTF-8 will be
assumed for a URL argument, and the local encoding will be assumed
for a file argument.
.TP
.B \-g \fIcommand\fR
Use \fIcommand\fR to fetch the contents of a URL.  (By default,
\fBhtml2markdown\fR searches for an available program or text-based
browser to fetch the contents of a URL.)  For example:
.IP
html2markdown \-g 'wget \-\-user=foo \-\-password=bar' mysite.com
.TP
.B \-n
Disable automatic fetching of contents when URLs are specified as
arguments.
.TP
.I pandoc\-opts
Any options appearing after \fIinput\-file\fR or \fIURL\fR on the
command line will be passed directly to \fBpandoc\fR.  If no
\fIinput-file\fR or \fIURL\fR is specified, these options must
be preceded by ` \fB\-\-\fR '.  (In other cases, ` \fB\-\-\fR ' is
optional.)  See \fBpandoc\fR(1) for a list of options that may be used.
Example:
.IP
html2markdown input.txt \-\- \-R
.SH "SEE ALSO"
\fBpandoc\fR(1),
\fBmarkdown2html\fR(1),
\fBmarkdown2latex\fR(1),
\fBlatex2markdown\fR(1),
\fBmarkdown2pdf\fR(1),
\fBiconv\fR(1)
.SH AUTHOR
John MacFarlane and Recai Oktas