8 files changed, 114 insertions, 572 deletions
diff --git a/changelog b/changelog
index 6bb057bbc..88f086127 100644
--- a/changelog
+++ b/changelog
@@ -2,20 +2,6 @@ pandoc (1.7)
 
   [new features]
 
-  * New `textile` reader and writer.  Thanks to Paul Rivier for contributing
-    the `textile` reader, an almost complete implementation of the textile
-    syntax used by the ruby [RedCloth library](http://redcloth.org/textile).
-    Resolves Issue #51.
-
-  * New `org` writer, for Emacs Org-mode, contributed by Puneeth Chaganti.
-
-  * New `json` reader and writer, for reading and writing a JSON
-    representation of the native Pandoc AST.  These are much faster
-    than the `native` reader and writer, and should be used for
-    serializing Pandoc to text.  To convert between the JSON representation
-    and native Pandoc, use `encodeJSON` and `decodeJSON` from
-    `Text.JSON.Generic`.
-
   * Support for citations using Andrea Rossato's `citeproc-hs` 0.3.
     You can now write, for example,
 
@@ -37,6 +23,20 @@ pandoc (1.7)
     syntax, and in the LaTeX reader, using natbib or biblatex syntax.
     (Thanks to Nathan Gass for the natbib and biblatex support.)
 
+  * New `textile` reader and writer.  Thanks to Paul Rivier for contributing
+    the `textile` reader, an almost complete implementation of the textile
+    syntax used by the ruby [RedCloth library](http://redcloth.org/textile).
+    Resolves Issue #51.
+
+  * New `org` writer, for Emacs Org-mode, contributed by Puneeth Chaganti.
+
+  * New `json` reader and writer, for reading and writing a JSON
+    representation of the native Pandoc AST.  These are much faster
+    than the `native` reader and writer, and should be used for
+    serializing Pandoc to text.  To convert between the JSON representation
+    and native Pandoc, use `encodeJSON` and `decodeJSON` from
+    `Text.JSON.Generic`.
+
   * A new `--mathjax` option has been added for displaying
     math in HTML using MathJax.  Resolves issue #259.
 
@@ -68,11 +68,15 @@ pandoc (1.7)
   * Made `--smart` work in HTML, RST, and Textile readers, as well
     as markdown.
 
+  * Added `--html5` option for HTML5 output.
+
   * Added support for listings package in LaTeX reader
     (Puneeth Chaganti).
 
   * Added support for simple tables in the LaTeX reader.
 
+  * Added support for simple tables in the HTML reader.
+
   * Significant performance improvements in many readers and writers.
 
   [API and program changes]
@@ -109,6 +113,9 @@ pandoc (1.7)
     resulting HTML using `xss-sanitize`, which is based on pandoc's
     sanitization, but improved.
 
+  * Added support for `lang` in `html` tag in the HTML template,
+    so you can do `pandoc -s -V lang=es`, for example.
+
   * Added `Text.Pandoc.Pretty`. This is better suited for pandoc than the
     `pretty` package.  Changed all writers that used
     `Text.PrettyPrint.HughesPJ` to use `Text.Pandoc.Pretty` instead.
@@ -118,7 +125,7 @@ pandoc (1.7)
 
   * `Text.Pandoc.Shared`:
 
-    + Added `writerColumns` to `WriterOptions`.
+    + Added `writerColumns` and `writerHtml5` to `WriterOptions`.
     + Added `normalize`.
     + Removed unneeded prettyprinting functions:
       `wrapped`, `wrapIfNeeded`, `wrappedTeX`, `wrapTeXIfNeeded`, `hang'`,
@@ -186,10 +193,16 @@ pandoc (1.7)
   [Under-the-hood improvements]
 
   * Completely rewrote HTML reader using tagsoup as a lexer. The
-    new reader is faster and more accurate.
+    new reader is faster and more accurate.  Unlike the
+    old reader, it does not get bogged down on some input
+    (Issues #277, 255). And it handles namespaces in tags
+    (Issue #274).
 
   * Replaced `escapeStringAsXML` with a faster version.
 
+  * Simplified Text.Pandoc.CharacterReferences by using
+    entity lookup functions from TagSoup.
+
   * Remove duplications in documentation by generating the
     pandoc man page from README, using `MakeManPage.hs`.
 
@@ -218,6 +231,10 @@ pandoc (1.7)
     Now they are parsed as `Quoted` inlines, if `--smart` is specified.
     Resolves Issue #270.
 
+  * Text.Pandoc.Parsing: Fixed bug in grid table parser.
+    Spaces at end of line were not being stripped properly,
+    resulting in unintended LineBreaks.
+
   * Markdown reader:
 
     + Allow HTML comments as inline elements in markdown.
@@ -239,6 +256,11 @@ pandoc (1.7)
     + Allow spaces between '\begin' or '\end' and '{'.
     + Support \L and \l.
 
+  * LaTeX writer:
+
+    + Escape strings in \href{..}.
+    + In nonsimple tables, put cells in \parbox.
+
   * OpenDocument writer:  don't print raw TeX.
 
   * Markdown writer: Fixed bug in `Image`.  URI was getting unescaped twice!
@@ -658,7 +680,8 @@ pandoc (1.5)
     + Removed stLink, link template variable. Reason: we now always
       include hyperref in the template.
 
-  * Latex template:
+  * LaTeX template:
+
     + Only show \author if there are some.
     + Always include hyperref package. It is used not just for links but
       for toc, section heading bookmarks, footnotes, etc. Also added
diff --git a/pandoc.cabal b/pandoc.cabal
index 63a6f7fa9..b3394bc91 100644
--- a/pandoc.cabal
+++ b/pandoc.cabal
@@ -1,6 +1,6 @@
 Name:            pandoc
 Version:         1.7
-Cabal-Version:   >= 1.2
+Cabal-Version:   >= 1.6
 Build-Type:      Custom
 License:         GPL
 License-File:    COPYING
@@ -83,10 +83,22 @@ Extra-Source-Files:
                  tests/insert,
                  tests/lalune.jpg,
                  tests/movie.jpg,
+                 tests/biblio.bib,
+                 tests/chicago-author-date.csl,
+                 tests/ieee.csl,
+                 tests/mhra.csl,
                  tests/latex-reader.latex,
                  tests/latex-reader.native,
+                 tests/biblatex-citations.latex,
+                 tests/natbib-citations.latex,
+                 tests/textile-reader.textile,
+                 tests/textile-reader.native,
                  tests/markdown-reader-more.txt,
                  tests/markdown-reader-more.native,
+                 tests/markdown-citations.txt,
+                 tests/markdown-citations.chicago-author-date.txt,
+                 tests/markdown-citations.mhra.txt,
+                 tests/markdown-citations.ieee.txt,
                  tests/textile-reader.textile,
                  tests/rst-reader.native,
                  tests/rst-reader.rst,
@@ -106,6 +118,7 @@ Extra-Source-Files:
                  tests/tables.textile,
                  tests/tables.native,
                  tests/tables.opendocument,
+                 tests/tables.org,
                  tests/tables.texinfo,
                  tests/tables.rst,
                  tests/tables.rtf,
@@ -124,6 +137,7 @@ Extra-Source-Files:
                  tests/writer.textile,
                  tests/writer.native,
                  tests/writer.opendocument,
+                 tests/writer.org,
                  tests/writer.rst,
                  tests/writer.rtf,
                  tests/writer.texinfo,
diff --git a/relann1.7 b/relann1.7
deleted file mode 100644
index 024c87ed8..000000000
--- a/relann1.7
+++ /dev/null
@@ -1,265 +0,0 @@
-I'm pleased to announce the release of pandoc 1.7.
-
-As usual, a source tarball and Windows installer are available
-at <http://code.google.com/p/pandoc/downloads/list>.  You can
-also use 'cabal install' to get the latest version from HackageDB:
-
-    cabal update
-    cabal install pandoc
-
-Thanks to everyone who contributed by filing bug reports or contributing
-patches, and especially to Andrea Rossato, Nathan Gass, Paul Rivier, and
-Puneeth Chaganti for their major contributions to this version.
-
-New features
-------------
-
-  * New `textile` reader and writer.  Thanks to Paul Rivier for contributing
-    the `textile` reader, an almost complete implementation of the textile
-    syntax used by the ruby [RedCloth library](http://redcloth.org/textile).
-    Resolves Issue #51.
-
-  * New `org` writer, for Emacs Org-mode, contributed by Puneeth Chaganti.
-
-  * New `json` reader and writer, for reading and writing a JSON
-    representation of the native Pandoc AST.  These are much faster
-    than the `native` reader and writer, and should be used for
-    serializing Pandoc to text.  To convert between the JSON representation
-    and native Pandoc, use `encodeJSON` and `decodeJSON` from
-    `Text.JSON.Generic`.
-
-  * Support for citations using Andrea Rossato's `citeproc-hs` 0.3.
-    You can now write, for example,
-
-        Water is wet [see @doe99, pp. 33-35; also @smith04, ch. 1].
-
-    and, when you process your document using `pandoc`, specifying
-    a citation style using `--csl` and a bibliography using `--bibliography`,
-    the citation will be replaced by an appropriately formatted
-    citation, and a list of works cited will be added to the end
-    of the document.
-
-    This means that you can switch effortlessly between different citation
-    and bibliography styles, including footnote, numerical, and author-date
-    formats. The bibliography can be in any of the following formats: MODS,
-    BibTeX, BibLaTeX, RIS, EndNote, EndNote XML, ISI, MEDLINE, Copac, or JSON.
-    See the README for further details.
-
-    Citations are supported in the markdown reader, using a special
-    syntax, and in the LaTeX reader, using natbib or biblatex syntax.
-    (Thanks to Nathan Gass for the natbib and biblatex support.)
-
-  * A new `--mathjax` option has been added for displaying
-    math in HTML using MathJax.  Resolves issue #259.
-
-  * You can now define LaTeX macros in markdown documents, and pandoc
-    will apply them to TeX math.  For example,
-
-        \newcommand{\plus}[2]{#1 + #2}
-        $\plus{3}{4}$
-
-    yields `3+4`.  Since the macros are applied in the reader, they
-    will work in every output format, not just LaTeX.
-
-  * LaTeX macros can also be used in LaTeX documents (both in math
-    and in non-math contexts).
-
-  * Footnotes are now supported in the RST reader. (Note, however,
-    that pandoc ignores the numeral or symbol used in the note;
-    footnotes are put in an auto-numbered ordered list.)
-    Resolves issue #258.
-
-  * `markdown2pdf` now supports `--data-dir`.
-
-  * Improved prettyprinting in most formats.  Lines will be wrapped
-    more evenly and duplicate blank lines avoided.
-
-  * New `--columns` command-line option sets the column width for
-    line wrapping and relative width calculations for tables.
-
-  * Made `--smart` work in HTML, RST, and Textile readers, as well
-    as markdown.
-
-  * Added support for listings package in LaTeX reader
-    (Puneeth Chaganti).
-
-  * Added support for simple tables in the LaTeX reader.
-
-  * Significant performance improvements in many readers and writers.
-
-API and program changes
------------------------
-
-  * Moved `Text.Pandoc.Definition` from the `pandoc` package to a new
-    auxiliary package, `pandoc-types`. This will make it possible for other
-    programs to supply output in Pandoc format, without depending on the whole
-    pandoc package.
-
-  * Moved generic functions to `Text.Pandoc.Generic`. Deprecated
-    `processWith`, replacing it with two functions, `bottomUp` and `topDown`.
-    Removed previously deprecated functions `processPandoc` and `queryPandoc`.
-
-  * Added `Text.Pandoc.Builder`, for building `Pandoc` structures.
-
-  * `Text.Pandoc` now exports association lists `readers` and `writers`.
-
-  * Removed deprecated `-C/--custom-header` option.
-    Use `--template` instead.
-
-  * `--biblio-file` has been replaced by `--bibliography`.
-    `--biblio-format` has been removed; pandoc now guesses the format
-    from the file extension (see README).
-
-  * pandoc will treat an argument as a URI only if it has an
-    `http(s)` scheme.  Previously pandoc would treat some
-    Windows pathnames beginning with `C:/` as URIs.
-
-  * pandoc now adds a newline to the end of its output in fragment
-    mode (= not `--standalone`).
-
-  * The `--sanitize-html` option and the `stateSanitize` field in
-    `ParserState` have been removed. Sanitization is better done in the
-    resulting HTML using `xss-sanitize`, which is based on pandoc's
-    sanitization, but improved.
-
-  * Added `Text.Pandoc.Pretty`. This is better suited for pandoc than the
-    `pretty` package.  Changed all writers that used
-    `Text.PrettyPrint.HughesPJ` to use `Text.Pandoc.Pretty` instead.
-
-  * Removed `Text.Pandoc.Blocks`. `Text.Pandoc.Pretty` allows you to define
-    blocks and concatenate them, so a separate module is no longer needed.
-
-  * `Text.Pandoc.Shared`:
-
-    + Added `writerColumns` to `WriterOptions`.
-    + Added `normalize`.
-    + Removed unneeded prettyprinting functions:
-      `wrapped`, `wrapIfNeeded`, `wrappedTeX`, `wrapTeXIfNeeded`, `hang'`,
-      `BlockWrapper`, `wrappedBlocksToDoc`.
-    + Made `splitBy` take a test instead of an element.
-    + Added `findDataFile`, refactored `readDataFile`.
-    + Added `stringify`. Rewrote `inlineListToIdentifier` using `stringify`.
-    + Fixed `inlineListToIdentifier` to treat '\160' as ' '.
-
-  * `Text.Pandoc.Readers.HTML`:
-
-    + Removed `rawHtmlBlock`, `anyHtmlBlockTag`, `anyHtmlInlineTag`,
-      `anyHtmlTag`, `anyHtmlEndTag`, `htmlEndTag`, `extractTagType`,
-      `htmlBlockElement`, `htmlComment`
-    + Added `htmlTag`, `htmlInBalanced`, `isInlineTag`, `isBlockTag`,
-      `isTextTag`
-
-  * Moved `smartPunctuation` from `Text.Pandoc.Readers.Markdown`
-    to `Text.Pandoc.Readers.Parsing`, and parameterized it with
-    an inline parser.
-
-  * Ellipses are no longer allowed to contain spaces.
-    Previously we allowed '. . .', ' . . . ', etc.  This caused
-    too many complications, and removed author's flexibility in
-    combining ellipses with spaces and periods.
-
-  * Allow linebreaks in URLs (treat as spaces). Also, a string of
-    consecutive spaces or tabs is now parsed as a single space. If you have
-    multiple spaces in your URL, use `%20%20`.
-
-  * `Text.Pandoc.Parsing`:
-
-    + Removed `refsMatch`.
-    + Hid `Key` constructor.
-    + Removed custom `Ord` and `Eq` instances for `Key`.
-    + Added `toKey` and `fromKey` to convert between `Key` and `[Inline]`.
-    + Generalized type on `readWith`.
-
-  * Small change in calculation of relative widths of table columns.
-    If the size of the header > the specified column width, use
-    the header size as 100% for purposes of calculating
-    relative widths of columns.
-
-  * Markdown writer now uses some pandoc-specific features when `--strict`
-    is not specified: \ newline is used for a hard linebreak instead of
-    two spaces then a newline. And delimited code blocks are used when
-    there are attributes.
-
-  * HTML writer:  improved gladTeX output by setting ENV appropriately
-    for display or inline math (Jonathan Daugherty).
-
-  * LaTeX writer: Use `\paragraph`, `\subparagraph` for level 4,5 headers.
-
-  * LaTeX reader:
-
-    + `\label{foo}` and `\ref{foo}` now become `{foo}` instead of `(foo)`.
-    + `\index{}` commands are skipped.
-
-  * Added `fontsize` variable to default LaTeX template.
-    This makes it easy to set the font size using `markdown2pdf`:
-    `markdown2pdf -V fontsize=12pt input.txt`.
-
-  * The `COLUMNS` environment variable no longer has any effect.
-
-Under-the-hood improvements
----------------------------
-
-  * Completely rewrote HTML reader using tagsoup as a lexer. The
-    new reader is faster and more accurate.
-
-  * Replaced `escapeStringAsXML` with a faster version.
-
-  * Remove duplications in documentation by generating the
-    pandoc man page from README, using `MakeManPage.hs`.
-
-  * Improvements to testing framework:  Removed old `tests/RunTests.hs`.
-    `cabal test` now runs `test-pandoc`, which is built from
-    `src/test-pandoc.hs` when the `tests` Cabal flag is set.
-    This allows the testing framework to have its own dependencies.
-
-  * Added `Interact.hs` to make it easier to use ghci while developing.
-    `Interact.hs` loads `ghci` from the `src` directory, specifying
-    all the options needed to load pandoc modules (including
-    specific package dependencies, which it gets by parsing
-    dist/setup-config).
-
-  * Added `Benchmark.hs`, testing all readers + writers using criterion.
-
-  * Added `stats.sh`, to make it easier to collect and archive
-    benchmark and lines-of-code stats.
-
-Bug fixes
----------
-
-  * Filenames are encoded as UTF8.  Resolves Issue #252.
-
-  * Handle curly quotes better in `--smart` mode. Previously, curly quotes
-    were just parsed literally, leading to problems in some output formats.
-    Now they are parsed as `Quoted` inlines, if `--smart` is specified.
-    Resolves Issue #270.
-
-  * Markdown reader:
-
-    + Allow HTML comments as inline elements in markdown.
-      So, `aaa <!-- comment --> bbb` can be a single paragraph.
-    + Fixed superscripts with links: `^[link](/foo)^` gets
-      recognized as a superscripted link, not an inline note followed by
-      garbage.
-    + Fixed regression, making markdown reference keys case-insensitive again.
-      Resolves Issue #272.
-    + Properly handle abbreviations (like `Mr.`) at the end of a line.
-    + Better handling of intraword underscores, avoiding exponential
-      slowdowns in some cases.  Resolves Issue #182.
-
-  * LaTeX reader:
-
-    + Improved parsing of preamble.
-      Previously you'd get unexpected behavior on a document that
-      contained `\begin{document}` in, say, a verbatim block.
-    + Allow spaces between '\begin' or '\end' and '{'.
-    + Support \L and \l.
-
-  * OpenDocument writer:  don't print raw TeX.
-
-  * Markdown writer: Fixed bug in `Image`.  URI was getting unescaped twice!
-
-  * LaTeX and ConTeXt: Escape `[` and `]` as `{[}` and `{]}`.
-    This avoids unwanted interpretation as an optional argument.
-
-  * `:` now allowed in HTML tags. Resolves Issue #274.
-
diff --git a/src/Text/Pandoc.hs b/src/Text/Pandoc.hs
index 3532c1d4b..dd1b3892d 100644
--- a/src/Text/Pandoc.hs
+++ b/src/Text/Pandoc.hs
@@ -149,8 +149,9 @@ readers = [("native"       , \_ -> read)
           ,("markdown+lhs" , \st ->
                              readMarkdown st{ stateLiterateHaskell = True})
           ,("rst"          , readRST)
+          ,("rst+lhs"      , \st ->
+                             readRST st{ stateLiterateHaskell = True})
           ,("textile"      , readTextile) -- TODO : textile+lhs 
-          ,("rst+lhs"      , readRST)
           ,("html"         , readHtml)
           ,("latex"        , readLaTeX)
           ,("latex+lhs"    , \st ->
diff --git a/src/Text/Pandoc/CharacterReferences.hs b/src/Text/Pandoc/CharacterReferences.hs
index 8ac55fc61..8157d94d3 100644
--- a/src/Text/Pandoc/CharacterReferences.hs
+++ b/src/Text/Pandoc/CharacterReferences.hs
@@ -31,9 +31,9 @@ module Text.Pandoc.CharacterReferences (
                      characterReference,
                      decodeCharacterReferences,
                     ) where
-import Data.Char ( chr )
 import Text.ParserCombinators.Parsec
-import qualified Data.Map as Map
+import Text.HTML.TagSoup.Entity ( lookupNamedEntity, lookupNumericEntity )
+import Data.Maybe ( fromMaybe )
 
 -- | Parse character entity.
 characterReference :: GenParser Char st Char
@@ -47,18 +47,21 @@ numRef :: GenParser Char st Char
 numRef = do
   char '#'
   num <- hexNum <|> decNum
-  return $ chr $ num 
+  return $ fromMaybe '?' $ lookupNumericEntity num
 
-hexNum :: GenParser Char st Int 
-hexNum = oneOf "Xx" >> many1 hexDigit >>= return . read . (\xs -> '0':'x':xs)
+hexNum :: GenParser Char st [Char]
+hexNum = do
+  x <- oneOf "Xx"
+  num <- many1 hexDigit
+  return (x:num)
 
-decNum :: GenParser Char st Int 
-decNum = many1 digit >>= return . read
+decNum :: GenParser Char st [Char]
+decNum = many1 digit
 
 entity :: GenParser Char st Char
 entity = do
   body <- many1 alphaNum
-  return $ Map.findWithDefault '?' body entityTable
+  return $ fromMaybe '?' $ lookupNamedEntity body
 
 -- | Convert entities in a string to characters.
 decodeCharacterReferences :: String -> String
@@ -67,261 +70,3 @@ decodeCharacterReferences str =
 	Left err        -> error $ "\nError: " ++ show err
 	Right result    -> result
 
-entityTable :: Map.Map String Char
-entityTable = Map.fromList entityTableList
-
-entityTableList :: [(String, Char)]
-entityTableList =  [
-	("quot", chr 34),
-	("amp", chr 38),
-	("lt", chr 60),
-	("gt", chr 62),
-	("nbsp", chr 160),
-	("iexcl", chr 161),
-	("cent", chr 162),
-	("pound", chr 163),
-	("curren", chr 164),
-	("yen", chr 165),
-	("brvbar", chr 166),
-	("sect", chr 167),
-	("uml", chr 168),
-	("copy", chr 169),
-	("ordf", chr 170),
-	("laquo", chr 171),
-	("not", chr 172),
-	("shy", chr 173),
-	("reg", chr 174),
-	("macr", chr 175),
-	("deg", chr 176),
-	("plusmn", chr 177),
-	("sup2", chr 178),
-	("sup3", chr 179),
-	("acute", chr 180),
-	("micro", chr 181),
-	("para", chr 182),
-	("middot", chr 183),
-	("cedil", chr 184),
-	("sup1", chr 185),
-	("ordm", chr 186),
-	("raquo", chr 187),
-	("frac14", chr 188),
-	("frac12", chr 189),
-	("frac34", chr 190),
-	("iquest", chr 191),
-	("Agrave", chr 192),
-	("Aacute", chr 193),
-	("Acirc", chr 194),
-	("Atilde", chr 195),
-	("Auml", chr 196),
-	("Aring", chr 197),
-	("AElig", chr 198),
-	("Ccedil", chr 199),
-	("Egrave", chr 200),
-	("Eacute", chr 201),
-	("Ecirc", chr 202),
-	("Euml", chr 203),
-	("Igrave", chr 204),
-	("Iacute", chr 205),
-	("Icirc", chr 206),
-	("Iuml", chr 207),
-	("ETH", chr 208),
-	("Ntilde", chr 209),
-	("Ograve", chr 210),
-	("Oacute", chr 211),
-	("Ocirc", chr 212),
-	("Otilde", chr 213),
-	("Ouml", chr 214),
-	("times", chr 215),
-	("Oslash", chr 216),
-	("Ugrave", chr 217),
-	("Uacute", chr 218),
-	("Ucirc", chr 219),
-	("Uuml", chr 220),
-	("Yacute", chr 221),
-	("THORN", chr 222),
-	("szlig", chr 223),
-	("agrave", chr 224),
-	("aacute", chr 225),
-	("acirc", chr 226),
-	("atilde", chr 227),
-	("auml", chr 228),
-	("aring", chr 229),
-	("aelig", chr 230),
-	("ccedil", chr 231),
-	("egrave", chr 232),
-	("eacute", chr 233),
-	("ecirc", chr 234),
-	("euml", chr 235),
-	("igrave", chr 236),
-	("iacute", chr 237),
-	("icirc", chr 238),
-	("iuml", chr 239),
-	("eth", chr 240),
-	("ntilde", chr 241),
-	("ograve", chr 242),
-	("oacute", chr 243),
-	("ocirc", chr 244),
-	("otilde", chr 245),
-	("ouml", chr 246),
-	("divide", chr 247),
-	("oslash", chr 248),
-	("ugrave", chr 249),
-	("uacute", chr 250),
-	("ucirc", chr 251),
-	("uuml", chr 252),
-	("yacute", chr 253),
-	("thorn", chr 254),
-	("yuml", chr 255),
-	("OElig", chr 338),
-	("oelig", chr 339),
-	("Scaron", chr 352),
-	("scaron", chr 353),
-	("Yuml", chr 376),
-	("fnof", chr 402),
-	("circ", chr 710),
-	("tilde", chr 732),
-	("Alpha", chr 913),
-	("Beta", chr 914),
-	("Gamma", chr 915),
-	("Delta", chr 916),
-	("Epsilon", chr 917),
-	("Zeta", chr 918),
-	("Eta", chr 919),
-	("Theta", chr 920),
-	("Iota", chr 921),
-	("Kappa", chr 922),
-	("Lambda", chr 923),
-	("Mu", chr 924),
-	("Nu", chr 925),
-	("Xi", chr 926),
-	("Omicron", chr 927),
-	("Pi", chr 928),
-	("Rho", chr 929),
-	("Sigma", chr 931),
-	("Tau", chr 932),
-	("Upsilon", chr 933),
-	("Phi", chr 934),
-	("Chi", chr 935),
-	("Psi", chr 936),
-	("Omega", chr 937),
-	("alpha", chr 945),
-	("beta", chr 946),
-	("gamma", chr 947),
-	("delta", chr 948),
-	("epsilon", chr 949),
-	("zeta", chr 950),
-	("eta", chr 951),
-	("theta", chr 952),
-	("iota", chr 953),
-	("kappa", chr 954),
-	("lambda", chr 955),
-	("mu", chr 956),
-	("nu", chr 957),
-	("xi", chr 958),
-	("omicron", chr 959),
-	("pi", chr 960),
-	("rho", chr 961),
-	("sigmaf", chr 962),
-	("sigma", chr 963),
-	("tau", chr 964),
-	("upsilon", chr 965),
-	("phi", chr 966),
-	("chi", chr 967),
-	("psi", chr 968),
-	("omega", chr 969),
-	("thetasym", chr 977),
-	("upsih", chr 978),
-	("piv", chr 982),
-	("ensp", chr 8194),
-	("emsp", chr 8195),
-	("thinsp", chr 8201),
-	("zwnj", chr 8204),
-	("zwj", chr 8205),
-	("lrm", chr 8206),
-	("rlm", chr 8207),
-	("ndash", chr 8211),
-	("mdash", chr 8212),
-	("lsquo", chr 8216),
-	("rsquo", chr 8217),
-	("sbquo", chr 8218),
-	("ldquo", chr 8220),
-	("rdquo", chr 8221),
-	("bdquo", chr 8222),
-	("dagger", chr 8224),
-	("Dagger", chr 8225),
-	("bull", chr 8226),
-	("hellip", chr 8230),
-	("permil", chr 8240),
-	("prime", chr 8242),
-	("Prime", chr 8243),
-	("lsaquo", chr 8249),
-	("rsaquo", chr 8250),
-	("oline", chr 8254),
-	("frasl", chr 8260),
-	("euro", chr 8364),
-	("image", chr 8465),
-	("weierp", chr 8472),
-	("real", chr 8476),
-	("trade", chr 8482),
-	("alefsym", chr 8501),
-	("larr", chr 8592),
-	("uarr", chr 8593),
-	("rarr", chr 8594),
-	("darr", chr 8595),
-	("harr", chr 8596),
-	("crarr", chr 8629),
-	("lArr", chr 8656),
-	("uArr", chr 8657),
-	("rArr", chr 8658),
-	("dArr", chr 8659),
-	("hArr", chr 8660),
-	("forall", chr 8704),
-	("part", chr 8706),
-	("exist", chr 8707),
-	("empty", chr 8709),
-	("nabla", chr 8711),
-	("isin", chr 8712),
-	("notin", chr 8713),
-	("ni", chr 8715),
-	("prod", chr 8719),
-	("sum", chr 8721),
-	("minus", chr 8722),
-	("lowast", chr 8727),
-	("radic", chr 8730),
-	("prop", chr 8733),
-	("infin", chr 8734),
-	("ang", chr 8736),
-	("and", chr 8743),
-	("or", chr 8744),
-	("cap", chr 8745),
-	("cup", chr 8746),
-	("int", chr 8747),
-	("there4", chr 8756),
-	("sim", chr 8764),
-	("cong", chr 8773),
-	("asymp", chr 8776),
-	("ne", chr 8800),
-	("equiv", chr 8801),
-	("le", chr 8804),
-	("ge", chr 8805),
-	("sub", chr 8834),
-	("sup", chr 8835),
-	("nsub", chr 8836),
-	("sube", chr 8838),
-	("supe", chr 8839),
-	("oplus", chr 8853),
-	("otimes", chr 8855),
-	("perp", chr 8869),
-	("sdot", chr 8901),
-	("lceil", chr 8968),
-	("rceil", chr 8969),
-	("lfloor", chr 8970),
-	("rfloor", chr 8971),
-	("lang", chr 9001),
-	("rang", chr 9002),
-	("loz", chr 9674),
-	("spades", chr 9824),
-	("clubs", chr 9827),
-	("hearts", chr 9829),
-	("diams", chr 9830)
-	]
diff --git a/src/Text/Pandoc/Readers/HTML.hs b/src/Text/Pandoc/Readers/HTML.hs
index ae8f0438e..0cbdf72b0 100644
--- a/src/Text/Pandoc/Readers/HTML.hs
+++ b/src/Text/Pandoc/Readers/HTML.hs
@@ -78,14 +78,14 @@ parseBody :: TagParser [Block]
 parseBody = liftM concat $ manyTill block eof
 
 block :: TagParser [Block]
-block = optional pLocation >>
-        choice [
-              pPara
+block = choice
+            [ pPara
             , pHeader
             , pBlockQuote
             , pCodeBlock
             , pList
             , pHrule
+            , pSimpleTable
             , pPlain
             , pRawHtmlBlock
             ]
@@ -195,6 +195,27 @@ pHrule = do
   pSelfClosing (=="hr") (const True)
   return [HorizontalRule]
 
+pSimpleTable :: TagParser [Block]
+pSimpleTable = try $ do
+  TagOpen _ _ <- pSatisfy (~== TagOpen "table" [])
+  skipMany pBlank
+  head' <- option [] $ pInTags "th" pTd
+  rows <- many1 $ try $
+           skipMany pBlank >> pInTags "tr" pTd
+  skipMany pBlank
+  TagClose _ <- pSatisfy (~== TagClose "table") 
+  let cols = maximum $ map length rows
+  let aligns = replicate cols AlignLeft
+  let widths = replicate cols 0
+  return [Table [] aligns widths head' rows]
+
+pTd :: TagParser [TableCell]
+pTd = try $ do
+  skipMany pBlank
+  res <- pInTags "td" pPlain
+  skipMany pBlank
+  return [res]
+
 pBlockQuote :: TagParser [Block]
 pBlockQuote = do
   contents <- pInTags "blockquote" block
@@ -235,9 +256,8 @@ pCodeBlock = try $ do
   return [CodeBlock attribs result]
 
 inline :: TagParser [Inline]
-inline = choice [
-             pLocation
-           , pTagText
+inline = choice
+           [ pTagText
            , pEmph
            , pStrong
            , pSuperscript
@@ -250,17 +270,19 @@ inline = choice [
            , pRawHtmlInline
            ]
 
-pLocation :: TagParser [a]
+pLocation :: TagParser ()
 pLocation = do
-  (TagPosition r c) <- pSatisfy isTagPosition
+  (TagPosition r c) <- pSat isTagPosition
   setPosition $ newPos "input" r c
-  return []
 
-pSatisfy :: (Tag String -> Bool) -> TagParser (Tag String)
-pSatisfy f = do
+pSat :: (Tag String -> Bool) -> TagParser (Tag String)
+pSat f = do
   pos <- getPosition
   token show (const pos) (\x -> if f x then Just x else Nothing) 
 
+pSatisfy :: (Tag String -> Bool) -> TagParser (Tag String)
+pSatisfy f = try $ optional pLocation >> pSat f
+
 pAnyTag :: TagParser (Tag String)
 pAnyTag = pSatisfy (const True)
 
@@ -268,7 +290,7 @@ pSelfClosing :: (String -> Bool) -> ([Attribute String] -> Bool)
              -> TagParser (Tag String)
 pSelfClosing f g = do
   open <- pSatisfy (tagOpen f g)
-  optional $ try $ pLocation >> pSatisfy (tagClose f)
+  optional $ pSatisfy (tagClose f)
   return open
 
 pEmph :: TagParser [Inline]
@@ -342,7 +364,6 @@ pInTags tagtype parser = try $ do
 
 pCloses :: String -> TagParser ()
 pCloses tagtype = try $ do
-  optional pLocation
   t <- lookAhead $ pSatisfy $ \tag -> isTagClose tag || isTagOpen tag
   case t of
        (TagClose t')  | t' == tagtype -> pAnyTag >> return ()
@@ -360,6 +381,11 @@ pTagText = try $ do
        Left _        -> fail $ "Could not parse `" ++ str ++ "'"
        Right result  -> return result
 
+pBlank :: TagParser ()
+pBlank = try $ do
+  (TagText str) <- pSatisfy isTagText
+  guard $ all isSpace str
+
 pTagContents :: GenParser Char ParserState Inline
 pTagContents =  pStr <|> pSpace <|> smartPunctuation pTagContents <|> pSymbol
 
@@ -433,10 +459,8 @@ _ `closes` "html" = False
 "a" `closes` "a" = True
 "li" `closes` "li" = True
 "th" `closes` t | t `elem` ["th","td"] = True
-"td" `closes` t | t `elem` ["th","td"] = True
 "tr" `closes` t | t `elem` ["th","td","tr"] = True
 "dt" `closes` t | t `elem` ["dt","dd"] = True
-"dd" `closes` t | t `elem` ["dt","dd"] = True
 "hr" `closes` "p" = True
 "p" `closes` "p" = True
 "meta" `closes` "meta" = True
diff --git a/src/Text/Pandoc/Writers/LaTeX.hs b/src/Text/Pandoc/Writers/LaTeX.hs
index fbf443a03..836e0f974 100644
--- a/src/Text/Pandoc/Writers/LaTeX.hs
+++ b/src/Text/Pandoc/Writers/LaTeX.hs
@@ -370,8 +370,8 @@ inlineToLaTeX (Link txt (src, _)) =
              do modify $ \s -> s{ stUrl = True }
                 return $ text $ "\\url{" ++ x ++ "}"
         _ -> do contents <- inlineListToLaTeX $ deVerb txt
-                return $ text ("\\href{" ++ src ++ "}{") <> contents <>
-                         char '}'
+                return $ text ("\\href{" ++ stringToLaTeX src ++ "}{") <>
+                         contents <> char '}'
 inlineToLaTeX (Image _ (source, _)) = do
   modify $ \s -> s{ stGraphics = True }
   return $ "\\includegraphics" <> braces (text source)
diff --git a/tests/writer.latex b/tests/writer.latex
index 374815f63..eb4012749 100644
--- a/tests/writer.latex
+++ b/tests/writer.latex
@@ -581,7 +581,7 @@ spaces: a\^{}b c\^{}d, a\ensuremath{\sim}b c\ensuremath{\sim}d.
 `He said, ``I want to go.''\,' Were you alive in the 70's?
 
 Here is some quoted `\verb!code!' and a
-``\href{http://example.com/?foo=1&bar=2}{quoted link}''.
+``\href{http://example.com/?foo=1\&bar=2}{quoted link}''.
 
 Some dashes: one---two --- three---four --- five.
 
@@ -711,7 +711,7 @@ Just a \href{/url/}{URL}.
 
 \href{/url/}{URL and title}
 
-\href{/url/with_underscore}{with\_underscore}
+\href{/url/with\_underscore}{with\_underscore}
 
 \href{mailto:nobody@nowhere.net}{Email link}
 
@@ -746,15 +746,15 @@ Foo \href{/url/}{biz}.
 
 \subsection{With ampersands}
 
-Here's a \href{http://example.com/?foo=1&bar=2}{link with an ampersand in the
+Here's a \href{http://example.com/?foo=1\&bar=2}{link with an ampersand in the
 URL}.
 
 Here's a link with an amersand in the link text:
 \href{http://att.com/}{AT\&T}.
 
-Here's an \href{/script?foo=1&bar=2}{inline link}.
+Here's an \href{/script?foo=1\&bar=2}{inline link}.
 
-Here's an \href{/script?foo=1&bar=2}{inline link in pointy braces}.
+Here's an \href{/script?foo=1\&bar=2}{inline link in pointy braces}.
 
 \subsection{Autolinks}