diff options
author | John MacFarlane <jgm@berkeley.edu> | 2010-12-22 20:25:15 -0800 |
---|---|---|
committer | John MacFarlane <jgm@berkeley.edu> | 2010-12-30 13:55:40 -0800 |
commit | 904050fa36715e18522d80432a2666fcbaacd105 (patch) | |
tree | 4745876e797d400539dd80309d31c330a013e969 /s5 | |
parent | 220fe5fab89ce84fcb98f0430c4126281ca8362d (diff) | |
download | pandoc-904050fa36715e18522d80432a2666fcbaacd105.tar.gz |
New HTML reader using tagsoup as a lexer.
* The new reader is faster and more accurate.
* API changes for Text.Pandoc.Readers.HTML:
- removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag,
anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType,
htmlBlockElement, htmlComment
- added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag
* tagsoup is a new dependency.
* Text.Pandoc.Parsing: Generalized type on readWith.
* Benchmark.hs: Added length calculation to force full evaluation.
* Updated HTML reader tests.
* Updated markdown and textile readers to use the functions from
the HTML reader.
* Note: The markdown reader now correctly handles some cases it did not
before. For example:
<hr/>
is reproduced without adding a space.
<script>
a = '<b>';
</script>
is parsed correctly.
Diffstat (limited to 's5')
0 files changed, 0 insertions, 0 deletions