aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers/HTML/Parsing.hs
AgeCommit message (Collapse)AuthorFilesLines
2020-12-10HTML reader: retain attribute prefixes and avoid duplicates.John MacFarlane1-7/+13
Previously we stripped attribute prefixes, reading `xml:lang` as `lang` for example. This resulted in two duplicate `lang` attributes when `xml:lang` and `lang` were both used. This commit causes the prefixes to be retained, and also avoids invald duplicate attributes. Closes #6938.
2020-11-26HTML reader: improve support for table headers, footer, attributesAlbert Krewinkel1-11/+36
- `<tfoot>` elements are no longer added to the table body but used as table footer. - Separate `<tbody>` elements are no longer combined into one. - Attributes on `<thead>`, `<tbody>`, `<th>`/`<td>`, and `<tfoot>` elements are preserved.
2020-11-26HTML reader: allow finer grained options for tag omissionAlbert Krewinkel1-5/+17
2020-11-24HTML reader: extract table parsing into separate moduleAlbert Krewinkel1-0/+26
2020-11-23HTML reader: extract submodulesAlbert Krewinkel1-0/+156
Reducing module size should reduce memory use during compilation. This is preparatory work to tackle support for more table features.