aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers/HTML/Parsing.hs
AgeCommit message (Collapse)AuthorFilesLines
2021-08-10HTML reader: treat commments as blank when parsing.John MacFarlane1-5/+7
This modifies pBlank. Previously comments could sometimes flummox the parser. Cloes #7482.
2021-07-06HTML reader: add col, colgroup to 'closes' definitionsJohn MacFarlane1-1/+3
2021-01-08Update copyright notices for 2021 (#7012)Albert Krewinkel1-1/+1
2020-12-10HTML reader: retain attribute prefixes and avoid duplicates.John MacFarlane1-7/+13
Previously we stripped attribute prefixes, reading `xml:lang` as `lang` for example. This resulted in two duplicate `lang` attributes when `xml:lang` and `lang` were both used. This commit causes the prefixes to be retained, and also avoids invald duplicate attributes. Closes #6938.
2020-11-26HTML reader: improve support for table headers, footer, attributesAlbert Krewinkel1-11/+36
- `<tfoot>` elements are no longer added to the table body but used as table footer. - Separate `<tbody>` elements are no longer combined into one. - Attributes on `<thead>`, `<tbody>`, `<th>`/`<td>`, and `<tfoot>` elements are preserved.
2020-11-26HTML reader: allow finer grained options for tag omissionAlbert Krewinkel1-5/+17
2020-11-24HTML reader: extract table parsing into separate moduleAlbert Krewinkel1-0/+26
2020-11-23HTML reader: extract submodulesAlbert Krewinkel1-0/+156
Reducing module size should reduce memory use during compilation. This is preparatory work to tackle support for more table features.