From 8ca191604dcd13af27c11d2da225da646ebce6fc Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Mon, 8 Feb 2021 23:35:19 -0800 Subject: Add new unexported module T.P.XMLParser. This exports functions that uses xml-conduit's parser to produce an xml-light Element or [Content]. This allows existing pandoc code to use a better parser without much modification. The new parser is used in all places where xml-light's parser was previously used. Benchmarks show a significant performance improvement in parsing XML-based formats (especially ODT and FB2). Note that the xml-light types use String, so the conversion from xml-conduit types involves a lot of extra allocation. It would be desirable to avoid that in the future by gradually switching to using xml-conduit directly. This can be done module by module. The new parser also reports errors, which we report when possible. A new constructor PandocXMLError has been added to PandocError in T.P.Error [API change]. Closes #7091, which was the main stimulus. These changes revealed the need for some changes in the tests. The docbook-reader.docbook test lacked definitions for the entities it used; these have been added. And the docx golden tests have been updated, because the new parser does not preserve the order of attributes. Add entity defs to docbook-reader.docbook. Update golden tests for docx. --- test/docx/golden/block_quotes.docx | Bin 10092 -> 10098 bytes test/docx/golden/codeblock.docx | Bin 9944 -> 9950 bytes test/docx/golden/comments.docx | Bin 10279 -> 10285 bytes test/docx/golden/custom_style_no_reference.docx | Bin 10042 -> 10048 bytes test/docx/golden/custom_style_preserve.docx | Bin 10666 -> 10673 bytes test/docx/golden/custom_style_reference.docx | Bin 12434 -> 12434 bytes test/docx/golden/definition_list.docx | Bin 9941 -> 9947 bytes .../docx/golden/document-properties-short-desc.docx | Bin 9947 -> 9953 bytes test/docx/golden/document-properties.docx | Bin 10423 -> 10429 bytes test/docx/golden/headers.docx | Bin 10080 -> 10086 bytes test/docx/golden/image.docx | Bin 26758 -> 26764 bytes test/docx/golden/inline_code.docx | Bin 9880 -> 9886 bytes test/docx/golden/inline_formatting.docx | Bin 10060 -> 10066 bytes test/docx/golden/inline_images.docx | Bin 26816 -> 26822 bytes test/docx/golden/link_in_notes.docx | Bin 10101 -> 10107 bytes test/docx/golden/links.docx | Bin 10276 -> 10282 bytes test/docx/golden/lists.docx | Bin 10352 -> 10358 bytes test/docx/golden/lists_continuing.docx | Bin 10143 -> 10149 bytes test/docx/golden/lists_multiple_initial.docx | Bin 10232 -> 10238 bytes test/docx/golden/lists_restarting.docx | Bin 10144 -> 10150 bytes test/docx/golden/nested_anchors_in_header.docx | Bin 10239 -> 10245 bytes test/docx/golden/notes.docx | Bin 10046 -> 10052 bytes test/docx/golden/raw-blocks.docx | Bin 9980 -> 9986 bytes test/docx/golden/raw-bookmarks.docx | Bin 10115 -> 10121 bytes test/docx/golden/table_one_row.docx | Bin 9932 -> 9938 bytes test/docx/golden/table_with_list_cell.docx | Bin 10249 -> 10255 bytes test/docx/golden/tables.docx | Bin 10266 -> 10272 bytes test/docx/golden/track_changes_deletion.docx | Bin 9924 -> 9930 bytes test/docx/golden/track_changes_insertion.docx | Bin 9907 -> 9913 bytes test/docx/golden/track_changes_move.docx | Bin 9941 -> 9947 bytes .../golden/track_changes_scrubbed_metadata.docx | Bin 10053 -> 10059 bytes test/docx/golden/unicode.docx | Bin 9865 -> 9871 bytes test/docx/golden/verbatim_subsuper.docx | Bin 9913 -> 9919 bytes 33 files changed, 0 insertions(+), 0 deletions(-) (limited to 'test/docx') diff --git a/test/docx/golden/block_quotes.docx b/test/docx/golden/block_quotes.docx index 3e1bf16e7..d3b16d0f2 100644 Binary files a/test/docx/golden/block_quotes.docx and b/test/docx/golden/block_quotes.docx differ diff --git a/test/docx/golden/codeblock.docx b/test/docx/golden/codeblock.docx index 66f055063..6293ef493 100644 Binary files a/test/docx/golden/codeblock.docx and b/test/docx/golden/codeblock.docx differ diff --git a/test/docx/golden/comments.docx b/test/docx/golden/comments.docx index fb3a02a0a..4205a1516 100644 Binary files a/test/docx/golden/comments.docx and b/test/docx/golden/comments.docx differ diff --git a/test/docx/golden/custom_style_no_reference.docx b/test/docx/golden/custom_style_no_reference.docx index bc6c2702a..adb3f23db 100644 Binary files a/test/docx/golden/custom_style_no_reference.docx and b/test/docx/golden/custom_style_no_reference.docx differ diff --git a/test/docx/golden/custom_style_preserve.docx b/test/docx/golden/custom_style_preserve.docx index 8c555a5bd..92c8137fe 100644 Binary files a/test/docx/golden/custom_style_preserve.docx and b/test/docx/golden/custom_style_preserve.docx differ diff --git a/test/docx/golden/custom_style_reference.docx b/test/docx/golden/custom_style_reference.docx index 5f96cc911..f53470617 100644 Binary files a/test/docx/golden/custom_style_reference.docx and b/test/docx/golden/custom_style_reference.docx differ diff --git a/test/docx/golden/definition_list.docx b/test/docx/golden/definition_list.docx index c21b3a5b3..d6af90a72 100644 Binary files a/test/docx/golden/definition_list.docx and b/test/docx/golden/definition_list.docx differ diff --git a/test/docx/golden/document-properties-short-desc.docx b/test/docx/golden/document-properties-short-desc.docx index 92ce144e9..e18dbe853 100644 Binary files a/test/docx/golden/document-properties-short-desc.docx and b/test/docx/golden/document-properties-short-desc.docx differ diff --git a/test/docx/golden/document-properties.docx b/test/docx/golden/document-properties.docx index d21b67309..820299043 100644 Binary files a/test/docx/golden/document-properties.docx and b/test/docx/golden/document-properties.docx differ diff --git a/test/docx/golden/headers.docx b/test/docx/golden/headers.docx index 3558a47bf..ae0f41d12 100644 Binary files a/test/docx/golden/headers.docx and b/test/docx/golden/headers.docx differ diff --git a/test/docx/golden/image.docx b/test/docx/golden/image.docx index 606df92a3..94cd35dfa 100644 Binary files a/test/docx/golden/image.docx and b/test/docx/golden/image.docx differ diff --git a/test/docx/golden/inline_code.docx b/test/docx/golden/inline_code.docx index 759269cac..879f2a25b 100644 Binary files a/test/docx/golden/inline_code.docx and b/test/docx/golden/inline_code.docx differ diff --git a/test/docx/golden/inline_formatting.docx b/test/docx/golden/inline_formatting.docx index c37777080..93f86478f 100644 Binary files a/test/docx/golden/inline_formatting.docx and b/test/docx/golden/inline_formatting.docx differ diff --git a/test/docx/golden/inline_images.docx b/test/docx/golden/inline_images.docx index 9450b1a73..967d297f2 100644 Binary files a/test/docx/golden/inline_images.docx and b/test/docx/golden/inline_images.docx differ diff --git a/test/docx/golden/link_in_notes.docx b/test/docx/golden/link_in_notes.docx index 6f0b830e6..c5614e2fa 100644 Binary files a/test/docx/golden/link_in_notes.docx and b/test/docx/golden/link_in_notes.docx differ diff --git a/test/docx/golden/links.docx b/test/docx/golden/links.docx index e53889cfb..0f39a831f 100644 Binary files a/test/docx/golden/links.docx and b/test/docx/golden/links.docx differ diff --git a/test/docx/golden/lists.docx b/test/docx/golden/lists.docx index 5dbe298b7..07046f223 100644 Binary files a/test/docx/golden/lists.docx and b/test/docx/golden/lists.docx differ diff --git a/test/docx/golden/lists_continuing.docx b/test/docx/golden/lists_continuing.docx index 194181288..3656618e6 100644 Binary files a/test/docx/golden/lists_continuing.docx and b/test/docx/golden/lists_continuing.docx differ diff --git a/test/docx/golden/lists_multiple_initial.docx b/test/docx/golden/lists_multiple_initial.docx index 6e0b634f7..8798253d5 100644 Binary files a/test/docx/golden/lists_multiple_initial.docx and b/test/docx/golden/lists_multiple_initial.docx differ diff --git a/test/docx/golden/lists_restarting.docx b/test/docx/golden/lists_restarting.docx index 477178e77..0a24d1840 100644 Binary files a/test/docx/golden/lists_restarting.docx and b/test/docx/golden/lists_restarting.docx differ diff --git a/test/docx/golden/nested_anchors_in_header.docx b/test/docx/golden/nested_anchors_in_header.docx index 51110356e..52bb7a217 100644 Binary files a/test/docx/golden/nested_anchors_in_header.docx and b/test/docx/golden/nested_anchors_in_header.docx differ diff --git a/test/docx/golden/notes.docx b/test/docx/golden/notes.docx index b6206cdf5..182c06c64 100644 Binary files a/test/docx/golden/notes.docx and b/test/docx/golden/notes.docx differ diff --git a/test/docx/golden/raw-blocks.docx b/test/docx/golden/raw-blocks.docx index 07b576080..7b69a56a3 100644 Binary files a/test/docx/golden/raw-blocks.docx and b/test/docx/golden/raw-blocks.docx differ diff --git a/test/docx/golden/raw-bookmarks.docx b/test/docx/golden/raw-bookmarks.docx index d46095eb7..3d3a35701 100644 Binary files a/test/docx/golden/raw-bookmarks.docx and b/test/docx/golden/raw-bookmarks.docx differ diff --git a/test/docx/golden/table_one_row.docx b/test/docx/golden/table_one_row.docx index 7caba4e93..5ae37b406 100644 Binary files a/test/docx/golden/table_one_row.docx and b/test/docx/golden/table_one_row.docx differ diff --git a/test/docx/golden/table_with_list_cell.docx b/test/docx/golden/table_with_list_cell.docx index 6aaa6da61..c29aa6716 100644 Binary files a/test/docx/golden/table_with_list_cell.docx and b/test/docx/golden/table_with_list_cell.docx differ diff --git a/test/docx/golden/tables.docx b/test/docx/golden/tables.docx index 5746c5ad0..664493246 100644 Binary files a/test/docx/golden/tables.docx and b/test/docx/golden/tables.docx differ diff --git a/test/docx/golden/track_changes_deletion.docx b/test/docx/golden/track_changes_deletion.docx index 5f22dccc6..b6d15340e 100644 Binary files a/test/docx/golden/track_changes_deletion.docx and b/test/docx/golden/track_changes_deletion.docx differ diff --git a/test/docx/golden/track_changes_insertion.docx b/test/docx/golden/track_changes_insertion.docx index ab5c4f56d..f8e1092d2 100644 Binary files a/test/docx/golden/track_changes_insertion.docx and b/test/docx/golden/track_changes_insertion.docx differ diff --git a/test/docx/golden/track_changes_move.docx b/test/docx/golden/track_changes_move.docx index 085f33162..b4cda82f2 100644 Binary files a/test/docx/golden/track_changes_move.docx and b/test/docx/golden/track_changes_move.docx differ diff --git a/test/docx/golden/track_changes_scrubbed_metadata.docx b/test/docx/golden/track_changes_scrubbed_metadata.docx index 1ac86d5c8..ee222efa0 100644 Binary files a/test/docx/golden/track_changes_scrubbed_metadata.docx and b/test/docx/golden/track_changes_scrubbed_metadata.docx differ diff --git a/test/docx/golden/unicode.docx b/test/docx/golden/unicode.docx index c2c443b19..c6f8d9c96 100644 Binary files a/test/docx/golden/unicode.docx and b/test/docx/golden/unicode.docx differ diff --git a/test/docx/golden/verbatim_subsuper.docx b/test/docx/golden/verbatim_subsuper.docx index 5ea18d32e..ea8146690 100644 Binary files a/test/docx/golden/verbatim_subsuper.docx and b/test/docx/golden/verbatim_subsuper.docx differ -- cgit v1.2.3