From 69d433d37a2b50b2d07f588603a6fbc03041c0af Mon Sep 17 00:00:00 2001 From: Jesse Rosenthal Date: Thu, 21 Feb 2019 08:32:57 -0500 Subject: Docx reader: Start adding comment to combine module This module is one of the most opaque parts of the docx reader: it deals with the fact that runs have non-nesting formatting, so we have to figure out the nesting on the fly as we combine them. We start adding commenting, so new developers can understand and, if necessary, modify this module. Specific function comments will be added in the future, but this offers a global description of the purpose of the module. --- src/Text/Pandoc/Readers/Docx/Combine.hs | 40 +++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/src/Text/Pandoc/Readers/Docx/Combine.hs b/src/Text/Pandoc/Readers/Docx/Combine.hs index 2fba3394b..da40a80ea 100644 --- a/src/Text/Pandoc/Readers/Docx/Combine.hs +++ b/src/Text/Pandoc/Readers/Docx/Combine.hs @@ -14,6 +14,46 @@ Flatten sequences of elements. -} + +{- +The purpose of this module is to combine the formatting of separate +runs, which have *non-nesting* formatting. Because the formatting +doesn't nest, you can't actually tell the nesting order until you +combine with the runs that follow. + +For example, say you have a something like `foo +bar`. Then in ooxml, you'll get these two runs: + +~~~ + + + + + + Foo + + + + + + Bar + +~~~ + +Note that this is an ideal situation. In practice, it will probably be +more---if, for example, the user turned italics +off and then on. + +So, when you get the first run, which is marked as both bold and italic, +you have no idea whether it's `Strong [Emph [Str "Foo"]]` or `Emph +[Strong [Str "Foo"]]`. + +We combine two runs, then, by taking off the formatting that modifies an +inline, seeing what is shared between them, and rebuilding an inline. We +fold this to combine the inlines. + +-} + module Text.Pandoc.Readers.Docx.Combine ( smushInlines , smushBlocks ) -- cgit v1.2.3