aboutsummaryrefslogtreecommitdiff
path: root/src/Text
AgeCommit message (Collapse)AuthorFilesLines
2014-08-18Merge pull request #1547 from jkr/styleparseJohn MacFarlane2-36/+110
Docx reader: parsing styles
2014-08-18HTML reader: improved handling of tags that can be block or inline.John MacFarlane1-5/+13
Previously a section like this would be enclosed in a paragraph, with RawInline for the video tags (since video is a tag that can be either block or inline): <video controls="controls"> <source src="../videos/test.mp4" type="video/mp4" /> <source src="../videos/test.webm" type="video/webm" /> <p> The videos can not be played back on your system.<br/> Try viewing on Youtube (requires Internet connection): <a href="http://youtu.be/etE5urBps_w">Relative Velocity on Youtube</a>. </p> </video> This change will cause the video and source tags to be parsed as RawBlock instead, giving better output. The general change is this: when we're parsing a "plain" sequence of inlines, we don't parse anything that COULD be a block-level tag.
2014-08-17Docx reader: whitespace fix.Jesse Rosenthal1-6/+6
2014-08-17Docx reader: remove emph styles and strong styles list.Jesse Rosenthal1-6/+0
We no longer need the explicit lists since we're deriving them from the ground up.
2014-08-17Docx reader: Add "Hyperlink" to blacklisted styles.Jesse Rosenthal1-2/+2
This is the only one so far. We'll add others as they show up.
2014-08-17Docx reader: Use style resolver.Jesse Rosenthal1-23/+9
We now no longer check against explicit styles.
2014-08-17Docx Reader: Introduce function for resolving dependent run styles.Jesse Rosenthal1-0/+31
We always favor an explicit positive or negative in a style in a descendent, and only turn to the ancestor if nothing is set. We also introduce an (empty) list of styles that are black-listed. We won't check them. (Think underlines in hyperlinks).
2014-08-17Merge pull request #1536 from considerate/masterJohn MacFarlane1-0/+3
Add row width to tables in Docx XML
2014-08-17Merge pull request #1543 from jkr/superSubVertJohn MacFarlane2-16/+17
Docx reader: Change behavior of Super/Subscript
2014-08-17Docx writer: Fixed regression, bungled list numbering.John MacFarlane1-3/+10
In pandoc 1.13, all lists come out as basic ordered lists. This fixes that bad regression. Closes #1544.
2014-08-17Docx Parse: build a bottom-up style tree.Jesse Rosenthal1-6/+31
Two points here: (1) We're going bottom-up, from styles not based on anything, to avoid circular dependencies or any other sort of maliciousness/incompetence. And (2) each style points to its parent. That way, we don't need the whole tree to pass a style over to Docx.hs
2014-08-17Remove an unnecessary import.Artyom Kazak1-1/+1
2014-08-17Update Reader.EPUB to use `MimeType`.Artyom Kazak1-8/+7
2014-08-17MIME cleanup.Artyom Kazak7-42/+59
* Create a type synonym for MIME type (instead of `String`). * Add `getMimeTypeDef` function. * Avoid recreating MIME type `Map`s every time. * Move “Formula-...” case handling into `getMimeType`.
2014-08-17Alias string and runStyle to CharStyle type.Jesse Rosenthal1-7/+10
2014-08-17Docx Style parser: Basic one now just takes a parent style.Jesse Rosenthal1-13/+15
This will make it easier to build the style map from the bottom up (to avoid any infinite references).
2014-08-17Docx reader: work with new rStyle.Jesse Rosenthal1-4/+4
Just discards info at the moment, so at least it works the same.
2014-08-17Parser: Framework for parsing styles.Jesse Rosenthal1-11/+44
We want to be able to read user-defined styles. Eventually we'll be able to figure out styles in terms of inheritance as well. The actual cascading will happen in the docx reader.
2014-08-17Docx reader: Change behavior of Super/SubscriptJesse Rosenthal2-16/+17
In docx, super- and subscript are attributes of Vertalign. It makes more sense to follow this, and have different possible values of Vertalign in runStyle. This is mainly a preparatory step for real style parsing, since it can distinguish between vertical align being explicitly turned off and it not being set. In addition, it makes parsing a bit clearer, and makes sure we don't do docx-impossible things like being simultaneously super and sub.
2014-08-16HTML reader: Parse appropriately styled span as SmallCaps.John MacFarlane1-1/+6
2014-08-17Simplify row width calculation.Viktor Kronvall1-2/+2
2014-08-17Include row width in table rows.Christoffer Ackelman1-0/+3
Added a property to all table rows where the sum of column widths is specified in pct (fraction of 5000).
2014-08-16Markdown writer: don't escape $, ^, ~ when extensions are deactivated.John MacFarlane1-5/+16
`tex_math_dollars`, `superscript`, and `subscript` extensions, respectively. Closes #1127.
2014-08-16Docx reader: Remove unnecessary plural functionsJesse Rosenthal1-11/+5
functions like runElemsToInlines and parPartsToInlines are just defined in terms of concatting and mapping their singular version (e.g. `runElemToInlines`). Having two functions with almost identical names makes it easier to introduce errors. It's easy enough to just concat and map inline, and it makes it clearer what is going on in the code.
2014-08-16Docx reader: Fix bug in character styles.Jesse Rosenthal1-2/+2
Style handling has been cleaned up, but introduced a bug here. There wasn't previously a test to catch it.
2014-08-16Rewrite Docx.hs and Reducible to use Builder.Jesse Rosenthal2-415/+368
The big news here is a rewrite of Docx to use the builder functions. As opposed to previous attempts, we now see a significant speedup -- times are cut in half (or more) in a few informal tests. Reducible has also been rewritten. It can doubtless be simplified and clarified further. We can consider this, at the moment, a reference for correct behavior.
2014-08-14Markdown reader: Better handle quote characters in inline links.John MacFarlane1-2/+4
This was previously failing to be recognized as a link: [Test](http://en.wikipedia.org/wiki/Ward's_method) Closes #1534.
2014-08-14Make `raw_tex` extension non-default for textile reader, writer.John MacFarlane2-3/+4
Enable `raw_tex` extension in textile writer. Closes #1532.
2014-08-13Merge pull request #1531 from jkr/morefontsJohn MacFarlane1-2/+6
Docx reader: Interpret "Strong" and "Emphasis" run styles.
2014-08-13Fixed haddock comment.John MacFarlane1-9/+7
2014-08-13Removed unneeded import.John MacFarlane1-1/+0
2014-08-13Docx reader: Interpret "Strong" and Emphasis run styles.Jesse Rosenthal1-2/+6
2014-08-12Removed unneeded CPP.John MacFarlane1-4/+0
2014-08-13Docx: Reducible forgot about smallcapsJesse Rosenthal1-0/+2
2014-08-12Docx Reader: Trim line breaks from the beginning and end of SectionJesse Rosenthal1-2/+10
Headers. We might also want to do this elsewhere (for pars, for example).
2014-08-12Docx: More robust handling of multiple bookmarks in header.Jesse Rosenthal1-6/+8
2014-08-12Docx reader: Check for null-id'd anchors too.Jesse Rosenthal1-1/+0
Otherwise they get left dangling in the document.
2014-08-12Docx reader: accept explicit "Italic" and "Bold" rStyles.Jesse Rosenthal2-18/+31
Note that "Italic" can be on, and, from the last commit, `<w:i>` can be present, but be turned off. In that case, the turned-off tag takes precedence. So, we have to distinguish between something being off and something not being there. Hence, isItalic, isBold, isStrike, and isSmallCaps have become Maybes.
2014-08-12Docx reader: Add "BlockQuotation" to divs list.Jesse Rosenthal1-1/+1
2014-08-12Docx Reader: Fix font style parsing.Jesse Rosenthal1-12/+27
Before we just checked for the existence of a tag. Now, we make sure to check for its on/off value.
2014-08-12Merge pull request #1527 from mpickering/juicypixelsJohn MacFarlane2-5/+31
Attempts to convert gif, tiff and bmp to png in pdf writer
2014-08-12Merge pull request #1528 from mpickering/epubtitlepageJohn MacFarlane1-4/+10
EPUB Reader: Ignores titlepage attribute
2014-08-13LaTeX Writer: Added missing closing braces to hyperdef commandsMatthew Pickering1-2/+2
2014-08-13PDF Writer: Attempts to convert images to pdf renderable formatsMatthew Pickering1-3/+29
Now depends on the JuicyPixels library. Will attempt to convert an image (gif, tiff, bmp) to png when converting to pdf.
2014-08-12HTML writer: use 'uri' or 'email' class for autolinks.John MacFarlane1-5/+8
This allows them to be styled specially. Closes #1501.
2014-08-12ConTeXt writer: improved autolink detection.John MacFarlane1-1/+1
It previously failed in some cases with escaped special characters.
2014-08-12EPUB Reader: Ignore title pagesMatthew Pickering1-4/+10
2014-08-12DocBook: Support equations with mathml.John MacFarlane1-4/+16
equation, informalequation, inlineequation and mml:math elements.
2014-08-12Merge pull request #1524 from jkr/dropCap3John MacFarlane2-3/+11
Docx reader: move dropcap combining logic to Reducible
2014-08-12Markdown reader: Improved parsing of indented code in list items.John MacFarlane1-25/+42
Indented code at the beginning of a list item must be indented eight spaces from the margin (or from the edge of the container), or four spaces past the list marker, whichever is farther. Some examples in `tests/markdown-reader-more.txt`.