aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc
AgeCommit message (Collapse)AuthorFilesLines
2014-08-04Use `mapM_` instead of `() <$ mapM` in one place.Artyom Kazak1-1/+1
2014-08-04Merge branch 'epubend' of https://github.com/mpickering/pandoc into ↵John MacFarlane4-33/+501
mpickering-epubend Conflicts: pandoc.cabal
2014-08-03Correctly implement capitalisation.Artyom Kazak3-15/+19
Using `map toUpper` to capitalise text is wrong, as e.g. “Straße” should be converted to “STRASSE”, which is 1 character longer. This commit adds a `capitalize` function and replaces 2 identical implementations in different modules (`toCaps` and `capitalize`) with it.
2014-08-02SelfContained: Fixed determining of source URL from within CSS files.John MacFarlane1-2/+9
(This fixes a bug introduced a couple commits back.)
2014-08-02fetchItem: improved mime type guessing.John MacFarlane1-4/+5
Strip a fragment like `?#iefix` from the extension before doing the mime lookup.
2014-08-02Shared: fetchItem improvements.John MacFarlane1-11/+12
* More consistent logic: absolute URIs are fetched from the net; other things are treated as relative URIs if sourceURL is a Just, otherwise as file paths. * We escape characters that are not allowed in URIs before trying to parse them (e.g. '|', which often occurs in the wild). * When treating relative paths as local file paths, we drop any fragment or query. This is useful e.g. when you've downloaded web fonts locally, but your source still contains the original relative URLs. Together with the previous commit, this should close #1477.
2014-08-02Text.Pandoc.SelfContained changes.John MacFarlane2-59/+28
* mkSelfContained now takes just two arguments, WriterOptions and the string. * It no longer looks in data files. This only made sense when we had copies of slidy and S5 code there. * Shared.fetchItem' is used instead of the nearly duplicate getItem.
2014-08-01Docx Parser: Produce endnotes.Jesse Rosenthal1-2/+2
The parser had been changing footnotes and endnotes into footnotes. This isn't a problem, because pandoc collapses them, but the parser should maintain as much of the docx structure as is collapsed, and let the toplevel reader worry about how to translate it into Pandoc. (This would be an issue when, as is planned, the docx parser spins off into its own module.) The output is the same, so no test change is required.
2014-07-31Docx Reader: Single underlines are "emph"Jesse Rosenthal1-1/+2
All other underlines are ignored.
2014-07-31EPUB Reader: Now uses the new MediaBag for imagesMatthew Pickering1-20/+45
2014-07-31HTML Reader: Added ability to read MathML formatted <math> blocksMatthew Pickering1-0/+16
2014-07-31HTML Reader: Added support for anchors on links and list itemsMatthew Pickering1-4/+22
2014-07-31HTML Reader: Extended HTML Reader to recognise EPUB specific elementsMatthew Pickering1-28/+178
2014-07-31Options: Added option to turn on epub html extensionsMatthew Pickering1-0/+1
2014-07-31Except Compat: Updated to export more module functionsMatthew Pickering1-1/+11
2014-07-31EPUB Reader: Added EPUB readerMatthew Pickering1-0/+248
2014-07-31New module, Text.Pandoc.MediaBag.John MacFarlane6-81/+121
Moved `MediaBag` definition and functions from Shared: `lookupMedia`, `mediaDirectory`, `insertMedia`, `extractMediaBag`. Removed `emptyMediaBag`; use `mempty` instead, since `MediaBag` is a Monoid.
2014-07-31Made MediaBag a newtype, and added mime type information to media.John MacFarlane5-32/+85
Shared now exports functions for interacting with a MediaBag: - `emptyMediaBag` - `lookuMedia` - `insertMedia` - `mediaDirectory` - `extractMediaBag`
2014-07-31Shared: Added function insertMedia which is an alias for M.insertMatthew Pickering1-1/+8
2014-07-30Removed deprecated and no longer used readerStrict in ReaderOptions.John MacFarlane1-2/+0
This is handled by readerExtensions now.
2014-07-30getT2TMeta: Take list of source files instead of single.John MacFarlane1-7/+8
Get latest modification time.
2014-07-30Allow --self-contained to get content from MediaBag.John MacFarlane2-26/+35
Added a parameter to makeSelfContained (API change).
2014-07-30RTF writer: Improved image embedding.John MacFarlane1-1/+12
Use calculated sizes.
2014-07-30RTF writer: refactored image embedding, using fetchItem'.John MacFarlane1-26/+21
2014-07-30PDF, Docx, EPUB, and ODT writers now automatically use MediaBag.John MacFarlane5-14/+15
The MediaBag is thread through from the reader, with no need to extract to files.
2014-07-30Shared: Added fetchItem', which searches a media bag too.John MacFarlane1-0/+14
2014-07-30Moved MediaBag back from Shared to Options, to avoid module cycle.John MacFarlane2-5/+12
2014-07-30Added writerMediaBag to WriterOptions.John MacFarlane1-1/+3
2014-07-30Moved MediaBag from Shared to Options.John MacFarlane2-11/+6
This will allow us to put a MediaBag in WriterOptions.
2014-07-30Moved withTempDir from PDF to Shared, export from Shared.John MacFarlane2-11/+17
API change.
2014-07-30Shared: Make MediaBag available through Shared.Jesse Rosenthal1-0/+11
2014-07-30Docx reader: Make docx reader put image data in MediaBag.Jesse Rosenthal2-37/+30
Image data will not be put in a media bag map, which will be output along with the pandoc output.
2014-07-29Mediawiki writer: don't escape inside `<source>`.John MacFarlane1-4/+8
Closes #1445. Escapes can still be used with `<code>` and `<pre>`.
2014-07-29Docx writer: Print subtitle from metadata if present.John MacFarlane1-3/+9
Use Subtitle style. See #1451.
2014-07-29LaTeX writer: use \(..\) instead of $..$ for inline math.John MacFarlane1-1/+1
Closes #1464.
2014-07-29Merge pull request #1463 from jkr/metadataJohn MacFarlane1-11/+73
Make metadata out of styled pars
2014-07-29Docx reader: Make metavalues out of styled paragraphs.Jesse Rosenthal1-11/+73
This will make paragraphs styled with `Author`, `Title`, `Subtitle`, `Date`, and `Abstract` into pandoc metavalues, rather than text. The implementation only takes those elements from the beginning of the document (ignoring empty paragraphs). Multiple paragraphs in the `Author` style will be made into a metaList, one paragraph per item. Hard linebreaks (shift-return) in the paragraph will be maintained, and can be used for institution, email, etc.
2014-07-27Parsing: Added isbn and pmid schemesMatthew Pickering1-2/+2
2014-07-27Markdown writer: Separate adjacent lists of the same kind with comment.John MacFarlane1-3/+9
Closes #1458.
2014-07-27Markdown writer: More improvements to 'plain' output, updated tests.John MacFarlane1-21/+26
Math now appears in unicode if possible, without the distracting italics around identifiers. Blank lines around headers are more consistent. Footnotes appear in regular [n] style.
2014-07-27Text.Pandoc.Pretty: added blanklines.John MacFarlane1-15/+17
This ensures a certain number of blanklines (and no more) in output.
2014-07-27Markdown writer: Better 'plain' output.John MacFarlane1-83/+101
We now largely follow the style of Project Gutenberg. Emphasis is rendered with `_underscores_`, strong with ALL CAPS. The appearance of horizontal rules has changed (even in regular markdown) to a line across the whole page. Headings are rendered differently, using space to set them off.
2014-07-27Markdown writer: Update definition lists.John MacFarlane1-2/+13
They now behave like the new reader does. The old behavior can be activated with the `compact_definition_lists` extension.
2014-07-26Docx writer: Added missing case from last commit.John MacFarlane1-1/+1
2014-07-26Docx writer: include abstract with Abstract style.John MacFarlane1-1/+8
Addresses docx part of #1451.
2014-07-26Merge pull request #1457 from mpickering/generalstateJohn MacFarlane2-58/+114
Generalised more in Parsing.hs to enable the use of custom state
2014-07-27Added compatability layer to support directory-1.1Matthew Pickering2-1/+22
2014-07-27Txt2Tags Reader: Added copyright informationMatthew Pickering1-0/+26
2014-07-27Txt2Tags Reader: Added recognition of macrosMatthew Pickering1-4/+18
2014-07-27Added txt2tags readerMatthew Pickering1-0/+507
http://txt2tags.org/ There are two points which currently do not match the official implementation. 1. In the official implementation lists can not be nested like the following but the reader would interpret this as a bullet list with the first item being a numbered list. ``` - + This is not a list ``` 2. The specification describes how URIs automatically becomes links. Unfortunately as is often the case, their definitiong of URI is not clear. I tried three solutions but was unsure about which to adopt. * Using isURI from Network.URI, this matches far too many strings and is therefore unsuitable * Using uri from Text.Pandoc.Shared, this doesn't match all strings that the reference implementation matches * Try to simulate the regex which is used in the native code I went with the third approach but it is not perfect, for example trailing punctuation is captured in Urls.