aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers/Odt
AgeCommit message (Collapse)AuthorFilesLines
2021-03-15Use foldl' instead of foldl everywhere.John MacFarlane3-6/+6
2021-02-16Rename Text.Pandoc.XMLParser -> Text.Pandoc.XML.Light...John MacFarlane6-58/+48
..and add new definitions isomorphic to xml-light's, but with Text instead of String. This allows us to keep most of the code in existing readers that use xml-light, but avoid lots of unnecessary allocation. We also add versions of the functions from xml-light's Text.XML.Light.Output and Text.XML.Light.Proc that operate on our modified XML types, and functions that convert xml-light types to our types (since some of our dependencies, like texmath, use xml-light). Update golden tests for docx and pptx. OOXML test: Use `showContent` instead of `ppContent` in `displayDiff`. Docx: Do a manual traversal to unwrap sdt and smartTag. This is faster, and needed to pass the tests. Benchmarks: A = prior to 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8) B = as of 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8) C = this commit | Reader | A | B | C | | ------- | ----- | ------ | ----- | | docbook | 18 ms | 12 ms | 10 ms | | opml | 65 ms | 62 ms | 35 ms | | jats | 15 ms | 11 ms | 9 ms | | docx | 72 ms | 69 ms | 44 ms | | odt | 78 ms | 41 ms | 28 ms | | epub | 64 ms | 61 ms | 56 ms | | fb2 | 14 ms | 5 ms | 4 ms |
2020-11-07Lint code in PRs and when committing to master (#6790)Albert Krewinkel2-3/+3
* Remove unused LANGUAGE pragmata * Apply HLint suggestions * Configure HLint to ignore some warnings * Lint code when committing to master
2020-10-14Fix typos in comments, doc strings, error messages, and testsAlbert Krewinkel1-1/+1
Typos reported by https://fossies.org/linux/test/pandoc-master.tar.gz/codespell.html See: #6738
2020-09-13Fix hlint suggestions, update hlint.yaml (#6680)Christian Despres2-4/+5
* Fix hlint suggestions, update hlint.yaml Most suggestions were redundant brackets. Some required LambdaCase. The .hlint.yaml file had a small typo, and didn't ignore camelCase suggestions in certain modules.
2020-04-28Support new Underline element in readers and writers (#6277)Vaibhav Sagar1-1/+1
Deprecate `underlineSpan` in Shared in favor of `Text.Pandoc.Builder.underline`.
2020-04-15Adapt to the newest Table type, fix some previous adaptation issuesdespresc1-2/+2
- Writers.Native is now adapted to the new Table type. - Inline captions should now be conditionally wrapped in a Plain, not a Para block. - The toLegacyTable function now lives in Writers.Shared.
2020-04-15Implement the new Table typedespresc1-2/+2
2020-03-15Use implicit Prelude (#6187)Albert Krewinkel10-20/+0
* Use implicit Prelude The previous behavior was introduced as a fix for #4464. It seems that this change alone did not fix the issue, and `stack ghci` and `cabal repl` only work with GHC 8.4.1 or newer, as no custom Prelude is loaded for these versions. Given this, it seems cleaner to revert to the implicit Prelude. * PandocMonad: remove outdated check for base version Only base versions 4.9 and later are supported, the check for `MIN_VERSION_base(4,8,0)` is therefore unnecessary. * Always use custom prelude Previously, the custom prelude was used only with older GHC versions, as a workaround for problems with ghci. The ghci problems are resolved by replacing package `base` with `base-noprelude`, allowing for consistent use of the custom prelude across all GHC versions.
2020-02-07Apply linter suggestions. Add fix_spacing to lint target in Makefile.John MacFarlane4-8/+8
2020-02-07Resolve HLint warningsAlbert Krewinkel1-2/+0
All warnings are either fixed or, if more appropriate, HLint is configured to ignore them. HLint suggestions remain. * Ignore "Use camelCase" warnings in Lua and legacy code * Fix or ignore remaining HLint warnings * Remove redundant brackets * Remove redundant `return`s * Remove redundant as-pattern * Fuse mapM_/map * Use `.` to shorten code * Remove redundant `fmap` * Remove unused LANGUAGE pragmas * Hoist `not` in Text.Pandoc.App * Use fewer imports for `Text.DocTemplates` * Remove redundant `do`s * Remove redundant `$`s * Jira reader: remove unnecessary parentheses
2020-02-05Simplify an overcomplicated filtering function (#6115)Joseph C. Sible1-1/+1
There's no need to use `catMaybes`, `uncurry`, `bool`, etc., just to get elements where the second element of a tuple is True.
2020-02-04Remove our bool function (#6116)Joseph C. Sible2-9/+1
Data.Bool already provides a bool function identical to this one.
2019-11-12Switch to new pandoc-types and use Text instead of String [API change].despresc3-35/+69
PR #5884. + Use pandoc-types 1.20 and texmath 0.12. + Text is now used instead of String, with a few exceptions. + In the MediaBag module, some of the types using Strings were switched to use FilePath instead (not Text). + In the Parsing module, new parsers `manyChar`, `many1Char`, `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`, `mantyUntilChar` have been added: these are like their unsuffixed counterparts but pack some or all of their output. + `glob` in Text.Pandoc.Class still takes String since it seems to be intended as an interface to Glob, which uses strings. It seems to be used only once in the package, in the EPUB writer, so that is not hard to change.
2019-06-20Improve the parsing of frames in ODT documentsblmage3-72/+125
2019-03-01Remove license boilerplate.John MacFarlane11-206/+0
The haddock module header contains essentially the same information, so the boilerplate is redundant and just one more thing to get out of sync.
2018-12-17Replace read with safeRead. Closes #5162.John MacFarlane1-6/+3
2018-11-22Hlint suggestions.John MacFarlane1-2/+0
2018-11-11Text.Pandoc.Shared: add parameter to uniqueIdent, inlineListToIdentifier.John MacFarlane1-2/+7
The parameter is Extensions. This allows these functions to be sensitive to the settings of `Ext_gfm_auto_identifiers` and `Ext_ascii_identifiers`. This allows us to use `uniqueIdent` in the CommonMark reader, replacing some custom code. It also means that `gfm_auto_identifiers` can now be used in all formats. Semantically, `gfm_auto_identifiers` is now a modifier of `auto_identifiers`; for identifiers to be set, `auto_identifiers` must be turned on, and then the type of identifier produced depends on `gfm_auto_identifiers` and `ascii_identifiers` are set. Closes #5057.
2018-08-14ODT reader: deal gracefully with missing `<office:font-face-decls/>`.John MacFarlane1-1/+1
This allows pandoc to parse ODT document produced by KDE's Calligra. Closes #4336.
2018-08-10Avoid non-exhaustive pattern match.John MacFarlane1-3/+1
2018-07-02Spellcheck commentsAlexander Krotov5-7/+7
2018-03-18Use NoImplicitPrelude and explicitly import Prelude.John MacFarlane11-4/+20
This seems to be necessary if we are to use our custom Prelude with ghci. Closes #4464.
2018-03-17hlint fixes.John MacFarlane1-4/+4
2018-03-16Monoid/Semiground cleanup relying on custom Prelude.John MacFarlane2-14/+2
2018-03-16Semigroup instance for Styles in T.P.Readers.Odt.StyleReader.John MacFarlane1-2/+12
2018-03-16Removed redundant import.John MacFarlane1-3/+1
2018-03-13Require pandoc-types 1.17.4.John MacFarlane1-3/+3
And a few tweaks related to the Semigroups/Monoid change. Closes #4448.
2018-01-19hlint code improvements.John MacFarlane4-36/+32
2017-11-06Spellcheck commentsAlexander Krotov2-6/+6
2017-11-02hlintAlexander Krotov2-5/+5
2017-10-29Source code reformatting.John MacFarlane1-1/+0
2017-10-29More hlint fixes.John MacFarlane1-1/+1
2017-10-27hlint suggestions.John MacFarlane4-9/+8
2017-10-27Automatic reformating by stylish-haskell.John MacFarlane8-76/+75
2017-06-20Odt reader: replaced collectRights with rights from Data.Either.John MacFarlane2-6/+2
2017-05-31Odt reader: remove dead codeAlbert Krewinkel7-902/+4
The ODT reader contained a lot of general code useful for working with arrows. However, many of these utils weren't used and are hence removed.
2017-01-27Shared: rename compactify', compactify'DL -> compactify, compactifyDL.John MacFarlane1-2/+2
2017-01-25Revert "Added page breaks into Pandoc."John MacFarlane2-36/+10
This reverts commit f02a12aff638fa2339192231b8f601bffdfe3e14.
2017-01-25Added page breaks into Pandoc.Hubert Plociniczak2-10/+36
This requires an updated version of pandoc-types that introduces PageBreak definition. Not that this initial commit only introduces ODT pagebreaks and distinguishes for it page breaks before, after, or both, the paragraph, as read from the style definition.
2016-11-26[odt] Infer table's caption from the paragraph (#3224)hubertp-lshift1-6/+21
ODT's reader always put empty captions for the parsed tables. This commit 1) checks paragraphs that follow the table definition 2) treats specially a paragraph with a style named 'Table' 3) does some postprocessing of the paragraphs that combines tables followed immediately by captions The ODT writer used 'TableCaption' style name for the caption paragraph. This commit follows the open office approach which allows for appending captions to table but uses a built-in style named 'Table' instead of 'TableCaption'. Any users of odt format (both writer and reader) are therefore required to change the style's name to 'Table', if necessary.
2016-11-08Inline code when text has a special styleHubert Plociniczak1-6/+20
When a piece of text has a text 'Source_Text' then we assume that this is a piece of the document that represents a code that needs to be inlined. Addapted an odt writer to also reflect that change; previously it was just writing a 'preformatted' text using a non-distinguishable font style. Code blocks are still not recognized by the ODT reader. That's a separate issue.
2016-11-01[odt] Infer tables' header props from rows (#3199)hubertp-lshift1-2/+9
ODT reader simply provided an empty header list which meant that the contents of the whole table, even if not empty, was simply ignored. While we still do not infer headers we at least have to provide default properties of columns.
2016-10-19Image with a caption needs special formattingHubert Plociniczak1-2/+6
Latex Writer only handles captions if the image's title is prefixed with 'fig:'.
2016-10-18Merge pull request #3166 from hubertp-lshift/bug/3134John MacFarlane1-3/+2
Issue 3143: Don't duplicate text for anchors
2016-10-18Merge pull request #3165 from hubertp-lshift/feature/odt-imageJohn MacFarlane1-6/+106
[odt] images parser
2016-10-18Issue 3143: Don't duplicate text for anchorsHubert Plociniczak1-3/+2
When creating an anchor element we were adding its representation as well as the original content, leading to text duplication.
2016-10-17Minor refactoringHubert Plociniczak1-10/+6
2016-10-17Infer caption from the text following the imgHubert Plociniczak1-20/+47
Frame can contain other frames with the text boxes. This is something that has not been considered before and meant that the whole construction of images was broken in those cases. Also the captions were fixed/ignored.
2016-10-14Added tests and a corner case for starting numberHubert Plociniczak1-0/+1
Review revealed that we didn't handle the case when the starting point is an empty string. While this is not a valid .odt file, we simply added a special case to deal with it. Also added tests for the new feature.