aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Shared.hs
AgeCommit message (Collapse)AuthorFilesLines
2021-10-11Remove splitSentences from T.P.Shared [API change].John MacFarlane1-28/+0
We used to attempt automatic sentence splitting in man and ms output, since sentence-ending periods need to be followed by two spaces or a newline in these formats. But it's difficult to do this reliably at the level of `[Inline]`.
2021-10-01Depend on pandoc-types 1.23, remove Null constructor on Block.John MacFarlane1-1/+0
2021-05-09Change reader types, allowing better tracking of source positions.John MacFarlane1-0/+1
Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn't report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn't resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752). Add Text.Pandoc.Sources (exported module), with a `Sources` type and a `ToSources` class. A `Sources` wraps a list of `(SourcePos, Text)` pairs. [API change] A parsec `Stream` instance is provided for `Sources`. The module also exports versions of parsec's `satisfy` and other Char parsers that track source positions accurately from a `Sources` stream (or any instance of the new `UpdateSourcePos` class). Text.Pandoc.Parsing now exports these modified Char parsers instead of the ones parsec provides. Modified parsers to use a `Sources` as stream [API change]. The readers that previously took a `Text` argument have been modified to take any instance of `ToSources`. So, they may still be used with a `Text`, but they can also be used with a `Sources` object. In Text.Pandoc.Error, modified the constructor PandocParsecError to take a `Sources` rather than a `Text` as first argument, so parse error locations can be accurately reported. T.P.Error: showPos, do not print "-" as source name.
2021-04-17Move getLang from BCP47 -> T.P.Writers.Shared.John MacFarlane1-1/+0
[API change]
2021-04-08Fix regression in grid tables for wide characters.John MacFarlane1-5/+13
In the translation from String to Text, a char-width-sensitive splitAt' was dropped. This commit reinstates it. Closes #7214.
2021-03-21Simplify T.P.Asciify and export toAsciiText [API change].John MacFarlane1-2/+2
Instead of encoding a giant (and incomplete) map, we now just use unicode-transforms to normalize the text to a canonical decomposition, and manipulate the result. The new `toAsciiText` is equivalent to the old `T.pack . mapMaybe toAsciiChar . T.unpack` but should be faster.
2021-03-20T.P.Shared: remove `backslashEscapes`, `escapeStringUsing`.John MacFarlane1-13/+0
[API change] These are inefficient association list lookups. Replace with more efficient functions in the writers that used them (with 10-25% performance improvements in haddock, org, rtf, texinfo writers).
2021-03-19T.P.Shared: Remove ToString, ToText typeclasses [API change].John MacFarlane1-20/+0
T.P.Parsing: revise type of readWithM so that it takes a Text rather than a polymorphic ToText value. These typeclasses were there to ease the transition from String to Text. They are no longer needed, and they may clash with more useful versions under the same name. This will require a bump to 2.13.
2021-03-15Use foldl' instead of foldl everywhere.John MacFarlane1-2/+2
2021-03-12Simplify compactDL.John MacFarlane1-13/+11
2021-03-05Shared: Change defaultUserDataDirs -> defaultUserDataDir.John MacFarlane1-8/+12
Rationale: the manual says that the XDG data directory will be used if it exists, otherwise the legacy data directory. So we should just determine this and use this directory, rather than having a search path which could cause some things to be taken from one data directory and others from others. [API change]
2021-02-20T.P.Shared: remove some obsolete functions [API change].John MacFarlane1-43/+1
Removed: - `splitByIndices` - `splitStringByIndicies` - `substitute` - `underlineSpan` None of these are used elsewhere in the code base.
2021-02-18T.P.Shared: cleanup.John MacFarlane1-11/+26
Cleanup up some functions and added deprecation pragmas to funtions no longer used in the code base.
2021-02-13T.P.Shared: export `handleTaskListItem`. [API change]Albert Krewinkel1-0/+1
2021-01-08Update copyright notices for 2021 (#7012)Albert Krewinkel1-1/+1
2020-09-13Fix hlint suggestions, update hlint.yaml (#6680)Christian Despres1-6/+4
* Fix hlint suggestions, update hlint.yaml Most suggestions were redundant brackets. Some required LambdaCase. The .hlint.yaml file had a small typo, and didn't ignore camelCase suggestions in certain modules.
2020-05-05Shared.makeSections: omit number attribute when unnumbered class...John MacFarlane1-1/+2
...is present. Previously the attribute was included but given an empty value, and this caused the table of contents creation functions in T.P.Writers.Shared to think these items had numbers, which meant that they were included in the TOC even if the `unlisted` class was used. Closes #6339.
2020-04-28Support new Underline element in readers and writers (#6277)Vaibhav Sagar1-2/+3
Deprecate `underlineSpan` in Shared in favor of `Text.Pandoc.Builder.underline`.
2020-04-17Merge pull request #6224 from despresc/better-tablesJohn MacFarlane1-4/+8
2020-04-16Shared: renderTags': use self-closing tag for col element.John MacFarlane1-1/+1
Closes #6295.
2020-04-15Adapt to the newest Table type, fix some previous adaptation issuesdespresc1-27/+4
- Writers.Native is now adapted to the new Table type. - Inline captions should now be conditionally wrapped in a Plain, not a Para block. - The toLegacyTable function now lives in Writers.Shared.
2020-04-15Remove the onlySimpleCellBodies function from Shareddespresc1-13/+2
2020-04-15Implement the new Table typedespresc1-5/+43
2020-03-15Use implicit Prelude (#6187)Albert Krewinkel1-2/+0
* Use implicit Prelude The previous behavior was introduced as a fix for #4464. It seems that this change alone did not fix the issue, and `stack ghci` and `cabal repl` only work with GHC 8.4.1 or newer, as no custom Prelude is loaded for these versions. Given this, it seems cleaner to revert to the implicit Prelude. * PandocMonad: remove outdated check for base version Only base versions 4.9 and later are supported, the check for `MIN_VERSION_base(4,8,0)` is therefore unnecessary. * Always use custom prelude Previously, the custom prelude was used only with older GHC versions, as a workaround for problems with ghci. The ghci problems are resolved by replacing package `base` with `base-noprelude`, allowing for consistent use of the custom prelude across all GHC versions.
2020-03-13Update copyright year (#6186)Albert Krewinkel1-1/+1
* Update copyright year * Copyright: add notes for Lua and Jira modules
2020-02-08Factor out a findM function (#6125)Joseph C. Sible1-0/+9
This adds a new function to the API: Text.Pandoc.Shared.findM.
2020-02-07Apply linter suggestions. Add fix_spacing to lint target in Makefile.John MacFarlane1-9/+9
2020-02-07Resolve HLint warningsAlbert Krewinkel1-2/+0
All warnings are either fixed or, if more appropriate, HLint is configured to ignore them. HLint suggestions remain. * Ignore "Use camelCase" warnings in Lua and legacy code * Fix or ignore remaining HLint warnings * Remove redundant brackets * Remove redundant `return`s * Remove redundant as-pattern * Fuse mapM_/map * Use `.` to shorten code * Remove redundant `fmap` * Remove unused LANGUAGE pragmas * Hoist `not` in Text.Pandoc.App * Use fewer imports for `Text.DocTemplates` * Remove redundant `do`s * Remove redundant `$`s * Jira reader: remove unnecessary parentheses
2020-02-07Various minor cleanups and refactoring (#6117)Joseph C. Sible1-4/+3
* Use concatMap instead of reimplementing it * Replace an unnecessary multi-way if with a regular if * Use sortOn instead of sortBy and comparing * Use guards instead of lots of indents for if and else * Remove redundant do blocks * Extract common functions from both branches of maybe Whenever both the Nothing and the Just branch of maybe do the same function, do that function on the result of maybe instead. * Use fmap instead of reimplementing it from maybe * Use negative forms instead of negating the positive forms * Use mapMaybe instead of mapping and then using catMaybes * Use zipWith instead of mapping over the result of zip * Use unwords instead of reimplementing it * Use <$ instead of <$> and const * Replace case of Bool with if and else * Use find instead of listToMaybe and filter * Use zipWithM instead of mapM and zip * Inline lambda wrappers into the real functions * We get zipWithM from Text.Pandoc.Writers.Shared * Use maybe instead of fromMaybe and fmap I'm not sure how this one slipped past me. * Increase a bit of indentation
2019-12-17Improved makeSections so we don't get doubled attributes.John MacFarlane1-13/+17
Closes #5986.
2019-12-07Fix --toc-depth regression in 2.8.John MacFarlane1-6/+6
Closes #5967.
2019-12-05Roll back part of of `--shift-heading-level-by` change.John MacFarlane1-6/+0
With positive heading shifts, starting in 2.8 this option caused metadata titles to be removed and changed to regular headings. This behavior is incompatible with the old behavior of `--base-header-level` and breaks old workflows, so with this commit we are rolling back this change. Now, there is an asymmetry in positive and negative heading level shifts: + With positive shifts, the metadata title stays the same and does not get changed to a heading in the body. + With negative shifts, a heading can be converted into the metadata title. I think this is a desirable combination of features, despite the asymmetry. One might, e.g., want to have a document with level-1 section headigs, but render it to HTML with level-2 headings, retaining the metadata title (which pandoc will render as a level-1 heading with the default template). Closes #5957. Revises #5615.
2019-12-05Fix makeSections so it doesn't turn column divs into sections.John MacFarlane1-1/+3
2019-11-12Switch to new pandoc-types and use Text instead of String [API change].despresc1-118/+175
PR #5884. + Use pandoc-types 1.20 and texmath 0.12. + Text is now used instead of String, with a few exceptions. + In the MediaBag module, some of the types using Strings were switched to use FilePath instead (not Text). + In the Parsing module, new parsers `manyChar`, `many1Char`, `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`, `mantyUntilChar` have been added: these are like their unsuffixed counterparts but pack some or all of their output. + `glob` in Text.Pandoc.Class still takes String since it seems to be intended as an interface to Glob, which uses strings. It seems to be used only once in the package, in the EPUB writer, so that is not hard to change.
2019-11-11Change the implementation of `htmlSpanLikeElements` and implement `<dfn>` ↵Florian Beeres1-1/+1
(#5882) * Add HTML Reader support for `<dfn>`, parsing this as a Span with class `dfn`. * Change `htmlSpanLikeElements` implementation to retain classes, attributes and inline content.
2019-10-29Shared.makeSections: better behavior in some corner cases.John MacFarlane1-3/+7
When a div surrounds multiple sections at the same level, or a section of highre level followed by one of lower level, then we just leave it as a div and create a new div for the section. Closes #5846, closes #5761.
2019-10-28Shared: improve isTight.John MacFarlane1-1/+1
If a list has an empty item, this should not count against its being a tight list. Closes #5857.
2019-10-16Add support for reading & writing <mark> elementsFlorian B1-1/+1
Parse <mark> elements from HTML as HTML span like elements, with a single class matching the tag name `mark`. Mark elements are rendered to HTML using the native <mark> element. Fixes https://github.com/jgm/pandoc/issues/5797.
2019-10-15Add support for reading and writing <kbd> elementsDaniele D'Orazio1-0/+6
* Text.Pandoc.Shared: export `htmlSpanLikeElements` [API change] This commit also introduces a mapping of HTML span like elements that are internally represented as a Span with a single class, but that are converted back to the original element by the html writer. As of now, only the kbd element is handled this way. Ideally these elements should be handled as plain AST values, but since that would be a breaking change with a large impact, we revert to this stop-gap solution. Fixes https://github.com/jgm/pandoc/issues/5796.
2019-10-11Fix `gfm_auto_identifiers` behavior with emojis.John MacFarlane1-1/+8
Closes #5813. Note that we also now use emoji names for emojis when `ascii_identifiers` is enabled.
2019-10-07Shored.camelCaseToHyphenated: handle ABCDef = abc-def.John MacFarlane1-2/+8
2019-09-19EPUB writer: improve splitting into chapters.John MacFarlane1-3/+6
+ Use makeSection from T.P.Shared. This deals better with embedded divs. (Closes #5761.) + Remove chapter-title class from chapter h1, for now. (Reverts one change made earlier; we may revisit this in light of #5749.) + Avoid issuing warning multiple times when title not set (see #5760).
2019-09-10Add --shift-heading-level-by option.John MacFarlane1-4/+17
Deprecate --base-heading-level. The new option does everything the old one does, but also allows negative shifts. It also promotes the document metadata (if not null) to a level-1 heading with a +1 shift, and demotes an initial level-1 heading to document metadata with a -1 shift. This supports converting documents that use an initial level-1 heading for the document title. Closes #5615.
2019-09-08Replace Element and makeHierarchical with makeSections.John MacFarlane1-61/+66
Text.Pandoc.Shared: + Remove `Element` type [API change] + Remove `makeHierarchicalize` [API change] + Add `makeSections` [API change] + Export `deLink` [API change] Now that we have Divs, we can use them to represent the structure of sections, and we don't need a special Element type. `makeSections` reorganizes a block list, adding Divs with class `section` around sections, and adding numbering if needed. This change also fixes some longstanding issues recognizing section structure when the document contains Divs. Closes #3057, see also #997. All writers have been changed to use `makeSections`. Note that in the process we have reverted the change c1d058aeb1c6a331a2cc22786ffaab17f7118ccd made in response to #5168, which I'm not completely sure was a good idea. Lua modules have also been adjusted accordingly. Existing lua filters that use `hierarchicalize` will need to be rewritten to use `make_sections`.
2019-09-08Revert changes to hierarchicalizeWithIds.John MacFarlane1-21/+5
Revert "hierarchicalize: ensure that sections get ids..." This reverts commit 212406a61d027d85712705e626954e0486a2bc34. Revert "Improve detection of headings in Divs by hierarchicalize." This reverts commit 6e2cfd6c97b1b8657f1f3e2b66090a2c3ba8d887. Revert "Shared.hierarchicalize: improve handling of div and section structure." This reverts commit 345b33762eb4cc6d57d74c76c4757a6166ee5c13.
2019-09-06hierarchicalize: ensure that sections get ids...John MacFarlane1-6/+10
even if they're in divs. Improves #3057.
2019-09-06Improve detection of headings in Divs by hierarchicalize.John MacFarlane1-1/+2
The structure ``` <h1>one</h1> <div> <h1>two</h1> </div> ``` should create two coordinate sections, not a section with a subsection. Now it does. Extends #3057.
2019-09-05Shared.hierarchicalize: improve handling of div and section structure.John MacFarlane1-4/+15
Previously Divs were opaque to hierarchicalize, so headings inside divs didn't get into the table of contents, for example (#3057). Now hierarchicalize treats Divs as sections when appropriate. For example, these structures both yield a section and a subsection: ``` html <div> <h1>one</h1> <div> <h2>two</h2> </div> </div> ``` ``` html <div> <h1>one</h1> <div> <h1>two</h1> </div> </div> ``` Note that ``` html <h1>one</h1> <div> <h2>two</h2> </div> <h1>three</h1> ``` gets parsed as the structure one two three which may not always be desirable. Closes #3057.
2019-08-25Use new doctemplates, doclayout.John MacFarlane1-1/+1
+ Remove Text.Pandoc.Pretty; use doclayout instead. [API change] + Text.Pandoc.Writers.Shared: remove metaToJSON, metaToJSON' [API change]. + Text.Pandoc.Writers.Shared: modify `addVariablesToContext`, `defField`, `setField`, `getField`, `resetField` to work with Context rather than JSON values. [API change] + Text.Pandoc.Writers.Shared: export new function `endsWithPlain` [API change]. + Use new templates and doclayout in writers. + Use Doc-based templates in all writers. + Adjust three tests for minor template rendering differences. + Added indentation to body in docbook4, docbook5 templates. The main impact of this change is better reflowing of content interpolated into templates. Previously, interpolated variables were rendered independently and intepolated as strings, which could lead to overly long lines. Now the templates interpolated as Doc values which may include breaking spaces, and reflowing occurs after template interpolation rather than before.
2019-07-16Make filterIpynbOutput strip ANSI escapes from code in output...John MacFarlane1-1/+10
for non-ipynb formats, when the default "best" option is used with --ipynb-output. The escape sequences cause problems in many formats, including LaTeX. Closes #5633.