Age | Commit message (Collapse) | Author | Files | Lines |
|
..and add new definitions isomorphic to xml-light's, but with
Text instead of String. This allows us to keep most of the code in
existing readers that use xml-light, but avoid lots of unnecessary
allocation.
We also add versions of the functions from xml-light's
Text.XML.Light.Output and Text.XML.Light.Proc that operate
on our modified XML types, and functions that convert
xml-light types to our types (since some of our dependencies,
like texmath, use xml-light).
Update golden tests for docx and pptx.
OOXML test: Use `showContent` instead of `ppContent` in `displayDiff`.
Docx: Do a manual traversal to unwrap sdt and smartTag.
This is faster, and needed to pass the tests.
Benchmarks:
A = prior to 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8)
B = as of 8ca191604dcd13af27c11d2da225da646ebce6fc (Feb 8)
C = this commit
| Reader | A | B | C |
| ------- | ----- | ------ | ----- |
| docbook | 18 ms | 12 ms | 10 ms |
| opml | 65 ms | 62 ms | 35 ms |
| jats | 15 ms | 11 ms | 9 ms |
| docx | 72 ms | 69 ms | 44 ms |
| odt | 78 ms | 41 ms | 28 ms |
| epub | 64 ms | 61 ms | 56 ms |
| fb2 | 14 ms | 5 ms | 4 ms |
|
|
Setting SOURCE_DATE_EPOCH will allow reproducible builds.
Partially addresses #7093. This does not suffice to fully enable
reproducible in EPUB, since a unique id is being generated for each
build.
|
|
Instead of hard-coding the border and header cell vertical alignment,
we now let this be determined by the Table style, making use of
Word's "conditional formatting" for the table's first row.
For headerless tables, we use the tblLook element to tell Word
not to apply conditional first-row formatting.
Closes #7008.
|
|
|
|
|
|
|
|
If we have a paragraph then a bookmarkEnd, we don't need to
insert the empty paragraph (and in fact it alters the spacing).
Closes #6983.
|
|
Previously we got unreadable content, because docx seems
to want a `<w:p>` element (even an empty one) at the end of
every table cell. Closes #6983.
|
|
Closes: #6933
|
|
|
|
Previously bold and italics didn't work properly in LTR
text. This commit causes the w:bCs and w:iCs attributes
to be used, in addition to w:b and w:i, for bold and
italics respectively.
Closes #6911.
|
|
Fix appearance of bullets/numbered lists (the first level is slightly
indented to the right instead of right on the margin).
New golden files have been tested using Word 2010 on Windows 10.
|
|
For security reasons, some legal firms delete the date from comments and
tracked changes.
* Make date optional (Maybe) in tracked changes and comments datatypes
* Add tests
|
|
If the first element of a bulleted or ordered list is another list,
then that first item will disappear if the target format is docx. This
changes the docx writer so that it prepends an empty string for those
cases. With this, no items will disappear.
Closes #5948.
|
|
This deprecates the use of the external pandoc-citeproc
filter; citation processing is now built in to pandoc.
* Add dependency on citeproc library.
* Add Text.Pandoc.Citeproc module (and some associated unexported
modules under Text.Pandoc.Citeproc). Exports `processCitations`.
[API change]
* Add data files needed for Text.Pandoc.Citeproc: default.csl
in the data directory, and a citeproc directory that is just
used at compile-time. Note that we've added file-embed as a mandatory
rather than a conditional depedency, because of the biblatex
localization files. We might eventually want to use readDataFile
for this, but it would take some code reorganization.
* Text.Pandoc.Loging: Add `CiteprocWarning` to `LogMessage` and use it
in `processCitations`. [API change]
* Add tests from the pandoc-citeproc package as command tests (including
some tests pandoc-citeproc did not pass).
* Remove instructions for building pandoc-citeproc from CI and
release binary build instructions. We will no longer distribute
pandoc-citeproc.
* Markdown reader: tweak abbreviation support. Don't insert a
nonbreaking space after a potential abbreviation if it comes right before
a note or citation. This messes up several things, including citeproc's
moving of note citations.
* Add `csljson` as and input and output format. This allows pandoc
to convert between `csljson` and other bibliography formats,
and to generate formatted versions of CSL JSON bibliographies.
* Add module Text.Pandoc.Writers.CslJson, exporting `writeCslJson`. [API
change]
* Add module Text.Pandoc.Readers.CslJson, exporting `readCslJson`. [API
change]
* Added `bibtex`, `biblatex` as input formats. This allows pandoc
to convert between BibLaTeX and BibTeX and other bibliography formats,
and to generated formatted versions of BibTeX/BibLaTeX bibliographies.
* Add module Text.Pandoc.Readers.BibTeX, exporting `readBibTeX` and
`readBibLaTeX`. [API change]
* Make "standalone" implicit if output format is a bibliography format.
This is needed because pandoc readers for bibliography formats put
the bibliographic information in the `references` field of metadata;
and unless standalone is specified, metadata gets ignored.
(TODO: This needs improvement. We should trigger standalone for the
reader when the input format is bibliographic, and for the writer
when the output format is markdown.)
* Carry over `citationNoteNum` to `citationNoteNumber`. This was just
ignored in pandoc-citeproc.
* Text.Pandoc.Filter: Add `CiteprocFilter` constructor to Filter.
[API change] This runs the processCitations transformation.
We need to treat it like a filter so it can be placed
in the sequence of filter runs (after some, before others).
In FromYAML, this is parsed from `citeproc` or `{type: citeproc}`,
so this special filter may be specified either way in a defaults file
(or by `citeproc: true`, though this gives no control of positioning
relative to other filters). TODO: we need to add something to the
manual section on defaults files for this.
* Add deprecation warning if `upandoc-citeproc` filter is used.
* Add `--citeproc/-C` option to trigger citation processing.
This behaves like a filter and will be positioned
relative to filters as they appear on the command line.
* Rewrote the manual on citatations, adding a dedicated Citations
section which also includes some information formerly found in
the pandoc-citeproc man page.
* Look for CSL styles in the `csl` subdirectory of the pandoc user data
directory. This changes the old pandoc-citeproc behavior, which looked
in `~/.csl`. Users can simply symlink `~/.csl` to the `csl`
subdirectory of their pandoc user data directory if they want
the old behavior.
* Add support for CSL bibliography entry formatting to LaTeX, HTML,
Ms writers. Added CSL-related CSS to styles.html.
|
|
* Fix hlint suggestions, update hlint.yaml
Most suggestions were redundant brackets. Some required
LambdaCase.
The .hlint.yaml file had a small typo, and didn't ignore camelCase
suggestions in certain modules.
|
|
|
|
Word combines adjacent tables, so to prevent this we insert
an empty paragraph between two adjacent tables.
Closes #4315.
|
|
Closes #1413.
|
|
This change will not have any effect with the default style.
However, it enables users to use a style (via a reference.docx)
that turns on row and/or column bands.
Closes #6371.
|
|
Deprecate `underlineSpan` in Shared in favor of `Text.Pandoc.Builder.underline`.
|
|
- Writers.Native is now adapted to the new Table type.
- Inline captions should now be conditionally wrapped in a Plain, not
a Para block.
- The toLegacyTable function now lives in Writers.Shared.
|
|
|
|
* Use <|> to simplify the Semigroup instance
* Use map instead of reimplementing it
* Simplify isValidChar
* Remove an unnecessary nested do block
* Simplify pgContentWidth
* Simplify addLang
* Simplify newStyles
* Avoid an unnecessary fmap in headerFooterEntries
* Remove unnecessary monadicity from mkNumbering and mkAbstractNum
* Use randomRs instead of constantly messing with the RNG state
* Lift common functions out of ifs
* Hoist not
* Clarify withTextPropM and withParaPropM
|
|
* Avoid fmapping when we're just binding right after anyway
* Clean up unnecessary fmaps in the LaTeX reader
|
|
This should speed-up recompilation after changes in `Text.Pandoc.Class`,
as the number of modules affected by a change will be smaller in
general. It also offers faster insights into the parts of `T.P.Class`
used within a module.
|
|
* Use implicit Prelude
The previous behavior was introduced as a fix for #4464. It seems that
this change alone did not fix the issue, and `stack ghci` and `cabal
repl` only work with GHC 8.4.1 or newer, as no custom Prelude is loaded
for these versions. Given this, it seems cleaner to revert to the
implicit Prelude.
* PandocMonad: remove outdated check for base version
Only base versions 4.9 and later are supported, the check for
`MIN_VERSION_base(4,8,0)` is therefore unnecessary.
* Always use custom prelude
Previously, the custom prelude was used only with older GHC versions, as
a workaround for problems with ghci. The ghci problems are resolved by
replacing package `base` with `base-noprelude`, allowing for consistent
use of the custom prelude across all GHC versions.
|
|
* Update copyright year
* Copyright: add notes for Lua and Jira modules
|
|
Starting in 2.8, the docx writer no longer distinguishes
between tight and loose lists, since the Compact style is
omitted.
This is a side-effect of the fix to #5670, as explained
in the changelog:
+ Preserve built-in styles in DOCX with custom style (Ben Steinberg,
#5670). This change prevents custom styles on divs and spans
from overriding styles on certain elements inside them, like
headings, blockquotes, and links. On those elements, the
"native" style is required for the element to display correctly.
This change also allows nesting of custom styles; in order to do so,
it removes the default "Compact" style applied to Plain blocks,
except when inside a table.
This patch fixes the problem by extending the exception currently
offered to Plain blocks inside tables to Plain blocks inside list
items.
Closes #6072.
|
|
PR #5884.
+ Use pandoc-types 1.20 and texmath 0.12.
+ Text is now used instead of String, with a few exceptions.
+ In the MediaBag module, some of the types using Strings
were switched to use FilePath instead (not Text).
+ In the Parsing module, new parsers `manyChar`, `many1Char`,
`manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`,
`mantyUntilChar` have been added: these are like their
unsuffixed counterparts but pack some or all of their output.
+ `glob` in Text.Pandoc.Class still takes String since it seems
to be intended as an interface to Glob, which uses strings.
It seems to be used only once in the package, in the EPUB writer,
so that is not hard to change.
|
|
|
|
|
|
|
|
|
|
* [Docx Parser] Move style-parsing-specific code to a new module
* [Docx Writer] Re-use Readers.Docx.Parse.Styles for StyleMap
* [Docx Writer] Move Readers.Docx.StyleMap to Writers.Docx.StyleMap
It's never used outside of writer code, so it makes more sense to scope it under writers really.
|
|
Styles that this change affects: paragraph styles: Author, Abstract,
Compact, Figure, Captioned Figure, Image Caption, First Paragraph,
Source Code, Table Caption, Definition, Definition Term; character
styles: Verbatim Char, token styles (those with names ending in Tok)
|
|
Reduce code duplication, remove redundant brackets
|
|
This commit prevents custom styles on divs and spans from overriding
styles on certain elements inside them, like headings, blockquotes,
and links. On those elements, the "native" style is required for the
element to display correctly. This change also allows nesting of
custom styles; in order to do so, it removes the default "Compact"
style applied to Plain blocks, except when inside a table.
|
|
Text.Pandoc.Shared:
+ Remove `Element` type [API change]
+ Remove `makeHierarchicalize` [API change]
+ Add `makeSections` [API change]
+ Export `deLink` [API change]
Now that we have Divs, we can use them to represent the structure
of sections, and we don't need a special Element type.
`makeSections` reorganizes a block list, adding Divs with
class `section` around sections, and adding numbering
if needed.
This change also fixes some longstanding issues recognizing
section structure when the document contains Divs.
Closes #3057, see also #997.
All writers have been changed to use `makeSections`.
Note that in the process we have reverted the change
c1d058aeb1c6a331a2cc22786ffaab17f7118ccd
made in response to #5168, which I'm not completely
sure was a good idea.
Lua modules have also been adjusted accordingly.
Existing lua filters that use `hierarchicalize` will
need to be rewritten to use `make_sections`.
|
|
|
|
Workaround for Word Online shortcomming. Fixes #5645
Also, make list para properties go first.
This reordering of properties shouldn't be necessary but
it seems Word Online does not understand the docx correctly otherwise.
|
|
We previously added the attribute `type="textWrapping"`, but
this causes problems on Word Online.
Closes #5377.
|
|
...in numbering.xml. This caused pandoc-produced docx files to
be uneditable using Word Online.
The problem was that recent versions of reference.docx include
samples of various kinds of text, including lists. The
numering elements for these were getting copied over to
the new docx, where they clashed with the autogenerated
elements produced by pandoc. This didn't confuse Desktop
Word, but it did confuse Word Online.
Closes #5358.
|
|
The haddock module header contains essentially the
same information, so the boilerplate is redundant and
just one more thing to get out of sync.
|
|
Quite a few modules were missing copyright notices.
This commit adds copyright notices everywhere via haddock module
headers. The old license boilerplate comment is redundant with this and has
been removed.
Update copyright years to 2019.
Closes #4592.
|
|
* docx writer: support custom properties. Solves the writer part of #3024.
Also supports additional core properties: `subject`, `lang`, `category`,
`description`.
* odt writer: improve standard properties, including the following core properties:
`generator` (Pandoc/VERSION), `description`, `subject`, `keywords`,
`initial-creator` (from authors), `creation-date` (actual creation date).
Also fix date.
* pptx writer: support custom properties. Also supports additional core
properties: `subject`, `category`, `description`.
* Includes golden tests.
* MANUAL: document metadata support for docx, odt, pptx writers
|
|
closes #5180
|
|
Word has a 40 character limit for bookmark names. In
addition, bookmarks must begin with a letter. Since
pandoc's auto-generated identifiers may not respect
these constraints, some internal links did not work.
With this change, pandoc uses a bookmark name based
on the SHA1 hash of the identifier when the identifier
isn't a legal bookmark name.
Closes #5091.
|
|
|
|
This was a mismatch between pandoc's docx, epub, latex, and markdown
writers and the behavior of pandoc-citeproc, which actually looks
for a div with id 'refs' rather than one with class 'references'.
|