aboutsummaryrefslogtreecommitdiff
path: root/src/Text
AgeCommit message (Collapse)AuthorFilesLines
2021-10-22Lua: marshal Inline elements as userdataAlbert Krewinkel2-63/+345
This includes the following user-facing changes: - Deprecated inline constructors are removed. These are `DoubleQuoted`, `SingleQuoted`, `DisplayMath`, and `InlineMath`. - Attr values are no longer normalized when assigned to an Inline element property. - It's no longer possible to access parts of Inline elements via numerical indexes. E.g., `pandoc.Span('test')[2]` used to give `pandoc.Str 'test'`, but yields `nil` now. This was undocumented behavior not intended to be used in user scripts. Use named properties instead. - Accessing `.c` to get a JSON-like tuple of all components no longer works. This was undocumented behavior. - Only known properties can be set on an element value. Trying to set a different property will now raise an error.
2021-10-22Lua: marshal Attr values as userdataAlbert Krewinkel4-14/+233
- Adds a new `pandoc.AttributeList()` constructor, which creates the associative attribute list that is used as the third component of `Attr` values. Values of this type can often be passed to constructors instead of `Attr` values. - `AttributeList` values can no longer be indexed numerically.
2021-10-22Lua: marshal Pandoc values as userdataAlbert Krewinkel2-11/+36
2021-10-22Switch to hslua-2.0Albert Krewinkel24-1187/+1095
The new HsLua version takes a somewhat different approach to marshalling and unmarshalling, relying less on typeclasses and more on specialized types. This allows for better performance and improved error messages. Furthermore, new abstractions allow to document the code and exposed functions.
2021-10-21Move splitStrWhen to T.P.Citeproc.Util.John MacFarlane3-23/+15
Previously there were two copies, in BibTeX and Locator.
2021-10-21SelfContained: fix bug that caused everything to be made a data uri.John MacFarlane1-12/+12
All the code we needed to put most styles and scripts into inline style and script tags was there, but because of the order of pattern matching, it was never being called. Putting the catch-all clause at the end fixes the bug. Closes #7635, closes #7367. See also #3423.
2021-10-20Markdown reader: don't parse links or bracketed spans as citations.John MacFarlane1-2/+4
Previously pandoc would parse [link to (@a)](url) as a citation; similarly [(@a)]{#ident} This is undesirable. One should be able to use example references in citations, and even if `@a` is not defined as an example reference, `[@a](url)` should be a link containing an author-in-text citation rather than a normal citation followed by literal `(url)`. Closes #7632.
2021-10-19FormatHeuristics: remove `.tei.xml` extension for TEI.John MacFarlane1-1/+0
As noted in #7630, this never worked, because `takeExtension` only returns `.xml`. So it won't be missed if we remove it. Closes #7630.
2021-10-18Docx reader: fix handling of empty fieldsMilan Bracke1-0/+4
Some fields only have an instrText and no content, Pandoc didn't understand these, causing other fields to be misunderstood because it seemed like a field was still open when it wasn't.
2021-10-18Docx parser: implement PAGEREF fieldsMilan Bracke2-0/+26
These fields, often used in tables of contents, can be a hyperlink.
2021-10-18Docx reader: fix handling of nested fieldsMilan Bracke2-115/+150
Fields delimited by fldChar elements can contain other fields. Before, the nested fields would be ignored, except for the end, which would be considered the end of the parent field. To fix this issue, fields needed to be considered containing ParParts instead of Runs, since a Run can't represent complex enough structures. This also impacted Hyperlinks since they can originate from a field.
2021-10-17pptx: Line up continuation paragraphsEmily Bourke2-10/+93
This commit changes the `marL` and `indent` values used for plain paragraphs and numbered lists, and changes the spacing defined in the reference doc master for bulleted lists. For paragraphs, there is now a left-indent taken from the `otherStyle` in the master. For numbered lists, the number is positioned where the text would be if this were a plain paragraph, and the text is indented to the next level. This means that continuation paragraphs line up nicely with numbered lists. It also /mostly/ matches the observed PowerPoint behaviour when inserting paragraphs and numbered lists: the only difference is that PowerPoint was using a different margin value for the first level numbered lists – I’ve changed this to match the other levels, as I don’t think it makes the spacing unappealing and it allows continuation paragraphs at any level to line up. With bulleted lists, I’m keeping the observed PowerPoint behaviour of specifying only a level, letting `marL` and `indent` be automatically taken from `bodyStyle`. To that end, this commit changes the `bodyStyle` spacing in the master of the default reference doc, to: - line up the text of the first paragraph in each bullet with any continuation paragraphs - line up nested bullet markers in any continuation paragraphs with the first paragraph, matching lists and plain paragraphs This does mean the continuation paragraphs still won’t line up for anyone using their own reference doc where they haven’t matched the `otherStyle` and `bodyStyle` indent levels, but I think people in that situation will be able to troubleshoot.
2021-10-17pptx: Remove outdated commentEmily Bourke1-3/+0
I removed the field this comment refers to recently, missed the comment.
2021-10-17pptx: Fix list level numberingEmily Bourke1-14/+17
In PowerPoint, the content of a top-level list is at the same level as the content of a top-level paragraph – the only difference is that a list style has been applied. At the moment, the pptx writer increments the paragraph level on each list, turning what should be top-level lists into second-level lists. This commit changes that logic, only incrementing the paragraph level on continuation paragraphs of lists. - Fixes https://github.com/jgm/pandoc/issues/4828 - Fixes https://github.com/jgm/pandoc/issues/4663
2021-10-14asciidoc writer: translate numberLines attribute to linesnum switchSamuel Tardieu1-2/+5
AsciiDoctor allows to request line numbering on code blocks by using a switch on the `source` block, such as in: ``` [source%linesnum,haskell] ---- some Haskell code here ---- ```
2021-10-14DocBook reader: honor linenumbering attributeSamuel Tardieu1-0/+1
The attribute DocBook linenumbering="numbered" attribute on code blocks maps to "numberLines" internally.
2021-10-14Remove redundant $Samuel Tardieu1-1/+1
Found by hlint 3.3.1
2021-10-13Fix markdown parsing bug for math in bracketed spans and links.John MacFarlane1-0/+1
This affects math with unbalanced brackets (e.g. `$(0,1]$`) inside links, images, bracketed spans. Closes #7623.
2021-10-12Revert "Depend on pandoc-types 1.23, remove Null constructor on Block."John MacFarlane30-1/+39
This reverts commit fb0d6c7cb63a791fa72becf21ed493282e65ea91.
2021-10-11T.P.Writers.Shared: remove 'breakable'...John MacFarlane1-18/+0
which was introduced in the cherry-pick'd commit that added splitSentences, but isn't needed here. (It is for the nospace branch.)
2021-10-11T.P.Writers.Shared: Export splitSentences as a Doc Text transform.John MacFarlane3-16/+61
[API change] Use this in man/ms.
2021-10-11Remove splitSentences from T.P.Shared [API change].John MacFarlane3-34/+4
We used to attempt automatic sentence splitting in man and ms output, since sentence-ending periods need to be followed by two spaces or a newline in these formats. But it's difficult to do this reliably at the level of `[Inline]`.
2021-10-11Fix warningJohn MacFarlane1-1/+1
2021-10-11LaTeX reader: Implement siunitx v3 commands.John MacFarlane1-1/+5
We support `\unit`, `\qty`, `\qtyrange`, and `\qtylist` as synonynms of `\si`, `\SI`, `\SIrange`, and `\SIlist`. Closes #7614.
2021-10-10Avoid blockquote when parent style has more indentMilan Bracke3-53/+66
When a paragraph has an indentation different from the parent (named) style, it used to be considered a blockquote. But this only makes sense when the paragraph has more indentation. So this commit adds a check for the indentation of the parent style.
2021-10-10LaTeX reader: Properly handle `\^` followed by group closing.John MacFarlane1-3/+3
Closes #7615.
2021-10-10Translations: don't depend on the fact that Aeson Object is...John MacFarlane1-3/+2
implemented internally as a HashMap. This is no longer public as of aeson 2.0.0.0.
2021-10-06Don't prepend `file://` to `--syntax-definition` on Windows.John MacFarlane1-8/+2
This was a fix for a problem in skylighting, but this problem doesn't exist now that we've moved from HXT to xml-conduit. Cf. #6374.
2021-10-05Avoid bad wraps in markdown writer at the Doc Text level.John MacFarlane1-22/+23
Previously we tried to do this at the Inline list level, but it makes more sense to intervene on breaking spaces at the Doc Text level.
2021-10-04Powerpoint writer: consolidate text runs when possible.John MacFarlane2-4/+9
This slims down the output files by avoiding unnecessary text run elements. Updated golden tests.
2021-10-04Revert "Powerpoint writer: consolidate text run nodes."John MacFarlane1-9/+1
This reverts commit 62f83aa48633af477913bde6f615fe9f8793901a. This was already being done, it seems. I misidentified the problem; it is really with `Str ""` nodes.
2021-10-04Powerpoint writer: consolidate text run nodes.John MacFarlane1-1/+9
This should reduce the size of the generated files.
2021-10-01Depend on pandoc-types 1.23, remove Null constructor on Block.John MacFarlane30-39/+1
2021-09-30epub: Add EPUB3 subject metadata (authority/term)nuew1-10/+31
This adds the ability to specify EPUB 3 `authority` and `term` specific refinements to the `subject` tag. Specifying a plain `subject` tag in metadata will function as before.
2021-09-30Add `footnotes` to default `gfm` etxensions.John MacFarlane1-0/+1
Now that `gfm` supports footnotes. https://github.blog/changelog/2021-09-30-footnotes-now-supported-in-markdown-fields/
2021-09-30Docx reader: Add placeholder for word diagramEzwal2-0/+17
2021-09-29EPUB writer: treat epub:type "frontispiece" as front matter.John MacFarlane1-1/+1
This allows you to include a frontispiece using ``` ![](yourimage.jpg) etc. ``` Closes #7600.
2021-09-28Switch from pretty-simple to pretty-show for native output.John MacFarlane1-12/+8
Update tests. Reason: it turns out that the native output generated by pretty-simple isn't always readable by the native reader. According to https://github.com/cdepillabout/pretty-simple/issues/99 it is not a design goal of the library that the rendered values be readable using 'read'. This makes it unsuitable for our purposes. pretty-show is a bit slower and it uses 4-space indents (non-configurable), but it doesn't have this serious drawback.
2021-09-27Better implementation of splitStrWhenJohn MacFarlane1-12/+7
2021-09-26RST writer: properly handle anchors to ids...John MacFarlane1-1/+6
with spaces or leading underscore. In this cases we need the quoted form, e.g. ``` .. _`foo bar`: .. _`_foo`: ``` Side note: rST will "normalize" these identifiers anyway, ignoring the underscore: https://docutils.sourceforge.io/docs/ref/rst/directives.html#identifier-normalization Closes #7593.
2021-09-23BibTeX parser: fix expansion of special strings in series...John MacFarlane1-7/+10
e.g. `newseries` or `library`. Expansion should not happen when these strings are protected in braces, or when they're capitalized. Closes #7591.
2021-09-23HTML reader: handle empty tbody element in table.John MacFarlane1-5/+8
Closes #7589.
2021-09-23HTML writer: render `\ref` and `\eqref` as inline math...John MacFarlane1-8/+11
not display. See #7589.
2021-09-22HTML writer: pass through `\ref` and `\eqref`...John MacFarlane1-2/+10
if MathJax is used. Closes #7587.
2021-09-22HTML writer: pass through inline math environments with KaTeX.John MacFarlane1-0/+1
2021-09-21Use pretty-simple to format native output.John MacFarlane1-73/+15
Previously we used our own homespun formatting. But this produces over-long lines that aren't ideal for diffs in tests. Easier to use something off-the-shelf and standard. Closes #7580. Performance is slower by about a factor of 10, but this isn't really a problem because native isn't suitable as a serialization format. (For serialization you should use json, because the reader is so much faster than native.)
2021-09-19LaTeX reader: Recognize that `\vadjust` sometimes takes "pre".John MacFarlane1-0/+7
Closes #7531.
2021-09-19Ignore (and gobble parameters of) CSLReferences environment.John MacFarlane1-0/+1
Otherwise we get the parameters as numbers in the output. Closes #7531.
2021-09-19Use babel, not polyglossia, with xelatex.John MacFarlane3-102/+13
Previously polyglossia worked better with xelatex, but that is no longer the case, so we simplify the code so that babel is used with all latex engines. This involves a change to the default LaTeX template.
2021-09-19Markdown writer: use `underline` class rather than `ul` for underline.John MacFarlane1-1/+1
This only affects output with bracketed_spans enabled. The markdown reader parses spans with either `.ul` or `.underline` as Underline elements, but we're moving towards preferring the latter.