aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc
AgeCommit message (Collapse)AuthorFilesLines
2021-10-28Lua: increase strictness when getting attribute keysAlbert Krewinkel1-2/+2
2021-10-27Lua: re-add `t` and `tag` property to Attr valuesAlbert Krewinkel1-0/+4
Removal of these properties from Attr values was a regression.
2021-10-27Markdown writer: Be sure to quote special values in YAML metadata.John MacFarlane1-3/+13
E.g. "Y", "yes", which are now (with yaml library) considered boolean values, as well as "null". This fixes a bug with roundtripping markdown -> markdown: ``` --- foo: "true" ... ```
2021-10-27Change JSON encodings of some types.John MacFarlane3-44/+56
- For LineEnding use lowercase constructors, e.g. `crlf`, `native`. This was the original intent, but there was a bug in the implementation. - For HTMLSlideVariant use lowercase constructors. - For ReaderOptions use e.g. `default-image-extension` instead of `readerDefaultImageExtension` for field names. - For Extension, use e.g. `tex_math_dollars` instead of `Ext_tex_math_dollars` as constructor. - For Extensions, use an array of Extensions, instead of an object wrapping the tag `Extensions` and an integer. (The representation is not supposed to be part of the public API.) - For Opt, use field names like `tab-stop` instead of `optTabStop`.
2021-10-27Switch back from HsYAML to yaml.John MacFarlane9-419/+334
Reasons: - Performance: HsYAML is around 20 times slower in parsing large YAML bibliographies (#6084). - An issue was submitted to HsYAML, but it hasn't gotten any attention. HsYAML seems borderline unmaintained; it hasn't had a commit in over a year. - Unfortunately this goes back on our attempts to free ourselves from C dependencies (#4535). But I don't see a better alternative until a better pure Haskell parser is available. Closes #6084. Notes: - We've removed the FromYAML instances for all types that had them, since this is a HsYAML-specific typeclass [API change]. (The yaml package just uses From/ToJSON.) - Unlike HsYAML (in the configuration we were using), yaml parses 'Y', 'N', 'Yes', 'No', 'On', 'Off' as boolean values. Users may need to quote these when they are meant to be interpreted as strings. Similarly, 'null' is parsed as a YAML null value (and will be treated as an empty string by pandoc rather than the string 'null'). Quoting it will force it to be interpreted as a string. - Some tests had to be adjusted accordingly. - Pandoc now behaves better when the YAML metadata contains escaping errors: instead of just falling back on treating the section as a table, it raises a YAML parsing error.
2021-10-27Lua: fix `pandoc.utils.stringify` regressionAlbert Krewinkel1-1/+1
The `pandoc.utils.stringify` function returned empty strings when called with a string argument.
2021-10-26Fix a copy/paste bug in Lua marshalling code.John MacFarlane1-1/+1
This led changes in link properties in Lua filters to change the links into images! Closes #7639.
2021-10-26Lua: marshal SimpleTable values as userdata objectsAlbert Krewinkel3-46/+58
2021-10-26Lua: generate constants in module pandoc programmaticallyAlbert Krewinkel1-0/+17
2021-10-26Lua: marshal ListAttributes values as userdata objectsAlbert Krewinkel6-15/+80
2021-10-26Lua: marshal Block values as userdata objectsAlbert Krewinkel4-146/+461
Properties of Block values are marshalled lazily, which generally improves performance considerably. Script users may also notice the following differences: - Block element properties can no longer be accessed by numerical indexing of the `.c` field. The `.c` property now serves as an alias for `.content`, so some filter that used this undocumented method for property access may continue to work, while others will need to be updated and use proper property names. - The marshalled Block elements now have a `show` method, and a `__tostring` metamethod. Both return the Haskell string representation of the element. - Block values now have the Lua type `userdata` instead of `table`.
2021-10-25Lua: marshal Citation values as userdata objectsAlbert Krewinkel3-16/+53
2021-10-23Lua: convert IOErrors to PandocErrors in pandoc.pipe functionAlbert Krewinkel1-0/+2
Fixes: #7523
2021-10-22Org reader: allow an initial :PROPERTIES: drawer to add to metadata.John MacFarlane1-2/+10
Closes #7520.
2021-10-22Use simpleFigure in Readers.Aner Lucero25-110/+93
2021-10-22Lua: marshal Version values as userdataAlbert Krewinkel6-125/+12
2021-10-22Lua: marshal Inline elements as userdataAlbert Krewinkel2-63/+345
This includes the following user-facing changes: - Deprecated inline constructors are removed. These are `DoubleQuoted`, `SingleQuoted`, `DisplayMath`, and `InlineMath`. - Attr values are no longer normalized when assigned to an Inline element property. - It's no longer possible to access parts of Inline elements via numerical indexes. E.g., `pandoc.Span('test')[2]` used to give `pandoc.Str 'test'`, but yields `nil` now. This was undocumented behavior not intended to be used in user scripts. Use named properties instead. - Accessing `.c` to get a JSON-like tuple of all components no longer works. This was undocumented behavior. - Only known properties can be set on an element value. Trying to set a different property will now raise an error.
2021-10-22Lua: marshal Attr values as userdataAlbert Krewinkel4-14/+233
- Adds a new `pandoc.AttributeList()` constructor, which creates the associative attribute list that is used as the third component of `Attr` values. Values of this type can often be passed to constructors instead of `Attr` values. - `AttributeList` values can no longer be indexed numerically.
2021-10-22Lua: marshal Pandoc values as userdataAlbert Krewinkel2-11/+36
2021-10-22Switch to hslua-2.0Albert Krewinkel24-1187/+1095
The new HsLua version takes a somewhat different approach to marshalling and unmarshalling, relying less on typeclasses and more on specialized types. This allows for better performance and improved error messages. Furthermore, new abstractions allow to document the code and exposed functions.
2021-10-21Move splitStrWhen to T.P.Citeproc.Util.John MacFarlane3-23/+15
Previously there were two copies, in BibTeX and Locator.
2021-10-21SelfContained: fix bug that caused everything to be made a data uri.John MacFarlane1-12/+12
All the code we needed to put most styles and scripts into inline style and script tags was there, but because of the order of pattern matching, it was never being called. Putting the catch-all clause at the end fixes the bug. Closes #7635, closes #7367. See also #3423.
2021-10-20Markdown reader: don't parse links or bracketed spans as citations.John MacFarlane1-2/+4
Previously pandoc would parse [link to (@a)](url) as a citation; similarly [(@a)]{#ident} This is undesirable. One should be able to use example references in citations, and even if `@a` is not defined as an example reference, `[@a](url)` should be a link containing an author-in-text citation rather than a normal citation followed by literal `(url)`. Closes #7632.
2021-10-19FormatHeuristics: remove `.tei.xml` extension for TEI.John MacFarlane1-1/+0
As noted in #7630, this never worked, because `takeExtension` only returns `.xml`. So it won't be missed if we remove it. Closes #7630.
2021-10-18Docx reader: fix handling of empty fieldsMilan Bracke1-0/+4
Some fields only have an instrText and no content, Pandoc didn't understand these, causing other fields to be misunderstood because it seemed like a field was still open when it wasn't.
2021-10-18Docx parser: implement PAGEREF fieldsMilan Bracke2-0/+26
These fields, often used in tables of contents, can be a hyperlink.
2021-10-18Docx reader: fix handling of nested fieldsMilan Bracke2-115/+150
Fields delimited by fldChar elements can contain other fields. Before, the nested fields would be ignored, except for the end, which would be considered the end of the parent field. To fix this issue, fields needed to be considered containing ParParts instead of Runs, since a Run can't represent complex enough structures. This also impacted Hyperlinks since they can originate from a field.
2021-10-17pptx: Line up continuation paragraphsEmily Bourke2-10/+93
This commit changes the `marL` and `indent` values used for plain paragraphs and numbered lists, and changes the spacing defined in the reference doc master for bulleted lists. For paragraphs, there is now a left-indent taken from the `otherStyle` in the master. For numbered lists, the number is positioned where the text would be if this were a plain paragraph, and the text is indented to the next level. This means that continuation paragraphs line up nicely with numbered lists. It also /mostly/ matches the observed PowerPoint behaviour when inserting paragraphs and numbered lists: the only difference is that PowerPoint was using a different margin value for the first level numbered lists – I’ve changed this to match the other levels, as I don’t think it makes the spacing unappealing and it allows continuation paragraphs at any level to line up. With bulleted lists, I’m keeping the observed PowerPoint behaviour of specifying only a level, letting `marL` and `indent` be automatically taken from `bodyStyle`. To that end, this commit changes the `bodyStyle` spacing in the master of the default reference doc, to: - line up the text of the first paragraph in each bullet with any continuation paragraphs - line up nested bullet markers in any continuation paragraphs with the first paragraph, matching lists and plain paragraphs This does mean the continuation paragraphs still won’t line up for anyone using their own reference doc where they haven’t matched the `otherStyle` and `bodyStyle` indent levels, but I think people in that situation will be able to troubleshoot.
2021-10-17pptx: Remove outdated commentEmily Bourke1-3/+0
I removed the field this comment refers to recently, missed the comment.
2021-10-17pptx: Fix list level numberingEmily Bourke1-14/+17
In PowerPoint, the content of a top-level list is at the same level as the content of a top-level paragraph – the only difference is that a list style has been applied. At the moment, the pptx writer increments the paragraph level on each list, turning what should be top-level lists into second-level lists. This commit changes that logic, only incrementing the paragraph level on continuation paragraphs of lists. - Fixes https://github.com/jgm/pandoc/issues/4828 - Fixes https://github.com/jgm/pandoc/issues/4663
2021-10-14asciidoc writer: translate numberLines attribute to linesnum switchSamuel Tardieu1-2/+5
AsciiDoctor allows to request line numbering on code blocks by using a switch on the `source` block, such as in: ``` [source%linesnum,haskell] ---- some Haskell code here ---- ```
2021-10-14DocBook reader: honor linenumbering attributeSamuel Tardieu1-0/+1
The attribute DocBook linenumbering="numbered" attribute on code blocks maps to "numberLines" internally.
2021-10-14Remove redundant $Samuel Tardieu1-1/+1
Found by hlint 3.3.1
2021-10-13Fix markdown parsing bug for math in bracketed spans and links.John MacFarlane1-0/+1
This affects math with unbalanced brackets (e.g. `$(0,1]$`) inside links, images, bracketed spans. Closes #7623.
2021-10-12Revert "Depend on pandoc-types 1.23, remove Null constructor on Block."John MacFarlane30-1/+39
This reverts commit fb0d6c7cb63a791fa72becf21ed493282e65ea91.
2021-10-11T.P.Writers.Shared: remove 'breakable'...John MacFarlane1-18/+0
which was introduced in the cherry-pick'd commit that added splitSentences, but isn't needed here. (It is for the nospace branch.)
2021-10-11T.P.Writers.Shared: Export splitSentences as a Doc Text transform.John MacFarlane3-16/+61
[API change] Use this in man/ms.
2021-10-11Remove splitSentences from T.P.Shared [API change].John MacFarlane3-34/+4
We used to attempt automatic sentence splitting in man and ms output, since sentence-ending periods need to be followed by two spaces or a newline in these formats. But it's difficult to do this reliably at the level of `[Inline]`.
2021-10-11Fix warningJohn MacFarlane1-1/+1
2021-10-11LaTeX reader: Implement siunitx v3 commands.John MacFarlane1-1/+5
We support `\unit`, `\qty`, `\qtyrange`, and `\qtylist` as synonynms of `\si`, `\SI`, `\SIrange`, and `\SIlist`. Closes #7614.
2021-10-10Avoid blockquote when parent style has more indentMilan Bracke3-53/+66
When a paragraph has an indentation different from the parent (named) style, it used to be considered a blockquote. But this only makes sense when the paragraph has more indentation. So this commit adds a check for the indentation of the parent style.
2021-10-10LaTeX reader: Properly handle `\^` followed by group closing.John MacFarlane1-3/+3
Closes #7615.
2021-10-10Translations: don't depend on the fact that Aeson Object is...John MacFarlane1-3/+2
implemented internally as a HashMap. This is no longer public as of aeson 2.0.0.0.
2021-10-06Don't prepend `file://` to `--syntax-definition` on Windows.John MacFarlane1-8/+2
This was a fix for a problem in skylighting, but this problem doesn't exist now that we've moved from HXT to xml-conduit. Cf. #6374.
2021-10-05Avoid bad wraps in markdown writer at the Doc Text level.John MacFarlane1-22/+23
Previously we tried to do this at the Inline list level, but it makes more sense to intervene on breaking spaces at the Doc Text level.
2021-10-04Powerpoint writer: consolidate text runs when possible.John MacFarlane2-4/+9
This slims down the output files by avoiding unnecessary text run elements. Updated golden tests.
2021-10-04Revert "Powerpoint writer: consolidate text run nodes."John MacFarlane1-9/+1
This reverts commit 62f83aa48633af477913bde6f615fe9f8793901a. This was already being done, it seems. I misidentified the problem; it is really with `Str ""` nodes.
2021-10-04Powerpoint writer: consolidate text run nodes.John MacFarlane1-1/+9
This should reduce the size of the generated files.
2021-10-01Depend on pandoc-types 1.23, remove Null constructor on Block.John MacFarlane30-39/+1
2021-09-30epub: Add EPUB3 subject metadata (authority/term)nuew1-10/+31
This adds the ability to specify EPUB 3 `authority` and `term` specific refinements to the `subject` tag. Specifying a plain `subject` tag in metadata will function as before.