aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc
AgeCommit message (Collapse)AuthorFilesLines
2022-01-16LaTeX: parse thebibliographyHEADmasterIgor Pashev2-0/+40
2021-12-29LaTeX->HTML: Automatically generate the TOCIgor Pashev1-3/+14
2021-12-29Merge https://github.com/jgm/pandocIgor Pashev130-4498/+7194
2021-12-28Use `splitDirectories` istead of `splitPath`.John MacFarlane2-2/+2
We were using `splitPath` in two places in the code where `splitDirectories` should have been used. This led to a test for `..` in paths in `extractMedia` failing, so that images with `..` in the path name could be extracted outside the directory specified by `extractMedia`. It also led a test for `media` in resource paths to fail in the docx reader.
2021-12-28OpenDocument writer: fix vertical align bug with display math.John MacFarlane1-1/+1
Previously some displayed formulas would be floated above a preceding text line. This is fixed by setting vertical-rel to 'text' rather than 'paragraph-content'. Closes #7777.
2021-12-25Lua: improve handling of empty caption, body by `from_simple_table`Albert Krewinkel1-2/+2
Create truly empty table caption and body when these are empty in the simple table. Fixes: #7776
2021-12-22RTF writer: properly handle images in data URIs.John MacFarlane1-2/+3
See #7771.
2021-12-22HTML writer: make line breaks more consistent.John MacFarlane1-60/+59
- With `--wrap=none`, we now output line breaks between block-level elements. Previously they were omitted entirely, so the whole document was on one line, unless there were literal line breaks in pre sections. This makes the HTML writer's behavior more consistent with that of other writers. - Put newline after `<dd>`. - Put newlines after block-level elements in footnote section.
2021-12-22Add text wrapping to HTML output.John MacFarlane2-49/+197
Previously the HTML writer was exceptional in not being sensitive to the `--wrap` option. With this change `--wrap` now works for HTML. The default (as with other formats) is automatic wrapping to 72 columns. A new internal module, T.P.Writers.Blaze, exports `layoutMarkup`. This converts a blaze Html structure into a doclayout Doc Text. In addition, we now add a line break between an `img` tag and the associated `figcaption`. Note: Output is never wrapped in `writeHtmlStringForEPUB`. This accords with previous behavior since previously the HTML writer was insensitive to `--wrap` settings. There's no real need to wrap HTML inside a zipped container. Note that the contents of script, textarea, and pre tags are always laid out with the `flush` combinator, so that unwanted spaces won't be introduced if these occur in an indented context in a template. Closes #7764.
2021-12-21Lua: simplify code of pandoc.utils.stringifyAlbert Krewinkel1-40/+22
Minor behavior change: plain strings nested in tables are now included in the result string.
2021-12-21Lua: simplify and deprecate function `pandoc.utils.equals`Albert Krewinkel1-3/+3
The function is no longer required for element comparisons; it is now an alias for the `==` operator.
2021-12-21Lua: add new library function `pandoc.utils.type`.Albert Krewinkel1-0/+12
The function behaves like the default `type` function from Lua's standard library, but is aware of pandoc userdata types. A typical use-case would be to determine the type of a metadata value.
2021-12-21Lua: fix return types of `blocks_to_inlines`, `make_sections`Albert Krewinkel1-2/+2
Ensures the returned lists have the correct type (`Inlines` and `Blocks`, respectively).
2021-12-20Lua: use more natural representation for Reference valuesAlbert Krewinkel1-6/+12
Omit `false` boolean values, push integers as numbers.
2021-12-19Custom writer: assign default Pandoc object to global PANDOC_DOCUMENTAlbert Krewinkel1-9/+3
The default Pandoc object is now non-strict, i.e., only the parts of the document that are accessed will be marshaled to Lua. A special type is no longer necessary. This change also makes it possible to use the global variable with library functions such as `pandoc.utils.references`, or to inspect the document contents with `walk()`.
2021-12-19Add a writer for Markua 0.10 (#7729)binaarinen7-70/+217
Markua is a markdown variant used by Leanpub. More information about Markua can be found at https://leanpub.com/markua/read. Adds a new exported function `writeMarkua` from T.P.Writers.Markdown. [API change] Closes #1871. Co-authored by Tim Wisotzki and Samuel Lemmenmeier.
2021-12-19JATS writer: keep quotes in element-citationsAlbert Krewinkel1-5/+5
The JATS writer was losing quotes in element-citations, as it uses the `T.P.Citeproc.getReferences` function to get references. That function replaces `Quoted` elements with spans. That transformation is required in `T.P.Citeproc.processCitations`, so it has been moved there.
2021-12-19Lua: fixup, should have been part of previous commitAlbert Krewinkel1-0/+3
2021-12-18Citeproc: avoid adding comma before an author-in-text citation...John MacFarlane1-8/+14
...in a note if it begins with a title (no author). Closes #7761.
2021-12-17Lua: add function `pandoc.utils.references`Albert Krewinkel2-0/+115
List with all cited references of a document. Closes: #7752
2021-12-15T.P.Citeproc: do not export getStyle, getCiteprocLang.John MacFarlane1-2/+0
This commit undoes the API changes noted in ea77f2e6f653d5b570109fa208dc427d99f95b51 They are no longer needed, and we should avoid unnecessary API changes.
2021-12-14Org writer: use the citation locator list from the org source code...John MacFarlane1-13/+61
which is not localized, instead of getting locators from the localized CSL stylesheet as we did before.
2021-12-14Org reader: parse official org-cite citations.John MacFarlane1-39/+160
We also support the older org-ref style as a fallback. We no longer support the "markdown-style" citations. See #7329.
2021-12-14Markdown writer: avoid extra space before citation suffix...John MacFarlane1-2/+4
if it already starts with a space.
2021-12-14Markdown writer: ensure semicolon btw locator and next citation...John MacFarlane1-1/+5
when an author-in-text citation has a locator and following citations.
2021-12-14Org reader: remove support for "Berkeley style" citations.John MacFarlane1-145/+42
See #7329.
2021-12-13Org writer: add tests for org-cite citations, and improve support.John MacFarlane1-4/+28
2021-12-13Markdown reader: fix parsing of "bare locators"...John MacFarlane1-1/+1
...after author-in-text citations. Previously `@item [p. 12; @item2]` was incorrectly parsed as three citations rather than two. This is now fixed by ensuring that `prefix` doesn't gobble any semicolons.
2021-12-13Citeproc changes:John MacFarlane2-41/+53
T.P.Citeproc exports `getCiteprocLang` and `getStyle` [API change]. T.P.Citeproc.Locator now exports `toLocatorMap`, `LocatorInfo`, and `LocatorMap`. The type of `parseLocator` has changed, so it now takes a `LocatorMap` rather than a `Locale` as parameter, and returns a `LocatorInfo` instead of a tuple.
2021-12-12Org writer: preliminary support for new org-cite syntax.John MacFarlane1-1/+21
See #7329. This could use some tests.
2021-12-11fix(IpynbOutput)!: rank always favors output formatKolen Cheung1-7/+7
Previously, both `fmt == f` case and Image have a rank of 1. In the end, e.g. from ipynb to html conversion, if both html and image exists, it actually prefers the image. This commit changes this, so that fmt == f is always highest rank, and rank never collides. This is achieved by keeping fmt == f case having rank 1, and every other rank increased by 1.
2021-12-11Custom reader: ensure old Readers continue to workAlbert Krewinkel2-16/+48
Retry conversion by passing a string instead of sources when the `Reader` fails with a message that hints at an outdated function. A deprecation notice is reported in that case.
2021-12-11Custom reader: pass list of sources instead of concatenated textAlbert Krewinkel3-6/+55
The first argument passed to Lua `Reader` functions is no longer a plain string but a richer data structure. The structure can easily be converted to a string by applying `tostring`, but is also a list with elements that contain each the *text* and *name* of each input source as a property of the respective name. A small example is added to the custom reader documentation, showcasing its use in a reader that creates a syntax-highlighted code block for each source code file passed as input. Existing readers must be updated.
2021-12-10Switch to released pandoc-lua-marshal-0.1.2Albert Krewinkel1-0/+1
Cell values are now marshaled as userdata objects; a constructor function for table cells is provided as `pandoc.Cell`.
2021-12-09ipynb writer: handle cell output with raw block of markdown (#7563)Kolen Cheung1-0/+2
Write RawBlock of markdown in code-cell output. #7561 makes the ipynb reader reads code-cell output with mime "text/markdown" to a RawBlock of markdown This commit makes the ipynb writer writes this RawBlock of markdown back inside a code-cell output with the same mime, preserving this information in round-trip Add tests of ipynb reader (#7561) and ipynb writer (#7563)'s ability to handle a "text/markdown" mime type in a code-cell output
2021-12-09Lua: update to latest pandoc-lua-marshal (0.1.1)Albert Krewinkel3-423/+26
- `walk` methods are added to `Block` and `Inline` values; the methods are similar to `pandoc.utils.walk_block` and `pandoc.utils.walk_inline`, but apply to filter also to the element itself, and therefore return a list of element instead of a single element. - Functions of name `Doc` are no longer accepted as alternatives for `Pandoc` filter functions. This functionality was undocumented.
2021-12-08Ipynb writer: ensure deterministic order of keys.John MacFarlane1-1/+1
2021-12-07Revert "Markdown reader: Improve inlinesInBalancedBrackets."John MacFarlane1-12/+20
This reverts commit fa83246d7de8527bbf59dfac9636a42ede185194.
2021-12-06Ipynb reader & writer: properly handle cell "id".John MacFarlane2-24/+56
This is passed through if it exists (in Nb4); otherwise the writer will add a random one so that cells all have an "id". Closes #7728.
2021-12-06Ms writer: properly encode strings for PDF contents.John MacFarlane1-2/+19
Closes #7731.
2021-12-05Commonmark writer: allow ')' delimiters on ordered lists.John MacFarlane1-1/+6
2021-12-03Improve Markdown writer escaping.John MacFarlane1-18/+19
This fixes escaping for '#' in particular. Closes #7726.
2021-11-30Markdown reader: don't allow `^` at beginning of link or image label.John MacFarlane1-2/+1
This is reserved for footnotes. Fixes a regression introduced by 0a93acf. Closes #7723.
2021-11-29Lua: remove `pandoc.utils.text` (#7720)Albert Krewinkel1-8/+0
The new `pandoc.Inlines` function behaves identical on string input, but allows other Inlines-like arguments as well. The `pandoc.utils.text` function could be written as function pandoc.utils.text (x) assert(type(x) == 'string') return pandoc.Inlines(x) end
2021-11-28Lua: add constructors `pandoc.Blocks` and `pandoc.Inlines`Albert Krewinkel1-0/+2
The functions convert their argument into a list of Block and Inline values, respectively.
2021-11-27Lua: use package pandoc-lua-marshal (#7719)Albert Krewinkel24-1729/+207
The marshaling functions for pandoc's AST are extracted into a separate package. The package comes with a number of changes: - Pandoc's List module was rewritten in C, thereby improving error messages. - Lists of `Block` and `Inline` elements are marshaled using the new list types `Blocks` and `Inlines`, respectively. These types currently behave identical to the generic List type, but give better error messages. This also opens up the possibility of adding element-specific methods to these lists in the future. - Elements of type `MetaValue` are no longer pushed as values which have `.t` and `.tag` properties. This was already true for `MetaString` and `MetaBool` values, which are still marshaled as Lua strings and booleans, respectively. Affected values: + `MetaBlocks` values are marshaled as a `Blocks` list; + `MetaInlines` values are marshaled as a `Inlines` list; + `MetaList` values are marshaled as a generic pandoc `List`s. + `MetaMap` values are marshaled as plain tables and no longer given any metatable. - The test suite for marshaled objects and their constructors has been extended and improved. - A bug in Citation objects, where setting a citation's suffix modified it's prefix, has been fixed.
2021-11-24LaTeX reader: Fix semantics of `\ref`.John MacFarlane1-5/+3
We were including the ams environment type in addition to the number. This is proper behavior for `\cref` but not for `\ref`. To support `\cref` we need to store the environment label separately.
2021-11-24LaTeX reader: improve references.John MacFarlane4-5/+27
- Resolve references to theorem environments. - Remove Span caused by "label" in figure, table, and theorem environments; this had an id that duplicated the environments' id. See #813.
2021-11-24LaTeX reader: omit visible content for `\label{...}`.John MacFarlane1-2/+1
Previously we included the text of the label in square brackets, but this is undesirable in many cases. See discussion in <https://github.com/jgm/pandoc/issues/813#issuecomment-978232426>.
2021-11-24HTML reader: parse attributes on links and images.John MacFarlane2-11/+10
Closes #6970.