aboutsummaryrefslogtreecommitdiff
path: root/src/Text
AgeCommit message (Collapse)AuthorFilesLines
2021-12-22Add text wrapping to HTML output.John MacFarlane2-49/+197
Previously the HTML writer was exceptional in not being sensitive to the `--wrap` option. With this change `--wrap` now works for HTML. The default (as with other formats) is automatic wrapping to 72 columns. A new internal module, T.P.Writers.Blaze, exports `layoutMarkup`. This converts a blaze Html structure into a doclayout Doc Text. In addition, we now add a line break between an `img` tag and the associated `figcaption`. Note: Output is never wrapped in `writeHtmlStringForEPUB`. This accords with previous behavior since previously the HTML writer was insensitive to `--wrap` settings. There's no real need to wrap HTML inside a zipped container. Note that the contents of script, textarea, and pre tags are always laid out with the `flush` combinator, so that unwanted spaces won't be introduced if these occur in an indented context in a template. Closes #7764.
2021-12-21Lua: simplify code of pandoc.utils.stringifyAlbert Krewinkel1-40/+22
Minor behavior change: plain strings nested in tables are now included in the result string.
2021-12-21Lua: simplify and deprecate function `pandoc.utils.equals`Albert Krewinkel1-3/+3
The function is no longer required for element comparisons; it is now an alias for the `==` operator.
2021-12-21Lua: add new library function `pandoc.utils.type`.Albert Krewinkel1-0/+12
The function behaves like the default `type` function from Lua's standard library, but is aware of pandoc userdata types. A typical use-case would be to determine the type of a metadata value.
2021-12-21Lua: fix return types of `blocks_to_inlines`, `make_sections`Albert Krewinkel1-2/+2
Ensures the returned lists have the correct type (`Inlines` and `Blocks`, respectively).
2021-12-20Lua: use more natural representation for Reference valuesAlbert Krewinkel1-6/+12
Omit `false` boolean values, push integers as numbers.
2021-12-19Custom writer: assign default Pandoc object to global PANDOC_DOCUMENTAlbert Krewinkel1-9/+3
The default Pandoc object is now non-strict, i.e., only the parts of the document that are accessed will be marshaled to Lua. A special type is no longer necessary. This change also makes it possible to use the global variable with library functions such as `pandoc.utils.references`, or to inspect the document contents with `walk()`.
2021-12-19Add a writer for Markua 0.10 (#7729)binaarinen7-70/+217
Markua is a markdown variant used by Leanpub. More information about Markua can be found at https://leanpub.com/markua/read. Adds a new exported function `writeMarkua` from T.P.Writers.Markdown. [API change] Closes #1871. Co-authored by Tim Wisotzki and Samuel Lemmenmeier.
2021-12-19JATS writer: keep quotes in element-citationsAlbert Krewinkel1-5/+5
The JATS writer was losing quotes in element-citations, as it uses the `T.P.Citeproc.getReferences` function to get references. That function replaces `Quoted` elements with spans. That transformation is required in `T.P.Citeproc.processCitations`, so it has been moved there.
2021-12-19Lua: fixup, should have been part of previous commitAlbert Krewinkel1-0/+3
2021-12-18Citeproc: avoid adding comma before an author-in-text citation...John MacFarlane1-8/+14
...in a note if it begins with a title (no author). Closes #7761.
2021-12-17Lua: add function `pandoc.utils.references`Albert Krewinkel2-0/+115
List with all cited references of a document. Closes: #7752
2021-12-15T.P.Citeproc: do not export getStyle, getCiteprocLang.John MacFarlane1-2/+0
This commit undoes the API changes noted in ea77f2e6f653d5b570109fa208dc427d99f95b51 They are no longer needed, and we should avoid unnecessary API changes.
2021-12-14Org writer: use the citation locator list from the org source code...John MacFarlane1-13/+61
which is not localized, instead of getting locators from the localized CSL stylesheet as we did before.
2021-12-14Org reader: parse official org-cite citations.John MacFarlane1-39/+160
We also support the older org-ref style as a fallback. We no longer support the "markdown-style" citations. See #7329.
2021-12-14Markdown writer: avoid extra space before citation suffix...John MacFarlane1-2/+4
if it already starts with a space.
2021-12-14Markdown writer: ensure semicolon btw locator and next citation...John MacFarlane1-1/+5
when an author-in-text citation has a locator and following citations.
2021-12-14Org reader: remove support for "Berkeley style" citations.John MacFarlane1-145/+42
See #7329.
2021-12-13Org writer: add tests for org-cite citations, and improve support.John MacFarlane1-4/+28
2021-12-13Markdown reader: fix parsing of "bare locators"...John MacFarlane1-1/+1
...after author-in-text citations. Previously `@item [p. 12; @item2]` was incorrectly parsed as three citations rather than two. This is now fixed by ensuring that `prefix` doesn't gobble any semicolons.
2021-12-13Citeproc changes:John MacFarlane2-41/+53
T.P.Citeproc exports `getCiteprocLang` and `getStyle` [API change]. T.P.Citeproc.Locator now exports `toLocatorMap`, `LocatorInfo`, and `LocatorMap`. The type of `parseLocator` has changed, so it now takes a `LocatorMap` rather than a `Locale` as parameter, and returns a `LocatorInfo` instead of a tuple.
2021-12-12Org writer: preliminary support for new org-cite syntax.John MacFarlane1-1/+21
See #7329. This could use some tests.
2021-12-11fix(IpynbOutput)!: rank always favors output formatKolen Cheung1-7/+7
Previously, both `fmt == f` case and Image have a rank of 1. In the end, e.g. from ipynb to html conversion, if both html and image exists, it actually prefers the image. This commit changes this, so that fmt == f is always highest rank, and rank never collides. This is achieved by keeping fmt == f case having rank 1, and every other rank increased by 1.
2021-12-11Custom reader: ensure old Readers continue to workAlbert Krewinkel2-16/+48
Retry conversion by passing a string instead of sources when the `Reader` fails with a message that hints at an outdated function. A deprecation notice is reported in that case.
2021-12-11Custom reader: pass list of sources instead of concatenated textAlbert Krewinkel3-6/+55
The first argument passed to Lua `Reader` functions is no longer a plain string but a richer data structure. The structure can easily be converted to a string by applying `tostring`, but is also a list with elements that contain each the *text* and *name* of each input source as a property of the respective name. A small example is added to the custom reader documentation, showcasing its use in a reader that creates a syntax-highlighted code block for each source code file passed as input. Existing readers must be updated.
2021-12-10Switch to released pandoc-lua-marshal-0.1.2Albert Krewinkel1-0/+1
Cell values are now marshaled as userdata objects; a constructor function for table cells is provided as `pandoc.Cell`.
2021-12-09ipynb writer: handle cell output with raw block of markdown (#7563)Kolen Cheung1-0/+2
Write RawBlock of markdown in code-cell output. #7561 makes the ipynb reader reads code-cell output with mime "text/markdown" to a RawBlock of markdown This commit makes the ipynb writer writes this RawBlock of markdown back inside a code-cell output with the same mime, preserving this information in round-trip Add tests of ipynb reader (#7561) and ipynb writer (#7563)'s ability to handle a "text/markdown" mime type in a code-cell output
2021-12-09Lua: update to latest pandoc-lua-marshal (0.1.1)Albert Krewinkel3-423/+26
- `walk` methods are added to `Block` and `Inline` values; the methods are similar to `pandoc.utils.walk_block` and `pandoc.utils.walk_inline`, but apply to filter also to the element itself, and therefore return a list of element instead of a single element. - Functions of name `Doc` are no longer accepted as alternatives for `Pandoc` filter functions. This functionality was undocumented.
2021-12-08Ipynb writer: ensure deterministic order of keys.John MacFarlane1-1/+1
2021-12-07Revert "Markdown reader: Improve inlinesInBalancedBrackets."John MacFarlane1-12/+20
This reverts commit fa83246d7de8527bbf59dfac9636a42ede185194.
2021-12-06Ipynb reader & writer: properly handle cell "id".John MacFarlane2-24/+56
This is passed through if it exists (in Nb4); otherwise the writer will add a random one so that cells all have an "id". Closes #7728.
2021-12-06Ms writer: properly encode strings for PDF contents.John MacFarlane1-2/+19
Closes #7731.
2021-12-05Commonmark writer: allow ')' delimiters on ordered lists.John MacFarlane1-1/+6
2021-12-03Improve Markdown writer escaping.John MacFarlane1-18/+19
This fixes escaping for '#' in particular. Closes #7726.
2021-11-30Markdown reader: don't allow `^` at beginning of link or image label.John MacFarlane1-2/+1
This is reserved for footnotes. Fixes a regression introduced by 0a93acf. Closes #7723.
2021-11-29Lua: remove `pandoc.utils.text` (#7720)Albert Krewinkel1-8/+0
The new `pandoc.Inlines` function behaves identical on string input, but allows other Inlines-like arguments as well. The `pandoc.utils.text` function could be written as function pandoc.utils.text (x) assert(type(x) == 'string') return pandoc.Inlines(x) end
2021-11-28Lua: add constructors `pandoc.Blocks` and `pandoc.Inlines`Albert Krewinkel1-0/+2
The functions convert their argument into a list of Block and Inline values, respectively.
2021-11-27Lua: use package pandoc-lua-marshal (#7719)Albert Krewinkel24-1729/+207
The marshaling functions for pandoc's AST are extracted into a separate package. The package comes with a number of changes: - Pandoc's List module was rewritten in C, thereby improving error messages. - Lists of `Block` and `Inline` elements are marshaled using the new list types `Blocks` and `Inlines`, respectively. These types currently behave identical to the generic List type, but give better error messages. This also opens up the possibility of adding element-specific methods to these lists in the future. - Elements of type `MetaValue` are no longer pushed as values which have `.t` and `.tag` properties. This was already true for `MetaString` and `MetaBool` values, which are still marshaled as Lua strings and booleans, respectively. Affected values: + `MetaBlocks` values are marshaled as a `Blocks` list; + `MetaInlines` values are marshaled as a `Inlines` list; + `MetaList` values are marshaled as a generic pandoc `List`s. + `MetaMap` values are marshaled as plain tables and no longer given any metatable. - The test suite for marshaled objects and their constructors has been extended and improved. - A bug in Citation objects, where setting a citation's suffix modified it's prefix, has been fixed.
2021-11-24LaTeX reader: Fix semantics of `\ref`.John MacFarlane1-5/+3
We were including the ams environment type in addition to the number. This is proper behavior for `\cref` but not for `\ref`. To support `\cref` we need to store the environment label separately.
2021-11-24LaTeX reader: improve references.John MacFarlane4-5/+27
- Resolve references to theorem environments. - Remove Span caused by "label" in figure, table, and theorem environments; this had an id that duplicated the environments' id. See #813.
2021-11-24LaTeX reader: omit visible content for `\label{...}`.John MacFarlane1-2/+1
Previously we included the text of the label in square brackets, but this is undesirable in many cases. See discussion in <https://github.com/jgm/pandoc/issues/813#issuecomment-978232426>.
2021-11-24HTML reader: parse attributes on links and images.John MacFarlane2-11/+10
Closes #6970.
2021-11-24Lua: allow single elements as singleton MetaBlocks/MetaInlinesAlbert Krewinkel1-0/+3
Single elements should always be treated as singleton lists in the Lua subsystem.
2021-11-23Improve detection of pipe table line widths.John MacFarlane1-14/+18
Fixed calculation of maximum column widths in pipe tables. It is now based on the length of the markdown line, rather than a "stringified" version of the parsed line. This should be more predictable for users. In addition, we take into account double-wide characters such as emojis. Closes #7713.
2021-11-23Lua: add function `pandoc.utils.text` (#7710)Albert Krewinkel2-2/+11
The function converts a string to `Inlines`, treating interword spaces as `Space`s or `SoftBreak`s. If you want a `Str` with literal spaces, use `pandoc.Str`. Closes: #7709
2021-11-23Lua: split strings into words when treating them as Inline list (#7712)Albert Krewinkel1-4/+7
Using a Lua string where a list of inlines is expected will cause the string to be split into words, replacing spaces and tabs into `pandoc.Space()` elements and newlines into `pandoc.SoftBreak()`. The previous behavior was to treat the string `s` as `{pandoc.Str(s)}`. The old behavior can be recovered by wrapping the string into a table `{s}`.
2021-11-22Add .yml to Citeproc formatFromExtension (#7706)Jörn Krenzer1-0/+1
Make Citeproc recognize files with .yml extension (in addition to .yaml) as YAML bibliographies. Closes #7707.
2021-11-21yamlBsToRefs: allow multiple YAML documents.John MacFarlane1-2/+2
Some people use `---` as the end delimiter in YAML bibliography files, which causes the `yaml` library to emit an error unless we explicitly allow multiple YAML documents (and just consider the first). In T.P.Readers.Metadata
2021-11-20Capture `alt-text` in JATS figures (#7703)Albert Krewinkel1-2/+13
Co-authored-by: Aner Lucero <4rgento@gmail.com>
2021-11-19Lua: fix global module loading (#7701)Albert Krewinkel1-7/+27