pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2021-11-04	Allow `plain` to be used in raw attribute syntax.	John MacFarlane	2	-2/+4

2021-10-29	Docx writer: move ": " out of the caption bookmark.	Tristan Stenner	2	-6/+4
	This is needed so that native references to the figure are included as "As seen in Figure X, it is..." instead of "As seen in [Figure: , it is..."
2021-10-27	Markdown writer: Be sure to quote special values in YAML metadata.	John MacFarlane	1	-3/+13
	E.g. "Y", "yes", which are now (with yaml library) considered boolean values, as well as "null". This fixes a bug with roundtripping markdown -> markdown: ``` --- foo: "true" ... ```
2021-10-22	Use simpleFigure in Readers.	Aner Lucero	19	-66/+43

2021-10-22	Switch to hslua-2.0	Albert Krewinkel	1	-72/+82
	The new HsLua version takes a somewhat different approach to marshalling and unmarshalling, relying less on typeclasses and more on specialized types. This allows for better performance and improved error messages. Furthermore, new abstractions allow to document the code and exposed functions.
2021-10-17	pptx: Line up continuation paragraphs	Emily Bourke	2	-10/+93
	This commit changes the `marL` and `indent` values used for plain paragraphs and numbered lists, and changes the spacing defined in the reference doc master for bulleted lists. For paragraphs, there is now a left-indent taken from the `otherStyle` in the master. For numbered lists, the number is positioned where the text would be if this were a plain paragraph, and the text is indented to the next level. This means that continuation paragraphs line up nicely with numbered lists. It also /mostly/ matches the observed PowerPoint behaviour when inserting paragraphs and numbered lists: the only difference is that PowerPoint was using a different margin value for the first level numbered lists – I’ve changed this to match the other levels, as I don’t think it makes the spacing unappealing and it allows continuation paragraphs at any level to line up. With bulleted lists, I’m keeping the observed PowerPoint behaviour of specifying only a level, letting `marL` and `indent` be automatically taken from `bodyStyle`. To that end, this commit changes the `bodyStyle` spacing in the master of the default reference doc, to: - line up the text of the first paragraph in each bullet with any continuation paragraphs - line up nested bullet markers in any continuation paragraphs with the first paragraph, matching lists and plain paragraphs This does mean the continuation paragraphs still won’t line up for anyone using their own reference doc where they haven’t matched the `otherStyle` and `bodyStyle` indent levels, but I think people in that situation will be able to troubleshoot.
2021-10-17	pptx: Remove outdated comment	Emily Bourke	1	-3/+0
	I removed the field this comment refers to recently, missed the comment.
2021-10-17	pptx: Fix list level numbering	Emily Bourke	1	-14/+17
	In PowerPoint, the content of a top-level list is at the same level as the content of a top-level paragraph – the only difference is that a list style has been applied. At the moment, the pptx writer increments the paragraph level on each list, turning what should be top-level lists into second-level lists. This commit changes that logic, only incrementing the paragraph level on continuation paragraphs of lists. - Fixes https://github.com/jgm/pandoc/issues/4828 - Fixes https://github.com/jgm/pandoc/issues/4663
2021-10-14	asciidoc writer: translate numberLines attribute to linesnum switch	Samuel Tardieu	1	-2/+5
	AsciiDoctor allows to request line numbering on code blocks by using a switch on the `source` block, such as in: ``` [source%linesnum,haskell] ---- some Haskell code here ---- ```
2021-10-12	Revert "Depend on pandoc-types 1.23, remove Null constructor on Block."	John MacFarlane	28	-1/+36
	This reverts commit fb0d6c7cb63a791fa72becf21ed493282e65ea91.
2021-10-11	T.P.Writers.Shared: remove 'breakable'...	John MacFarlane	1	-18/+0
	which was introduced in the cherry-pick'd commit that added splitSentences, but isn't needed here. (It is for the nospace branch.)
2021-10-11	T.P.Writers.Shared: Export splitSentences as a Doc Text transform.	John MacFarlane	3	-16/+61
	[API change] Use this in man/ms.
2021-10-11	Remove splitSentences from T.P.Shared [API change].	John MacFarlane	2	-6/+4
	We used to attempt automatic sentence splitting in man and ms output, since sentence-ending periods need to be followed by two spaces or a newline in these formats. But it's difficult to do this reliably at the level of `[Inline]`.
2021-10-05	Avoid bad wraps in markdown writer at the Doc Text level.	John MacFarlane	1	-22/+23
	Previously we tried to do this at the Inline list level, but it makes more sense to intervene on breaking spaces at the Doc Text level.
2021-10-04	Powerpoint writer: consolidate text runs when possible.	John MacFarlane	2	-4/+9
	This slims down the output files by avoiding unnecessary text run elements. Updated golden tests.
2021-10-04	Revert "Powerpoint writer: consolidate text run nodes."	John MacFarlane	1	-9/+1
	This reverts commit 62f83aa48633af477913bde6f615fe9f8793901a. This was already being done, it seems. I misidentified the problem; it is really with `Str ""` nodes.
2021-10-04	Powerpoint writer: consolidate text run nodes.	John MacFarlane	1	-1/+9
	This should reduce the size of the generated files.
2021-10-01	Depend on pandoc-types 1.23, remove Null constructor on Block.	John MacFarlane	28	-36/+1

2021-09-30	epub: Add EPUB3 subject metadata (authority/term)	nuew	1	-10/+31
	This adds the ability to specify EPUB 3 `authority` and `term` specific refinements to the `subject` tag. Specifying a plain `subject` tag in metadata will function as before.
2021-09-29	EPUB writer: treat epub:type "frontispiece" as front matter.	John MacFarlane	1	-1/+1
	This allows you to include a frontispiece using ``` ![](yourimage.jpg) etc. ``` Closes #7600.
2021-09-28	Switch from pretty-simple to pretty-show for native output.	John MacFarlane	1	-12/+8
	Update tests. Reason: it turns out that the native output generated by pretty-simple isn't always readable by the native reader. According to https://github.com/cdepillabout/pretty-simple/issues/99 it is not a design goal of the library that the rendered values be readable using 'read'. This makes it unsuitable for our purposes. pretty-show is a bit slower and it uses 4-space indents (non-configurable), but it doesn't have this serious drawback.
2021-09-26	RST writer: properly handle anchors to ids...	John MacFarlane	1	-1/+6
	with spaces or leading underscore. In this cases we need the quoted form, e.g. ``` .. _`foo bar`: .. _`_foo`: ``` Side note: rST will "normalize" these identifiers anyway, ignoring the underscore: https://docutils.sourceforge.io/docs/ref/rst/directives.html#identifier-normalization Closes #7593.
2021-09-23	HTML writer: render `\ref` and `\eqref` as inline math...	John MacFarlane	1	-8/+11
	not display. See #7589.
2021-09-22	HTML writer: pass through `\ref` and `\eqref`...	John MacFarlane	1	-2/+10
	if MathJax is used. Closes #7587.
2021-09-22	HTML writer: pass through inline math environments with KaTeX.	John MacFarlane	1	-0/+1

2021-09-21	Use pretty-simple to format native output.	John MacFarlane	1	-73/+15
	Previously we used our own homespun formatting. But this produces over-long lines that aren't ideal for diffs in tests. Easier to use something off-the-shelf and standard. Closes #7580. Performance is slower by about a factor of 10, but this isn't really a problem because native isn't suitable as a serialization format. (For serialization you should use json, because the reader is so much faster than native.)
2021-09-19	Use babel, not polyglossia, with xelatex.	John MacFarlane	3	-102/+13
	Previously polyglossia worked better with xelatex, but that is no longer the case, so we simplify the code so that babel is used with all latex engines. This involves a change to the default LaTeX template.
2021-09-19	Markdown writer: use `underline` class rather than `ul` for underline.	John MacFarlane	1	-1/+1
	This only affects output with bracketed_spans enabled. The markdown reader parses spans with either `.ul` or `.underline` as Underline elements, but we're moving towards preferring the latter.
2021-09-18	pptx-footers: Replace fixed dates with yaml date	Emily Bourke	2	-8/+44

2021-09-18	pptx: Support footers in the reference doc	Emily Bourke	1	-15/+88
	In PowerPoint, it’s possible to specify footers across all slides, containing a date (optionally automatically updated to today’s date), the slide number (optionally starting from a higher number than 1), and static text. There’s also an option to hide the footer on the title slide. Before this commit, none of that footer content was pulled through from the reference doc: this commit supports all the functionality listed above. There is one behaviour which may not be immediately obvious: if the reference doc specifies a fixed date (i.e. not automatically updating), and there’s a date specified in the metadata for the document, the footer date is replaced by the metadata date. - Include date, slide number, and static footer content from reference doc - Respect “slide number starts from” option - Respect “Don’t show on title slide” option - Add tests
2021-09-17	Org writer: don't indent contents of code blocks.	John MacFarlane	1	-1/+1
	We previously indented them by two spaces, following a common convention. Since the convention is fading, and the indentation is inconvenient for copy/paste, we are discontinuing this practice. Closes #5440.
2021-09-17	Update list of supported source languages in org writer.	John MacFarlane	1	-12/+44
	See #5440.
2021-09-16	pptx: Support specifying slide background images	Emily Bourke	2	-36/+110
	In the reveal-js output, it’s possible to use reveal’s `data-background-image` class on a slide’s title to specify a background image for the slide. With this commit, it’s possible to use `background-image` in the same way for pptx output. Only the “stretch” mode is supported, and the background image is centred around the slide in the image’s larger axis, matching the observed default behaviour of PowerPoint. - Support `background-image` per slide. - Add tests. - Update manual.
2021-09-16	HTML writer: set "hash" to True by default (for reveal.js).	John MacFarlane	1	-1/+1
	Closes #7574. See #6968 where the motivation for setting "hash" to True is explained.
2021-09-15	pptx: Add support for incremental lists	Emily Bourke	2	-162/+455
	- Support -i option - Support incremental/noincremental divs - Support older block quote syntax - Add tests One thing not clear from the manual is what should happen when the input uses a combination of these things. For example, what should the following produce? ```md ::: {.incremental .nonincremental} - are - these - incremental? ::: ::: incremental ::::: nonincremental - or - these? ::::: ::: ::: nonincremental > - how > - about > - these? ::: ``` In this commit I’ve taken the following approach, matching the observed behaviour for beamer and reveal.js output: - if a div with both classes, incremental wins - the innermost incremental/nonincremental div is the one which takes effect - a block quote containing a list as its first element inverts whether the list is incremental, whether or not the quote is inside an incremental/non-incremental div I’ve added some tests to verify this behaviour. This commit closes issue #5689 (https://github.com/jgm/pandoc/issues/5689).
2021-09-13	pptx: Fix logic for choosing Comparison layout	Emily Bourke	1	-4/+5
	There was a mistake in the logic used to choose between the Comparison and Two Content layouts: if one column contained only non-text (an image or a table) and the other contained only text, the Comparison layout was chosen instead of the desired Two Content layout. This commit fixes that logic: > If either column contains text followed by non-text, use Comparison. Otherwise, use Two Content. It also adds a test asserting this behaviour.
2021-09-12	Docx writer: make id used in native_numbering predictable.	John MacFarlane	1	-3/+6
	If the image has the id IMAGEID, then we use the id ref_IMAGEID for the figure number. Closes #7551. This allows one to create a filter that adds a figure number with figure name, e.g. <w:fldSimple w:instr=" REF ref_superfig "><w:r><w:t>Figure X</w:t></w:r></w:fldSimple> For this to be possible it must be possible to predict the figure number id from the image id. If images lack an id, an id of the form `ref_fig1` is used.
2021-09-10	fix!(ipynb writer): improve round trip identity	Kolen Cheung	1	-2/+2
	for raw cell output BREAKING CHANGE: The Jupyter ecosystem, including nbconvert, lab and notebook, deviated from their own spec in nbformat, where they used the key `raw_mimetype` instead of `format`. Moreover, the mime-type of rst used in Jupyter deviated from that suggested by https://docutils.sourceforge.io/FAQ.html and is defined as `text/restructuredtext` when chosen from "Raw NBConvert Format" in Jupyter. So while this is backward-compatible, it should matches the real world usage better, hence improving the round-trip "identity" in raw-cell. See #229, jupyter/nbformat#229.
2021-09-10	feat(ipynb writer): add more Jupyter's "Raw NBConvert Format"	Kolen Cheung	1	-0/+7
	Adds more formats that Jupyter's "Raw NBConvert Format" uses natively (asciidoc), and maps more formats to text/html whenever it makes sense.
2021-09-10	pptx: Copy embedded fonts from reference doc	Emily Bourke	1	-0/+1
	We already copy the relationships and elements in presentation.xml for embedded fonts, so at the moment using a reference doc with embedded fonts is broken, producing a pptx that PowerPoint says needs repairing. This commit copies the fonts over, which I believe is all that’s needed to work correctly with reference docs with embedded fonts.
2021-09-10	pptx: Fix presentation rel numbering	Emily Bourke	1	-63/+131
	Before now, the numbering of rIds was inconsistent when making the presentation XML and when making the presentation relationships XML. For the relationships, the slides were inserted into the rId order after the first master, and everything else was moved up out of the way. However, this change was then missed in the presentation XML, I think because `envSlideOffset` was never set. The result was that any slide masters after the first would have the wrong rIds in the presentation XML, clashing with the slides, which would lead PowerPoint to view produced files as corrupt. As well, other relationships (like embedded fonts) would have their rId changed in the relationships XML but not in the presentation XML. This commit: - Removes `envSlideOffset` in favour of directly passed function arguments - Inserts the slides into the rId order after all masters rather than after the first - Updates any other rIds in presentation.xml that need to be changed
2021-09-10	pptx: Include all themes in output archive	Emily Bourke	1	-4/+2
	- Accept test changes: they’re adding the second theme (for all tests not containing speaker notes), or changing its position in the XML (for the ones containing speaker notes).
2021-09-10	pptx: Don’t add relationships unnecessarily	Emily Bourke	1	-5/+14
	Before now, for any layouts added to the output from the default reference doc, the relationships were unconditionally added to the output. However, if there was already a layout in slideMaster1 at the same index then that results in duplicate relationships. This commit checks first, and only adds the relationship if it doesn’t already exist.
2021-09-10	pptx: Fix capitalisation of notesMasterId	Emily Bourke	1	-1/+1
	I don’t think this has caused any problems, but before now it’s been "NotesMasterId", which is incorrect according to [ECMA-376]. [ECMA-376]: https://www.ecma-international.org/publications-and-standards/standards/ecma-376/
2021-09-10	Support `--reference-location` for HTML output (#7461)	Francesco Mazzoli	1	-32/+89
	The HTML writer now supports `EndOfBlock`, `EndOfSection`, and `EndOfDocument` for reference locations. EPUB and HTML slide show formats are also affected by this change. This works similarly to the markdown writer, but with special care taken to skipping section divs with what regards to the block level. The change also takes care to not modify the output if `EndOfDocument` is used.
2021-09-01	pptx: Add support for more layouts	Emily Bourke	2	-55/+313
	Until now, the pptx writer only supported four slide layouts: “Title Slide” (used for the automatically generated metadata slide), “Section Header” (used for headings above the slide level), “Two Column” (used when there’s a columns div containing at least two column divs), and “Title and Content” (used for all other slides). This commit adds support for three more layouts: Comparison, Content with Caption, and Blank. - Support “Comparison” slide layout This layout is used when a slide contains at least two columns, at least one of which contains some text followed by some non-text (e.g. an image or table). The text in each column is inserted into the “body” placeholder for that column, and the non-text is inserted into the ObjType placeholder. Any extra content after the non-text is overlaid on top of the preceding content, rather than dropping it completely (as currently happens for the two-column layout). + Accept straightforward test changes Adding the new layout means the “-deleted-layouts” tests have an additional layout added to the master and master rels. + Add new tests for the comparison layout + Add new tests to pandoc.cabal - Support “Content with Caption” slide layout This layout is used when a slide’s body contains some text, followed by non-text (e.g. and image or a table). Before now, in this case the image or table would break onto a new slide: to get that output again, users can add a horizontal rule before the image or table. + Accept straightforward tests The “-deleted-layouts” tests all have an extra layout and relationship in the master for the Content with Caption layout. + Accept remove-empty-slides test Empty slides are still removed, but the Content with Caption layout is now used. + Change slide-level-0/h1-h2-with-text description This test now triggers the content with caption layout, giving a different (but still correct) result. + Add new tests for the new layout + Add new tests to the cabal file - Support “Blank” slide layout This layout is used when a slide contains only blank content (e.g. non-breaking spaces). No content is inserted into any placeholders in the layout. Fixes #5097. + Accept straightforward test changes Blank layout now copied over from reference doc as well, when layouts have been deleted. + Add some new tests A slide should use the blank layout if: - It contains only speaker notes - It contains only an empty heading with a body of nbsps - It contains only a heading containing only nbsps - Change ContentType -> Placeholder This type was starting to have a constructor for each placeholder on each slide (e.g. `ComparisonUpperLeftContent`). I’ve changed it instead to identify a placeholder by type and index, as I think that’s clearer and less redundant. - Describe layout-choosing logic in manual
2021-08-29	Improve asciidoc escaping for `--` in URLs. Closes #7529.	John MacFarlane	1	-3/+11

2021-08-28	Remove unneeded import.	John MacFarlane	1	-1/+1

2021-08-28	Docx writer: handle SVG images.	John MacFarlane	1	-3/+38
	This change has several parts: - In Text.Pandoc.App, if the writer is docx, we fill the media bag and attempt to convert any SVG images to PNG, adding these to the media bag. The PNG backups have the same filenames as the SVG images, but with an added .png extension. If the conversion cannot be done (e.g. because rsvg-convert is not present), a warning is omitted. - In Text.Pandoc.Writers.Docx, we now use Word 2016's syntax for including SVG images. If a PNG fallback is present in the media bag, we include a link to that too. It would be helpful if someone with an old Word version could test to see that the documents we produce can be opened and viewed with the PNG fallbacks. If not, then perhaps we can eliminate the slightly complex code for producing these fallbacks. Closes #4058.
2021-08-27	pptx: Make first heading title if slide level is 0	Emily Bourke	1	-24/+29
	Before this commit, the pptx writer adds a slide break before any table, “columns” div, or paragraph starting with an image, unless the only thing before it on the same slide is a heading at the slide level. In that case, the item and heading are kept on the same slide, and the heading is used as the slide title (inserted into the layout’s “title” placeholder). However, if the slide level is set to 0 (as was recently enabled) this makes it impossible to have a slide with a title which contains any of those items in its body. This commit changes this behaviour: now if the slide level is 0, then items will be kept with a heading of any level, if the heading’s the only thing before the item on the same slide.