pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2021-09-13	pptx: Fix logic for choosing Comparison layout	Emily Bourke	11	-2/+18
	There was a mistake in the logic used to choose between the Comparison and Two Content layouts: if one column contained only non-text (an image or a table) and the other contained only text, the Comparison layout was chosen instead of the desired Two Content layout. This commit fixes that logic: > If either column contains text followed by non-text, use Comparison. Otherwise, use Two Content. It also adds a test asserting this behaviour.
2021-09-12	Docx writer: make id used in native_numbering predictable.	John MacFarlane	1	-0/+0
	If the image has the id IMAGEID, then we use the id ref_IMAGEID for the figure number. Closes #7551. This allows one to create a filter that adds a figure number with figure name, e.g. <w:fldSimple w:instr=" REF ref_superfig "><w:r><w:t>Figure X</w:t></w:r></w:fldSimple> For this to be possible it must be possible to predict the figure number id from the image id. If images lack an id, an id of the form `ref_fig1` is used.
2021-09-10	pptx: Include all themes in output archive	Emily Bourke	144	-0/+0
	- Accept test changes: they’re adding the second theme (for all tests not containing speaker notes), or changing its position in the XML (for the ones containing speaker notes).
2021-09-10	pptx: Fix capitalisation of notesMasterId	Emily Bourke	20	-0/+0
	I don’t think this has caused any problems, but before now it’s been "NotesMasterId", which is incorrect according to [ECMA-376]. [ECMA-376]: https://www.ecma-international.org/publications-and-standards/standards/ecma-376/
2021-09-10	Fix command test for #7557.	John MacFarlane	1	-1/+1

2021-09-10	Org reader: don't parse a list as first item in a list item.	John MacFarlane	1	-0/+7
	Closes #7557.
2021-09-10	Support `--reference-location` for HTML output (#7461)	Francesco Mazzoli	5	-5/+78
	The HTML writer now supports `EndOfBlock`, `EndOfSection`, and `EndOfDocument` for reference locations. EPUB and HTML slide show formats are also affected by this change. This works similarly to the markdown writer, but with special care taken to skipping section divs with what regards to the block level. The change also takes care to not modify the output if `EndOfDocument` is used.
2021-09-04	RTF reader: better handling of `\*` and bookmarks.	John MacFarlane	1	-1/+1
	We now ensure that groups starting with `\*` never cause text to be added to the document. In addition, bookmarks now create a span between the start and end of the bookmark, rather than an empty span.
2021-09-01	pptx: Add support for more layouts	Emily Bourke	87	-2/+196
	Until now, the pptx writer only supported four slide layouts: “Title Slide” (used for the automatically generated metadata slide), “Section Header” (used for headings above the slide level), “Two Column” (used when there’s a columns div containing at least two column divs), and “Title and Content” (used for all other slides). This commit adds support for three more layouts: Comparison, Content with Caption, and Blank. - Support “Comparison” slide layout This layout is used when a slide contains at least two columns, at least one of which contains some text followed by some non-text (e.g. an image or table). The text in each column is inserted into the “body” placeholder for that column, and the non-text is inserted into the ObjType placeholder. Any extra content after the non-text is overlaid on top of the preceding content, rather than dropping it completely (as currently happens for the two-column layout). + Accept straightforward test changes Adding the new layout means the “-deleted-layouts” tests have an additional layout added to the master and master rels. + Add new tests for the comparison layout + Add new tests to pandoc.cabal - Support “Content with Caption” slide layout This layout is used when a slide’s body contains some text, followed by non-text (e.g. and image or a table). Before now, in this case the image or table would break onto a new slide: to get that output again, users can add a horizontal rule before the image or table. + Accept straightforward tests The “-deleted-layouts” tests all have an extra layout and relationship in the master for the Content with Caption layout. + Accept remove-empty-slides test Empty slides are still removed, but the Content with Caption layout is now used. + Change slide-level-0/h1-h2-with-text description This test now triggers the content with caption layout, giving a different (but still correct) result. + Add new tests for the new layout + Add new tests to the cabal file - Support “Blank” slide layout This layout is used when a slide contains only blank content (e.g. non-breaking spaces). No content is inserted into any placeholders in the layout. Fixes #5097. + Accept straightforward test changes Blank layout now copied over from reference doc as well, when layouts have been deleted. + Add some new tests A slide should use the blank layout if: - It contains only speaker notes - It contains only an empty heading with a body of nbsps - It contains only a heading containing only nbsps - Change ContentType -> Placeholder This type was starting to have a constructor for each placeholder on each slide (e.g. `ComparisonUpperLeftContent`). I’ve changed it instead to identify a placeholder by type and index, as I think that’s clearer and less redundant. - Describe layout-choosing logic in manual
2021-09-01	pptx: Restructure tests	Emily Bourke	125	-57/+57
	- Use dashes consistently rather than underscores - Make a folder for each set of tests - List test files explicitly (Cabal doesn’t support ** until version 2.4)
2021-08-29	Improve asciidoc escaping for `--` in URLs. Closes #7529.	John MacFarlane	1	-0/+7

2021-08-27	pptx: Make first heading title if slide level is 0	Emily Bourke	21	-0/+56
	Before this commit, the pptx writer adds a slide break before any table, “columns” div, or paragraph starting with an image, unless the only thing before it on the same slide is a heading at the slide level. In that case, the item and heading are kept on the same slide, and the heading is used as the slide title (inserted into the layout’s “title” placeholder). However, if the slide level is set to 0 (as was recently enabled) this makes it impossible to have a slide with a title which contains any of those items in its body. This commit changes this behaviour: now if the slide level is 0, then items will be kept with a heading of any level, if the heading’s the only thing before the item on the same slide.
2021-08-27	Ensure we have unique ids for wp:docPr and pic:cNvPr elements.	John MacFarlane	2	-0/+0
	This will, I hope, fix #7527 and #7503.
2021-08-24	Fix test for #7521.	John MacFarlane	1	-2/+2

2021-08-23	Markdown reader: fix interaction of --strip-comments and list	John MacFarlane	1	-0/+11
	parsing. Use of `--strip-comments` was causing tight lists to be rendered as loose (as if the comment were a blank line). Closes #7521.
2021-08-21	LaTeX-parser: restrict \endinput to current file	Simon Schuster	2	-0/+17

2021-08-20	RST reader: Fix `:literal:` includes.	John MacFarlane	1	-1/+1
	These should create code blocks, not insert raw RST. Closes #7513.
2021-08-18	pptx: Include image title in description	Emily Bourke	8	-0/+0
	The image title (i.e. `![alt text](link "title")`) was previously ignored when writing to pptx. This commit includes it in PowerPoint's description of the image, along with the link (which was already included). Fixes 7352.
2021-08-17	Revise citeproc code to fit new citeproc 0.5 API.	John MacFarlane	4	-8/+8
	Linkification of URLs in the bibliography is now done in the citeproc library, depending on the setting of an option. We set that option depending on the value of the metadata field `link-bibliography` (defaulting to true, for consistency with earlier behavior, though the new behavior includes the CSL draft recommendation of hyperlinking the title or the whole entry if a DOI, PMID, PMCID, or URL field is present but not explicitly rendered). These changes implement the following recommendations from the draft CSL v1.0.2 spec (Appendix VI): > The CSL syntax does not have support for configuration of links. > However, processors should include links on bibliographic references, > using the following rules: > If the bibliography entry for an item renders any of the following > identifiers, the identifier should be anchored as a link, with the > target of the link as follows: > - url: output as is > - doi: prepend with "`https://doi.org/`" > - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`" > - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`" > If the identifier is rendered as a URI, include rendered URI components > (e.g. "`https://doi.org/`") in the link anchor. Do not include any other > affix text in the link anchor (e.g. "Available from: ", "doi: ", "PMID: "). > If the bibliography entry for an item does not render any of > the above identifiers, then set the anchor of the link as the item > title. If title is not rendered, then set the anchor of the link as the > full bibliography entry for the item. Set the target of the link as one > of the following, in order of priority: > > - doi: prepend with "`https://doi.org/`" > - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`" > - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`" > - url: output as is > > If the item data does not include any of the above identifiers, do not > include a link. > > Citation processors should include an option flag for calling > applications to disable bibliography linking behavior. Thanks to Benjamin Bray for getting this all working.
2021-08-17	OOXML tests: silence warnings.	John MacFarlane	1	-0/+1
	These can make the test output confusing, making people think tests are failing when they're passing.
2021-08-17	pptx: Select layouts from reference doc by name	Emily Bourke	45	-7/+24
	Until now, users had to make sure that their reference doc contains layouts in a specific order: the first four layouts in the file had to have a specific structure, or else pandoc would error (or sometimes successfully produce a pptx file, which PowerPoint would then fail to open). This commit changes the layout selection to use the layout names rather than order: users must make sure their reference doc contains four layouts with specific names, and if a layout with the right name isn’t found pandoc will output a warning and use the corresponding layout from the default reference doc as a fallback. I believe the use of names rather than order will be clearer to users, and the clearer errors will help them troubleshoot when things go wrong. - Add tests for moved layouts - Add tests for deleted layouts - Add newly included layouts to slideMaster1.xml to fix tests
2021-08-17	Don’t compare cdLine in OOXML golden tests	Emily Bourke	1	-1/+0
	The `cdLine` field gives the line of the file some CData was found on. I don’t think this is a difference that should fail these golden tests, as the XML should still be parsable if nothing else has changed.
2021-08-17	Provide more detailed XML diff in tests	Emily Bourke	1	-21/+51
	I had some failing tests and couldn’t tell what was different in the XML. Updating the comparison to return what’s different made it easier to figure out what was wrong, and I think will be helpful for others in future.
2021-08-15	Multimarkdown sub- and superscripts (#5512) (#7188)	OCzarnecki	1	-0/+48
	Added an extension `short_subsuperscripts` which modifies the behavior of `subscript` and `superscript`, allowing subscripts or superscripts containing only alphanumerics to end with a space character (eg. `x^2 = 4` or `H~2 is combustible`). This improves support for multimarkdown. Closes #5512. Add `Ext_short_subsuperscripts` constructor to `Extension` [API change]. This is enabled by default for `markdown_mmd`.
2021-08-15	Make docx writer sensitive to `native_numbering` extension.	John MacFarlane	1	-1/+2
	Figure and table numbers are now only included if `native_numbering` is enabled. (By default it is disabled.) This is a behavior change with respect to 2.14.1, but the behavior is that of previous versions. The change was necessary to avoid incompatibilities between pandoc's native numbering and third-party cross reference filters like pandoc-crossref. Closes #7499.
2021-08-15	Remove misleading description from command/citeproc-87 test.	John MacFarlane	1	-5/+2

2021-08-13	Convert Quoted in bib entries to special Spans...	John MacFarlane	1	-0/+42
	before passing them off to citeproc. This ensures that we get proper localization and flipflopping if, e.g., quotes are used in titles. Closes jgm/citeproc#87.
2021-08-13	Citeproc: avoid odd handling of quotes.	John MacFarlane	1	-0/+16
	citeproc changes allow us to ignore Quoted elements; citeproc now uses its own method for represented quoted things, and only localizes and flipflops quotes it adds itself. See #87. The one thing left to do is to convert Quoted elements in bibliography databases (esp. titles) to `Span ("",["csl-quoted"],[])` before passing them to citeproc, IF the localized quotes for the quote type match the standard inverted commas.
2021-08-13	Fix raw LaTeX injection issue (LaTeX writer).	John MacFarlane	1	-0/+37
	Using a code block containing `\end{verbatim}`, one could inject raw TeX into a LaTeX document even when `raw_tex` is disabled. Thanks to Augustin Laville for noticing the bug. Closes #7497.
2021-08-12	Various sample.lua editorial fixes. (#7493)	William Lupton	1	-8/+7
	These address most of the items mentioned in #7487. There's also a table caption fix (the caption wasn't escaped).
2021-08-11	LaTeX reader: Support `\global` before `\def`, `\let`, etc.	John MacFarlane	1	-0/+12
	See #7494.
2021-08-11	Fix scope for LaTeX macros.	John MacFarlane	1	-0/+50
	They should by default scope over the group in which they are defined (except `\gdef` and `\xdef`, which are global). In addition, environments must be treated as groups. We handle this by making sMacros in the LaTeX parser state a STACK of macro tables. Opening a group adds a table to the stack, closing one removes one. Only the top of the stack is queried. This commit adds a parameter for scope to the Macro constructor (not exported). Closes #7494.
2021-08-11	LaTeX reader: improve handling of plain TeX macro primitives.	John MacFarlane	1	-1/+37
	- Fixed semantics for `\let`. - Implement `\edef`, `\gdef`, and `\xdef`. - Add comment noting that currently `\def` and `\edef` set global macros (so are equivalent to `\gdef` and `\xdef`). This should be fixed by scoping macro definitions to groups, in a future commit. Closes #7474.
2021-08-10	Tests.Helpers: export testGolden and use it in RTF reader.	John MacFarlane	2	-13/+27
	This gives a diff output on failure.
2021-08-10	HTML reader: treat commments as blank when parsing.	John MacFarlane	1	-0/+47
	This modifies pBlank. Previously comments could sometimes flummox the parser. Cloes #7482.
2021-08-10	Add test for #7488.	John MacFarlane	3	-0/+447

2021-08-10	Add RTF reader.	John MacFarlane	27	-4/+951
	- `rtf` is now supported as an input format as well as output. - New module Text.Pandoc.Readers.RTF (exporting `readRTF`). [API change] Closes #3982.
2021-08-04	RTF writer: emit \outlinelevel for section headings.	John MacFarlane	1	-31/+31

2021-08-03	LaTeX table writer: Increase column width precision (#7466)	Peter Fabinski	5	-20/+20
	In some cases, the rounding performed by the LaTeX table writer would introduce visible overrun outside the text area. This adds two more decimal places to the width values.
2021-08-01	RTF writer: omit `\bin` in `\pict`.	John MacFarlane	1	-2/+2
	According to the spec, this is not needed or wanted when the data is in hexadecimal format, as it is here.
2021-08-01	RTF template: specify font family for fixed-width font f1.	John MacFarlane	1	-1/+1
	According to the spec, this is mandatory.
2021-07-11	DocBook reader: add support for citerefentry (#7437)	Jan Tojnar	2	-0/+4
	Originally intended for referring to UNIX manual pages, either part of the same DocBook document as refentry element, or external – hence the manvolnum element. These days, refentry is more general, for example the element documentation pages linked below are each a refentry. As per the Processing expectations section of citerefentry, the element is supposed to be a hyperlink to a refentry (when in the same document) but pandoc does not support refentry tag at the moment so that is moot. https://tdg.docbook.org/tdg/5.1/citerefentry.html https://tdg.docbook.org/tdg/5.1/manvolnum.html https://tdg.docbook.org/tdg/5.1/refentry.html This roughly corresponds to a `manpage` role in rST syntax, which produces a `Code` AST node with attributes `.interpreted-text role=manpage` but that does not fit DocBook parser. https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#role-manpage
2021-07-11	Improved parsing of raw LaTeX from Text streams (rawLaTeXParser).	John MacFarlane	1	-0/+15
	We now use source positions from the token stream to tell us how much of the text stream to consume. Getting this to work required a few other changes to make token source positions accurate. Closes #7434.
2021-07-09	RST reader: fix regression with code includes.	John MacFarlane	2	-0/+17
	With the recent changes to include infrastructure, included code blocks were getting an extra newline. Closes #7436. Added regression test.
2021-07-06	Recognize data-external when reading HTML img tags (#7429)	Michael Hoffmann	1	-0/+6
	Preserve all attributes in img tags. If attributes have a `data-` prefix, it will be stripped. In particular, this preserves a `data-external` attribute as an `external` attribute in the pandoc AST.
2021-07-05	Add command test for #7394.	John MacFarlane	1	-0/+85
	And fix a small bug in handling of citations in notes, which led to commas at the end of sentences in some cases.
2021-07-05	document-css: reset overflow-wrap on code blocks	Mauro Bieg	4	-4/+8
	fixes #7423
2021-07-03	Revert "LaTeX template: move title, author, date up to top of preamble."	John MacFarlane	4	-13/+13
	This reverts commit cc088687b4013c2b8b744eb337ed04fc63f315f2 and PR #7295. This fixes issues people had when using LaTeX commands defined later in the preamble (or in some cases UTF-8 text) in the title or author fields. Closes #7422.
2021-07-02	HTML5 writer, remove aria-hidden when explicit atl text is provided.	Aner Lucero	1	-1/+1

2021-06-29	Docx writer: Add table numbering for captioned tables.	John MacFarlane	2	-1/+3
	The numbers are added using fields, so that Word can create a list of tables that will update automatically.