pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2021-08-28	Docx writer: handle SVG images.	John MacFarlane	2	-6/+56
	This change has several parts: - In Text.Pandoc.App, if the writer is docx, we fill the media bag and attempt to convert any SVG images to PNG, adding these to the media bag. The PNG backups have the same filenames as the SVG images, but with an added .png extension. If the conversion cannot be done (e.g. because rsvg-convert is not present), a warning is omitted. - In Text.Pandoc.Writers.Docx, we now use Word 2016's syntax for including SVG images. If a PNG fallback is present in the media bag, we include a link to that too. It would be helpful if someone with an old Word version could test to see that the documents we produce can be opened and viewed with the PNG fallbacks. If not, then perhaps we can eliminate the slightly complex code for producing these fallbacks. Closes #4058.
2021-08-27	Image: Generalize svgToPng to MonadIO.	John MacFarlane	1	-4/+5

2021-08-27	Add haddock for dpi parameter.	John MacFarlane	1	-1/+1

2021-08-27	T.P.Image: svgToPng, change first parameter from WriterOptions to Int.	John MacFarlane	1	-4/+4
	The information we need is just a DPI, so why require more?
2021-08-27	pptx: Make first heading title if slide level is 0	Emily Bourke	1	-24/+29
	Before this commit, the pptx writer adds a slide break before any table, “columns” div, or paragraph starting with an image, unless the only thing before it on the same slide is a heading at the slide level. In that case, the item and heading are kept on the same slide, and the heading is used as the slide title (inserted into the layout’s “title” placeholder). However, if the slide level is set to 0 (as was recently enabled) this makes it impossible to have a slide with a title which contains any of those items in its body. This commit changes this behaviour: now if the slide level is 0, then items will be kept with a heading of any level, if the heading’s the only thing before the item on the same slide.
2021-08-27	Ensure we have unique ids for wp:docPr and pic:cNvPr elements.	John MacFarlane	1	-9/+11
	This will, I hope, fix #7527 and #7503.
2021-08-24	Comment out unused module.	John MacFarlane	1	-1/+1

2021-08-24	Reorganize App to make it easier to limit IO in main loop.	John MacFarlane	1	-85/+100
	Previously we used liftIO fairly liberally. The code has been restructured to avoid this. A small behavior change is that pandoc will now fall back to latin1 encoding for inputs that can't be read as UTF-8. This is what it did previously for content fetched from the web and not marked as to content type. It makes sense to do the same for local files.
2021-08-24	Text.Pandoc.Class: add readStdinStrict method to PandocMonad.	John MacFarlane	5	-0/+17
	[API change]
2021-08-24	Class: Generalize type of extractMedia.	John MacFarlane	1	-1/+1
	It was uselessly restricted to PandocIO, instead of any instance of PandocMonad and MonadIO. [API change]
2021-08-24	T.P.App.OutputSettings: Generalize some types...	John MacFarlane	2	-7/+6
	so we can run this with any instance of PandocMonad and MonadIO, not just PandocIO.
2021-08-24	Text.Pandoc.Filter: Generalize type of applyFilters...	John MacFarlane	3	-9/+91
	from PandocIO to any instance of MonadIO and PandocMonad. [API change]
2021-08-24	PDF: generalize type of makePDF...	John MacFarlane	1	-40/+55
	instead of PandocIO, it can be used in any instance of PandocMonad, MonadIO, and MonadMask. [API change]
2021-08-24	Lua subsystem and custom writers: generalize types from PandocIO...	John MacFarlane	3	-8/+8
	to any instance of PandocMonad and MonadIO. This involves an API change, since the type of runLua is now (PandocMonad m, MonadIO m) => Lua a -> m (Either PandocError a)
2021-08-23	Markdown reader: fix interaction of --strip-comments and list	John MacFarlane	1	-1/+1
	parsing. Use of `--strip-comments` was causing tight lists to be rendered as loose (as if the comment were a blank line). Closes #7521.
2021-08-22	Clean up PDF module.	John MacFarlane	1	-59/+49
	Previously we had to run runIOorExplode inside withTempDir. Now that PandocIO is an instance of MonadMask, this is no longer necessary.
2021-08-22	PandocIO: derive MonadCatch, MonadThrow, MonadMask.	John MacFarlane	1	-0/+4
	This will allow us to use withTempDir.
2021-08-22	App: Move output-file writing out of PandocMonad action.	John MacFarlane	1	-29/+29

2021-08-21	LaTeX-parser: restrict \endinput to current file	Simon Schuster	2	-1/+9

2021-08-20	RST reader: Fix `:literal:` includes.	John MacFarlane	1	-5/+2
	These should create code blocks, not insert raw RST. Closes #7513.
2021-08-19	Improve docx reader's robustness in extracting images.	John MacFarlane	1	-5/+6
	The docx reader made a couple assumptions about how docx containers were laid out that were not always true, with the result that some images in documents did not get found/extracted. Closes #7511.
2021-08-18	pptx: Include image title in description	Emily Bourke	2	-12/+19
	The image title (i.e. `![alt text](link "title")`) was previously ignored when writing to pptx. This commit includes it in PowerPoint's description of the image, along with the link (which was already included). Fixes 7352.
2021-08-17	Revise citeproc code to fit new citeproc 0.5 API.	John MacFarlane	3	-43/+13
	Linkification of URLs in the bibliography is now done in the citeproc library, depending on the setting of an option. We set that option depending on the value of the metadata field `link-bibliography` (defaulting to true, for consistency with earlier behavior, though the new behavior includes the CSL draft recommendation of hyperlinking the title or the whole entry if a DOI, PMID, PMCID, or URL field is present but not explicitly rendered). These changes implement the following recommendations from the draft CSL v1.0.2 spec (Appendix VI): > The CSL syntax does not have support for configuration of links. > However, processors should include links on bibliographic references, > using the following rules: > If the bibliography entry for an item renders any of the following > identifiers, the identifier should be anchored as a link, with the > target of the link as follows: > - url: output as is > - doi: prepend with "`https://doi.org/`" > - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`" > - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`" > If the identifier is rendered as a URI, include rendered URI components > (e.g. "`https://doi.org/`") in the link anchor. Do not include any other > affix text in the link anchor (e.g. "Available from: ", "doi: ", "PMID: "). > If the bibliography entry for an item does not render any of > the above identifiers, then set the anchor of the link as the item > title. If title is not rendered, then set the anchor of the link as the > full bibliography entry for the item. Set the target of the link as one > of the following, in order of priority: > > - doi: prepend with "`https://doi.org/`" > - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`" > - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`" > - url: output as is > > If the item data does not include any of the above identifiers, do not > include a link. > > Citation processors should include an option flag for calling > applications to disable bibliography linking behavior. Thanks to Benjamin Bray for getting this all working.
2021-08-17	Rename TemplateWarning -> PowerpointTemplateWarning.	John MacFarlane	2	-6/+7
	@undergroundquizscene - I think TemplateWarning is apt to be confusing, since this actually doesn't have anything to do with what we call 'templates' in pandoc. Hence the change to a powerpoint-specific name.
2021-08-17	pptx: Select layouts from reference doc by name	Emily Bourke	1	-19/+206
	Until now, users had to make sure that their reference doc contains layouts in a specific order: the first four layouts in the file had to have a specific structure, or else pandoc would error (or sometimes successfully produce a pptx file, which PowerPoint would then fail to open). This commit changes the layout selection to use the layout names rather than order: users must make sure their reference doc contains four layouts with specific names, and if a layout with the right name isn’t found pandoc will output a warning and use the corresponding layout from the default reference doc as a fallback. I believe the use of names rather than order will be clearer to users, and the clearer errors will help them troubleshoot when things go wrong. - Add tests for moved layouts - Add tests for deleted layouts - Add newly included layouts to slideMaster1.xml to fix tests
2021-08-17	Add TemplateWarning log message type [API change]	Emily Bourke	1	-0/+6
	This is a general warning to use for messages about templates.
2021-08-17	Escape backslashes in haddock comments (#7505)	Emily Bourke	1	-4/+4
	Any literal backslash needs to be escaped: these are currently showing up as “‘r’” instead of “‘\r’”. Co-authored-by: Emily Bourke <undergroundquizscene@protonmail.com>
2021-08-16	Fix bug in last commit due to removal of take1WhileP.	John MacFarlane	1	-2/+2

2021-08-15	Multimarkdown sub- and superscripts (#5512) (#7188)	OCzarnecki	2	-15/+20
	Added an extension `short_subsuperscripts` which modifies the behavior of `subscript` and `superscript`, allowing subscripts or superscripts containing only alphanumerics to end with a space character (eg. `x^2 = 4` or `H~2 is combustible`). This improves support for multimarkdown. Closes #5512. Add `Ext_short_subsuperscripts` constructor to `Extension` [API change]. This is enabled by default for `markdown_mmd`.
2021-08-15	Make docx writer sensitive to `native_numbering` extension.	John MacFarlane	3	-12/+22
	Figure and table numbers are now only included if `native_numbering` is enabled. (By default it is disabled.) This is a behavior change with respect to 2.14.1, but the behavior is that of previous versions. The change was necessary to avoid incompatibilities between pandoc's native numbering and third-party cross reference filters like pandoc-crossref. Closes #7499.
2021-08-13	Convert Quoted in bib entries to special Spans...	John MacFarlane	1	-1/+3
	before passing them off to citeproc. This ensures that we get proper localization and flipflopping if, e.g., quotes are used in titles. Closes jgm/citeproc#87.
2021-08-13	Citeproc: avoid odd handling of quotes.	John MacFarlane	1	-1/+6
	citeproc changes allow us to ignore Quoted elements; citeproc now uses its own method for represented quoted things, and only localizes and flipflops quotes it adds itself. See #87. The one thing left to do is to convert Quoted elements in bibliography databases (esp. titles) to `Span ("",["csl-quoted"],[])` before passing them to citeproc, IF the localized quotes for the quote type match the standard inverted commas.
2021-08-13	Removed quote localization from citeproc processing.	John MacFarlane	1	-20/+1
	This is now done in citeproc itself.
2021-08-13	Fix raw LaTeX injection issue (LaTeX writer).	John MacFarlane	1	-5/+10
	Using a code block containing `\end{verbatim}`, one could inject raw TeX into a LaTeX document even when `raw_tex` is disabled. Thanks to Augustin Laville for noticing the bug. Closes #7497.
2021-08-13	LaTeX reader: proper implicit grouping around environment macros.	John MacFarlane	1	-1/+2

2021-08-12	Use Prelude from base-compat for ghc 8.4 too.	John MacFarlane	1	-5/+1
	We were having trouble building on ghc 8.4 because of the lack of a Foldable instance for (Alt Maybe) in base < 4.12. Mystery: for some reason our builds were failing for gitit but not in the pandoc CI.
2021-08-11	Try fixing compile error on older ghcs.	John MacFarlane	1	-1/+5
	See https://github.com/jgm/gitit/runs/3308381697
2021-08-11	Fix some lint issues.	John MacFarlane	2	-6/+5

2021-08-11	LaTeX reader: Support `\global` before `\def`, `\let`, etc.	John MacFarlane	1	-2/+10
	See #7494.
2021-08-11	Fix scope for LaTeX macros.	John MacFarlane	3	-55/+100
	They should by default scope over the group in which they are defined (except `\gdef` and `\xdef`, which are global). In addition, environments must be treated as groups. We handle this by making sMacros in the LaTeX parser state a STACK of macro tables. Opening a group adds a table to the stack, closing one removes one. Only the top of the stack is queried. This commit adds a parameter for scope to the Macro constructor (not exported). Closes #7494.
2021-08-11	LaTeX reader: improve handling of plain TeX macro primitives.	John MacFarlane	2	-6/+29
	- Fixed semantics for `\let`. - Implement `\edef`, `\gdef`, and `\xdef`. - Add comment noting that currently `\def` and `\edef` set global macros (so are equivalent to `\gdef` and `\xdef`). This should be fixed by scoping macro definitions to groups, in a future commit. Closes #7474.
2021-08-10	HTML reader: treat commments as blank when parsing.	John MacFarlane	1	-5/+7
	This modifies pBlank. Previously comments could sometimes flummox the parser. Cloes #7482.
2021-08-10	Fix RTF table parsing bug that created undesired nested tables.	John MacFarlane	1	-1/+1
	Closes #7488.
2021-08-10	Add RTF reader.	John MacFarlane	2	-0/+1336
	- `rtf` is now supported as an input format as well as output. - New module Text.Pandoc.Readers.RTF (exporting `readRTF`). [API change] Closes #3982.
2021-08-08	Allow `--slide-level=0`.	John MacFarlane	1	-2/+2
	When the slide level is set to 0, headings won't be used at all in splitting the document into slides. Horizontal rules must be used to separate slides. Closes #7476.
2021-08-04	RTF writer: emit \outlinelevel for section headings.	John MacFarlane	1	-1/+2

2021-08-03	Stop using the HTTP package. (#7456)	mt_caret	6	-11/+29
	We only depend on the urlEncode function in the package, which is also provided by http-types. The HTTP package also depends on the network package, which has difficulty building on ghcjs. Add internal module Text.Pandoc.Network.HTTP, exporting `urlEncode`.
2021-08-03	LaTeX table writer: Increase column width precision (#7466)	Peter Fabinski	1	-1/+1
	In some cases, the rounding performed by the LaTeX table writer would introduce visible overrun outside the text area. This adds two more decimal places to the width values.
2021-08-01	RTF writer: omit `\bin` in `\pict`.	John MacFarlane	1	-1/+1
	According to the spec, this is not needed or wanted when the data is in hexadecimal format, as it is here.
2021-07-29	parseFromString: preserve at least the source directory.	John MacFarlane	1	-1/+1
	Previously we just set the source name to "chunk" when parsing from strings, to avoid misleading source positions. This had the side effect that `rebase_relative_paths` would break inside sections that were parsed as strings. So, now we use "ORIGINAL_SOURCE_PATH_chunk" instead of just "chunk". Closes #7464.