pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2021-08-16	Fix bug in last commit due to removal of take1WhileP.	John MacFarlane	1	-2/+2

2021-08-15	Multimarkdown sub- and superscripts (#5512) (#7188)	OCzarnecki	2	-15/+20
	Added an extension `short_subsuperscripts` which modifies the behavior of `subscript` and `superscript`, allowing subscripts or superscripts containing only alphanumerics to end with a space character (eg. `x^2 = 4` or `H~2 is combustible`). This improves support for multimarkdown. Closes #5512. Add `Ext_short_subsuperscripts` constructor to `Extension` [API change]. This is enabled by default for `markdown_mmd`.
2021-08-15	Make docx writer sensitive to `native_numbering` extension.	John MacFarlane	3	-12/+22
	Figure and table numbers are now only included if `native_numbering` is enabled. (By default it is disabled.) This is a behavior change with respect to 2.14.1, but the behavior is that of previous versions. The change was necessary to avoid incompatibilities between pandoc's native numbering and third-party cross reference filters like pandoc-crossref. Closes #7499.
2021-08-13	Convert Quoted in bib entries to special Spans...	John MacFarlane	1	-1/+3
	before passing them off to citeproc. This ensures that we get proper localization and flipflopping if, e.g., quotes are used in titles. Closes jgm/citeproc#87.
2021-08-13	Citeproc: avoid odd handling of quotes.	John MacFarlane	1	-1/+6
	citeproc changes allow us to ignore Quoted elements; citeproc now uses its own method for represented quoted things, and only localizes and flipflops quotes it adds itself. See #87. The one thing left to do is to convert Quoted elements in bibliography databases (esp. titles) to `Span ("",["csl-quoted"],[])` before passing them to citeproc, IF the localized quotes for the quote type match the standard inverted commas.
2021-08-13	Removed quote localization from citeproc processing.	John MacFarlane	1	-20/+1
	This is now done in citeproc itself.
2021-08-13	Fix raw LaTeX injection issue (LaTeX writer).	John MacFarlane	1	-5/+10
	Using a code block containing `\end{verbatim}`, one could inject raw TeX into a LaTeX document even when `raw_tex` is disabled. Thanks to Augustin Laville for noticing the bug. Closes #7497.
2021-08-13	LaTeX reader: proper implicit grouping around environment macros.	John MacFarlane	1	-1/+2

2021-08-12	Use Prelude from base-compat for ghc 8.4 too.	John MacFarlane	1	-5/+1
	We were having trouble building on ghc 8.4 because of the lack of a Foldable instance for (Alt Maybe) in base < 4.12. Mystery: for some reason our builds were failing for gitit but not in the pandoc CI.
2021-08-11	Try fixing compile error on older ghcs.	John MacFarlane	1	-1/+5
	See https://github.com/jgm/gitit/runs/3308381697
2021-08-11	Fix some lint issues.	John MacFarlane	2	-6/+5

2021-08-11	LaTeX reader: Support `\global` before `\def`, `\let`, etc.	John MacFarlane	1	-2/+10
	See #7494.
2021-08-11	Fix scope for LaTeX macros.	John MacFarlane	3	-55/+100
	They should by default scope over the group in which they are defined (except `\gdef` and `\xdef`, which are global). In addition, environments must be treated as groups. We handle this by making sMacros in the LaTeX parser state a STACK of macro tables. Opening a group adds a table to the stack, closing one removes one. Only the top of the stack is queried. This commit adds a parameter for scope to the Macro constructor (not exported). Closes #7494.
2021-08-11	LaTeX reader: improve handling of plain TeX macro primitives.	John MacFarlane	2	-6/+29
	- Fixed semantics for `\let`. - Implement `\edef`, `\gdef`, and `\xdef`. - Add comment noting that currently `\def` and `\edef` set global macros (so are equivalent to `\gdef` and `\xdef`). This should be fixed by scoping macro definitions to groups, in a future commit. Closes #7474.
2021-08-10	HTML reader: treat commments as blank when parsing.	John MacFarlane	1	-5/+7
	This modifies pBlank. Previously comments could sometimes flummox the parser. Cloes #7482.
2021-08-10	Fix RTF table parsing bug that created undesired nested tables.	John MacFarlane	1	-1/+1
	Closes #7488.
2021-08-10	Add RTF reader.	John MacFarlane	2	-0/+1336
	- `rtf` is now supported as an input format as well as output. - New module Text.Pandoc.Readers.RTF (exporting `readRTF`). [API change] Closes #3982.
2021-08-08	Allow `--slide-level=0`.	John MacFarlane	1	-2/+2
	When the slide level is set to 0, headings won't be used at all in splitting the document into slides. Horizontal rules must be used to separate slides. Closes #7476.
2021-08-04	RTF writer: emit \outlinelevel for section headings.	John MacFarlane	1	-1/+2

2021-08-03	Stop using the HTTP package. (#7456)	mt_caret	6	-11/+29
	We only depend on the urlEncode function in the package, which is also provided by http-types. The HTTP package also depends on the network package, which has difficulty building on ghcjs. Add internal module Text.Pandoc.Network.HTTP, exporting `urlEncode`.
2021-08-03	LaTeX table writer: Increase column width precision (#7466)	Peter Fabinski	1	-1/+1
	In some cases, the rounding performed by the LaTeX table writer would introduce visible overrun outside the text area. This adds two more decimal places to the width values.
2021-08-01	RTF writer: omit `\bin` in `\pict`.	John MacFarlane	1	-1/+1
	According to the spec, this is not needed or wanted when the data is in hexadecimal format, as it is here.
2021-07-29	parseFromString: preserve at least the source directory.	John MacFarlane	1	-1/+1
	Previously we just set the source name to "chunk" when parsing from strings, to avoid misleading source positions. This had the side effect that `rebase_relative_paths` would break inside sections that were parsed as strings. So, now we use "ORIGINAL_SOURCE_PATH_chunk" instead of just "chunk". Closes #7464.
2021-07-22	LaTeX writer: Use ulem for underline.	John MacFarlane	1	-1/+3
	ulem is conditionally included already when the `strikeout` variable is set, so we set this when there is underlined text, and use `\uline` instead of `\underline`. This fixes wrapping for underlined text. Closes #7351.
2021-07-22	MIME: use image/x-xcf instead of application/x-xcf.	John MacFarlane	1	-1/+1
	Closes #7454.
2021-07-17	LaTeX reader: avoid trailing hyphen in translating languages.	John MacFarlane	1	-2/+2
	Previously `\foreignlanguage{english}` turned into `<span lang="en-">`. The same issue affected Arabic. Closes #7447.
2021-07-16	DocBook reader: handle images with imageobjectco elements.	John MacFarlane	1	-3/+3
	Closes #7440.
2021-07-16	LaTeX reader: Support `\cline` in LaTeX tables.	John MacFarlane	1	-0/+1
	Closes #7442.
2021-07-16	PDF: Fix svgIn path error.	John MacFarlane	1	-1/+1
	We were duplicating the temp directory; this didn't show up on macOS or linux because there we use absolute paths for the temp directory. Closes #7431.
2021-07-11	DocBook reader: add support for citerefentry (#7437)	Jan Tojnar	1	-1/+5
	Originally intended for referring to UNIX manual pages, either part of the same DocBook document as refentry element, or external – hence the manvolnum element. These days, refentry is more general, for example the element documentation pages linked below are each a refentry. As per the Processing expectations section of citerefentry, the element is supposed to be a hyperlink to a refentry (when in the same document) but pandoc does not support refentry tag at the moment so that is moot. https://tdg.docbook.org/tdg/5.1/citerefentry.html https://tdg.docbook.org/tdg/5.1/manvolnum.html https://tdg.docbook.org/tdg/5.1/refentry.html This roughly corresponds to a `manpage` role in rST syntax, which produces a `Code` AST node with attributes `.interpreted-text role=manpage` but that does not fit DocBook parser. https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#role-manpage
2021-07-11	Improved parsing of raw LaTeX from Text streams (rawLaTeXParser).	John MacFarlane	2	-11/+37
	We now use source positions from the token stream to tell us how much of the text stream to consume. Getting this to work required a few other changes to make token source positions accurate. Closes #7434.
2021-07-09	Always use / when adding directory to image path with extractMedia.	John MacFarlane	1	-1/+1
	Even on Windows. May help with #7431.
2021-07-09	RST reader: fix regression with code includes.	John MacFarlane	1	-1/+5
	With the recent changes to include infrastructure, included code blocks were getting an extra newline. Closes #7436. Added regression test.
2021-07-07	Don't incorporate externally linked images in EPUB documents (#7430)	Michael Hoffmann	1	-1/+2
	Just like it is possible to avoid incorporating an image in EPUB by passing `data-external="1"` to a raw HTML snippet, this makes the same possible for native Images, by looking for an associated `external` attribute.
2021-07-06	Recognize data-external when reading HTML img tags (#7429)	Michael Hoffmann	1	-8/+3
	Preserve all attributes in img tags. If attributes have a `data-` prefix, it will be stripped. In particular, this preserves a `data-external` attribute as an `external` attribute in the pandoc AST.
2021-07-06	T.P.PDF, convertImage: normalize paths.	John MacFarlane	1	-3/+3
	This will avoid paths on Windows with mixed path separators, which may cause problems with SVG conversion. See #7431.
2021-07-06	Markdown reader: don't try to read contents in self-closing HTML tag.	John MacFarlane	1	-1/+4
	Previously we had problems parsing raw HTML with self-closing tags like `<col/>`. The problem was that pandoc would look for a closing tag to close the markdown contents, but the closing tag had, in effect, already been parsed by `htmlTag`. This fixes the issue described in <https://groups.google.com/d/msgid/pandoc-discuss/297bc662-7841-4423-bcbb-534e99bbba09n%40googlegroups.com>.
2021-07-06	HTML reader: add col, colgroup to 'closes' definitions	John MacFarlane	1	-1/+3

2021-07-05	Add command test for #7394.	John MacFarlane	1	-0/+1
	And fix a small bug in handling of citations in notes, which led to commas at the end of sentences in some cases.
2021-07-05	Citeproc: cleanup and efficiency improvement in deNote.	John MacFarlane	1	-15/+21

2021-07-05	Revamp note citation handling.	John MacFarlane	1	-14/+30
	Use latest citeproc, which uses a Span with a class rather than a Note for notes. This helps us distinguish between user notes and citation notes. Don't put citations at the beginning of a note in parentheses. (Closes #7394.)
2021-07-02	HTML5 writer, remove aria-hidden when explicit atl text is provided.	Aner Lucero	1	-4/+7

2021-06-29	Docx writer: Add table numbering for captioned tables.	John MacFarlane	2	-3/+30
	The numbers are added using fields, so that Word can create a list of tables that will update automatically.
2021-06-29	Docx writer: Fixed a couple bugs in Figure numbering.	John MacFarlane	1	-4/+3

2021-06-29	Docx writer: support figure numbers.	John MacFarlane	2	-3/+21
	These are set up in such a way that they will work with Word's automatic table of figures. Closes #7392.
2021-06-29	Remove duplicated alt text in HTML output.	Aner Lucero	1	-2/+3

2021-06-28	Improve punctuation moving with `--citeproc`.	John MacFarlane	1	-14/+15
	Previously, using `--citeproc` could cause punctuation to move in quotes even when there aer no citations. This has been changed; now, punctuation moving is limited to citations. In addition, we only move footnotes around punctuation if the style is a note style, even if `notes-after-punctuation` is `true`.
2021-06-28	Allow `$` characters in bibtex keys.	John MacFarlane	1	-1/+1
	Closes #7409.
2021-06-28	Text.Pandoc.Error: fix line calculations in reporting parsec errors.	John MacFarlane	1	-3/+3
	Also remove a spurious initial newline in the error report.
2021-06-28	Set proper initial source name in parsing BibTeX.	John MacFarlane	1	-1/+3
	(For better error messages.)