pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2019-09-22	Make `plain` output plainer.	John MacFarlane	3	-91/+39
	Previously we used the following Project Gutenberg conventions for plain output: - extra space before and after level 1 and 2 headings - all-caps for strong emphasis `LIKE THIS` - underscores surrounding regular emphasis `_like this_` This commit makes `plain` output plainer. Strong and Emph inlines are rendered without special formatting. Headings are also rendered without special formatting, and with only one blank line following. To restore the former behavior, use `-t plain+gutenberg`. API change: Add `Ext_gutenberg` constructor to `Extension`. See #5741.
2019-09-21	[Docx Reader] Update tests	Nikolay Yakimov	6	-7/+7
	Notice this commit updates lists.docx. The old test file contained references to "ListParagraph" style, which should never leak outside of pandoc, so I'm not sure what that was supposed to test for exactly.
2019-09-21	[Docx Reader] Use style names, not ids, for assigning semantic meaning	Nikolay Yakimov	5	-0/+19
	Motivating issues: #5523, #5052, #5074 Style name comparisons are case-insensitive, since those are case-insensitive in Word. w:styleId will be used as style name if w:name is missing (this should only happen for malformed docx and is kept as a fallback to avoid failing altogether on malformed documents) Block quote detection code moved from Docx.Parser to Readers.Docx Code styles, i.e. "Source Code" and "Verbatim Char" now honor style inheritance Docx Reader now honours "Compact" style (used in Pandoc-generated docx). The side-effect is that "Compact" style no longer shows up in docx+styles output. Styles inherited from "Compact" will still show up. Removed obsolete list-item style from divsToKeep. That didn't really do anything for a while now. Add newtypes to differentiate between style names, ids, and different style types (that is, paragraph and character styles) Since docx style names can have spaces in them, and pandoc-markdown classes can't, anywhere when style name is used as a class name, spaces are replaced with ASCII dashes `-`. Get rid of extraneous intermediate types, carrying styleId information. Instead, styleId is saved with other style data. Use RunStyle for inline style definitions only (lacking styleId and styleName); for Character Styles use CharStyle type (which is basicaly RunStyle with styleId and StyleName bolted onto it).
2019-09-20	Preserve built-in styles in DOCX with custom style (#5670)	Ben Steinberg	4	-0/+19
	This commit prevents custom styles on divs and spans from overriding styles on certain elements inside them, like headings, blockquotes, and links. On those elements, the "native" style is required for the element to display correctly. This change also allows nesting of custom styles; in order to do so, it removes the default "Compact" style applied to Plain blocks, except when inside a table.
2019-09-19	Remove admonition-title remnants.	John MacFarlane	1	-1/+1
	Completes 8e01ccb41dde8a5e6123f5b0746c36f240576047
2019-09-15	Lua filters: allow passing of HTML-like tables instead of Attr (#5750)	Albert Krewinkel	1	-0/+41
	Attr values can now be given as normal Lua tables; this can be used as a convenient alternative to define Attr values, instead of constructing values with `pandoc.Attr`. Identifiers are taken from the id field, classes must be given as space separated words in the class field. All remaining fields are included as misc attributes. With this change, the following lines now create equal elements: pandoc.Span('test', {id = 'test', class = 'a b', check = 1}) pandoc.Span('test', pandoc.Attr('test', {'a','b'}, {check = 1})) This also works when using the attr setter: local span = pandoc.Span 'text' span.attr = {id = 'test', class = 'a b', check = 1} Furthermore, the attributes field of AST elements can now be a plain key-value table even when using the `attributes` accessor: local span = pandoc.Span 'test' span.attributes = {check = 1} -- works as expected now Closes: #5744
2019-09-15	Revert "FB2 reader test: better diagnostics on failure."	John MacFarlane	1	-28/+1
	This reverts commit c65af7d1a2f35cbfd1235df2960f7156d38e8f92.
2019-09-15	FB2 reader test: better diagnostics on failure.	John MacFarlane	1	-1/+28

2019-09-14	FB2 reader test: Another attempt to fix test failure on GitHub CI.	John MacFarlane	1	-4/+5

2019-09-13	Revert "FB2 reader test: filter CRs."	John MacFarlane	1	-2/+2
	This reverts commit e35147d715a737bb854e0c527243f77d970d1b86.
2019-09-13	FB2 reader test: filter CRs.	John MacFarlane	1	-2/+2
	This may help with the test failure on GitHub CI. https://github.com/jgm/pandoc/commit/b59e6d03762becd5c9d767463ce7ba5062a1b4a0/checks
2019-09-10	Add --shift-heading-level-by option.	John MacFarlane	1	-0/+33
	Deprecate --base-heading-level. The new option does everything the old one does, but also allows negative shifts. It also promotes the document metadata (if not null) to a level-1 heading with a +1 shift, and demotes an initial level-1 heading to document metadata with a -1 shift. This supports converting documents that use an initial level-1 heading for the document title. Closes #5615.
2019-09-09	LaTeX reader: Fix parsing of optional arguments that contain braced text.	John MacFarlane	1	-0/+9
	Closes #5740.
2019-09-08	Org reader: modify handling of example blocks. (#5717)	Brian Leung	1	-0/+60
	* Org reader: allow the `-i` switch to ignore leading spaces. * Org reader: handle awkwardly-aligned code blocks within lists. Code blocks in Org lists must have their #+BEGIN_ aligned in a reasonable way, but their other components can be positioned otherwise.
2019-09-08	Replace Element and makeHierarchical with makeSections.	John MacFarlane	7	-22/+7
	Text.Pandoc.Shared: + Remove `Element` type [API change] + Remove `makeHierarchicalize` [API change] + Add `makeSections` [API change] + Export `deLink` [API change] Now that we have Divs, we can use them to represent the structure of sections, and we don't need a special Element type. `makeSections` reorganizes a block list, adding Divs with class `section` around sections, and adding numbering if needed. This change also fixes some longstanding issues recognizing section structure when the document contains Divs. Closes #3057, see also #997. All writers have been changed to use `makeSections`. Note that in the process we have reverted the change c1d058aeb1c6a331a2cc22786ffaab17f7118ccd made in response to #5168, which I'm not completely sure was a good idea. Lua modules have also been adjusted accordingly. Existing lua filters that use `hierarchicalize` will need to be rewritten to use `make_sections`.
2019-09-08	Revert changes to hierarchicalizeWithIds.	John MacFarlane	1	-56/+0
	Revert "hierarchicalize: ensure that sections get ids..." This reverts commit 212406a61d027d85712705e626954e0486a2bc34. Revert "Improve detection of headings in Divs by hierarchicalize." This reverts commit 6e2cfd6c97b1b8657f1f3e2b66090a2c3ba8d887. Revert "Shared.hierarchicalize: improve handling of div and section structure." This reverts commit 345b33762eb4cc6d57d74c76c4757a6166ee5c13.
2019-09-06	hierarchicalize: ensure that sections get ids...	John MacFarlane	1	-5/+5
	even if they're in divs. Improves #3057.
2019-09-06	Improve detection of headings in Divs by hierarchicalize.	John MacFarlane	1	-5/+7
	The structure ``` <h1>one</h1> <div> <h1>two</h1> </div> ``` should create two coordinate sections, not a section with a subsection. Now it does. Extends #3057.
2019-09-05	Shared.hierarchicalize: improve handling of div and section structure.	John MacFarlane	1	-0/+54
	Previously Divs were opaque to hierarchicalize, so headings inside divs didn't get into the table of contents, for example (#3057). Now hierarchicalize treats Divs as sections when appropriate. For example, these structures both yield a section and a subsection: ``` html <div> <h1>one</h1> <div> <h2>two</h2> </div> </div> ``` ``` html <div> <h1>one</h1> <div> <h1>two</h1> </div> </div> ``` Note that ``` html <h1>one</h1> <div> <h2>two</h2> </div> <h1>three</h1> ``` gets parsed as the structure one two three which may not always be desirable. Closes #3057.
2019-09-05	Add div.hanging-indent CSS to HTML templates.	John MacFarlane	7	-0/+7

2019-09-05	Add partial styles.html in HTML5 template.	John MacFarlane	7	-156/+152
	Avoid duplication in HTML templates by using styles.html partial. Change indentation of styles in template.
2019-09-04	asciidoc writer: don't include `+` in code blocks for regular asciidoc.	John MacFarlane	1	-6/+6
	This is asciidoctor-specific. Amends 98ee6ca289ad7117b7336a57bcfc6f4b54463f4e.
2019-09-04	Roff readers: better parsing of groups.	John MacFarlane	2	-1/+8
	We now allow groups where the closing `\\}` isn't at the beginning of a line. Closes #5410.
2019-09-03	XML: change toEntities to emit numerical hex character references.	John MacFarlane	1	-3/+3
	Previously decimal references were used. But Polyglot Markup prefers hex. See #5718. This affects the output of pandoc with `--ascii`.
2019-09-02	LaTeX reader: don't try to parse includes if raw_tex is set.	John MacFarlane	1	-5/+2
	When the `raw_tex` extension is set, we just carry through `\usepackage`, `\input`, etc. verbatim as raw LaTeX. Closes #5673.
2019-09-02	HTML writer: use numeric character references with `--ascii`.	John MacFarlane	1	-1/+1
	Previously we used named character references with html5 output. But these aren't valid XML, and we aim to produce html5 that is also valid XHTML (polyglot markup). (This is also needed for epub3.) Closes #5718.
2019-09-02	LaTeX reader: properly handle optional arguments for macros.	John MacFarlane	1	-0/+8
	Closes #5682.
2019-08-27	LaTeX reader: fix `\\` in `\parbox` inside a table cell.	John MacFarlane	1	-0/+13
	Closes #5711.
2019-08-27	Markdown reader: Headers: don't parse content over newline boundary.	John MacFarlane	1	-0/+13
	Closes #5714.
2019-08-27	PowerPoint writer: Start numbering at appopriate numbers.	Jesse Rosenthal	4	-0/+13
	Starting numbers for ordered lists were previously ignored. Now we specify the number if it is something other than 1. Closes: #5709
2019-08-26	Add test for issue #5708.	John MacFarlane	1	-0/+12

2019-08-25	Use new doctemplates, doclayout.	John MacFarlane	19	-2372/+2361
	+ Remove Text.Pandoc.Pretty; use doclayout instead. [API change] + Text.Pandoc.Writers.Shared: remove metaToJSON, metaToJSON' [API change]. + Text.Pandoc.Writers.Shared: modify `addVariablesToContext`, `defField`, `setField`, `getField`, `resetField` to work with Context rather than JSON values. [API change] + Text.Pandoc.Writers.Shared: export new function `endsWithPlain` [API change]. + Use new templates and doclayout in writers. + Use Doc-based templates in all writers. + Adjust three tests for minor template rendering differences. + Added indentation to body in docbook4, docbook5 templates. The main impact of this change is better reflowing of content interpolated into templates. Previously, interpolated variables were rendered independently and intepolated as strings, which could lead to overly long lines. Now the templates interpolated as Doc values which may include breaking spaces, and reflowing occurs after template interpolation rather than before.
2019-08-24	Change optMetadataFile type from Maybe to List (#5702)	Owen McGrath	3	-0/+9
	Changed optMetadataFile from `Maybe FilePath` to `[FilePath]`. This allows for multiple YAML metadata files to be added. The new default value has been changed from `Nothing` to `[]`. To account for this change in `Text.Pandoc.App`, `metaDataFromFile` now operates on two `mapM` calls (for `readFileLazy` and `yamlToMeta`) and a fold. Added a test (command/5700.md) which tests this functionality and updated MANUAL.txt, as per the contributing guidelines. With the current behavior, using `foldr1 (<>)`, values within files specified first will be used over those in later files. (If the reverse of this behavior would be preferred, it should be fixed by changing foldr1 to foldl1.)
2019-08-23	Add test for #5690.	John MacFarlane	1	-0/+16

2019-08-23	Ensure proper nesting when we have long ordered list markers.	John MacFarlane	1	-0/+11
	Closes #5705.
2019-08-16	Lua: traverse nested blocks and inlines in correct order	Albert Krewinkel	1	-0/+38
	Traversal methods are updated to use the new Walk module such that sequences with nested Inline (or Block) elements are traversed in the order in which they appear in the linearized document. Fixes: #5667
2019-08-14	LaTeX reader: improve withRaw so it can handle cases where...	John MacFarlane	1	-0/+9
	the token string is modified by a parser (e.g. accent when it only takes part of a Word token). Closes #5686. Still not ideal, because we get the whole `\t0BAR` and not just `\t0` as a raw latex inline command. But I'm willing to let this be an edge case, since you can easily work around this by inserting a space, braces, or raw attribute. The important thing is that we no longer drop the rest of the document after a raw latex inline command that gobbles only part of a Word token!
2019-08-14	Rename test for 5685 -> 5684 (typo in last commit).	John MacFarlane	1	-0/+0
	Closes #5684. (Note that #5685 is NOT closed by previous commit.)
2019-08-14	Add thin space when needed in LaTeX quote ligatures.	John MacFarlane	1	-0/+6
	Closes #5685.
2019-08-11	JIRA writer: Remove escapeStringForJira for code blocks	Jan-Otto Kröpke	1	-16/+16

2019-07-28	Update muse template to handle multiple authors better.	John MacFarlane	1	-1/+1

2019-07-28	Use doctemplates 0.3, change type of writerTemplate.	John MacFarlane	7	-10/+21
	* Require recent doctemplates. It is more flexible and supports partials. * Changed type of writerTemplate to Maybe Template instead of Maybe String. * Remove code from the LaTeX, Docbook, and JATS writers that looked in the template for strings to determine whether it is a book or an article, or whether csquotes is used. This was always kludgy and unreliable. To use csquotes for LaTeX, set `csquotes` in your variables or metadata. It is no longer sufficient to put `\usepackage{csquotes}` in your template or header includes. To specify a book style, use the `documentclass` variable or `--top-level-division`. * Change template code to use new API for doctemplates.
2019-07-24	HTML writer: ensure TeX formulas are rendered correctly (#5658)	Philip Pesca	1	-1/+1
	The web service passed in to `--webtex` may render formulas using inline or display style by default. Prefixing formulas with the appropriate command ensures they are rendered correctly. This is a followup to the discussion in #5656.
2019-07-23	HTML writer: render inline formulas correctly with --webtex (#5656)	Philip Pesca	1	-0/+14
	We add `\textstyle` to the beginning of the formula to ensure it will be rendered in inline style. Closes #5655.
2019-07-22	Fix error introduced in change to test for 4669.	John MacFarlane	1	-1/+1

2019-07-22	LaTeX reader: support tex `\tt` command.	John MacFarlane	2	-1/+7
	Closes #5654.
2019-07-22	Org reader: accept ATTR_LATEX in block attributes	Albert Krewinkel	1	-0/+7
	Attributes for LaTeX output are accepted as valid block attributes; however, their values are ignored. Fixes: #5648
2019-07-20	LaTeX writer: fix line breaks at start of paragraph.	John MacFarlane	1	-0/+18
	Previously we just omitted these. Now we render them using `\hfill\break` instead of `\\`. This is a revision of a PR by @sabine (#5591) who should be credited with the idea. Closes #3324.
2019-07-20	LaTeX reader: search for image with list of extensions...	John MacFarlane	1	-0/+6
	like latex does, if an extension is not provided. Closes #4933.
2019-07-19	Markdown: Ensure that expanded latex macros end with space if original did.	John MacFarlane	1	-0/+9
	Closes #4442.