pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2021-10-27	Switch back from HsYAML to yaml.	John MacFarlane	1	-6/+4
	Reasons: - Performance: HsYAML is around 20 times slower in parsing large YAML bibliographies (#6084). - An issue was submitted to HsYAML, but it hasn't gotten any attention. HsYAML seems borderline unmaintained; it hasn't had a commit in over a year. - Unfortunately this goes back on our attempts to free ourselves from C dependencies (#4535). But I don't see a better alternative until a better pure Haskell parser is available. Closes #6084. Notes: - We've removed the FromYAML instances for all types that had them, since this is a HsYAML-specific typeclass [API change]. (The yaml package just uses From/ToJSON.) - Unlike HsYAML (in the configuration we were using), yaml parses 'Y', 'N', 'Yes', 'No', 'On', 'Off' as boolean values. Users may need to quote these when they are meant to be interpreted as strings. Similarly, 'null' is parsed as a YAML null value (and will be treated as an empty string by pandoc rather than the string 'null'). Quoting it will force it to be interpreted as a string. - Some tests had to be adjusted accordingly. - Pandoc now behaves better when the YAML metadata contains escaping errors: instead of just falling back on treating the section as a table, it raises a YAML parsing error.
2021-10-22	Use simpleFigure in Readers.	Aner Lucero	1	-14/+13

2021-10-20	Markdown reader: don't parse links or bracketed spans as citations.	John MacFarlane	1	-2/+4
	Previously pandoc would parse [link to (@a)](url) as a citation; similarly [(@a)]{#ident} This is undesirable. One should be able to use example references in citations, and even if `@a` is not defined as an example reference, `[@a](url)` should be a link containing an author-in-text citation rather than a normal citation followed by literal `(url)`. Closes #7632.
2021-10-13	Fix markdown parsing bug for math in bracketed spans and links.	John MacFarlane	1	-0/+1
	This affects math with unbalanced brackets (e.g. `$(0,1]$`) inside links, images, bracketed spans. Closes #7623.
2021-09-17	Fix linter warning.	John MacFarlane	1	-4/+3

2021-09-16	Fix code blocks using `--preserve-tabs`.	John MacFarlane	1	-1/+7
	Previously they did not behave as the equivalent input with spaces would. Closes #7573.
2021-08-23	Markdown reader: fix interaction of --strip-comments and list	John MacFarlane	1	-1/+1
	parsing. Use of `--strip-comments` was causing tight lists to be rendered as loose (as if the comment were a blank line). Closes #7521.
2021-08-16	Fix bug in last commit due to removal of take1WhileP.	John MacFarlane	1	-2/+2

2021-08-15	Multimarkdown sub- and superscripts (#5512) (#7188)	OCzarnecki	1	-8/+16
	Added an extension `short_subsuperscripts` which modifies the behavior of `subscript` and `superscript`, allowing subscripts or superscripts containing only alphanumerics to end with a space character (eg. `x^2 = 4` or `H~2 is combustible`). This improves support for multimarkdown. Closes #5512. Add `Ext_short_subsuperscripts` constructor to `Extension` [API change]. This is enabled by default for `markdown_mmd`.
2021-07-06	Markdown reader: don't try to read contents in self-closing HTML tag.	John MacFarlane	1	-1/+4
	Previously we had problems parsing raw HTML with self-closing tags like `<col/>`. The problem was that pandoc would look for a closing tag to close the markdown contents, but the closing tag had, in effect, already been parsed by `htmlTag`. This fixes the issue described in <https://groups.google.com/d/msgid/pandoc-discuss/297bc662-7841-4423-bcbb-534e99bbba09n%40googlegroups.com>.
2021-06-01	Markdown reader: fix pipe table regression in 2.11.4.	John MacFarlane	1	-1/+1
	Previously pipe tables with empty headers (that is, a header line with all empty cells) would be rendered as headerless tables. This broke in 2.11.4. The fix here is to produce an AST with an empty table head when a pipe table has all empty header cells. Closes #7343.
2021-05-29	Markdown reader: in rebasePaths, check for both Windows and Posix	John MacFarlane	1	-4/+5
	absolute paths. Previously Windows pandoc was treating `/foo/bar.jpg` as non-absolute.
2021-05-29	In rebasePath, check for absolute paths two ways.	John MacFarlane	1	-1/+4
	isAbsolute from FilePath doesn't return True on Windows for paths beginning with `/`, so we check that separately.
2021-05-27	rebase_relative_paths: leave empty paths unchanged.	John MacFarlane	1	-1/+1

2021-05-27	rebase_relative_paths extension: don't change fragment paths.	John MacFarlane	1	-1/+2
	We don't want a pure fragment path to be rewritten, since these are used for cross-referencing.
2021-05-27	Modify rebase_reference_links treatment of reference links/images.	John MacFarlane	1	-5/+4
	The directory is based on the file containing the link reference, not the file containing the link, if these differ.
2021-05-27	Add `rebase_relative_paths` extension.	John MacFarlane	1	-7/+29
	- Add manual entry for (non-default) extension `rebase_relative_paths`. - Add constructor `Ext_rebase_relative_paths` to `Extensions` in Text.Pandoc.Extensions [API change]. When enabled, this extension rewrites relative image and link paths by prepending the (relative) directory of the containing file. - Make Markdown reader sensitive to the new extension. - Add tests for #3752. Closes #3752. NB. currently the extension applies to markdown and associated readers but not commonmark/gfm.
2021-05-13	Implement curly-brace syntax for Markdown citation keys.	John MacFarlane	1	-3/+3
	The change provides a way to use citation keys that contain special characters not usable with the standard citation key syntax. Example: `@{foo_bar{x}'}` for the key `foo_bar{x}`. Closes #6026. The change requires adding a new parameter to the `citeKey` parser from Text.Pandoc.Parsing [API change]. Markdown reader: recognize @{..} syntax for citatinos. Markdown writer: use @{..} syntax for citations when needed. Update manual with curly-brace syntax for citations. Closes #6026.
2021-05-12	Fix source position reporting for YAML bibliographies.	John MacFarlane	1	-2/+0
	Closes #7273.
2021-05-09	Change reader types, allowing better tracking of source positions.	John MacFarlane	1	-22/+27
	Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn't report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn't resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752). Add Text.Pandoc.Sources (exported module), with a `Sources` type and a `ToSources` class. A `Sources` wraps a list of `(SourcePos, Text)` pairs. [API change] A parsec `Stream` instance is provided for `Sources`. The module also exports versions of parsec's `satisfy` and other Char parsers that track source positions accurately from a `Sources` stream (or any instance of the new `UpdateSourcePos` class). Text.Pandoc.Parsing now exports these modified Char parsers instead of the ones parsec provides. Modified parsers to use a `Sources` as stream [API change]. The readers that previously took a `Text` argument have been modified to take any instance of `ToSources`. So, they may still be used with a `Text`, but they can also be used with a `Sources` object. In Text.Pandoc.Error, modified the constructor PandocParsecError to take a `Sources` rather than a `Text` as first argument, so parse error locations can be accurately reported. T.P.Error: showPos, do not print "-" as source name.
2021-04-28	Smarter smart quotes.	John MacFarlane	1	-8/+10
	Treat a leading " with no closing " as a left curly quote. This supports the practice, in fiction, of continuing paragraphs quoting the same speaker without an end quote. It also helps with quotes that break over lines in line blocks. Closes #7216.
2021-04-18	Use MetaInlines not MetaBlocks for multimarkdown metadata fields.	John MacFarlane	1	-1/+1
	This gives better results in converting to e.g. pandoc markdown. Ref: <https://groups.google.com/d/msgid/pandoc-discuss/9728d1f4-040e-4392-aa04-148f648a8dfdn%40googlegroups.com>
2021-03-20	Move yamlMetaBlock from Markdown reader to T.P.Readers.Metadata.	John MacFarlane	1	-22/+2

2021-03-20	Markdown reader: export `yamlMetaBlock`.	John MacFarlane	1	-17/+23
	[API change] This will allow us to parse YAML metadata blocks in other readers, potentially.
2021-03-20	Text.Pandoc.Parsing: remove F type synonym.	John MacFarlane	1	-0/+2
	Muse and Org were defining their own F anyway, with their own state. We therefore move this definition to the Markdown reader.
2021-03-19	Protect partial uses of maximum with NonEmpty.	John MacFarlane	1	-1/+2

2021-03-17	Fix regression with `tex_math_backslash` in Markdown reader.	John MacFarlane	1	-1/+1
	Added regression test. Closes #7155.
2021-03-15	Use foldl' instead of foldl everywhere.	John MacFarlane	1	-3/+3

2021-03-04	Revert "Revert "Relax `--abbreviations` rules so that a period isn't required.	John MacFarlane	1	-1/+1
	This reverts commit 916ce4d51121e0529b938fda71f37e947882abe5. I was confused in thinking it wouldn't work.
2021-03-04	Revert "Relax `--abbreviations` rules so that a period isn't required."	John MacFarlane	1	-1/+1
	This reverts commit e461b7dd45f717f3317216c7d3207a1d24bf1c85. Ill-advised change. This doesn't work because we parse strings in chunks.
2021-03-04	Relax `--abbreviations` rules so that a period isn't required.	John MacFarlane	1	-1/+1
	Partially addresses #7124.
2021-02-28	Fix bug in last commit.	John MacFarlane	1	-1/+1

2021-02-28	Markdown reader efficiency improvements.	John MacFarlane	1	-182/+208
	Benchmarks show that these make the reader 13-17% faster, depending on extensions.
2021-02-06	Markdown reader: improved handling of mmd link attributes in references.	John MacFarlane	1	-0/+2
	Previously they only worked for links that had titles. Closes #7080.
2021-01-16	Revert "Markdown reader: support GitHub wiki's internal links (#2923) (#6458)"	John MacFarlane	1	-25/+0
	This reverts commit 6efd3460a776620fdb93812daa4f6831e6c332ce. Since this extension is designed to be used with GitHub markdown (gfm), we need to implement the parser as a commonmark extension (commonmark-extensions), rather than in pandoc's markdown reader. When that is done, we can add it here.
2021-01-16	Markdown reader: support GitHub wiki's internal links (#2923) (#6458)	Gautier DI FOLCO	1	-0/+25
	Canges overview: * Add a `Ext_markdown_github_wikilink` constructor to `Extension` [API change]. * Add the parser `githubWikiLink` in `Text.Pandoc.Readers.Markdown` * Add tests.
2021-01-08	Update copyright notices for 2021 (#7012)	Albert Krewinkel	1	-1/+1

2020-11-17	Markdown reader: fix regression with example list references.	John MacFarlane	1	-1/+5
	This affects example list references followed by dashes. Introduced by commit b8d17f7. Closes #6855.
2020-11-15	Markdown reader: fix detection of locators following in-text citations.	John MacFarlane	1	-27/+30
	Prevously, if we had `@foo [p. 33; @bar]`, the `p. 33` would be incorrectly parsed as a prefix of `@bar` rather than a suffix of `@foo`.
2020-11-14	Markdown reader: don't increment stateNoteNumber for example refs.	John MacFarlane	1	-0/+12
	Background: syntactically, references to example list items can't be distinguished from citations; we only know which they are after we've parsed the whole document (and this is resolved in the `runF` stage). This means that pandoc's calculation of `citationNoteNum` can sometimes be wrong when there are example list references. This commit partially addresses #6836, but only for the case where the example list references refer to list items defined previously in the document.
2020-11-07	Lint code in PRs and when committing to master (#6790)	Albert Krewinkel	1	-1/+1
	* Remove unused LANGUAGE pragmata * Apply HLint suggestions * Configure HLint to ignore some warnings * Lint code when committing to master
2020-10-07	Raise informative errors when YAML metadata parsing fails.	John MacFarlane	1	-2/+14
	Closes #6730. Previously the command would succeed, returning empty metadata, with no errors or warnings. API changes: - Remove now unused CouldNotParseYamlMetadata constructor for LogMessage (T.P.Logging). - Add 'Maybe FilePath' parameter to yamlToMeta in T.P.Readers.Markdown.
2020-10-05	Fixed regresison in last commit.	John MacFarlane	1	-1/+1
	Parsing of YAML bibliographies was broken; this fixes it.
2020-10-05	Add yamlToRefs, yamlBsToRefs.	John MacFarlane	1	-2/+25
	T.P.Readers.Markdown now exports yamlToRefs. [API change] T.P.Readers.Metadata exports yamlBsToRefs. [API change] These allow specifying an id filter so we parse only references that are used in the document. Improves timing with a 3M yaml references file from 36s to 17s.
2020-09-21	Add built-in citation support using new citeproc library.	John MacFarlane	1	-0/+1
	This deprecates the use of the external pandoc-citeproc filter; citation processing is now built in to pandoc. * Add dependency on citeproc library. * Add Text.Pandoc.Citeproc module (and some associated unexported modules under Text.Pandoc.Citeproc). Exports `processCitations`. [API change] * Add data files needed for Text.Pandoc.Citeproc: default.csl in the data directory, and a citeproc directory that is just used at compile-time. Note that we've added file-embed as a mandatory rather than a conditional depedency, because of the biblatex localization files. We might eventually want to use readDataFile for this, but it would take some code reorganization. * Text.Pandoc.Loging: Add `CiteprocWarning` to `LogMessage` and use it in `processCitations`. [API change] * Add tests from the pandoc-citeproc package as command tests (including some tests pandoc-citeproc did not pass). * Remove instructions for building pandoc-citeproc from CI and release binary build instructions. We will no longer distribute pandoc-citeproc. * Markdown reader: tweak abbreviation support. Don't insert a nonbreaking space after a potential abbreviation if it comes right before a note or citation. This messes up several things, including citeproc's moving of note citations. * Add `csljson` as and input and output format. This allows pandoc to convert between `csljson` and other bibliography formats, and to generate formatted versions of CSL JSON bibliographies. * Add module Text.Pandoc.Writers.CslJson, exporting `writeCslJson`. [API change] * Add module Text.Pandoc.Readers.CslJson, exporting `readCslJson`. [API change] * Added `bibtex`, `biblatex` as input formats. This allows pandoc to convert between BibLaTeX and BibTeX and other bibliography formats, and to generated formatted versions of BibTeX/BibLaTeX bibliographies. * Add module Text.Pandoc.Readers.BibTeX, exporting `readBibTeX` and `readBibLaTeX`. [API change] * Make "standalone" implicit if output format is a bibliography format. This is needed because pandoc readers for bibliography formats put the bibliographic information in the `references` field of metadata; and unless standalone is specified, metadata gets ignored. (TODO: This needs improvement. We should trigger standalone for the reader when the input format is bibliographic, and for the writer when the output format is markdown.) * Carry over `citationNoteNum` to `citationNoteNumber`. This was just ignored in pandoc-citeproc. * Text.Pandoc.Filter: Add `CiteprocFilter` constructor to Filter. [API change] This runs the processCitations transformation. We need to treat it like a filter so it can be placed in the sequence of filter runs (after some, before others). In FromYAML, this is parsed from `citeproc` or `{type: citeproc}`, so this special filter may be specified either way in a defaults file (or by `citeproc: true`, though this gives no control of positioning relative to other filters). TODO: we need to add something to the manual section on defaults files for this. * Add deprecation warning if `upandoc-citeproc` filter is used. * Add `--citeproc/-C` option to trigger citation processing. This behaves like a filter and will be positioned relative to filters as they appear on the command line. * Rewrote the manual on citatations, adding a dedicated Citations section which also includes some information formerly found in the pandoc-citeproc man page. * Look for CSL styles in the `csl` subdirectory of the pandoc user data directory. This changes the old pandoc-citeproc behavior, which looked in `~/.csl`. Users can simply symlink `~/.csl` to the `csl` subdirectory of their pandoc user data directory if they want the old behavior. * Add support for CSL bibliography entry formatting to LaTeX, HTML, Ms writers. Added CSL-related CSS to styles.html.
2020-09-21	Markdown reader: Set citationNoteNum accurately in citations.	John MacFarlane	1	-5/+26
	This also changes stateLastNoteNumber -> stateNoteNumber.
2020-09-19	Change deprecated Builder.isNull to null.	John MacFarlane	1	-2/+2

2020-09-13	Fix hlint suggestions, update hlint.yaml (#6680)	Christian Despres	1	-6/+6
	* Fix hlint suggestions, update hlint.yaml Most suggestions were redundant brackets. Some required LambdaCase. The .hlint.yaml file had a small typo, and didn't ignore camelCase suggestions in certain modules.
2020-06-29	Clean up T.P.R.Metadata	Nikolay Yakimov	1	-5/+2

2020-06-29	Handle errors in yamlToMeta	Nikolay Yakimov	1	-3/+1