aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Citeproc.hs
AgeCommit message (Collapse)AuthorFilesLines
2021-10-27Switch back from HsYAML to yaml.John MacFarlane1-1/+1
Reasons: - Performance: HsYAML is around 20 times slower in parsing large YAML bibliographies (#6084). - An issue was submitted to HsYAML, but it hasn't gotten any attention. HsYAML seems borderline unmaintained; it hasn't had a commit in over a year. - Unfortunately this goes back on our attempts to free ourselves from C dependencies (#4535). But I don't see a better alternative until a better pure Haskell parser is available. Closes #6084. Notes: - We've removed the FromYAML instances for all types that had them, since this is a HsYAML-specific typeclass [API change]. (The yaml package just uses From/ToJSON.) - Unlike HsYAML (in the configuration we were using), yaml parses 'Y', 'N', 'Yes', 'No', 'On', 'Off' as boolean values. Users may need to quote these when they are meant to be interpreted as strings. Similarly, 'null' is parsed as a YAML null value (and will be treated as an empty string by pandoc rather than the string 'null'). Quoting it will force it to be interpreted as a string. - Some tests had to be adjusted accordingly. - Pandoc now behaves better when the YAML metadata contains escaping errors: instead of just falling back on treating the section as a table, it raises a YAML parsing error.
2021-08-17Revise citeproc code to fit new citeproc 0.5 API.John MacFarlane1-37/+6
Linkification of URLs in the bibliography is now done in the citeproc library, depending on the setting of an option. We set that option depending on the value of the metadata field `link-bibliography` (defaulting to true, for consistency with earlier behavior, though the new behavior includes the CSL draft recommendation of hyperlinking the title or the whole entry if a DOI, PMID, PMCID, or URL field is present but not explicitly rendered). These changes implement the following recommendations from the draft CSL v1.0.2 spec (Appendix VI): > The CSL syntax does not have support for configuration of links. > However, processors should include links on bibliographic references, > using the following rules: > If the bibliography entry for an item renders any of the following > identifiers, the identifier should be anchored as a link, with the > target of the link as follows: > - url: output as is > - doi: prepend with "`https://doi.org/`" > - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`" > - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`" > If the identifier is rendered as a URI, include rendered URI components > (e.g. "`https://doi.org/`") in the link anchor. Do not include any other > affix text in the link anchor (e.g. "Available from: ", "doi: ", "PMID: "). > If the bibliography entry for an item does not render any of > the above identifiers, then set the anchor of the link as the item > title. If title is not rendered, then set the anchor of the link as the > full bibliography entry for the item. Set the target of the link as one > of the following, in order of priority: > > - doi: prepend with "`https://doi.org/`" > - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`" > - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`" > - url: output as is > > If the item data does not include any of the above identifiers, do not > include a link. > > Citation processors should include an option flag for calling > applications to disable bibliography linking behavior. Thanks to Benjamin Bray for getting this all working.
2021-08-13Convert Quoted in bib entries to special Spans...John MacFarlane1-1/+3
before passing them off to citeproc. This ensures that we get proper localization and flipflopping if, e.g., quotes are used in titles. Closes jgm/citeproc#87.
2021-08-13Citeproc: avoid odd handling of quotes.John MacFarlane1-1/+6
citeproc changes allow us to ignore Quoted elements; citeproc now uses its own method for represented quoted things, and only localizes and flipflops quotes it adds itself. See #87. The one thing left to do is to convert Quoted elements in bibliography databases (esp. titles) to `Span ("",["csl-quoted"],[])` before passing them to citeproc, IF the localized quotes for the quote type match the standard inverted commas.
2021-08-13Removed quote localization from citeproc processing.John MacFarlane1-20/+1
This is now done in citeproc itself.
2021-07-05Add command test for #7394.John MacFarlane1-0/+1
And fix a small bug in handling of citations in notes, which led to commas at the end of sentences in some cases.
2021-07-05Citeproc: cleanup and efficiency improvement in deNote.John MacFarlane1-15/+21
2021-07-05Revamp note citation handling.John MacFarlane1-14/+30
Use latest citeproc, which uses a Span with a class rather than a Note for notes. This helps us distinguish between user notes and citation notes. Don't put citations at the beginning of a note in parentheses. (Closes #7394.)
2021-06-28Improve punctuation moving with `--citeproc`.John MacFarlane1-14/+15
Previously, using `--citeproc` could cause punctuation to move in quotes even when there aer no citations. This has been changed; now, punctuation moving is limited to citations. In addition, we only move footnotes around punctuation if the style is a note style, even if `notes-after-punctuation` is `true`.
2021-06-12Fix regression in citeproc processing.John MacFarlane1-1/+3
If inline references are used (in the metadata `references` field), we should still only include in the bibliography items that are actually cited -- unless `nocite` is used. Closes #7376.
2021-06-08Citeproc: avoid duplicate classes and attributes on refs div.John MacFarlane1-2/+2
2021-05-17Citeproc: ensure that CSL-related attributes are passed on...John MacFarlane1-1/+1
...to a Div with id 'refs'. Previously we just left the attributes of such a Div alone, which meant that style options like entry-spacing had no effect there.
2021-04-17Use document's lang for the lang parameter of citeproc...John MacFarlane1-2/+1
even if it differs from localeLanguage. (It is designed to be possible to override the locale language, and this is especially useful when one wants to use the unicode extension syntx, e.g. fr-u-kb.)
2021-04-17Remove Text.Pandoc.BCP47 module.John MacFarlane1-8/+2
[API change] Use Lang from UnicodeCollation.Lang instead. This is a richer implementation of BCP 47.
2021-03-12Citeproc: apply fixLinks correctly.John MacFarlane1-5/+5
This is code that incorporates a prefix like `https://doi.org/` into a following link when appropriate. But it didn't work because we were walking with a `[Inline] -> [Inline]` function on an `Inlines`. Changed the point of application of `fixLink` to resolve the issue. Closes #7130.
2021-02-26Fix/update URLs and use HTTP**S** where possible (#7122)Salim B1-1/+1
2021-01-21Text.Pandoc.Citeproc: use finer grained importsAlbert Krewinkel1-18/+21
This allows to import the module in writers without causing a circular dependency.
2021-01-10T.P.Citeproc: factor out and export `getStyle`.John MacFarlane1-45/+55
2021-01-10T.P.Citeproc: factor out getLang.John MacFarlane1-8/+15
2021-01-10T.P.Citeproc: refactor and export `getReferences`.John MacFarlane1-28/+51
See #7016.
2020-12-24Citeproc: fix handling of empty URL variables (`DOI`, etc.).John MacFarlane1-1/+3
The `linkifyVariables` function was changing these to links which then got treated as non-empty by citeproc, leading to wrong results (e.g. ignoring nonempty URL when empty DOI is present). Addresses part 2 of jgm/citeproc#41.
2020-12-16Fix citeproc regression with duplicate references.John MacFarlane1-1/+2
- Use dev version of citeproc, which handles duplicate ids better, preferring the last one in the list and discarding the rest. - Ensure that inline citations take priority over external ones. See jgm/citeproc#36. This restores the behavior of pandoc-citeproc.
2020-12-15Use fetchItem to get external bibliography.John MacFarlane1-8/+7
This means that: - a URL may be provided, and pandoc will fetch the resource. - Pandoc will search the resource path for the bibliography if it is not found relative to the working directory. Closes #6940.
2020-12-15Allow both inline and external references to be usedJohn MacFarlane1-14/+15
with `--citeproc`. This fixes a regression, since pandoc-citeproc allowed these to be combined. Closes #6951.
2020-12-02Citeproc: ensure that BCP47 lang codes can be used.John MacFarlane1-2/+17
We ignore the variants and just use the base lang code and country code when passing off to citeproc.
2020-11-25Fix truncation of `[Citation]` list in `Cite` inside footnotes...John MacFarlane1-2/+2
This affected author-in-text citations in footnotes. It didn't cause problems for the printed output, but for filters that expected the citation id and other information. Closes #6890.
2020-11-13Improve period suppression algorithm for citations in notes...John MacFarlane1-1/+22
in note citation styles. See #6835.
2020-11-07Lint code in PRs and when committing to master (#6790)Albert Krewinkel1-5/+1
* Remove unused LANGUAGE pragmata * Apply HLint suggestions * Configure HLint to ignore some warnings * Lint code when committing to master
2020-11-05Citeproc: improve punctuation in in-text note citations.John MacFarlane1-8/+15
Previously in-text note citations inside a footnote would sometimes have the final period stripped, even if it was needed (e.g. on the end of 'ibid'). See #6813.
2020-11-04Simplified idpred in citeproc.John MacFarlane1-2/+1
2020-11-01Citeproc: use comma for in-text citations inside footnotes.John MacFarlane1-8/+18
When an author-in-text citation like `@foo` occurs in a footnote, we now render it with: `AUTHOR NAME + COMMA + SPACE + REST`. Previously we rendered: `AUTHOR NAME + SPACE + "(" + REST + ")"`. This gives better results. Note that normal citations are still rendered in parentheses.
2020-11-01Improve deNote.John MacFarlane1-4/+5
2020-10-29Use new citeproc; do note capitalization here, not in citeproc.John MacFarlane1-2/+11
2020-10-27Remove obsolete commentJohn MacFarlane1-1/+0
2020-10-27Citeproc: properly handle `csl` field with `data:` URI.John MacFarlane1-1/+1
This is used with the JATS writer, so this fixes a regression in pandoc 2.11 with JATS output and citeproc. Closes #6783.
2020-10-26Add PandocBibliographyError and use it in parsing bibliographies.John MacFarlane1-5/+7
This ensures that bibliography parsing errors generate messages that include the bibliography file name -- otherwise it can be quite mysterious where it is coming from. [API change] New PandocBibliographyError constructor on PandocError type.
2020-10-21citeproc - improved removal of final period...John MacFarlane1-5/+8
...in citations inside notes in note-based styles. These citations are put in parentheses, but the final period must be removed. See jgm/citeproc#20
2020-10-14Fix typos in comments, doc strings, error messages, and testsAlbert Krewinkel1-4/+1
Typos reported by https://fossies.org/linux/test/pandoc-master.tar.gz/codespell.html See: #6738
2020-10-09In fetching parent of dependent CSL style, first...John MacFarlane1-1/+5
look locally, and only do an HTTP request if it's not found locally.
2020-10-07Raise informative errors when YAML metadata parsing fails.John MacFarlane1-2/+4
Closes #6730. Previously the command would succeed, returning empty metadata, with no errors or warnings. API changes: - Remove now unused CouldNotParseYamlMetadata constructor for LogMessage (T.P.Logging). - Add 'Maybe FilePath' parameter to yamlToMeta in T.P.Readers.Markdown.
2020-10-07Cleaner solution to #6723.John MacFarlane1-4/+4
2020-10-07Fix URL prefixes in citations also when they occur in notes.John MacFarlane1-3/+3
Update chicago-fullnote-bibliography.csl and adjust tests. Closes #6723.
2020-10-06Incorporate `https://doi.org/` prefix added by CSL style...John MacFarlane1-1/+12
...into linked DOI, and similarly for other URLs linked in the bibliography. We want to avoid having a URL in which only the latter part is linked. Closes #6723.
2020-10-06Fix URL for "short DOIs" in citations. See #6723.John MacFarlane1-1/+6
Short DOIs begin 10/abcd and should be links to `https://doi.org/abcd` (omitting the `10/`).
2020-10-05Fixed regresison in last commit.John MacFarlane1-1/+1
Parsing of YAML bibliographies was broken; this fixes it.
2020-10-05Removed the idpred from metaValueToReference.John MacFarlane1-3/+2
This isn't really necessary; we do filtering at other points now.
2020-10-05Add yamlToRefs, yamlBsToRefs.John MacFarlane1-7/+5
T.P.Readers.Markdown now exports yamlToRefs. [API change] T.P.Readers.Metadata exports yamlBsToRefs. [API change] These allow specifying an id filter so we parse only references that are used in the document. Improves timing with a 3M yaml references file from 36s to 17s.
2020-10-05Improve searching for CSL files...John MacFarlane1-6/+15
...and CSL abbreviation files. Use resource path to search in both USERDATADIR/csl and USERDATADIR/csl/dependent. Also, add .csl or .json extension as needed, so you can just do --csl zoology.
2020-10-05Use yamlToMeta for yaml bibliographyJohn MacFarlane1-5/+4
This speeds up parsing of external yaml bibliographies considerably (in one test 36s -> 17s).
2020-10-05Add filtering to metaValueToReference, and check other-ids field too.John MacFarlane1-4/+5