Add built-in citation support using new citeproc library.

This deprecates the use of the external pandoc-citeproc filter; citation processing is now built in to pandoc. * Add dependency on citeproc library. * Add Text.Pandoc.Citeproc module (and some associated unexported modules under Text.Pandoc.Citeproc). Exports `processCitations`. [API change] * Add data files needed for Text.Pandoc.Citeproc: default.csl in the data directory, and a citeproc directory that is just used at compile-time. Note that we've added file-embed as a mandatory rather than a conditional depedency, because of the biblatex localization files. We might eventually want to use readDataFile for this, but it would take some code reorganization. * Text.Pandoc.Loging: Add `CiteprocWarning` to `LogMessage` and use it in `processCitations`. [API change] * Add tests from the pandoc-citeproc package as command tests (including some tests pandoc-citeproc did not pass). * Remove instructions for building pandoc-citeproc from CI and release binary build instructions. We will no longer distribute pandoc-citeproc. * Markdown reader: tweak abbreviation support. Don't insert a nonbreaking space after a potential abbreviation if it comes right before a note or citation. This messes up several things, including citeproc's moving of note citations. * Add `csljson` as and input and output format. This allows pandoc to convert between `csljson` and other bibliography formats, and to generate formatted versions of CSL JSON bibliographies. * Add module Text.Pandoc.Writers.CslJson, exporting `writeCslJson`. [API change] * Add module Text.Pandoc.Readers.CslJson, exporting `readCslJson`. [API change] * Added `bibtex`, `biblatex` as input formats. This allows pandoc to convert between BibLaTeX and BibTeX and other bibliography formats, and to generated formatted versions of BibTeX/BibLaTeX bibliographies. * Add module Text.Pandoc.Readers.BibTeX, exporting `readBibTeX` and `readBibLaTeX`. [API change] * Make "standalone" implicit if output format is a bibliography format. This is needed because pandoc readers for bibliography formats put the bibliographic information in the `references` field of metadata; and unless standalone is specified, metadata gets ignored. (TODO: This needs improvement. We should trigger standalone for the reader when the input format is bibliographic, and for the writer when the output format is markdown.) * Carry over `citationNoteNum` to `citationNoteNumber`. This was just ignored in pandoc-citeproc. * Text.Pandoc.Filter: Add `CiteprocFilter` constructor to Filter. [API change] This runs the processCitations transformation. We need to treat it like a filter so it can be placed in the sequence of filter runs (after some, before others). In FromYAML, this is parsed from `citeproc` or `{type: citeproc}`, so this special filter may be specified either way in a defaults file (or by `citeproc: true`, though this gives no control of positioning relative to other filters). TODO: we need to add something to the manual section on defaults files for this. * Add deprecation warning if `upandoc-citeproc` filter is used. * Add `--citeproc/-C` option to trigger citation processing. This behaves like a filter and will be positioned relative to filters as they appear on the command line. * Rewrote the manual on citatations, adding a dedicated Citations section which also includes some information formerly found in the pandoc-citeproc man page. * Look for CSL styles in the `csl` subdirectory of the pandoc user data directory. This changes the old pandoc-citeproc behavior, which looked in `~/.csl`. Users can simply symlink `~/.csl` to the `csl` subdirectory of their pandoc user data directory if they want the old behavior. * Add support for CSL bibliography entry formatting to LaTeX, HTML, Ms writers. Added CSL-related CSS to styles.html.
author: John MacFarlane <jgm@berkeley.edu> 2020-09-06 16:25:16 -0700
committer: John MacFarlane <jgm@berkeley.edu> 2020-09-21 10:15:50 -0700
commit: e0984a43a99231e72c02a0a716c8d0315de9abdf (patch)
tree: 8531ef58c2470d372ff2427a6ae09a6284461471 /src/Text/Pandoc/Citeproc
parent: 89c577befb78b32a0884b6092e0415c0dcadab72 (diff)
download: pandoc-e0984a43a99231e72c02a0a716c8d0315de9abdf.tar.gz
6 files changed, 1906 insertions, 0 deletions
diff --git a/src/Text/Pandoc/Citeproc/BibTeX.hs b/src/Text/Pandoc/Citeproc/BibTeX.hs
new file mode 100644
index 000000000..5919fee77
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/BibTeX.hs
@@ -0,0 +1,1237 @@
+{-# LANGUAGE ViewPatterns      #-}
+{-# LANGUAGE OverloadedStrings #-}
+{-# LANGUAGE ScopedTypeVariables #-}
+{-# LANGUAGE FlexibleContexts #-}
+{-# OPTIONS_GHC -fno-warn-unused-do-bind #-}
+-----------------------------------------------------------------------------
+-- |
+-- Module      :  Text.CSL.Input.Bibtex
+-- Copyright   :  (c) John MacFarlane
+-- License     :  BSD-style (see LICENSE)
+--
+-- Maintainer  :  John MacFarlane <fiddlosopher@gmail.com>
+-- Stability   :  unstable-- Portability :  unportable
+--
+-----------------------------------------------------------------------------
+
+module Text.Pandoc.Citeproc.BibTeX
+    ( Variant(..)
+    , readBibtexString
+    )
+    where
+
+import Text.Pandoc.Definition
+import Text.Pandoc.Builder as B
+import Text.Pandoc.Readers.LaTeX (readLaTeX)
+import Text.Pandoc.Extensions (Extension(..), extensionsFromList)
+import Text.Pandoc.Options (ReaderOptions(..))
+import Text.Pandoc.Class (runPure)
+import Text.Pandoc.Error (PandocError)
+import Text.Pandoc.Shared (stringify)
+import qualified Text.Pandoc.Walk       as Walk
+import Citeproc.Types
+import Citeproc.CaseTransform (withSentenceCase)
+import Citeproc.Pandoc (caseTransform)
+import Text.Pandoc.Citeproc.Util (toIETF)
+import Text.Pandoc.Citeproc.Data (biblatexStringMap)
+import Data.Default
+import           Data.Text              (Text)
+import qualified Data.Text              as T
+import qualified Data.Map               as Map
+import           Data.Maybe
+import Text.Parsec hiding (State, many, (<|>))
+import           Control.Applicative
+import           Data.List.Split        (splitOn, splitWhen, wordsBy)
+import           Control.Monad.RWS      hiding ((<>))
+import qualified Data.Sequence          as Seq
+import           Data.Char              (isAlphaNum, isDigit, isLetter,
+                                         isUpper, toLower, toUpper,
+                                         isLower, isPunctuation)
+import           Data.List              (foldl', intercalate)
+import           Safe                   (readMay)
+
+data Variant = Bibtex | Biblatex
+  deriving (Show, Eq, Ord)
+
+-- | Parse BibTeX or BibLaTeX into a list of 'Reference's.
+readBibtexString :: Variant           -- ^ bibtex or biblatex
+                 -> Locale            -- ^ Locale
+                 -> (Text -> Bool)    -- ^ Filter on citation ids
+                 -> Text              -- ^ bibtex/biblatex text
+                 -> Either ParseError [Reference Inlines]
+readBibtexString variant locale idpred contents = do
+  case runParser (((resolveCrossRefs variant <$> bibEntries) <* eof) >>=
+                   mapM (itemToReference locale variant) .
+                      filter (idpred . identifier))
+           (fromMaybe defaultLang $ localeLanguage locale, Map.empty)
+           "" contents of
+          Left err -> Left err
+          Right xs -> return xs
+
+defaultLang :: Lang
+defaultLang = Lang "en" (Just "US")
+
+-- a map of bibtex "string" macros
+type StringMap = Map.Map Text Text
+
+type BibParser = Parsec Text (Lang, StringMap)
+
+data Item = Item{ identifier :: Text
+                , sourcePos  :: SourcePos
+                , entryType  :: Text
+                , fields     :: Map.Map Text Text
+                }
+                deriving (Show, Ord, Eq)
+
+itemToReference :: Locale -> Variant -> Item -> BibParser (Reference Inlines)
+itemToReference locale variant item = do
+  setPosition (sourcePos item)
+  bib item $ do
+    let lang = fromMaybe defaultLang $ localeLanguage locale
+    modify $ \st -> st{ localeLang = lang,
+                        untitlecase = case lang of
+                                           (Lang "en" _) -> True
+                                           _             -> False }
+
+    id' <- asks identifier
+    otherIds <- (Just <$> getRawField "ids")
+                  <|> return Nothing
+    (reftype, genre) <- getTypeAndGenre
+    -- hyphenation:
+    let getLangId = do
+             langid <- T.strip . T.toLower <$> getRawField "langid"
+             idopts <- T.strip . T.toLower . stringify <$>
+                           getField "langidopts" <|> return ""
+             case (langid, idopts) of
+                  ("english","variant=british")    -> return "british"
+                  ("english","variant=american")   -> return "american"
+                  ("english","variant=us")         -> return "american"
+                  ("english","variant=usmax")      -> return "american"
+                  ("english","variant=uk")         -> return "british"
+                  ("english","variant=australian") -> return "australian"
+                  ("english","variant=newzealand") -> return "newzealand"
+                  (x,_)                            -> return x
+    hyphenation <- (Just . toIETF . T.toLower <$>
+                     (getLangId <|> getRawField "hyphenation"))
+                  <|> return Nothing
+    modify $ \s -> s{ untitlecase = untitlecase s &&
+                                      case hyphenation of
+                                        Just x -> "en-" `T.isPrefixOf` x
+                                        _ -> True }
+
+
+    opts <- (parseOptions <$> getRawField "options") <|> return []
+
+    et <- asks entryType
+
+    -- titles
+    let isArticle = et `elem`
+                     ["article", "periodical", "suppperiodical", "review"]
+    let isPeriodical = et == "periodical"
+    let isChapterlike = et `elem`
+           ["inbook","incollection","inproceedings","inreference","bookinbook"]
+
+    let getFieldMaybe f = (Just <$> getField f) <|> return Nothing
+
+    -- names
+    let getNameList' f = Just <$>
+         getNameList (("bibtex", case variant of
+                                      Bibtex   -> "true"
+                                      Biblatex -> "false") : opts) f
+
+    author' <- getNameList' "author" <|> return Nothing
+    containerAuthor' <- getNameList' "bookauthor" <|> return Nothing
+    translator' <- getNameList' "translator" <|> return Nothing
+    editortype <- getRawField "editortype" <|> return mempty
+    editor'' <- getNameList' "editor" <|> return Nothing
+    director'' <- getNameList' "director" <|> return Nothing
+    let (editor', director') = case editortype of
+                                    "director" -> (Nothing, editor'')
+                                    _          -> (editor'', director'')
+    -- FIXME: add same for editora, editorb, editorc
+
+    -- dates
+    issued' <- (Just <$> (getDate "date" <|> getOldDate mempty)) <|>
+               return Nothing
+    eventDate' <- (Just <$> (getDate "eventdate" <|> getOldDate "event")) <|>
+                   return Nothing
+    origDate' <- (Just <$> (getDate "origdate" <|> getOldDate "orig")) <|>
+                   return Nothing
+    accessed' <- (Just <$> (getDate "urldate" <|> getOldDate "url")) <|>
+                    return Nothing
+
+    -- locators
+    pages' <- getFieldMaybe "pages"
+    volume' <- getFieldMaybe "volume"
+    part' <- getFieldMaybe "part"
+    volumes' <- getFieldMaybe "volumes"
+    pagetotal' <- getFieldMaybe "pagetotal"
+    chapter' <- getFieldMaybe "chapter"
+    edition' <- getFieldMaybe "edition"
+    version' <- getFieldMaybe "version"
+    (number', collectionNumber', issue') <-
+       (getField "number" >>= \x ->
+         if et `elem` ["book","collection","proceedings","reference",
+                       "mvbook","mvcollection","mvproceedings", "mvreference",
+                       "bookinbook","inbook", "incollection","inproceedings",
+                       "inreference", "suppbook","suppcollection"]
+         then return (Nothing, Just x, Nothing)
+         else if isArticle
+              then (getField "issue" >>= \y ->
+                      return (Nothing, Nothing, Just $ concatWith ',' [x,y]))
+                 <|> return (Nothing, Nothing, Just x)
+              else return (Just x, Nothing, Nothing))
+        <|> return (Nothing, Nothing, Nothing)
+
+    -- titles
+    hasMaintitle <- (True <$ getRawField "maintitle") <|> return False
+
+    title' <- Just <$>
+              ((guard isPeriodical >> getTitle "issuetitle")
+              <|> (guard hasMaintitle >>
+                   guard (not isChapterlike) >>
+                   getTitle "maintitle")
+              <|> getTitle "title")
+              <|> return Nothing
+
+    subtitle' <- (guard isPeriodical >> getTitle "issuesubtitle")
+                  <|> (guard hasMaintitle >>
+                       guard (not isChapterlike) >>
+                       getTitle "mainsubtitle")
+                  <|> getTitle "subtitle"
+                  <|> return mempty
+    titleaddon' <- (guard hasMaintitle >>
+                     guard (not isChapterlike) >>
+                     getTitle "maintitleaddon")
+                    <|> getTitle "titleaddon"
+                    <|> return mempty
+
+    volumeTitle' <- Just <$>
+                    ((guard hasMaintitle >>
+                      guard (not isChapterlike) >>
+                      getTitle "title")
+                     <|> (guard hasMaintitle >>
+                          guard isChapterlike >>
+                          getTitle "booktitle"))
+                    <|> return Nothing
+    volumeSubtitle' <- (guard hasMaintitle >>
+                        guard (not isChapterlike) >>
+                        getTitle "subtitle")
+                       <|> (guard hasMaintitle >>
+                            guard isChapterlike >>
+                            getTitle "booksubtitle")
+                       <|> return mempty
+    volumeTitleAddon' <- (guard hasMaintitle >>
+                          guard (not isChapterlike) >>
+                          getTitle "titleaddon")
+                         <|> (guard hasMaintitle >>
+                              guard isChapterlike >>
+                              getTitle "booktitleaddon")
+                         <|> return mempty
+
+    containerTitle' <- Just <$>
+                       ((guard isPeriodical >> getPeriodicalTitle "title")
+                       <|> (guard isChapterlike >> getTitle "maintitle")
+                       <|> (guard isChapterlike >> getTitle "booktitle")
+                       <|> getPeriodicalTitle "journaltitle"
+                       <|> getPeriodicalTitle "journal")
+                       <|> return Nothing
+    containerSubtitle' <- (guard isPeriodical >> getPeriodicalTitle "subtitle")
+                          <|> (guard isChapterlike >> getTitle "mainsubtitle")
+                          <|> (guard isChapterlike >> getTitle "booksubtitle")
+                          <|> getPeriodicalTitle "journalsubtitle"
+                          <|> return mempty
+    containerTitleAddon' <- (guard isPeriodical >>
+                             getPeriodicalTitle "titleaddon")
+                            <|> (guard isChapterlike >>
+                                 getTitle "maintitleaddon")
+                            <|> (guard isChapterlike >>
+                                 getTitle "booktitleaddon")
+                            <|> return mempty
+    containerTitleShort' <- Just <$>
+                            ((guard isPeriodical >>
+                              guard (not hasMaintitle) >>
+                              getField "shorttitle")
+                            <|> getPeriodicalTitle "shortjournal")
+                           <|> return Nothing
+
+    -- change numerical series title to e.g. 'series 3'
+    let fixSeriesTitle [Str xs] | isNumber xs =
+          [Str (ordinalize locale xs), Space, Str (resolveKey' lang "jourser")]
+        fixSeriesTitle xs = xs
+    seriesTitle' <- (Just . B.fromList . fixSeriesTitle .
+                     B.toList . resolveKey lang <$>
+                        getTitle "series") <|>
+                    return Nothing
+    shortTitle' <- Just <$>
+                    ((guard (not hasMaintitle || isChapterlike) >>
+                      getTitle "shorttitle")
+                      <|> if (subtitle' /= mempty || titleaddon' /= mempty) &&
+                             not hasMaintitle
+                             then getShortTitle False "title"
+                             else getShortTitle True  "title")
+                 <|> return Nothing
+
+    eventTitle' <- Just <$> getTitle "eventtitle" <|> return Nothing
+    origTitle' <- Just <$> getTitle "origtitle" <|> return Nothing
+
+    -- publisher
+    pubfields <- mapM (\f -> Just `fmap`
+                         (if variant == Bibtex || f == "howpublished"
+                          then getField f
+                          else getLiteralList' f)
+                        <|> return Nothing)
+           ["school","institution","organization", "howpublished","publisher"]
+    let publisher' = case catMaybes pubfields of
+                       [] -> Nothing
+                       xs -> Just $ concatWith ';' xs
+    origpublisher' <- (Just <$> getField "origpublisher") <|> return Nothing
+
+    -- places
+    venue' <- (Just <$> getField "venue") <|> return Nothing
+    address' <- Just <$>
+                  (if variant == Bibtex
+                      then getField "address"
+                      else getLiteralList' "address"
+                         <|> (guard (et /= "patent") >>
+                              getLiteralList' "location"))
+                <|> return Nothing
+    origLocation' <- Just <$>
+                  (if variant == Bibtex
+                      then getField "origlocation"
+                      else getLiteralList' "origlocation")
+                    <|> return Nothing
+    jurisdiction' <- if reftype == "patent"
+                     then Just <$>
+                        (concatWith ';' . map (resolveKey lang) <$>
+                             getLiteralList "location") <|> return Nothing
+                     else return Nothing
+
+    -- url, doi, isbn, etc.:
+    -- note that with eprinttype = arxiv, we take eprint to be a partial url
+    -- archivePrefix is an alias for eprinttype
+    url' <- (guard (et == "online" || lookup "url" opts /= Just "false")
+             >> Just <$> getRawField "url")
+         <|> (do etype <- getRawField "eprinttype"
+                 eprint <- getRawField "eprint"
+                 let baseUrl =
+                       case T.toLower etype of
+                         "arxiv"       -> "http://arxiv.org/abs/"
+                         "jstor"       -> "http://www.jstor.org/stable/"
+                         "pubmed"      -> "http://www.ncbi.nlm.nih.gov/pubmed/"
+                         "googlebooks" -> "http://books.google.com?id="
+                         _             -> ""
+                 if T.null baseUrl
+                    then mzero
+                    else return $ Just $ baseUrl <> eprint)
+         <|> return Nothing
+    doi' <- (guard (lookup "doi" opts /= Just "false") >>
+             Just <$> getRawField "doi")
+           <|> return Nothing
+    isbn' <- Just <$> getRawField "isbn" <|> return Nothing
+    issn' <- Just <$> getRawField "issn" <|> return Nothing
+    pmid' <- Just <$> getRawField  "pmid" <|> return Nothing
+    pmcid' <- Just <$> getRawField "pmcid" <|> return Nothing
+    callNumber' <- Just <$> getRawField "library" <|> return Nothing
+
+    -- notes
+    annotation' <- Just <$>
+                   (getField "annotation" <|> getField "annote")
+                     <|> return Nothing
+    abstract' <- Just <$> getField "abstract" <|> return Nothing
+    keywords' <- Just <$> getField "keywords" <|> return Nothing
+    note' <- if et == "periodical"
+             then return Nothing
+             else Just <$> getField "note" <|> return Nothing
+    addendum' <- if variant == Bibtex
+                    then return Nothing
+                    else Just <$> getField "addendum"
+                 <|> return Nothing
+    pubstate' <- (  (Just . resolveKey lang <$> getField "pubstate")
+                  <|> case dateLiteral <$> issued' of
+                           Just (Just "forthcoming") ->
+                             return $ Just $ B.str "forthcoming"
+                           _ -> return Nothing
+                   )
+
+
+
+
+    let addField (_, Nothing) = id
+        addField (f, Just x)  = Map.insert f x
+    let vars = foldr addField mempty
+                [ ("other-ids", TextVal <$> otherIds)
+                , ("genre", TextVal <$> genre)
+                , ("language", TextVal <$> hyphenation)
+                -- dates
+                , ("accessed", DateVal <$> accessed')
+                , ("event-date", DateVal <$> eventDate')
+                , ("issued", DateVal <$> issued')
+                , ("original-date", DateVal <$> origDate')
+                -- names
+                , ("author", NamesVal <$> author')
+                , ("editor", NamesVal <$> editor')
+                , ("translator", NamesVal <$> translator')
+                , ("director", NamesVal <$> director')
+                , ("container-author", NamesVal <$> containerAuthor')
+                -- locators
+                , ("page", FancyVal . Walk.walk convertEnDash <$> pages')
+                , ("number-of-pages", FancyVal <$> pagetotal')
+                , ("volume", case (volume', part') of
+                               (Nothing, Nothing) -> Nothing
+                               (Just v, Nothing) -> Just $ FancyVal v
+                               (Nothing, Just p) -> Just $ FancyVal p
+                               (Just v, Just p)  ->
+                                 Just $ FancyVal $ v <> B.str "." <> p)
+                , ("number-of-volumes", FancyVal <$> volumes')
+                , ("chapter-number", FancyVal <$> chapter')
+                , ("edition", FancyVal <$> edition')
+                , ("version", FancyVal <$> version')
+                , ("number", FancyVal <$> number')
+                , ("collection-number", FancyVal <$> collectionNumber')
+                , ("issue", FancyVal <$> issue')
+                -- title
+                , ("original-title", FancyVal <$> origTitle')
+                , ("event", FancyVal <$> eventTitle')
+                , ("title", case title' of
+                              Just t -> Just $ FancyVal $
+                                         concatWith '.' [
+                                             concatWith ':' [t, subtitle']
+                                           , titleaddon' ]
+                              Nothing -> Nothing)
+                , ("volume-title",
+                            case volumeTitle' of
+                              Just t -> Just $ FancyVal $
+                                         concatWith '.' [
+                                             concatWith ':' [t, volumeSubtitle']
+                                           , volumeTitleAddon' ]
+                              Nothing -> Nothing)
+                , ("container-title",
+                            case containerTitle' of
+                              Just t -> Just $ FancyVal $
+                                         concatWith '.' [
+                                             concatWith ':' [t,
+                                               containerSubtitle']
+                                           , containerTitleAddon' ]
+                              Nothing -> Nothing)
+                , ("container-title-short", FancyVal <$> containerTitleShort')
+                , ("collection-title", FancyVal <$> seriesTitle')
+                , ("title-short", FancyVal <$> shortTitle')
+                -- publisher
+                , ("publisher", FancyVal <$> publisher')
+                , ("original-publisher", FancyVal <$> origpublisher')
+                -- places
+                , ("jurisdiction", FancyVal <$> jurisdiction')
+                , ("event-place",  FancyVal <$> venue')
+                , ("publisher-place", FancyVal <$> address')
+                , ("original-publisher-place", FancyVal <$> origLocation')
+                -- urls
+                , ("url", TextVal <$> url')
+                , ("doi", TextVal <$> doi')
+                , ("isbn", TextVal <$> isbn')
+                , ("issn", TextVal <$> issn')
+                , ("pmcid", TextVal <$> pmcid')
+                , ("pmid", TextVal <$> pmid')
+                , ("call-number", TextVal <$> callNumber')
+                -- notes
+                , ("note", case catMaybes [note', addendum'] of
+                             [] -> Nothing
+                             xs -> return $ FancyVal $ concatWith '.' xs)
+                , ("annote", FancyVal <$> annotation')
+                , ("abstract", FancyVal <$> abstract')
+                , ("keyword", FancyVal <$> keywords')
+                , ("status", FancyVal <$> pubstate')
+                ]
+    return $ Reference
+      { referenceId             = ItemId id'
+      , referenceType           = reftype
+      , referenceDisambiguation = Nothing
+      , referenceVariables      = vars }
+
+
+bib :: Item -> Bib a -> BibParser a
+bib entry m = fst <$> evalRWST m entry (BibState True (Lang "en" (Just "US")))
+
+resolveCrossRefs :: Variant -> [Item] -> [Item]
+resolveCrossRefs variant entries =
+  map (resolveCrossRef variant entries) entries
+
+resolveCrossRef :: Variant -> [Item] -> Item -> Item
+resolveCrossRef variant entries entry =
+  Map.foldrWithKey go entry (fields entry)
+  where go key val entry' =
+          if key == "crossref" || key == "xdata"
+          then entry'{ fields = fields entry' <>
+                          Map.fromList (getXrefFields variant
+                                        entry entries val) }
+          else entry'
+
+getXrefFields :: Variant -> Item -> [Item] -> Text -> [(Text, Text)]
+getXrefFields variant baseEntry entries keys = do
+  let keys' = splitKeys keys
+  xrefEntry <- [e | e <- entries, identifier e `elem` keys']
+  (k, v) <- Map.toList $ fields xrefEntry
+  if k == "crossref" || k == "xdata"
+     then do
+       xs <- mapM (getXrefFields variant baseEntry entries)
+                   (splitKeys v)
+       (x, y) <- xs
+       guard $ isNothing $ Map.lookup x $ fields xrefEntry
+       return (x, y)
+     else do
+       k' <- case variant of
+               Bibtex -> return k
+               Biblatex -> transformKey
+                            (entryType xrefEntry) (entryType baseEntry) k
+       guard $ isNothing $ Map.lookup k' $ fields baseEntry
+       return (k',v)
+
+
+
+data BibState = BibState{
+           untitlecase    :: Bool
+         , localeLang     :: Lang
+         }
+
+type Bib = RWST Item () BibState BibParser
+
+blocksToInlines :: [Block] -> Inlines
+blocksToInlines bs =
+  case bs of
+       [Plain xs] -> B.fromList xs
+       [Para  xs] -> B.fromList xs
+       _          -> B.fromList $ Walk.query (:[]) bs
+
+adjustSpans :: Lang -> Inline -> Inline
+adjustSpans lang (RawInline (Format "latex") s)
+  | s == "\\hyphen" || s == "\\hyphen " = Str "-"
+  | otherwise = parseRawLaTeX lang s
+adjustSpans _ SoftBreak = Space
+adjustSpans _ x = x
+
+parseRawLaTeX :: Lang -> Text -> Inline
+parseRawLaTeX lang t@(T.stripPrefix "\\" -> Just xs) =
+  case parseLaTeX lang contents of
+          Right [Para ys]  -> f command ys
+          Right [Plain ys] -> f command ys
+          Right []         -> f command []
+          _                -> RawInline (Format "latex") t
+   where (command', contents') = T.break (\c -> c =='{' || c =='\\') xs
+         command  = T.strip command'
+         contents = T.drop 1 $ T.dropEnd 1 contents'
+         f "mkbibquote"    ils = Span nullAttr [Quoted DoubleQuote ils]
+         f "mkbibemph"     ils = Span nullAttr [Emph ils]
+         f "mkbibitalic"   ils = Span nullAttr [Emph ils]
+         f "mkbibbold"     ils = Span nullAttr [Strong ils]
+         f "mkbibparens"   ils = Span nullAttr $
+                                  [Str "("] ++ ils ++ [Str ")"]
+         f "mkbibbrackets" ils = Span nullAttr $
+                                  [Str "["] ++ ils ++ [Str "]"]
+         -- ... both should be nestable & should work in year fields
+         f "autocap"    ils    = Span nullAttr ils
+           -- TODO: should work in year fields
+         f "textnormal" ils    = Span ("",["nodecor"],[]) ils
+         f "bibstring" [Str s] = Str $ resolveKey' lang s
+         f "adddot"    []      = Str "."
+         f "adddotspace" []    = Span nullAttr [Str ".", Space]
+         f "addabbrvspace" []  = Space
+         f _            ils    = Span nullAttr ils
+parseRawLaTeX _ t = RawInline (Format "latex") t
+
+latex' :: Text -> Bib [Block]
+latex' t = do
+  lang <- gets localeLang
+  case parseLaTeX lang t of
+    Left _   -> mzero
+    Right bs -> return bs
+
+parseLaTeX :: Lang -> Text -> Either PandocError [Block]
+parseLaTeX lang t =
+  case runPure (readLaTeX
+                def{ readerExtensions =
+                      extensionsFromList [Ext_raw_tex, Ext_smart] } t) of
+    Left e              -> Left e
+    Right (Pandoc _ bs) -> Right $ Walk.walk (adjustSpans lang) bs
+
+latex :: Text -> Bib Inlines
+latex = fmap blocksToInlines . latex' . T.strip
+
+type Options = [(Text, Text)]
+
+parseOptions :: Text -> Options
+parseOptions = map breakOpt . T.splitOn ","
+  where breakOpt x = case T.break (=='=') x of
+                          (w,v) -> (T.toLower $ T.strip w,
+                                    T.toLower $ T.strip $ T.drop 1 v)
+
+bibEntries :: BibParser [Item]
+bibEntries = do
+  skipMany nonEntry
+  many (bibItem <* skipMany nonEntry)
+ where nonEntry = bibSkip <|>
+                  try (char '@' >>
+                       (bibComment <|> bibPreamble <|> bibString))
+
+bibSkip :: BibParser ()
+bibSkip = skipMany1 (satisfy (/='@'))
+
+bibComment :: BibParser ()
+bibComment = do
+  cistring "comment"
+  spaces
+  void inBraces <|> bibSkip <|> return ()
+
+bibPreamble :: BibParser ()
+bibPreamble = do
+  cistring "preamble"
+  spaces
+  void inBraces
+
+bibString :: BibParser ()
+bibString = do
+  cistring "string"
+  spaces
+  char '{'
+  spaces
+  (k,v) <- entField
+  char '}'
+  updateState (\(l,m) -> (l, Map.insert k v m))
+  return ()
+
+inBraces :: BibParser Text
+inBraces = try $ do
+  char '{'
+  res <- manyTill
+         (  (T.pack <$> many1 (noneOf "{}\\"))
+        <|> (char '\\' >> (  (char '{' >> return "\\{")
+                         <|> (char '}' >> return "\\}")
+                         <|> return "\\"))
+        <|> (braced <$> inBraces)
+         ) (char '}')
+  return $ T.concat res
+
+braced :: Text -> Text
+braced = T.cons '{' . flip T.snoc '}'
+
+inQuotes :: BibParser Text
+inQuotes = do
+  char '"'
+  T.concat <$> manyTill
+             (  (T.pack <$> many1 (noneOf "\"\\{"))
+               <|> (char '\\' >> T.cons '\\' . T.singleton <$> anyChar)
+               <|> braced <$> inBraces
+            ) (char '"')
+
+fieldName :: BibParser Text
+fieldName = resolveAlias . T.toLower . T.pack
+  <$> many1 (letter <|> digit <|> oneOf "-_:+")
+
+isBibtexKeyChar :: Char -> Bool
+isBibtexKeyChar c =
+  isAlphaNum c || c `elem` (".:;?!`'()/*@_+=-[]*&" :: [Char])
+
+bibItem :: BibParser Item
+bibItem = do
+  char '@'
+  pos <- getPosition
+  enttype <- map toLower <$> many1 letter
+  spaces
+  char '{'
+  spaces
+  entid <- many1 (satisfy isBibtexKeyChar)
+  spaces
+  char ','
+  spaces
+  entfields <- entField `sepEndBy` (char ',' >> spaces)
+  spaces
+  char '}'
+  return $ Item (T.pack entid) pos (T.pack enttype) (Map.fromList entfields)
+
+entField :: BibParser (Text, Text)
+entField = do
+  k <- fieldName
+  spaces
+  char '='
+  spaces
+  vs <- (expandString <|> inQuotes <|> inBraces <|> rawWord) `sepBy`
+            try (spaces >> char '#' >> spaces)
+  spaces
+  return (k, T.concat vs)
+
+resolveAlias :: Text -> Text
+resolveAlias "archiveprefix" = "eprinttype"
+resolveAlias "primaryclass" = "eprintclass"
+resolveAlias s = s
+
+rawWord :: BibParser Text
+rawWord = T.pack <$> many1 alphaNum
+
+expandString :: BibParser Text
+expandString = do
+  k <- fieldName
+  (lang, strs) <- getState
+  case Map.lookup k strs of
+       Just v  -> return v
+       Nothing -> return $ resolveKey' lang k
+
+cistring :: Text -> BibParser Text
+cistring s = try (go s)
+ where go t = case T.uncons t of
+         Nothing     -> return ""
+         Just (c,cs) -> do
+           x <- char (toLower c) <|> char (toUpper c)
+           xs <- go cs
+           return (T.cons x xs)
+
+splitKeys :: Text -> [Text]
+splitKeys = filter (not . T.null) . T.split (\c -> c == ' ' || c == ',')
+
+-- Biblatex Localization Keys (see Biblatex manual)
+-- Currently we only map a subset likely to be used in Biblatex *databases*
+-- (in fields such as `type`, and via `\bibstring{}` commands).
+
+parseMonth :: Text -> Maybe Int
+parseMonth s =
+  case T.toLower s of
+         "jan" -> Just 1
+         "feb" -> Just 2
+         "mar" -> Just 3
+         "apr" -> Just 4
+         "may" -> Just 5
+         "jun" -> Just 6
+         "jul" -> Just 7
+         "aug" -> Just 8
+         "sep" -> Just 9
+         "oct" -> Just 10
+         "nov" -> Just 11
+         "dec" -> Just 12
+         _     -> readMay (T.unpack s)
+
+notFound :: Text -> Bib a
+notFound f = Prelude.fail $ T.unpack f ++ " not found"
+
+getField :: Text -> Bib Inlines
+getField f = do
+  fs <- asks fields
+  case Map.lookup f fs of
+       Just x  -> latex x
+       Nothing -> notFound f
+
+
+getPeriodicalTitle :: Text -> Bib Inlines
+getPeriodicalTitle f = do
+  ils <- getField f
+  return ils
+
+protectCase :: (Inlines -> Inlines) -> (Inlines -> Inlines)
+protectCase f = Walk.walk unprotect . f . Walk.walk protect
+ where
+  protect (Span ("",[],[]) xs) = Span ("",["nocase"],[]) xs
+  protect  x = x
+  unprotect (Span ("",["nocase"],[]) xs)
+    | hasLowercaseWord xs = Span ("",["nocase"],[]) xs
+    | otherwise           = Span ("",[],[]) xs
+  unprotect x = x
+  hasLowercaseWord = any startsWithLowercase . splitStrWhen isPunctuation
+  startsWithLowercase (Str (T.uncons -> Just (x,_))) = isLower x
+  startsWithLowercase _           = False
+
+unTitlecase :: Maybe Lang -> Inlines -> Inlines
+unTitlecase mblang = protectCase (caseTransform (withSentenceCase mblang))
+
+getTitle :: Text -> Bib Inlines
+getTitle f = do
+  ils <- getField f
+  utc <- gets untitlecase
+  lang <- gets localeLang
+  let processTitle = if utc then unTitlecase (Just lang) else id
+  return $ processTitle ils
+
+getShortTitle :: Bool -> Text -> Bib Inlines
+getShortTitle requireColon f = do
+  ils <- splitStrWhen (==':') . B.toList <$> getTitle f
+  if not requireColon || containsColon ils
+     then return $ B.fromList $ upToColon ils
+     else return mempty
+
+containsColon :: [Inline] -> Bool
+containsColon xs = Str ":" `elem` xs
+
+upToColon :: [Inline] -> [Inline]
+upToColon xs = takeWhile (/= Str ":") xs
+
+isNumber :: Text -> Bool
+isNumber t = case T.uncons t of
+  Just ('-', ds) -> T.all isDigit ds
+  Just _         -> T.all isDigit t
+  Nothing        -> False
+
+getDate :: Text -> Bib Date
+getDate f = do
+  -- the ~ can used for approx dates, but the latex reader
+  -- parses this as a nonbreaking space, so we need to convert it back!
+  let nbspToTilde '\160' = '~'
+      nbspToTilde c      = c
+  mbd <- rawDateEDTF . T.map nbspToTilde <$> getRawField f
+  case mbd of
+    Nothing -> Prelude.fail "expected date"
+    Just d  -> return d
+
+-- A negative (BC) year might be written with -- or --- in bibtex:
+fixLeadingDash :: Text -> Text
+fixLeadingDash t = case T.uncons t of
+  Just (c, ds) | (c == '–' || c == '—') && firstIsDigit ds -> T.cons '–' ds
+  _ -> t
+ where firstIsDigit = maybe False (isDigit . fst) . T.uncons
+
+getOldDate :: Text -> Bib Date
+getOldDate prefix = do
+  year' <- (readMay . T.unpack . fixLeadingDash . stringify
+              <$> getField (prefix <> "year")) <|> return Nothing
+  month' <- (parseMonth <$> getRawField (prefix <> "month"))
+            <|> return Nothing
+  day' <- (readMay . T.unpack <$> getRawField (prefix <> "day"))
+          <|> return Nothing
+  endyear' <- (readMay . T.unpack . fixLeadingDash . stringify
+              <$> getField (prefix <> "endyear")) <|> return Nothing
+  endmonth' <- (parseMonth . stringify
+                 <$> getField (prefix <> "endmonth")) <|> return Nothing
+  endday' <- (readMay . T.unpack . stringify <$>
+                 getField (prefix <> "endday")) <|> return Nothing
+  let toDateParts (y', m', d') =
+              DateParts $
+                 case y' of
+                   Nothing -> []
+                   Just y  ->
+                     case m' of
+                       Nothing -> [y]
+                       Just m  ->
+                         case d' of
+                           Nothing -> [y,m]
+                           Just d  -> [y,m,d]
+  let dateparts = filter (\x -> x /= DateParts [])
+                  $ map toDateParts [(year',month',day'),
+                                     (endyear',endmonth',endday')]
+  literal <- if null dateparts
+                then Just <$> getRawField (prefix <> "year")
+                else return Nothing
+  return $
+    Date { dateParts = dateparts
+         , dateCirca = False
+         , dateSeason = Nothing
+         , dateLiteral = literal }
+
+getRawField :: Text -> Bib Text
+getRawField f =
+  (stringify <$> getField f)
+  <|> do fs <- asks fields
+         case Map.lookup f fs of
+              Just x  -> return x
+              Nothing -> notFound f
+
+getLiteralList :: Text -> Bib [Inlines]
+getLiteralList f = do
+  fs <- asks fields
+  case Map.lookup f fs of
+       Just x  -> latex' x >>= toLiteralList
+       Nothing -> notFound f
+
+-- separates items with semicolons
+getLiteralList' :: Text -> Bib Inlines
+getLiteralList' f = do
+  fs <- asks fields
+  case Map.lookup f fs of
+    Just x    -> do
+      x' <- latex' x
+      case x' of
+        [Para xs]  ->
+          return $ B.fromList
+                 $ intercalate [Str ";", Space]
+                 $ splitByAnd xs
+        [Plain xs] ->
+          return $ B.fromList
+                 $ intercalate [Str ";", Space]
+                 $ splitByAnd xs
+        _          -> mzero
+    Nothing   -> notFound f
+
+splitByAnd :: [Inline] -> [[Inline]]
+splitByAnd = splitOn [Space, Str "and", Space]
+
+toLiteralList :: [Block] -> Bib [Inlines]
+toLiteralList [Para xs] =
+  return $ map B.fromList $ splitByAnd xs
+toLiteralList [Plain xs] = toLiteralList [Para xs]
+toLiteralList _ = mzero
+
+concatWith :: Char -> [Inlines] -> Inlines
+concatWith sep = foldl' go mempty
+  where go :: Inlines -> Inlines -> Inlines
+        go accum s
+          | s == mempty = accum
+          | otherwise   =
+              case Seq.viewr (B.unMany accum) of
+                     Seq.EmptyR -> s
+                     _ Seq.:> Str x
+                       | not (T.null x) &&
+                         T.last x `elem` ("!?.,:;" :: String)
+                                    -> accum <> B.space <> s
+                     _ -> accum <> B.str (T.singleton sep) <>
+                                                B.space <> s
+
+
+getNameList :: Options -> Text -> Bib [Name]
+getNameList opts  f = do
+  fs <- asks fields
+  case Map.lookup f fs of
+       Just x  -> latexNames opts x
+       Nothing -> notFound f
+
+toNameList :: Options -> [Block] -> Bib [Name]
+toNameList opts [Para xs] =
+  filter (/= emptyName) <$> mapM (toName opts . addSpaceAfterPeriod)
+                                    (splitByAnd xs)
+toNameList opts [Plain xs] = toNameList opts [Para xs]
+toNameList _ _ = mzero
+
+latexNames :: Options -> Text -> Bib [Name]
+latexNames opts t = latex' (T.strip t) >>= toNameList opts
+
+-- see issue 392 for motivation.  We want to treat
+-- "J.G. Smith" and "J. G. Smith" the same.
+addSpaceAfterPeriod :: [Inline] -> [Inline]
+addSpaceAfterPeriod = go . splitStrWhen (=='.')
+  where
+    go [] = []
+    go (Str (T.unpack -> [c]):Str ".":Str (T.unpack -> [d]):xs)
+      | isLetter d
+      , isLetter c
+      , isUpper c
+      , isUpper d
+        = Str (T.singleton c):Str ".":Space:go (Str (T.singleton d):xs)
+    go (x:xs) = x:go xs
+
+emptyName :: Name
+emptyName =
+    Name {  nameFamily              = Nothing
+          , nameGiven               = Nothing
+          , nameDroppingParticle    = Nothing
+          , nameNonDroppingParticle = Nothing
+          , nameSuffix              = Nothing
+          , nameLiteral             = Nothing
+          , nameCommaSuffix         = False
+          , nameStaticOrdering      = False
+          }
+
+toName :: Options -> [Inline] -> Bib Name
+toName _ [Str "others"] =
+  return emptyName{ nameLiteral = Just "others" }
+toName _ [Span ("",[],[]) ils] = -- corporate author
+  return emptyName{ nameLiteral = Just $ stringify ils }
+ -- extended BibLaTeX name format - see #266
+toName _ ils@(Str ys:_) | T.any (== '=') ys = do
+  let commaParts = splitWhen (== Str ",")
+                   . splitStrWhen (\c -> c == ',' || c == '=' || c == '\160')
+                   $ ils
+  let addPart ag (Str "given" : Str "=" : xs) =
+        ag{ nameGiven = case nameGiven ag of
+                          Nothing -> Just $ stringify xs
+                          Just t  -> Just $ t <> " " <> stringify xs }
+      addPart ag (Str "family" : Str "=" : xs) =
+        ag{ nameFamily = Just $ stringify xs }
+      addPart ag (Str "prefix" : Str "=" : xs) =
+        ag{ nameDroppingParticle =  Just $ stringify xs }
+      addPart ag (Str "useprefix" : Str "=" : Str "true" : _) =
+        ag{ nameNonDroppingParticle = nameDroppingParticle ag
+          , nameDroppingParticle    = Nothing }
+      addPart ag (Str "suffix" : Str "=" : xs) =
+        ag{ nameSuffix = Just $ stringify xs }
+      addPart ag (Space : xs) = addPart ag xs
+      addPart ag _ = ag
+  return $ foldl' addPart emptyName commaParts
+-- First von Last
+-- von Last, First
+-- von Last, Jr ,First
+-- NOTE: biblatex and bibtex differ on:
+-- Drummond de Andrade, Carlos
+-- bibtex takes "Drummond de" as the von;
+-- biblatex takes the whole as a last name.
+-- See https://github.com/plk/biblatex/issues/236
+-- Here we implement the more sensible biblatex behavior.
+toName opts ils = do
+  let useprefix = optionSet "useprefix" opts
+  let usecomma  = optionSet "juniorcomma" opts
+  let bibtex    = optionSet "bibtex" opts
+  let words' = wordsBy (\x -> x == Space || x == Str "\160")
+  let commaParts = map words' $ splitWhen (== Str ",")
+                              $ splitStrWhen
+                                   (\c -> c == ',' || c == '\160') ils
+  let (first, vonlast, jr) =
+          case commaParts of
+               --- First is the longest sequence of white-space separated
+               -- words starting with an uppercase and that is not the
+               -- whole string. von is the longest sequence of whitespace
+               -- separated words whose last word starts with lower case
+               -- and that is not the whole string.
+               [fvl]      -> let (caps', rest') = span isCapitalized fvl
+                             in  if null rest' && not (null caps')
+                                 then (init caps', [last caps'], [])
+                                 else (caps', rest', [])
+               [vl,f]     -> (f, vl, [])
+               (vl:j:f:_) -> (f, vl, j )
+               []         -> ([], [], [])
+
+  let (von, lastname) =
+         if bibtex
+            then case span isCapitalized $ reverse vonlast of
+                        ([],w:ws) -> (reverse ws, [w])
+                        (vs, ws)    -> (reverse ws, reverse vs)
+            else case break isCapitalized vonlast of
+                        (vs@(_:_), []) -> (init vs, [last vs])
+                        (vs, ws)       -> (vs, ws)
+  let prefix = T.unwords $ map stringify von
+  let family = T.unwords $ map stringify lastname
+  let suffix = T.unwords $ map stringify jr
+  let given = T.unwords $ map stringify first
+  return
+    Name {  nameFamily              = if T.null family
+                                         then Nothing
+                                         else Just family
+          , nameGiven               = if T.null given
+                                         then Nothing
+                                         else Just given
+          , nameDroppingParticle    = if useprefix || T.null prefix
+                                         then Nothing
+                                         else Just prefix
+          , nameNonDroppingParticle = if useprefix && not (T.null prefix)
+                                         then Just prefix
+                                         else Nothing
+          , nameSuffix              = if T.null suffix
+                                         then Nothing
+                                         else Just suffix
+          , nameLiteral             = Nothing
+          , nameCommaSuffix         = usecomma
+          , nameStaticOrdering      = False
+          }
+
+splitStrWhen :: (Char -> Bool) -> [Inline] -> [Inline]
+splitStrWhen _ [] = []
+splitStrWhen p (Str xs : ys) = map Str (go xs) ++ splitStrWhen p ys
+  where go s =
+          let (w,z) = T.break p s
+           in if T.null z
+                 then if T.null w
+                         then []
+                         else [w]
+                 else if T.null w
+                         then (T.take 1 z : go (T.drop 1 z))
+                         else (w : T.take 1 z : go (T.drop 1 z))
+splitStrWhen p (x : ys) = x : splitStrWhen p ys
+
+ordinalize :: Locale -> Text -> Text
+ordinalize locale n =
+  let terms = localeTerms locale
+      pad0 t = case T.length t of
+                 0 -> "00"
+                 1 -> "0" <> t
+                 _ -> t
+   in case Map.lookup ("ordinal-" <> pad0 n) terms <|>
+           Map.lookup "ordinal" terms of
+        Nothing    -> n
+        Just []    -> n
+        Just (t:_) -> n <> snd t
+
+isCapitalized :: [Inline] -> Bool
+isCapitalized (Str (T.uncons -> Just (c,cs)) : rest)
+  | isUpper c = True
+  | isDigit c = isCapitalized (Str cs : rest)
+  | otherwise = False
+isCapitalized (_:rest) = isCapitalized rest
+isCapitalized [] = True
+
+optionSet :: Text -> Options -> Bool
+optionSet key opts = case lookup key opts of
+                      Just "true" -> True
+                      Just s      -> s == mempty
+                      _           -> False
+
+getTypeAndGenre :: Bib (Text, Maybe Text)
+getTypeAndGenre = do
+  lang <- gets localeLang
+  et <- asks entryType
+  guard $ et /= "xdata"
+  reftype' <- resolveKey' lang <$> getRawField "type"
+         <|> return mempty
+  st <- getRawField "entrysubtype" <|> return mempty
+  isEvent <- (True <$ (getRawField "eventdate"
+                     <|> getRawField "eventtitle"
+                     <|> getRawField "venue")) <|> return False
+  let reftype =
+        case et of
+           "article"
+             | st == "magazine"  -> "article-magazine"
+             | st == "newspaper" -> "article-newspaper"
+             | otherwise         -> "article-journal"
+           "book"                -> "book"
+           "booklet"             -> "pamphlet"
+           "bookinbook"          -> "chapter"
+           "collection"          -> "book"
+           "dataset"             -> "dataset"
+           "electronic"          -> "webpage"
+           "inbook"              -> "chapter"
+           "incollection"        -> "chapter"
+           "inreference"         -> "entry-encyclopedia"
+           "inproceedings"       -> "paper-conference"
+           "manual"              -> "book"
+           "mastersthesis"       -> "thesis"
+           "misc"                -> ""
+           "mvbook"              -> "book"
+           "mvcollection"        -> "book"
+           "mvproceedings"       -> "book"
+           "mvreference"         -> "book"
+           "online"              -> "webpage"
+           "patent"              -> "patent"
+           "periodical"
+             | st == "magazine"  -> "article-magazine"
+             | st == "newspaper" -> "article-newspaper"
+             | otherwise         -> "article-journal"
+           "phdthesis"           -> "thesis"
+           "proceedings"         -> "book"
+           "reference"           -> "book"
+           "report"              -> "report"
+           "software"            -> "book"    -- no "software" type in CSL
+           "suppbook"            -> "chapter"
+           "suppcollection"      -> "chapter"
+           "suppperiodical"
+             | st == "magazine"  -> "article-magazine"
+             | st == "newspaper" -> "article-newspaper"
+             | otherwise         -> "article-journal"
+           "techreport"          -> "report"
+           "thesis"              -> "thesis"
+           "unpublished"         -> if isEvent then "speech" else "manuscript"
+           "www"                 -> "webpage"
+           -- biblatex, "unsupported"
+           "artwork"             -> "graphic"
+           "audio"               -> "song"    -- for audio *recordings*
+           "commentary"          -> "book"
+           "image"               -> "graphic"   -- or "figure" ?
+           "jurisdiction"        -> "legal_case"
+           "legislation"         -> "legislation"  -- or "bill" ?
+           "legal"               -> "treaty"
+           "letter"              -> "personal_communication"
+           "movie"               -> "motion_picture"
+           "music"               -> "song"        -- for musical *recordings*
+           "performance"         -> "speech"
+           "review"              -> "review"      -- or "review-book" ?
+           "standard"            -> "legislation"
+           "video"               -> "motion_picture"
+           -- biblatex-apa:
+           "data"                -> "dataset"
+           "letters"             -> "personal_communication"
+           "newsarticle"         -> "article-newspaper"
+           _                     -> ""
+
+  let refgenre =
+        case et of
+          "mastersthesis"  -> if T.null reftype'
+                                 then Just $ resolveKey' lang "mathesis"
+                                 else Just reftype'
+          "phdthesis"      -> if T.null reftype'
+                                 then Just $ resolveKey' lang "phdthesis"
+                                 else Just reftype'
+          _                -> if T.null reftype'
+                                 then Nothing
+                                 else Just reftype'
+  return (reftype, refgenre)
+
+
+-- transformKey source target key
+-- derived from Appendix C of bibtex manual
+transformKey :: Text -> Text -> Text -> [Text]
+transformKey _ _ "ids"            = []
+transformKey _ _ "crossref"       = []
+transformKey _ _ "xref"           = []
+transformKey _ _ "entryset"       = []
+transformKey _ _ "entrysubtype"   = []
+transformKey _ _ "execute"        = []
+transformKey _ _ "label"          = []
+transformKey _ _ "options"        = []
+transformKey _ _ "presort"        = []
+transformKey _ _ "related"        = []
+transformKey _ _ "relatedoptions" = []
+transformKey _ _ "relatedstring"  = []
+transformKey _ _ "relatedtype"    = []
+transformKey _ _ "shorthand"      = []
+transformKey _ _ "shorthandintro" = []
+transformKey _ _ "sortkey"        = []
+transformKey x y "author"
+  | x `elem` ["mvbook", "book"] &&
+    y `elem` ["inbook", "bookinbook", "suppbook"] = ["bookauthor", "author"]
+-- note: this next clause is not in the biblatex manual, but it makes
+-- sense in the context of CSL conversion:
+transformKey x y "author"
+  | x == "mvbook" && y == "book" = ["bookauthor", "author"]
+transformKey "mvbook" y z
+  | y `elem` ["book", "inbook", "bookinbook", "suppbook"] = standardTrans z
+transformKey x y z
+  | x `elem` ["mvcollection", "mvreference"] &&
+    y `elem` ["collection", "reference", "incollection", "inreference",
+               "suppcollection"] = standardTrans z
+transformKey "mvproceedings" y z
+  | y `elem` ["proceedings", "inproceedings"] = standardTrans z
+transformKey "book" y z
+  | y `elem` ["inbook", "bookinbook", "suppbook"] = bookTrans z
+transformKey x y z
+  | x `elem` ["collection", "reference"] &&
+    y `elem` ["incollection", "inreference", "suppcollection"] = bookTrans z
+transformKey "proceedings" "inproceedings" z = bookTrans z
+transformKey "periodical" y z
+  | y `elem` ["article", "suppperiodical"] =
+  case z of
+       "title"          -> ["journaltitle"]
+       "subtitle"       -> ["journalsubtitle"]
+       "shorttitle"     -> []
+       "sorttitle"      -> []
+       "indextitle"     -> []
+       "indexsorttitle" -> []
+       _                -> [z]
+transformKey _ _ x                = [x]
+
+standardTrans :: Text -> [Text]
+standardTrans z =
+  case z of
+       "title"          -> ["maintitle"]
+       "subtitle"       -> ["mainsubtitle"]
+       "titleaddon"     -> ["maintitleaddon"]
+       "shorttitle"     -> []
+       "sorttitle"      -> []
+       "indextitle"     -> []
+       "indexsorttitle" -> []
+       _                -> [z]
+
+bookTrans :: Text -> [Text]
+bookTrans z =
+  case z of
+       "title"          -> ["booktitle"]
+       "subtitle"       -> ["booksubtitle"]
+       "titleaddon"     -> ["booktitleaddon"]
+       "shorttitle"     -> []
+       "sorttitle"      -> []
+       "indextitle"     -> []
+       "indexsorttitle" -> []
+       _                -> [z]
+
+resolveKey :: Lang -> Inlines -> Inlines
+resolveKey lang ils = Walk.walk go ils
+  where go (Str s) = Str $ resolveKey' lang s
+        go x       = x
+
+resolveKey' :: Lang -> Text -> Text
+resolveKey' lang@(Lang l _) k =
+  case Map.lookup l biblatexStringMap >>= Map.lookup (T.toLower k) of
+    Nothing     -> k
+    Just (x, _) -> either (const k) stringify $ parseLaTeX lang x
+
+convertEnDash :: Inline -> Inline
+convertEnDash (Str s) = Str (T.map (\c -> if c == '–' then '-' else c) s)
+convertEnDash x       = x
diff --git a/src/Text/Pandoc/Citeproc/CslJson.hs b/src/Text/Pandoc/Citeproc/CslJson.hs
new file mode 100644
index 000000000..862af5188
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/CslJson.hs
@@ -0,0 +1,37 @@
+{-# LANGUAGE OverloadedStrings #-}
+module Text.Pandoc.Citeproc.CslJson
+  ( cslJsonToReferences )
+where
+
+import Citeproc.CslJson
+import Citeproc.Types
+import Control.Monad.Identity (runIdentity)
+import Data.Aeson (eitherDecodeStrict')
+import Data.ByteString (ByteString)
+import Text.Pandoc.Builder as B
+import Data.Text (Text)
+
+fromCslJson :: CslJson Text -> Inlines
+fromCslJson (CslText t) = B.text t
+fromCslJson CslEmpty = mempty
+fromCslJson (CslConcat x y) = fromCslJson x <> fromCslJson y
+fromCslJson (CslQuoted x) = B.doubleQuoted (fromCslJson x)
+fromCslJson (CslItalic x) = B.emph (fromCslJson x)
+fromCslJson (CslNormal x) = fromCslJson x  -- TODO?
+fromCslJson (CslBold x) = B.strong (fromCslJson x)
+fromCslJson (CslUnderline x) = B.underline (fromCslJson x)
+fromCslJson (CslNoDecoration x) =
+  B.spanWith ("",["nodecoration"],[]) (fromCslJson x)
+fromCslJson (CslSmallCaps x) = B.smallcaps (fromCslJson x)
+fromCslJson (CslBaseline x) = fromCslJson x
+fromCslJson (CslSub x) = B.subscript (fromCslJson x)
+fromCslJson (CslSup x) = B.superscript (fromCslJson x)
+fromCslJson (CslNoCase x) = B.spanWith ("",["nocase"],[]) (fromCslJson x)
+fromCslJson (CslDiv t x) = B.spanWith ("",["csl-" <> t],[]) (fromCslJson x)
+
+cslJsonToReferences :: ByteString -> Either String [Reference Inlines]
+cslJsonToReferences raw =
+  case eitherDecodeStrict' raw of
+    Left e        -> Left e
+    Right cslrefs -> Right $
+      map (runIdentity . traverse (return . fromCslJson)) cslrefs
diff --git a/src/Text/Pandoc/Citeproc/Data.hs b/src/Text/Pandoc/Citeproc/Data.hs
new file mode 100644
index 000000000..dfdaf2598
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/Data.hs
@@ -0,0 +1,31 @@
+{-# LANGUAGE OverloadedStrings #-}
+{-# LANGUAGE TemplateHaskell #-}
+module Text.Pandoc.Citeproc.Data
+  (biblatexStringMap)
+where
+import Data.FileEmbed
+import Data.ByteString (ByteString)
+import qualified Data.Map as M
+import qualified Data.Text.Encoding as TE
+import qualified Data.Text as T
+import Data.Text (Text)
+import Text.Pandoc.Citeproc.Util (toIETF)
+import Citeproc (Lang(..), parseLang)
+
+biblatexLocalizations :: [(FilePath, ByteString)]
+biblatexLocalizations = $(embedDir "citeproc/biblatex-localization")
+
+-- biblatex localization keys, from files at
+-- http://github.com/plk/biblatex/tree/master/tex/latex/biblatex/lbx
+biblatexStringMap :: M.Map Text (M.Map Text (Text, Text))
+biblatexStringMap = foldr go mempty biblatexLocalizations
+ where
+  go (fp, bs) =
+    let Lang lang _ = parseLang (toIETF $ T.takeWhile (/= '.') $ T.pack fp)
+        ls = T.lines $ TE.decodeUtf8 bs
+     in if length ls > 4
+           then M.insert lang (toStringMap $ map (T.splitOn "|") ls)
+           else id
+  toStringMap = foldr go' mempty
+  go' [term, x, y] = M.insert term (x, y)
+  go' _ = id
diff --git a/src/Text/Pandoc/Citeproc/Locator.hs b/src/Text/Pandoc/Citeproc/Locator.hs
new file mode 100644
index 000000000..dba762c02
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/Locator.hs
@@ -0,0 +1,279 @@
+{-# LANGUAGE TupleSections #-}
+{-# LANGUAGE ViewPatterns #-}
+{-# LANGUAGE OverloadedStrings #-}
+module Text.Pandoc.Citeproc.Locator
+  ( parseLocator )
+where
+import Citeproc.Types
+import Data.Text (Text)
+import qualified Data.Text as T
+import Text.Parsec
+import Text.Pandoc.Definition
+import Text.Pandoc.Parsing (romanNumeral)
+import Text.Pandoc.Shared (stringify)
+import Control.Monad (mzero)
+import qualified Data.Map as M
+import Data.Char (isSpace, isPunctuation, isDigit)
+
+parseLocator :: Locale -> [Inline] -> (Maybe (Text, Text), [Inline])
+parseLocator locale inp =
+  case parse (pLocatorWords (toLocatorMap locale)) "suffix" $ splitInp inp of
+       Right r -> r
+       Left _  -> (Nothing, inp)
+
+splitInp :: [Inline] -> [Inline]
+splitInp = splitStrWhen (\c -> isSpace c || (isPunctuation c && c /= ':'))
+
+--
+-- Locator parsing
+--
+
+type LocatorParser = Parsec [Inline] ()
+
+pLocatorWords :: LocatorMap
+              -> LocatorParser (Maybe (Text, Text), [Inline])
+pLocatorWords locMap = do
+  optional $ pMatchChar "," (== ',')
+  optional pSpace
+  (la, lo) <- pLocatorDelimited locMap <|> pLocatorIntegrated locMap
+  s <- getInput -- rest is suffix
+  -- need to trim, otherwise "p. 9" and "9" will have 'different' locators later on
+  -- i.e. the first one will be " 9"
+  return $
+    if T.null la && T.null lo
+       then (Nothing, s)
+       else (Just (la, T.strip lo), s)
+
+pLocatorDelimited :: LocatorMap -> LocatorParser (Text, Text)
+pLocatorDelimited locMap = try $ do
+  _ <- pMatchChar "{" (== '{')
+  skipMany pSpace -- gobble pre-spaces so label doesn't try to include them
+  (la, _) <- pLocatorLabelDelimited locMap
+  -- we only care about balancing {} and [] (because of the outer [] scope);
+  -- the rest can be anything
+  let inner = do { t <- anyToken; return (True, stringify t) }
+  gs <- many (pBalancedBraces [('{','}'), ('[',']')] inner)
+  _ <- pMatchChar "}" (== '}')
+  let lo = T.concat $ map snd gs
+  return (la, lo)
+
+pLocatorLabelDelimited :: LocatorMap -> LocatorParser (Text, Bool)
+pLocatorLabelDelimited locMap
+  = pLocatorLabel' locMap lim <|> return ("page", True)
+    where
+        lim = stringify <$> anyToken
+
+pLocatorIntegrated :: LocatorMap -> LocatorParser (Text, Text)
+pLocatorIntegrated locMap = try $ do
+  (la, wasImplicit) <- pLocatorLabelIntegrated locMap
+  -- if we got the label implicitly, we have presupposed the first one is
+  -- going to have a digit, so guarantee that. You _can_ have p. (a)
+  -- because you specified it.
+  let modifier = if wasImplicit
+                    then requireDigits
+                    else requireRomansOrDigits
+  g <- try $ pLocatorWordIntegrated (not wasImplicit) >>= modifier
+  gs <- many (try $ pLocatorWordIntegrated False >>= modifier)
+  let lo = T.concat (g:gs)
+  return (la, lo)
+
+pLocatorLabelIntegrated :: LocatorMap -> LocatorParser (Text, Bool)
+pLocatorLabelIntegrated locMap
+  = pLocatorLabel' locMap lim <|> (lookAhead digital >> return ("page", True))
+    where
+      lim = try $ pLocatorWordIntegrated True >>= requireRomansOrDigits
+      digital = try $ pLocatorWordIntegrated True >>= requireDigits
+
+pLocatorLabel' :: LocatorMap -> LocatorParser Text
+               -> LocatorParser (Text, Bool)
+pLocatorLabel' locMap lim = go ""
+    where
+      -- grow the match string until we hit the end
+      -- trying to find the largest match for a label
+      go acc = try $ do
+          -- advance at least one token each time
+          -- the pathological case is "p.3"
+          t <- anyToken
+          ts <- manyTill anyToken (try $ lookAhead lim)
+          let s = acc <> stringify (t:ts)
+          case M.lookup (T.strip s) locMap of
+            -- try to find a longer one, or return this one
+            Just l -> go s <|> return (l, False)
+            Nothing -> go s
+
+-- hard requirement for a locator to have some real digits in it
+requireDigits :: (Bool, Text) -> LocatorParser Text
+requireDigits (_, s) = if not (T.any isDigit s)
+                          then Prelude.fail "requireDigits"
+                          else return s
+
+-- soft requirement for a sequence with some roman or arabic parts
+-- (a)(iv) -- because iv is roman
+-- 1(a)  -- because 1 is an actual digit
+-- NOT: a, (a)-(b), hello, (some text in brackets)
+requireRomansOrDigits :: (Bool, Text) -> LocatorParser Text
+requireRomansOrDigits (d, s) = if not d
+                                  then Prelude.fail "requireRomansOrDigits"
+                                  else return s
+
+pLocatorWordIntegrated :: Bool -> LocatorParser (Bool, Text)
+pLocatorWordIntegrated isFirst = try $ do
+  punct <- if isFirst
+              then return ""
+              else (stringify <$> pLocatorSep) <|> return ""
+  sp <- option "" (pSpace >> return " ")
+  (dig, s) <- pBalancedBraces [('(',')'), ('[',']'), ('{','}')] pPageSeq
+  return (dig, punct <> sp <> s)
+
+-- we want to capture:  123, 123A, C22, XVII, 33-44, 22-33; 22-11
+--                      34(1), 34A(A), 34(1)(i)(i), (1)(a)
+--                      [17], [17]-[18], '591 [84]'
+--                      (because CSL cannot pull out individual pages/sections
+--                      to wrap in braces on a per-style basis)
+pBalancedBraces :: [(Char, Char)]
+                -> LocatorParser (Bool, Text)
+                -> LocatorParser (Bool, Text)
+pBalancedBraces braces p = try $ do
+  ss <- many1 surround
+  return $ anyWereDigitLike ss
+  where
+      except = notFollowedBy pBraces >> p
+      -- outer and inner
+      surround = foldl (\a (open, close) -> sur open close except <|> a)
+                       except
+                       braces
+
+      isc c = stringify <$> pMatchChar [c] (== c)
+
+      sur c c' m = try $ do
+          (d, mid) <- between (isc c) (isc c') (option (False, "") m)
+          return (d, T.cons c . flip T.snoc c' $  mid)
+
+      flattened = concatMap (\(o, c) -> [o, c]) braces
+      pBraces = pMatchChar "braces" (`elem` flattened)
+
+
+-- YES 1, 1.2, 1.2.3
+-- NO  1., 1.2. a.6
+-- can't use sepBy because we want to leave trailing .s
+pPageSeq :: LocatorParser (Bool, Text)
+pPageSeq = oneDotTwo <|> withPeriod
+  where
+      oneDotTwo = do
+          u <- pPageUnit
+          us <- many withPeriod
+          return $ anyWereDigitLike (u:us)
+      withPeriod = try $ do
+          -- .2
+          p <- pMatchChar "." (== '.')
+          u <- try pPageUnit
+          return (fst u, stringify p <> snd u)
+
+anyWereDigitLike :: [(Bool, Text)] -> (Bool, Text)
+anyWereDigitLike as = (any fst as, T.concat $ map snd as)
+
+pPageUnit :: LocatorParser (Bool, Text)
+pPageUnit = roman <|> plainUnit
+  where
+      -- roman is a 'digit'
+      roman = (True,) <$> pRoman
+      plainUnit = do
+          ts <- many1 (notFollowedBy pSpace >>
+                       notFollowedBy pLocatorPunct >>
+                       anyToken)
+          let s = stringify ts
+          -- otherwise look for actual digits or -s
+          return (T.any isDigit s, s)
+
+pRoman :: LocatorParser Text
+pRoman = try $ do
+  tok <- anyToken
+  case tok of
+       Str t -> case parse (romanNumeral True *> eof)
+                   "roman numeral" (T.toUpper t) of
+                      Left _    -> mzero
+                      Right ()  -> return t
+       _      -> mzero
+
+pLocatorPunct :: LocatorParser Inline
+pLocatorPunct = pMatchChar "punctuation" isLocatorPunct
+
+pLocatorSep :: LocatorParser Inline
+pLocatorSep = pMatchChar "locator separator" isLocatorSep
+
+pMatchChar :: String -> (Char -> Bool) -> LocatorParser Inline
+pMatchChar msg f = satisfyTok f' <?> msg
+    where
+        f' (Str (T.unpack -> [c])) = f c
+        f' _                       = False
+
+pSpace :: LocatorParser Inline
+pSpace = satisfyTok (\t -> isSpacey t || t == Str "\160") <?> "space"
+
+satisfyTok :: (Inline -> Bool) -> LocatorParser Inline
+satisfyTok f = tokenPrim show (\sp _ _ -> sp) (\tok -> if f tok
+                                                          then Just tok
+                                                          else Nothing)
+
+isSpacey :: Inline -> Bool
+isSpacey Space     = True
+isSpacey SoftBreak = True
+isSpacey _         = False
+
+isLocatorPunct :: Char -> Bool
+isLocatorPunct '-' = False -- page range
+isLocatorPunct '–' = False -- page range, en dash
+isLocatorPunct ':' = False -- vol:page-range hack
+isLocatorPunct c   = isPunctuation c -- includes [{()}]
+
+isLocatorSep :: Char -> Bool
+isLocatorSep ',' = True
+isLocatorSep ';' = True
+isLocatorSep _   = False
+
+splitStrWhen :: (Char -> Bool) -> [Inline] -> [Inline]
+splitStrWhen _ [] = []
+splitStrWhen p (Str xs : ys) = go (T.unpack xs) ++ splitStrWhen p ys
+  where
+   go [] = []
+   go s = case break p s of
+             ([],[])     -> []
+             (zs,[])     -> [Str $ T.pack zs]
+             ([],w:ws) -> Str (T.singleton w) : go ws
+             (zs,w:ws) -> Str (T.pack zs) : Str (T.singleton w) : go ws
+splitStrWhen p (x : ys) = x : splitStrWhen p ys
+
+--
+-- Locator Map
+--
+
+type LocatorMap = M.Map Text Text
+
+toLocatorMap :: Locale -> LocatorMap
+toLocatorMap locale =
+  foldr go mempty locatorTerms
+ where
+  go tname locmap =
+    case M.lookup tname (localeTerms locale) of
+      Nothing -> locmap
+      Just ts -> foldr (\x -> M.insert (snd x) tname) locmap ts
+
+locatorTerms :: [Text]
+locatorTerms =
+  [ "book"
+  , "chapter"
+  , "column"
+  , "figure"
+  , "folio"
+  , "issue"
+  , "line"
+  , "note"
+  , "opus"
+  , "page"
+  , "number-of-pages"
+  , "paragraph"
+  , "part"
+  , "section"
+  , "sub verbo"
+  , "verse"
+  , "volume" ]
diff --git a/src/Text/Pandoc/Citeproc/MetaValue.hs b/src/Text/Pandoc/Citeproc/MetaValue.hs
new file mode 100644
index 000000000..53b14f904
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/MetaValue.hs
@@ -0,0 +1,252 @@
+{-# LANGUAGE OverloadedStrings #-}
+module Text.Pandoc.Citeproc.MetaValue
+  ( referenceToMetaValue
+  , metaValueToReference
+  , metaValueToText
+  , metaValueToPath
+  )
+where
+
+import Citeproc.Types
+import Text.Pandoc.Definition
+import Text.Pandoc.Builder as B
+import Text.Pandoc.Walk (query)
+import Text.Pandoc.Shared (stringify)
+import Data.Maybe
+import Safe
+import qualified Data.Set as Set
+import qualified Data.Map as M
+import qualified Data.Text as T
+import Data.Text (Text)
+import Text.Printf (printf)
+import Control.Applicative ((<|>))
+
+metaValueToText :: MetaValue -> Maybe Text
+metaValueToText (MetaString t) = Just t
+metaValueToText (MetaInlines ils) = Just $ stringify ils
+metaValueToText (MetaBlocks bls) = Just $ stringify bls
+metaValueToText (MetaList xs) = T.unwords <$> mapM metaValueToText xs
+metaValueToText _ = Nothing
+
+metaValueToPath :: MetaValue -> Maybe FilePath
+metaValueToPath = fmap T.unpack . metaValueToText
+
+metaValueToBool :: MetaValue -> Maybe Bool
+metaValueToBool (MetaBool b) = Just b
+metaValueToBool (MetaString "true") = Just True
+metaValueToBool (MetaString "false") = Just False
+metaValueToBool (MetaInlines ils) =
+  metaValueToBool (MetaString (stringify ils))
+metaValueToBool _ = Nothing
+
+referenceToMetaValue :: Reference Inlines -> MetaValue
+referenceToMetaValue ref =
+  let ItemId id' = referenceId ref
+      type' = referenceType ref
+   in MetaMap $ M.insert "id" (MetaString id')
+              $ M.insert "type" (MetaString type')
+              $ M.map valToMetaValue
+              $ M.mapKeys fromVariable
+              $ referenceVariables ref
+
+
+valToMetaValue :: Val Inlines -> MetaValue
+valToMetaValue (TextVal t) = MetaString t
+valToMetaValue (FancyVal ils) = MetaInlines (B.toList ils)
+valToMetaValue (NumVal n) = MetaString (T.pack $ show n)
+valToMetaValue (NamesVal ns) = MetaList $ map nameToMetaValue ns
+valToMetaValue (DateVal d) = dateToMetaValue d
+
+nameToMetaValue :: Name -> MetaValue
+nameToMetaValue name =
+  MetaMap $
+    (maybe id (M.insert "family" . MetaString) (nameFamily name)) .
+    (maybe id (M.insert "given" . MetaString) (nameGiven name)) .
+    (maybe id (M.insert "dropping-particle" . MetaString)
+       (nameDroppingParticle name)) .
+    (maybe id (M.insert "non-dropping-particle" . MetaString)
+       (nameNonDroppingParticle name)) .
+    (maybe id (M.insert "suffix" . MetaString) (nameSuffix name)) .
+    (maybe id (M.insert "literal" . MetaString) (nameLiteral name)) .
+    (if nameCommaSuffix name
+        then M.insert "comma-suffix" (MetaBool True)
+        else id) .
+    (if nameStaticOrdering name
+        then M.insert "static-ordering" (MetaBool True)
+        else id)
+    $ mempty
+
+dateToMetaValue :: Date -> MetaValue
+dateToMetaValue date =
+  MetaString $
+    (case dateLiteral date of
+       Just l  -> l
+       Nothing -> T.intercalate "/" $ map datePartsToEDTF $ dateParts date)
+    <> (if dateCirca date then "~" else "")
+ where
+  datePartsToEDTF (DateParts dps) =
+    T.pack $
+     (case dps of
+        (y:_) | y > 9999 || y < -10000 -> ('y':)
+        _ -> id) $
+    case dps of
+      (y:m:d:_)
+       | y < -1     -> printf "%05d-%02d-%02d" (y+1) m d
+       | otherwise  -> printf "%04d-%02d-%02d" y m d
+      (y:m:[])
+       | y < -1     -> printf "%05d-%02d" (y+1) m
+       | otherwise  -> printf "%04d-%02d" y m
+      (y:[])
+       | y < -1     -> printf "%05d" (y+1)
+       | otherwise  -> printf "%04d" y
+      _             -> mempty
+
+metaValueToReference :: MetaValue -> Maybe (Reference Inlines)
+metaValueToReference (MetaMap m) = do
+  let m' = M.mapKeys normalizeKey m
+  id' <- M.lookup "id" m' >>= metaValueToText
+  type' <- (M.lookup "type" m' >>= metaValueToText) <|> pure ""
+  let m'' = M.delete "id" $ M.delete "type" m'
+  let vars = M.mapKeys toVariable $ M.mapWithKey metaValueToVal m''
+  return $ Reference { referenceId = ItemId id'
+                     , referenceType = type'
+                     , referenceDisambiguation = Nothing
+                     , referenceVariables = vars }
+metaValueToReference _ = Nothing
+
+metaValueToVal :: Text -> MetaValue -> Val Inlines
+metaValueToVal k v
+  | k `Set.member` dateVariables
+       = DateVal $ metaValueToDate v
+  | k `Set.member` nameVariables
+       = NamesVal $ metaValueToNames v
+  | k == "other-ids"
+       = TextVal $ fromMaybe mempty $ metaValueToText v
+         -- will create space-separated list
+  | otherwise =
+    case v of
+      MetaString t -> TextVal t
+      MetaInlines ils -> FancyVal (B.fromList ils)
+      MetaBlocks bs   -> FancyVal (B.fromList $ query id bs)
+      MetaBool b   -> TextVal (if b then "true" else "false")
+      MetaList _   -> TextVal mempty
+      MetaMap  _   -> TextVal mempty
+
+metaValueToDate :: MetaValue -> Date
+metaValueToDate (MetaMap m) =
+  Date
+   { dateParts = dateparts
+   , dateCirca = circa
+   , dateSeason = season
+   , dateLiteral = literal }
+ where
+  dateparts = case M.lookup "date-parts" m of
+                Just (MetaList xs) ->
+                  mapMaybe metaValueToDateParts xs
+                Just _ -> []
+                Nothing ->
+                  maybe [] (:[]) $ metaValueToDateParts (MetaMap m)
+  circa = fromMaybe False $
+            M.lookup "circa" m >>= metaValueToBool
+  season = M.lookup "season" m >>= metaValueToInt
+  literal = M.lookup "literal" m >>= metaValueToText
+metaValueToDate (MetaList xs) =
+  Date{ dateParts = mapMaybe metaValueToDateParts xs
+      , dateCirca = False
+      , dateSeason = Nothing
+      , dateLiteral = Nothing }
+metaValueToDate x =
+  fromMaybe emptyDate $ metaValueToText x >>= rawDateEDTF
+
+
+metaValueToInt :: MetaValue -> Maybe Int
+metaValueToInt x = metaValueToText x >>= readMay . T.unpack
+
+metaValueToDateParts :: MetaValue -> Maybe DateParts
+metaValueToDateParts (MetaList xs) =
+  Just $ DateParts $ map (fromMaybe 0 . metaValueToInt) xs
+metaValueToDateParts (MetaMap m) =
+  case (M.lookup "year" m >>= metaValueToInt,
+        ((M.lookup "month" m >>= metaValueToInt)
+          <|>
+        ((+ 20) <$> (M.lookup "season" m >>= metaValueToInt))),
+        M.lookup "day" m >>= metaValueToInt) of
+    (Just y, Just mo, Just d)  -> Just $ DateParts [y, mo, d]
+    (Just y, Just mo, Nothing) -> Just $ DateParts [y, mo]
+    (Just y, Nothing, _)       -> Just $ DateParts [y]
+    _                          -> Nothing
+metaValueToDateParts _ = Nothing
+
+emptyDate :: Date
+emptyDate = Date { dateParts = []
+                 , dateCirca = False
+                 , dateSeason = Nothing
+                 , dateLiteral = Nothing }
+
+metaValueToNames :: MetaValue -> [Name]
+metaValueToNames (MetaList xs) = mapMaybe metaValueToName xs
+metaValueToNames x = maybeToList $ metaValueToName x
+
+metaValueToName :: MetaValue -> Maybe Name
+metaValueToName (MetaMap m) = extractParticles <$>
+  Just Name
+    { nameFamily = family
+    , nameGiven = given
+    , nameDroppingParticle = dropping
+    , nameNonDroppingParticle = nondropping
+    , nameSuffix = suffix
+    , nameCommaSuffix = commasuffix
+    , nameStaticOrdering = staticordering
+    , nameLiteral = literal
+    }
+ where
+  family = M.lookup "family" m >>= metaValueToText
+  given  = M.lookup "given" m >>= metaValueToText
+  dropping = M.lookup "dropping-particle" m
+                >>= metaValueToText
+  nondropping = M.lookup "non-dropping-particle" m
+                >>= metaValueToText
+  suffix = M.lookup "suffix" m >>= metaValueToText
+  commasuffix = fromMaybe False $
+    M.lookup "comma-suffix" m >>= metaValueToBool
+  staticordering = fromMaybe False $
+    M.lookup "static-ordering" m >>= metaValueToBool
+  literal = M.lookup "literal" m >>= metaValueToText
+metaValueToName x = extractParticles <$>
+  case metaValueToText x of
+    Nothing -> Nothing
+    Just lit -> Just Name
+                     { nameFamily = Nothing
+                     , nameGiven = Nothing
+                     , nameDroppingParticle = Nothing
+                     , nameNonDroppingParticle = Nothing
+                     , nameSuffix = Nothing
+                     , nameCommaSuffix = False
+                     , nameStaticOrdering = False
+                     , nameLiteral = Just lit }
+
+dateVariables :: Set.Set Text
+dateVariables = Set.fromList
+  [ "accessed", "container", "event-date", "issued",
+    "original-date", "submitted" ]
+
+nameVariables :: Set.Set Text
+nameVariables = Set.fromList
+  [ "author", "collection-editor", "composer",
+    "container-author", "director", "editor",
+    "editorial-director", "illustrator",
+    "interviewer", "original-author",
+    "recipient", "reviewed-author",
+    "translator" ]
+
+normalizeKey :: Text -> Text
+normalizeKey k =
+  case T.toLower k of
+    "doi"   -> "DOI"
+    "isbn"  -> "ISBN"
+    "issn"  -> "ISSN"
+    "pmcid" -> "PMCID"
+    "pmid"  -> "PMID"
+    "url"   -> "URL"
+    x       -> x
+
diff --git a/src/Text/Pandoc/Citeproc/Util.hs b/src/Text/Pandoc/Citeproc/Util.hs
new file mode 100644
index 000000000..6d8e01bc9
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/Util.hs
@@ -0,0 +1,70 @@
+{-# LANGUAGE FlexibleContexts #-}
+{-# LANGUAGE OverloadedStrings #-}
+module Text.Pandoc.Citeproc.Util
+ ( toIETF )
+where
+import Data.Text (Text)
+
+toIETF :: Text -> Text
+toIETF "english"         = "en-US" -- "en-EN" unavailable in CSL
+toIETF "usenglish"       = "en-US"
+toIETF "american"        = "en-US"
+toIETF "british"         = "en-GB"
+toIETF "ukenglish"       = "en-GB"
+toIETF "canadian"        = "en-US" -- "en-CA" unavailable in CSL
+toIETF "australian"      = "en-GB" -- "en-AU" unavailable in CSL
+toIETF "newzealand"      = "en-GB" -- "en-NZ" unavailable in CSL
+toIETF "afrikaans"       = "af-ZA"
+toIETF "arabic"          = "ar"
+toIETF "basque"          = "eu"
+toIETF "bulgarian"       = "bg-BG"
+toIETF "catalan"         = "ca-AD"
+toIETF "croatian"        = "hr-HR"
+toIETF "czech"           = "cs-CZ"
+toIETF "danish"          = "da-DK"
+toIETF "dutch"           = "nl-NL"
+toIETF "estonian"        = "et-EE"
+toIETF "finnish"         = "fi-FI"
+toIETF "canadien"        = "fr-CA"
+toIETF "acadian"         = "fr-CA"
+toIETF "french"          = "fr-FR"
+toIETF "francais"        = "fr-FR"
+toIETF "austrian"        = "de-AT"
+toIETF "naustrian"       = "de-AT"
+toIETF "german"          = "de-DE"
+toIETF "germanb"         = "de-DE"
+toIETF "ngerman"         = "de-DE"
+toIETF "greek"           = "el-GR"
+toIETF "polutonikogreek" = "el-GR"
+toIETF "hebrew"          = "he-IL"
+toIETF "hungarian"       = "hu-HU"
+toIETF "icelandic"       = "is-IS"
+toIETF "italian"         = "it-IT"
+toIETF "japanese"        = "ja-JP"
+toIETF "latvian"         = "lv-LV"
+toIETF "lithuanian"      = "lt-LT"
+toIETF "magyar"          = "hu-HU"
+toIETF "mongolian"       = "mn-MN"
+toIETF "norsk"           = "nb-NO"
+toIETF "nynorsk"         = "nn-NO"
+toIETF "farsi"           = "fa-IR"
+toIETF "polish"          = "pl-PL"
+toIETF "brazil"          = "pt-BR"
+toIETF "brazilian"       = "pt-BR"
+toIETF "portugues"       = "pt-PT"
+toIETF "portuguese"      = "pt-PT"
+toIETF "romanian"        = "ro-RO"
+toIETF "russian"         = "ru-RU"
+toIETF "serbian"         = "sr-RS"
+toIETF "serbianc"        = "sr-RS"
+toIETF "slovak"          = "sk-SK"
+toIETF "slovene"         = "sl-SL"
+toIETF "spanish"         = "es-ES"
+toIETF "swedish"         = "sv-SE"
+toIETF "thai"            = "th-TH"
+toIETF "turkish"         = "tr-TR"
+toIETF "ukrainian"       = "uk-UA"
+toIETF "vietnamese"      = "vi-VN"
+toIETF "latin"           = "la"
+toIETF x                 = x
+
author	John MacFarlane <jgm@berkeley.edu>	2020-09-06 16:25:16 -0700
committer	John MacFarlane <jgm@berkeley.edu>	2020-09-21 10:15:50 -0700
commit	e0984a43a99231e72c02a0a716c8d0315de9abdf (patch)
tree	8531ef58c2470d372ff2427a6ae09a6284461471 /src/Text/Pandoc/Citeproc
parent	89c577befb78b32a0884b6092e0415c0dcadab72 (diff)
download	pandoc-e0984a43a99231e72c02a0a716c8d0315de9abdf.tar.gz