aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Citeproc
diff options
context:
space:
mode:
authorJohn MacFarlane <jgm@berkeley.edu>2020-09-06 16:25:16 -0700
committerJohn MacFarlane <jgm@berkeley.edu>2020-09-21 10:15:50 -0700
commite0984a43a99231e72c02a0a716c8d0315de9abdf (patch)
tree8531ef58c2470d372ff2427a6ae09a6284461471 /src/Text/Pandoc/Citeproc
parent89c577befb78b32a0884b6092e0415c0dcadab72 (diff)
downloadpandoc-e0984a43a99231e72c02a0a716c8d0315de9abdf.tar.gz
Add built-in citation support using new citeproc library.
This deprecates the use of the external pandoc-citeproc filter; citation processing is now built in to pandoc. * Add dependency on citeproc library. * Add Text.Pandoc.Citeproc module (and some associated unexported modules under Text.Pandoc.Citeproc). Exports `processCitations`. [API change] * Add data files needed for Text.Pandoc.Citeproc: default.csl in the data directory, and a citeproc directory that is just used at compile-time. Note that we've added file-embed as a mandatory rather than a conditional depedency, because of the biblatex localization files. We might eventually want to use readDataFile for this, but it would take some code reorganization. * Text.Pandoc.Loging: Add `CiteprocWarning` to `LogMessage` and use it in `processCitations`. [API change] * Add tests from the pandoc-citeproc package as command tests (including some tests pandoc-citeproc did not pass). * Remove instructions for building pandoc-citeproc from CI and release binary build instructions. We will no longer distribute pandoc-citeproc. * Markdown reader: tweak abbreviation support. Don't insert a nonbreaking space after a potential abbreviation if it comes right before a note or citation. This messes up several things, including citeproc's moving of note citations. * Add `csljson` as and input and output format. This allows pandoc to convert between `csljson` and other bibliography formats, and to generate formatted versions of CSL JSON bibliographies. * Add module Text.Pandoc.Writers.CslJson, exporting `writeCslJson`. [API change] * Add module Text.Pandoc.Readers.CslJson, exporting `readCslJson`. [API change] * Added `bibtex`, `biblatex` as input formats. This allows pandoc to convert between BibLaTeX and BibTeX and other bibliography formats, and to generated formatted versions of BibTeX/BibLaTeX bibliographies. * Add module Text.Pandoc.Readers.BibTeX, exporting `readBibTeX` and `readBibLaTeX`. [API change] * Make "standalone" implicit if output format is a bibliography format. This is needed because pandoc readers for bibliography formats put the bibliographic information in the `references` field of metadata; and unless standalone is specified, metadata gets ignored. (TODO: This needs improvement. We should trigger standalone for the reader when the input format is bibliographic, and for the writer when the output format is markdown.) * Carry over `citationNoteNum` to `citationNoteNumber`. This was just ignored in pandoc-citeproc. * Text.Pandoc.Filter: Add `CiteprocFilter` constructor to Filter. [API change] This runs the processCitations transformation. We need to treat it like a filter so it can be placed in the sequence of filter runs (after some, before others). In FromYAML, this is parsed from `citeproc` or `{type: citeproc}`, so this special filter may be specified either way in a defaults file (or by `citeproc: true`, though this gives no control of positioning relative to other filters). TODO: we need to add something to the manual section on defaults files for this. * Add deprecation warning if `upandoc-citeproc` filter is used. * Add `--citeproc/-C` option to trigger citation processing. This behaves like a filter and will be positioned relative to filters as they appear on the command line. * Rewrote the manual on citatations, adding a dedicated Citations section which also includes some information formerly found in the pandoc-citeproc man page. * Look for CSL styles in the `csl` subdirectory of the pandoc user data directory. This changes the old pandoc-citeproc behavior, which looked in `~/.csl`. Users can simply symlink `~/.csl` to the `csl` subdirectory of their pandoc user data directory if they want the old behavior. * Add support for CSL bibliography entry formatting to LaTeX, HTML, Ms writers. Added CSL-related CSS to styles.html.
Diffstat (limited to 'src/Text/Pandoc/Citeproc')
-rw-r--r--src/Text/Pandoc/Citeproc/BibTeX.hs1237
-rw-r--r--src/Text/Pandoc/Citeproc/CslJson.hs37
-rw-r--r--src/Text/Pandoc/Citeproc/Data.hs31
-rw-r--r--src/Text/Pandoc/Citeproc/Locator.hs279
-rw-r--r--src/Text/Pandoc/Citeproc/MetaValue.hs252
-rw-r--r--src/Text/Pandoc/Citeproc/Util.hs70
6 files changed, 1906 insertions, 0 deletions
diff --git a/src/Text/Pandoc/Citeproc/BibTeX.hs b/src/Text/Pandoc/Citeproc/BibTeX.hs
new file mode 100644
index 000000000..5919fee77
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/BibTeX.hs
@@ -0,0 +1,1237 @@
+{-# LANGUAGE ViewPatterns #-}
+{-# LANGUAGE OverloadedStrings #-}
+{-# LANGUAGE ScopedTypeVariables #-}
+{-# LANGUAGE FlexibleContexts #-}
+{-# OPTIONS_GHC -fno-warn-unused-do-bind #-}
+-----------------------------------------------------------------------------
+-- |
+-- Module : Text.CSL.Input.Bibtex
+-- Copyright : (c) John MacFarlane
+-- License : BSD-style (see LICENSE)
+--
+-- Maintainer : John MacFarlane <fiddlosopher@gmail.com>
+-- Stability : unstable-- Portability : unportable
+--
+-----------------------------------------------------------------------------
+
+module Text.Pandoc.Citeproc.BibTeX
+ ( Variant(..)
+ , readBibtexString
+ )
+ where
+
+import Text.Pandoc.Definition
+import Text.Pandoc.Builder as B
+import Text.Pandoc.Readers.LaTeX (readLaTeX)
+import Text.Pandoc.Extensions (Extension(..), extensionsFromList)
+import Text.Pandoc.Options (ReaderOptions(..))
+import Text.Pandoc.Class (runPure)
+import Text.Pandoc.Error (PandocError)
+import Text.Pandoc.Shared (stringify)
+import qualified Text.Pandoc.Walk as Walk
+import Citeproc.Types
+import Citeproc.CaseTransform (withSentenceCase)
+import Citeproc.Pandoc (caseTransform)
+import Text.Pandoc.Citeproc.Util (toIETF)
+import Text.Pandoc.Citeproc.Data (biblatexStringMap)
+import Data.Default
+import Data.Text (Text)
+import qualified Data.Text as T
+import qualified Data.Map as Map
+import Data.Maybe
+import Text.Parsec hiding (State, many, (<|>))
+import Control.Applicative
+import Data.List.Split (splitOn, splitWhen, wordsBy)
+import Control.Monad.RWS hiding ((<>))
+import qualified Data.Sequence as Seq
+import Data.Char (isAlphaNum, isDigit, isLetter,
+ isUpper, toLower, toUpper,
+ isLower, isPunctuation)
+import Data.List (foldl', intercalate)
+import Safe (readMay)
+
+data Variant = Bibtex | Biblatex
+ deriving (Show, Eq, Ord)
+
+-- | Parse BibTeX or BibLaTeX into a list of 'Reference's.
+readBibtexString :: Variant -- ^ bibtex or biblatex
+ -> Locale -- ^ Locale
+ -> (Text -> Bool) -- ^ Filter on citation ids
+ -> Text -- ^ bibtex/biblatex text
+ -> Either ParseError [Reference Inlines]
+readBibtexString variant locale idpred contents = do
+ case runParser (((resolveCrossRefs variant <$> bibEntries) <* eof) >>=
+ mapM (itemToReference locale variant) .
+ filter (idpred . identifier))
+ (fromMaybe defaultLang $ localeLanguage locale, Map.empty)
+ "" contents of
+ Left err -> Left err
+ Right xs -> return xs
+
+defaultLang :: Lang
+defaultLang = Lang "en" (Just "US")
+
+-- a map of bibtex "string" macros
+type StringMap = Map.Map Text Text
+
+type BibParser = Parsec Text (Lang, StringMap)
+
+data Item = Item{ identifier :: Text
+ , sourcePos :: SourcePos
+ , entryType :: Text
+ , fields :: Map.Map Text Text
+ }
+ deriving (Show, Ord, Eq)
+
+itemToReference :: Locale -> Variant -> Item -> BibParser (Reference Inlines)
+itemToReference locale variant item = do
+ setPosition (sourcePos item)
+ bib item $ do
+ let lang = fromMaybe defaultLang $ localeLanguage locale
+ modify $ \st -> st{ localeLang = lang,
+ untitlecase = case lang of
+ (Lang "en" _) -> True
+ _ -> False }
+
+ id' <- asks identifier
+ otherIds <- (Just <$> getRawField "ids")
+ <|> return Nothing
+ (reftype, genre) <- getTypeAndGenre
+ -- hyphenation:
+ let getLangId = do
+ langid <- T.strip . T.toLower <$> getRawField "langid"
+ idopts <- T.strip . T.toLower . stringify <$>
+ getField "langidopts" <|> return ""
+ case (langid, idopts) of
+ ("english","variant=british") -> return "british"
+ ("english","variant=american") -> return "american"
+ ("english","variant=us") -> return "american"
+ ("english","variant=usmax") -> return "american"
+ ("english","variant=uk") -> return "british"
+ ("english","variant=australian") -> return "australian"
+ ("english","variant=newzealand") -> return "newzealand"
+ (x,_) -> return x
+ hyphenation <- (Just . toIETF . T.toLower <$>
+ (getLangId <|> getRawField "hyphenation"))
+ <|> return Nothing
+ modify $ \s -> s{ untitlecase = untitlecase s &&
+ case hyphenation of
+ Just x -> "en-" `T.isPrefixOf` x
+ _ -> True }
+
+
+ opts <- (parseOptions <$> getRawField "options") <|> return []
+
+ et <- asks entryType
+
+ -- titles
+ let isArticle = et `elem`
+ ["article", "periodical", "suppperiodical", "review"]
+ let isPeriodical = et == "periodical"
+ let isChapterlike = et `elem`
+ ["inbook","incollection","inproceedings","inreference","bookinbook"]
+
+ let getFieldMaybe f = (Just <$> getField f) <|> return Nothing
+
+ -- names
+ let getNameList' f = Just <$>
+ getNameList (("bibtex", case variant of
+ Bibtex -> "true"
+ Biblatex -> "false") : opts) f
+
+ author' <- getNameList' "author" <|> return Nothing
+ containerAuthor' <- getNameList' "bookauthor" <|> return Nothing
+ translator' <- getNameList' "translator" <|> return Nothing
+ editortype <- getRawField "editortype" <|> return mempty
+ editor'' <- getNameList' "editor" <|> return Nothing
+ director'' <- getNameList' "director" <|> return Nothing
+ let (editor', director') = case editortype of
+ "director" -> (Nothing, editor'')
+ _ -> (editor'', director'')
+ -- FIXME: add same for editora, editorb, editorc
+
+ -- dates
+ issued' <- (Just <$> (getDate "date" <|> getOldDate mempty)) <|>
+ return Nothing
+ eventDate' <- (Just <$> (getDate "eventdate" <|> getOldDate "event")) <|>
+ return Nothing
+ origDate' <- (Just <$> (getDate "origdate" <|> getOldDate "orig")) <|>
+ return Nothing
+ accessed' <- (Just <$> (getDate "urldate" <|> getOldDate "url")) <|>
+ return Nothing
+
+ -- locators
+ pages' <- getFieldMaybe "pages"
+ volume' <- getFieldMaybe "volume"
+ part' <- getFieldMaybe "part"
+ volumes' <- getFieldMaybe "volumes"
+ pagetotal' <- getFieldMaybe "pagetotal"
+ chapter' <- getFieldMaybe "chapter"
+ edition' <- getFieldMaybe "edition"
+ version' <- getFieldMaybe "version"
+ (number', collectionNumber', issue') <-
+ (getField "number" >>= \x ->
+ if et `elem` ["book","collection","proceedings","reference",
+ "mvbook","mvcollection","mvproceedings", "mvreference",
+ "bookinbook","inbook", "incollection","inproceedings",
+ "inreference", "suppbook","suppcollection"]
+ then return (Nothing, Just x, Nothing)
+ else if isArticle
+ then (getField "issue" >>= \y ->
+ return (Nothing, Nothing, Just $ concatWith ',' [x,y]))
+ <|> return (Nothing, Nothing, Just x)
+ else return (Just x, Nothing, Nothing))
+ <|> return (Nothing, Nothing, Nothing)
+
+ -- titles
+ hasMaintitle <- (True <$ getRawField "maintitle") <|> return False
+
+ title' <- Just <$>
+ ((guard isPeriodical >> getTitle "issuetitle")
+ <|> (guard hasMaintitle >>
+ guard (not isChapterlike) >>
+ getTitle "maintitle")
+ <|> getTitle "title")
+ <|> return Nothing
+
+ subtitle' <- (guard isPeriodical >> getTitle "issuesubtitle")
+ <|> (guard hasMaintitle >>
+ guard (not isChapterlike) >>
+ getTitle "mainsubtitle")
+ <|> getTitle "subtitle"
+ <|> return mempty
+ titleaddon' <- (guard hasMaintitle >>
+ guard (not isChapterlike) >>
+ getTitle "maintitleaddon")
+ <|> getTitle "titleaddon"
+ <|> return mempty
+
+ volumeTitle' <- Just <$>
+ ((guard hasMaintitle >>
+ guard (not isChapterlike) >>
+ getTitle "title")
+ <|> (guard hasMaintitle >>
+ guard isChapterlike >>
+ getTitle "booktitle"))
+ <|> return Nothing
+ volumeSubtitle' <- (guard hasMaintitle >>
+ guard (not isChapterlike) >>
+ getTitle "subtitle")
+ <|> (guard hasMaintitle >>
+ guard isChapterlike >>
+ getTitle "booksubtitle")
+ <|> return mempty
+ volumeTitleAddon' <- (guard hasMaintitle >>
+ guard (not isChapterlike) >>
+ getTitle "titleaddon")
+ <|> (guard hasMaintitle >>
+ guard isChapterlike >>
+ getTitle "booktitleaddon")
+ <|> return mempty
+
+ containerTitle' <- Just <$>
+ ((guard isPeriodical >> getPeriodicalTitle "title")
+ <|> (guard isChapterlike >> getTitle "maintitle")
+ <|> (guard isChapterlike >> getTitle "booktitle")
+ <|> getPeriodicalTitle "journaltitle"
+ <|> getPeriodicalTitle "journal")
+ <|> return Nothing
+ containerSubtitle' <- (guard isPeriodical >> getPeriodicalTitle "subtitle")
+ <|> (guard isChapterlike >> getTitle "mainsubtitle")
+ <|> (guard isChapterlike >> getTitle "booksubtitle")
+ <|> getPeriodicalTitle "journalsubtitle"
+ <|> return mempty
+ containerTitleAddon' <- (guard isPeriodical >>
+ getPeriodicalTitle "titleaddon")
+ <|> (guard isChapterlike >>
+ getTitle "maintitleaddon")
+ <|> (guard isChapterlike >>
+ getTitle "booktitleaddon")
+ <|> return mempty
+ containerTitleShort' <- Just <$>
+ ((guard isPeriodical >>
+ guard (not hasMaintitle) >>
+ getField "shorttitle")
+ <|> getPeriodicalTitle "shortjournal")
+ <|> return Nothing
+
+ -- change numerical series title to e.g. 'series 3'
+ let fixSeriesTitle [Str xs] | isNumber xs =
+ [Str (ordinalize locale xs), Space, Str (resolveKey' lang "jourser")]
+ fixSeriesTitle xs = xs
+ seriesTitle' <- (Just . B.fromList . fixSeriesTitle .
+ B.toList . resolveKey lang <$>
+ getTitle "series") <|>
+ return Nothing
+ shortTitle' <- Just <$>
+ ((guard (not hasMaintitle || isChapterlike) >>
+ getTitle "shorttitle")
+ <|> if (subtitle' /= mempty || titleaddon' /= mempty) &&
+ not hasMaintitle
+ then getShortTitle False "title"
+ else getShortTitle True "title")
+ <|> return Nothing
+
+ eventTitle' <- Just <$> getTitle "eventtitle" <|> return Nothing
+ origTitle' <- Just <$> getTitle "origtitle" <|> return Nothing
+
+ -- publisher
+ pubfields <- mapM (\f -> Just `fmap`
+ (if variant == Bibtex || f == "howpublished"
+ then getField f
+ else getLiteralList' f)
+ <|> return Nothing)
+ ["school","institution","organization", "howpublished","publisher"]
+ let publisher' = case catMaybes pubfields of
+ [] -> Nothing
+ xs -> Just $ concatWith ';' xs
+ origpublisher' <- (Just <$> getField "origpublisher") <|> return Nothing
+
+ -- places
+ venue' <- (Just <$> getField "venue") <|> return Nothing
+ address' <- Just <$>
+ (if variant == Bibtex
+ then getField "address"
+ else getLiteralList' "address"
+ <|> (guard (et /= "patent") >>
+ getLiteralList' "location"))
+ <|> return Nothing
+ origLocation' <- Just <$>
+ (if variant == Bibtex
+ then getField "origlocation"
+ else getLiteralList' "origlocation")
+ <|> return Nothing
+ jurisdiction' <- if reftype == "patent"
+ then Just <$>
+ (concatWith ';' . map (resolveKey lang) <$>
+ getLiteralList "location") <|> return Nothing
+ else return Nothing
+
+ -- url, doi, isbn, etc.:
+ -- note that with eprinttype = arxiv, we take eprint to be a partial url
+ -- archivePrefix is an alias for eprinttype
+ url' <- (guard (et == "online" || lookup "url" opts /= Just "false")
+ >> Just <$> getRawField "url")
+ <|> (do etype <- getRawField "eprinttype"
+ eprint <- getRawField "eprint"
+ let baseUrl =
+ case T.toLower etype of
+ "arxiv" -> "http://arxiv.org/abs/"
+ "jstor" -> "http://www.jstor.org/stable/"
+ "pubmed" -> "http://www.ncbi.nlm.nih.gov/pubmed/"
+ "googlebooks" -> "http://books.google.com?id="
+ _ -> ""
+ if T.null baseUrl
+ then mzero
+ else return $ Just $ baseUrl <> eprint)
+ <|> return Nothing
+ doi' <- (guard (lookup "doi" opts /= Just "false") >>
+ Just <$> getRawField "doi")
+ <|> return Nothing
+ isbn' <- Just <$> getRawField "isbn" <|> return Nothing
+ issn' <- Just <$> getRawField "issn" <|> return Nothing
+ pmid' <- Just <$> getRawField "pmid" <|> return Nothing
+ pmcid' <- Just <$> getRawField "pmcid" <|> return Nothing
+ callNumber' <- Just <$> getRawField "library" <|> return Nothing
+
+ -- notes
+ annotation' <- Just <$>
+ (getField "annotation" <|> getField "annote")
+ <|> return Nothing
+ abstract' <- Just <$> getField "abstract" <|> return Nothing
+ keywords' <- Just <$> getField "keywords" <|> return Nothing
+ note' <- if et == "periodical"
+ then return Nothing
+ else Just <$> getField "note" <|> return Nothing
+ addendum' <- if variant == Bibtex
+ then return Nothing
+ else Just <$> getField "addendum"
+ <|> return Nothing
+ pubstate' <- ( (Just . resolveKey lang <$> getField "pubstate")
+ <|> case dateLiteral <$> issued' of
+ Just (Just "forthcoming") ->
+ return $ Just $ B.str "forthcoming"
+ _ -> return Nothing
+ )
+
+
+
+
+ let addField (_, Nothing) = id
+ addField (f, Just x) = Map.insert f x
+ let vars = foldr addField mempty
+ [ ("other-ids", TextVal <$> otherIds)
+ , ("genre", TextVal <$> genre)
+ , ("language", TextVal <$> hyphenation)
+ -- dates
+ , ("accessed", DateVal <$> accessed')
+ , ("event-date", DateVal <$> eventDate')
+ , ("issued", DateVal <$> issued')
+ , ("original-date", DateVal <$> origDate')
+ -- names
+ , ("author", NamesVal <$> author')
+ , ("editor", NamesVal <$> editor')
+ , ("translator", NamesVal <$> translator')
+ , ("director", NamesVal <$> director')
+ , ("container-author", NamesVal <$> containerAuthor')
+ -- locators
+ , ("page", FancyVal . Walk.walk convertEnDash <$> pages')
+ , ("number-of-pages", FancyVal <$> pagetotal')
+ , ("volume", case (volume', part') of
+ (Nothing, Nothing) -> Nothing
+ (Just v, Nothing) -> Just $ FancyVal v
+ (Nothing, Just p) -> Just $ FancyVal p
+ (Just v, Just p) ->
+ Just $ FancyVal $ v <> B.str "." <> p)
+ , ("number-of-volumes", FancyVal <$> volumes')
+ , ("chapter-number", FancyVal <$> chapter')
+ , ("edition", FancyVal <$> edition')
+ , ("version", FancyVal <$> version')
+ , ("number", FancyVal <$> number')
+ , ("collection-number", FancyVal <$> collectionNumber')
+ , ("issue", FancyVal <$> issue')
+ -- title
+ , ("original-title", FancyVal <$> origTitle')
+ , ("event", FancyVal <$> eventTitle')
+ , ("title", case title' of
+ Just t -> Just $ FancyVal $
+ concatWith '.' [
+ concatWith ':' [t, subtitle']
+ , titleaddon' ]
+ Nothing -> Nothing)
+ , ("volume-title",
+ case volumeTitle' of
+ Just t -> Just $ FancyVal $
+ concatWith '.' [
+ concatWith ':' [t, volumeSubtitle']
+ , volumeTitleAddon' ]
+ Nothing -> Nothing)
+ , ("container-title",
+ case containerTitle' of
+ Just t -> Just $ FancyVal $
+ concatWith '.' [
+ concatWith ':' [t,
+ containerSubtitle']
+ , containerTitleAddon' ]
+ Nothing -> Nothing)
+ , ("container-title-short", FancyVal <$> containerTitleShort')
+ , ("collection-title", FancyVal <$> seriesTitle')
+ , ("title-short", FancyVal <$> shortTitle')
+ -- publisher
+ , ("publisher", FancyVal <$> publisher')
+ , ("original-publisher", FancyVal <$> origpublisher')
+ -- places
+ , ("jurisdiction", FancyVal <$> jurisdiction')
+ , ("event-place", FancyVal <$> venue')
+ , ("publisher-place", FancyVal <$> address')
+ , ("original-publisher-place", FancyVal <$> origLocation')
+ -- urls
+ , ("url", TextVal <$> url')
+ , ("doi", TextVal <$> doi')
+ , ("isbn", TextVal <$> isbn')
+ , ("issn", TextVal <$> issn')
+ , ("pmcid", TextVal <$> pmcid')
+ , ("pmid", TextVal <$> pmid')
+ , ("call-number", TextVal <$> callNumber')
+ -- notes
+ , ("note", case catMaybes [note', addendum'] of
+ [] -> Nothing
+ xs -> return $ FancyVal $ concatWith '.' xs)
+ , ("annote", FancyVal <$> annotation')
+ , ("abstract", FancyVal <$> abstract')
+ , ("keyword", FancyVal <$> keywords')
+ , ("status", FancyVal <$> pubstate')
+ ]
+ return $ Reference
+ { referenceId = ItemId id'
+ , referenceType = reftype
+ , referenceDisambiguation = Nothing
+ , referenceVariables = vars }
+
+
+bib :: Item -> Bib a -> BibParser a
+bib entry m = fst <$> evalRWST m entry (BibState True (Lang "en" (Just "US")))
+
+resolveCrossRefs :: Variant -> [Item] -> [Item]
+resolveCrossRefs variant entries =
+ map (resolveCrossRef variant entries) entries
+
+resolveCrossRef :: Variant -> [Item] -> Item -> Item
+resolveCrossRef variant entries entry =
+ Map.foldrWithKey go entry (fields entry)
+ where go key val entry' =
+ if key == "crossref" || key == "xdata"
+ then entry'{ fields = fields entry' <>
+ Map.fromList (getXrefFields variant
+ entry entries val) }
+ else entry'
+
+getXrefFields :: Variant -> Item -> [Item] -> Text -> [(Text, Text)]
+getXrefFields variant baseEntry entries keys = do
+ let keys' = splitKeys keys
+ xrefEntry <- [e | e <- entries, identifier e `elem` keys']
+ (k, v) <- Map.toList $ fields xrefEntry
+ if k == "crossref" || k == "xdata"
+ then do
+ xs <- mapM (getXrefFields variant baseEntry entries)
+ (splitKeys v)
+ (x, y) <- xs
+ guard $ isNothing $ Map.lookup x $ fields xrefEntry
+ return (x, y)
+ else do
+ k' <- case variant of
+ Bibtex -> return k
+ Biblatex -> transformKey
+ (entryType xrefEntry) (entryType baseEntry) k
+ guard $ isNothing $ Map.lookup k' $ fields baseEntry
+ return (k',v)
+
+
+
+data BibState = BibState{
+ untitlecase :: Bool
+ , localeLang :: Lang
+ }
+
+type Bib = RWST Item () BibState BibParser
+
+blocksToInlines :: [Block] -> Inlines
+blocksToInlines bs =
+ case bs of
+ [Plain xs] -> B.fromList xs
+ [Para xs] -> B.fromList xs
+ _ -> B.fromList $ Walk.query (:[]) bs
+
+adjustSpans :: Lang -> Inline -> Inline
+adjustSpans lang (RawInline (Format "latex") s)
+ | s == "\\hyphen" || s == "\\hyphen " = Str "-"
+ | otherwise = parseRawLaTeX lang s
+adjustSpans _ SoftBreak = Space
+adjustSpans _ x = x
+
+parseRawLaTeX :: Lang -> Text -> Inline
+parseRawLaTeX lang t@(T.stripPrefix "\\" -> Just xs) =
+ case parseLaTeX lang contents of
+ Right [Para ys] -> f command ys
+ Right [Plain ys] -> f command ys
+ Right [] -> f command []
+ _ -> RawInline (Format "latex") t
+ where (command', contents') = T.break (\c -> c =='{' || c =='\\') xs
+ command = T.strip command'
+ contents = T.drop 1 $ T.dropEnd 1 contents'
+ f "mkbibquote" ils = Span nullAttr [Quoted DoubleQuote ils]
+ f "mkbibemph" ils = Span nullAttr [Emph ils]
+ f "mkbibitalic" ils = Span nullAttr [Emph ils]
+ f "mkbibbold" ils = Span nullAttr [Strong ils]
+ f "mkbibparens" ils = Span nullAttr $
+ [Str "("] ++ ils ++ [Str ")"]
+ f "mkbibbrackets" ils = Span nullAttr $
+ [Str "["] ++ ils ++ [Str "]"]
+ -- ... both should be nestable & should work in year fields
+ f "autocap" ils = Span nullAttr ils
+ -- TODO: should work in year fields
+ f "textnormal" ils = Span ("",["nodecor"],[]) ils
+ f "bibstring" [Str s] = Str $ resolveKey' lang s
+ f "adddot" [] = Str "."
+ f "adddotspace" [] = Span nullAttr [Str ".", Space]
+ f "addabbrvspace" [] = Space
+ f _ ils = Span nullAttr ils
+parseRawLaTeX _ t = RawInline (Format "latex") t
+
+latex' :: Text -> Bib [Block]
+latex' t = do
+ lang <- gets localeLang
+ case parseLaTeX lang t of
+ Left _ -> mzero
+ Right bs -> return bs
+
+parseLaTeX :: Lang -> Text -> Either PandocError [Block]
+parseLaTeX lang t =
+ case runPure (readLaTeX
+ def{ readerExtensions =
+ extensionsFromList [Ext_raw_tex, Ext_smart] } t) of
+ Left e -> Left e
+ Right (Pandoc _ bs) -> Right $ Walk.walk (adjustSpans lang) bs
+
+latex :: Text -> Bib Inlines
+latex = fmap blocksToInlines . latex' . T.strip
+
+type Options = [(Text, Text)]
+
+parseOptions :: Text -> Options
+parseOptions = map breakOpt . T.splitOn ","
+ where breakOpt x = case T.break (=='=') x of
+ (w,v) -> (T.toLower $ T.strip w,
+ T.toLower $ T.strip $ T.drop 1 v)
+
+bibEntries :: BibParser [Item]
+bibEntries = do
+ skipMany nonEntry
+ many (bibItem <* skipMany nonEntry)
+ where nonEntry = bibSkip <|>
+ try (char '@' >>
+ (bibComment <|> bibPreamble <|> bibString))
+
+bibSkip :: BibParser ()
+bibSkip = skipMany1 (satisfy (/='@'))
+
+bibComment :: BibParser ()
+bibComment = do
+ cistring "comment"
+ spaces
+ void inBraces <|> bibSkip <|> return ()
+
+bibPreamble :: BibParser ()
+bibPreamble = do
+ cistring "preamble"
+ spaces
+ void inBraces
+
+bibString :: BibParser ()
+bibString = do
+ cistring "string"
+ spaces
+ char '{'
+ spaces
+ (k,v) <- entField
+ char '}'
+ updateState (\(l,m) -> (l, Map.insert k v m))
+ return ()
+
+inBraces :: BibParser Text
+inBraces = try $ do
+ char '{'
+ res <- manyTill
+ ( (T.pack <$> many1 (noneOf "{}\\"))
+ <|> (char '\\' >> ( (char '{' >> return "\\{")
+ <|> (char '}' >> return "\\}")
+ <|> return "\\"))
+ <|> (braced <$> inBraces)
+ ) (char '}')
+ return $ T.concat res
+
+braced :: Text -> Text
+braced = T.cons '{' . flip T.snoc '}'
+
+inQuotes :: BibParser Text
+inQuotes = do
+ char '"'
+ T.concat <$> manyTill
+ ( (T.pack <$> many1 (noneOf "\"\\{"))
+ <|> (char '\\' >> T.cons '\\' . T.singleton <$> anyChar)
+ <|> braced <$> inBraces
+ ) (char '"')
+
+fieldName :: BibParser Text
+fieldName = resolveAlias . T.toLower . T.pack
+ <$> many1 (letter <|> digit <|> oneOf "-_:+")
+
+isBibtexKeyChar :: Char -> Bool
+isBibtexKeyChar c =
+ isAlphaNum c || c `elem` (".:;?!`'()/*@_+=-[]*&" :: [Char])
+
+bibItem :: BibParser Item
+bibItem = do
+ char '@'
+ pos <- getPosition
+ enttype <- map toLower <$> many1 letter
+ spaces
+ char '{'
+ spaces
+ entid <- many1 (satisfy isBibtexKeyChar)
+ spaces
+ char ','
+ spaces
+ entfields <- entField `sepEndBy` (char ',' >> spaces)
+ spaces
+ char '}'
+ return $ Item (T.pack entid) pos (T.pack enttype) (Map.fromList entfields)
+
+entField :: BibParser (Text, Text)
+entField = do
+ k <- fieldName
+ spaces
+ char '='
+ spaces
+ vs <- (expandString <|> inQuotes <|> inBraces <|> rawWord) `sepBy`
+ try (spaces >> char '#' >> spaces)
+ spaces
+ return (k, T.concat vs)
+
+resolveAlias :: Text -> Text
+resolveAlias "archiveprefix" = "eprinttype"
+resolveAlias "primaryclass" = "eprintclass"
+resolveAlias s = s
+
+rawWord :: BibParser Text
+rawWord = T.pack <$> many1 alphaNum
+
+expandString :: BibParser Text
+expandString = do
+ k <- fieldName
+ (lang, strs) <- getState
+ case Map.lookup k strs of
+ Just v -> return v
+ Nothing -> return $ resolveKey' lang k
+
+cistring :: Text -> BibParser Text
+cistring s = try (go s)
+ where go t = case T.uncons t of
+ Nothing -> return ""
+ Just (c,cs) -> do
+ x <- char (toLower c) <|> char (toUpper c)
+ xs <- go cs
+ return (T.cons x xs)
+
+splitKeys :: Text -> [Text]
+splitKeys = filter (not . T.null) . T.split (\c -> c == ' ' || c == ',')
+
+-- Biblatex Localization Keys (see Biblatex manual)
+-- Currently we only map a subset likely to be used in Biblatex *databases*
+-- (in fields such as `type`, and via `\bibstring{}` commands).
+
+parseMonth :: Text -> Maybe Int
+parseMonth s =
+ case T.toLower s of
+ "jan" -> Just 1
+ "feb" -> Just 2
+ "mar" -> Just 3
+ "apr" -> Just 4
+ "may" -> Just 5
+ "jun" -> Just 6
+ "jul" -> Just 7
+ "aug" -> Just 8
+ "sep" -> Just 9
+ "oct" -> Just 10
+ "nov" -> Just 11
+ "dec" -> Just 12
+ _ -> readMay (T.unpack s)
+
+notFound :: Text -> Bib a
+notFound f = Prelude.fail $ T.unpack f ++ " not found"
+
+getField :: Text -> Bib Inlines
+getField f = do
+ fs <- asks fields
+ case Map.lookup f fs of
+ Just x -> latex x
+ Nothing -> notFound f
+
+
+getPeriodicalTitle :: Text -> Bib Inlines
+getPeriodicalTitle f = do
+ ils <- getField f
+ return ils
+
+protectCase :: (Inlines -> Inlines) -> (Inlines -> Inlines)
+protectCase f = Walk.walk unprotect . f . Walk.walk protect
+ where
+ protect (Span ("",[],[]) xs) = Span ("",["nocase"],[]) xs
+ protect x = x
+ unprotect (Span ("",["nocase"],[]) xs)
+ | hasLowercaseWord xs = Span ("",["nocase"],[]) xs
+ | otherwise = Span ("",[],[]) xs
+ unprotect x = x
+ hasLowercaseWord = any startsWithLowercase . splitStrWhen isPunctuation
+ startsWithLowercase (Str (T.uncons -> Just (x,_))) = isLower x
+ startsWithLowercase _ = False
+
+unTitlecase :: Maybe Lang -> Inlines -> Inlines
+unTitlecase mblang = protectCase (caseTransform (withSentenceCase mblang))
+
+getTitle :: Text -> Bib Inlines
+getTitle f = do
+ ils <- getField f
+ utc <- gets untitlecase
+ lang <- gets localeLang
+ let processTitle = if utc then unTitlecase (Just lang) else id
+ return $ processTitle ils
+
+getShortTitle :: Bool -> Text -> Bib Inlines
+getShortTitle requireColon f = do
+ ils <- splitStrWhen (==':') . B.toList <$> getTitle f
+ if not requireColon || containsColon ils
+ then return $ B.fromList $ upToColon ils
+ else return mempty
+
+containsColon :: [Inline] -> Bool
+containsColon xs = Str ":" `elem` xs
+
+upToColon :: [Inline] -> [Inline]
+upToColon xs = takeWhile (/= Str ":") xs
+
+isNumber :: Text -> Bool
+isNumber t = case T.uncons t of
+ Just ('-', ds) -> T.all isDigit ds
+ Just _ -> T.all isDigit t
+ Nothing -> False
+
+getDate :: Text -> Bib Date
+getDate f = do
+ -- the ~ can used for approx dates, but the latex reader
+ -- parses this as a nonbreaking space, so we need to convert it back!
+ let nbspToTilde '\160' = '~'
+ nbspToTilde c = c
+ mbd <- rawDateEDTF . T.map nbspToTilde <$> getRawField f
+ case mbd of
+ Nothing -> Prelude.fail "expected date"
+ Just d -> return d
+
+-- A negative (BC) year might be written with -- or --- in bibtex:
+fixLeadingDash :: Text -> Text
+fixLeadingDash t = case T.uncons t of
+ Just (c, ds) | (c == '–' || c == '—') && firstIsDigit ds -> T.cons '–' ds
+ _ -> t
+ where firstIsDigit = maybe False (isDigit . fst) . T.uncons
+
+getOldDate :: Text -> Bib Date
+getOldDate prefix = do
+ year' <- (readMay . T.unpack . fixLeadingDash . stringify
+ <$> getField (prefix <> "year")) <|> return Nothing
+ month' <- (parseMonth <$> getRawField (prefix <> "month"))
+ <|> return Nothing
+ day' <- (readMay . T.unpack <$> getRawField (prefix <> "day"))
+ <|> return Nothing
+ endyear' <- (readMay . T.unpack . fixLeadingDash . stringify
+ <$> getField (prefix <> "endyear")) <|> return Nothing
+ endmonth' <- (parseMonth . stringify
+ <$> getField (prefix <> "endmonth")) <|> return Nothing
+ endday' <- (readMay . T.unpack . stringify <$>
+ getField (prefix <> "endday")) <|> return Nothing
+ let toDateParts (y', m', d') =
+ DateParts $
+ case y' of
+ Nothing -> []
+ Just y ->
+ case m' of
+ Nothing -> [y]
+ Just m ->
+ case d' of
+ Nothing -> [y,m]
+ Just d -> [y,m,d]
+ let dateparts = filter (\x -> x /= DateParts [])
+ $ map toDateParts [(year',month',day'),
+ (endyear',endmonth',endday')]
+ literal <- if null dateparts
+ then Just <$> getRawField (prefix <> "year")
+ else return Nothing
+ return $
+ Date { dateParts = dateparts
+ , dateCirca = False
+ , dateSeason = Nothing
+ , dateLiteral = literal }
+
+getRawField :: Text -> Bib Text
+getRawField f =
+ (stringify <$> getField f)
+ <|> do fs <- asks fields
+ case Map.lookup f fs of
+ Just x -> return x
+ Nothing -> notFound f
+
+getLiteralList :: Text -> Bib [Inlines]
+getLiteralList f = do
+ fs <- asks fields
+ case Map.lookup f fs of
+ Just x -> latex' x >>= toLiteralList
+ Nothing -> notFound f
+
+-- separates items with semicolons
+getLiteralList' :: Text -> Bib Inlines
+getLiteralList' f = do
+ fs <- asks fields
+ case Map.lookup f fs of
+ Just x -> do
+ x' <- latex' x
+ case x' of
+ [Para xs] ->
+ return $ B.fromList
+ $ intercalate [Str ";", Space]
+ $ splitByAnd xs
+ [Plain xs] ->
+ return $ B.fromList
+ $ intercalate [Str ";", Space]
+ $ splitByAnd xs
+ _ -> mzero
+ Nothing -> notFound f
+
+splitByAnd :: [Inline] -> [[Inline]]
+splitByAnd = splitOn [Space, Str "and", Space]
+
+toLiteralList :: [Block] -> Bib [Inlines]
+toLiteralList [Para xs] =
+ return $ map B.fromList $ splitByAnd xs
+toLiteralList [Plain xs] = toLiteralList [Para xs]
+toLiteralList _ = mzero
+
+concatWith :: Char -> [Inlines] -> Inlines
+concatWith sep = foldl' go mempty
+ where go :: Inlines -> Inlines -> Inlines
+ go accum s
+ | s == mempty = accum
+ | otherwise =
+ case Seq.viewr (B.unMany accum) of
+ Seq.EmptyR -> s
+ _ Seq.:> Str x
+ | not (T.null x) &&
+ T.last x `elem` ("!?.,:;" :: String)
+ -> accum <> B.space <> s
+ _ -> accum <> B.str (T.singleton sep) <>
+ B.space <> s
+
+
+getNameList :: Options -> Text -> Bib [Name]
+getNameList opts f = do
+ fs <- asks fields
+ case Map.lookup f fs of
+ Just x -> latexNames opts x
+ Nothing -> notFound f
+
+toNameList :: Options -> [Block] -> Bib [Name]
+toNameList opts [Para xs] =
+ filter (/= emptyName) <$> mapM (toName opts . addSpaceAfterPeriod)
+ (splitByAnd xs)
+toNameList opts [Plain xs] = toNameList opts [Para xs]
+toNameList _ _ = mzero
+
+latexNames :: Options -> Text -> Bib [Name]
+latexNames opts t = latex' (T.strip t) >>= toNameList opts
+
+-- see issue 392 for motivation. We want to treat
+-- "J.G. Smith" and "J. G. Smith" the same.
+addSpaceAfterPeriod :: [Inline] -> [Inline]
+addSpaceAfterPeriod = go . splitStrWhen (=='.')
+ where
+ go [] = []
+ go (Str (T.unpack -> [c]):Str ".":Str (T.unpack -> [d]):xs)
+ | isLetter d
+ , isLetter c
+ , isUpper c
+ , isUpper d
+ = Str (T.singleton c):Str ".":Space:go (Str (T.singleton d):xs)
+ go (x:xs) = x:go xs
+
+emptyName :: Name
+emptyName =
+ Name { nameFamily = Nothing
+ , nameGiven = Nothing
+ , nameDroppingParticle = Nothing
+ , nameNonDroppingParticle = Nothing
+ , nameSuffix = Nothing
+ , nameLiteral = Nothing
+ , nameCommaSuffix = False
+ , nameStaticOrdering = False
+ }
+
+toName :: Options -> [Inline] -> Bib Name
+toName _ [Str "others"] =
+ return emptyName{ nameLiteral = Just "others" }
+toName _ [Span ("",[],[]) ils] = -- corporate author
+ return emptyName{ nameLiteral = Just $ stringify ils }
+ -- extended BibLaTeX name format - see #266
+toName _ ils@(Str ys:_) | T.any (== '=') ys = do
+ let commaParts = splitWhen (== Str ",")
+ . splitStrWhen (\c -> c == ',' || c == '=' || c == '\160')
+ $ ils
+ let addPart ag (Str "given" : Str "=" : xs) =
+ ag{ nameGiven = case nameGiven ag of
+ Nothing -> Just $ stringify xs
+ Just t -> Just $ t <> " " <> stringify xs }
+ addPart ag (Str "family" : Str "=" : xs) =
+ ag{ nameFamily = Just $ stringify xs }
+ addPart ag (Str "prefix" : Str "=" : xs) =
+ ag{ nameDroppingParticle = Just $ stringify xs }
+ addPart ag (Str "useprefix" : Str "=" : Str "true" : _) =
+ ag{ nameNonDroppingParticle = nameDroppingParticle ag
+ , nameDroppingParticle = Nothing }
+ addPart ag (Str "suffix" : Str "=" : xs) =
+ ag{ nameSuffix = Just $ stringify xs }
+ addPart ag (Space : xs) = addPart ag xs
+ addPart ag _ = ag
+ return $ foldl' addPart emptyName commaParts
+-- First von Last
+-- von Last, First
+-- von Last, Jr ,First
+-- NOTE: biblatex and bibtex differ on:
+-- Drummond de Andrade, Carlos
+-- bibtex takes "Drummond de" as the von;
+-- biblatex takes the whole as a last name.
+-- See https://github.com/plk/biblatex/issues/236
+-- Here we implement the more sensible biblatex behavior.
+toName opts ils = do
+ let useprefix = optionSet "useprefix" opts
+ let usecomma = optionSet "juniorcomma" opts
+ let bibtex = optionSet "bibtex" opts
+ let words' = wordsBy (\x -> x == Space || x == Str "\160")
+ let commaParts = map words' $ splitWhen (== Str ",")
+ $ splitStrWhen
+ (\c -> c == ',' || c == '\160') ils
+ let (first, vonlast, jr) =
+ case commaParts of
+ --- First is the longest sequence of white-space separated
+ -- words starting with an uppercase and that is not the
+ -- whole string. von is the longest sequence of whitespace
+ -- separated words whose last word starts with lower case
+ -- and that is not the whole string.
+ [fvl] -> let (caps', rest') = span isCapitalized fvl
+ in if null rest' && not (null caps')
+ then (init caps', [last caps'], [])
+ else (caps', rest', [])
+ [vl,f] -> (f, vl, [])
+ (vl:j:f:_) -> (f, vl, j )
+ [] -> ([], [], [])
+
+ let (von, lastname) =
+ if bibtex
+ then case span isCapitalized $ reverse vonlast of
+ ([],w:ws) -> (reverse ws, [w])
+ (vs, ws) -> (reverse ws, reverse vs)
+ else case break isCapitalized vonlast of
+ (vs@(_:_), []) -> (init vs, [last vs])
+ (vs, ws) -> (vs, ws)
+ let prefix = T.unwords $ map stringify von
+ let family = T.unwords $ map stringify lastname
+ let suffix = T.unwords $ map stringify jr
+ let given = T.unwords $ map stringify first
+ return
+ Name { nameFamily = if T.null family
+ then Nothing
+ else Just family
+ , nameGiven = if T.null given
+ then Nothing
+ else Just given
+ , nameDroppingParticle = if useprefix || T.null prefix
+ then Nothing
+ else Just prefix
+ , nameNonDroppingParticle = if useprefix && not (T.null prefix)
+ then Just prefix
+ else Nothing
+ , nameSuffix = if T.null suffix
+ then Nothing
+ else Just suffix
+ , nameLiteral = Nothing
+ , nameCommaSuffix = usecomma
+ , nameStaticOrdering = False
+ }
+
+splitStrWhen :: (Char -> Bool) -> [Inline] -> [Inline]
+splitStrWhen _ [] = []
+splitStrWhen p (Str xs : ys) = map Str (go xs) ++ splitStrWhen p ys
+ where go s =
+ let (w,z) = T.break p s
+ in if T.null z
+ then if T.null w
+ then []
+ else [w]
+ else if T.null w
+ then (T.take 1 z : go (T.drop 1 z))
+ else (w : T.take 1 z : go (T.drop 1 z))
+splitStrWhen p (x : ys) = x : splitStrWhen p ys
+
+ordinalize :: Locale -> Text -> Text
+ordinalize locale n =
+ let terms = localeTerms locale
+ pad0 t = case T.length t of
+ 0 -> "00"
+ 1 -> "0" <> t
+ _ -> t
+ in case Map.lookup ("ordinal-" <> pad0 n) terms <|>
+ Map.lookup "ordinal" terms of
+ Nothing -> n
+ Just [] -> n
+ Just (t:_) -> n <> snd t
+
+isCapitalized :: [Inline] -> Bool
+isCapitalized (Str (T.uncons -> Just (c,cs)) : rest)
+ | isUpper c = True
+ | isDigit c = isCapitalized (Str cs : rest)
+ | otherwise = False
+isCapitalized (_:rest) = isCapitalized rest
+isCapitalized [] = True
+
+optionSet :: Text -> Options -> Bool
+optionSet key opts = case lookup key opts of
+ Just "true" -> True
+ Just s -> s == mempty
+ _ -> False
+
+getTypeAndGenre :: Bib (Text, Maybe Text)
+getTypeAndGenre = do
+ lang <- gets localeLang
+ et <- asks entryType
+ guard $ et /= "xdata"
+ reftype' <- resolveKey' lang <$> getRawField "type"
+ <|> return mempty
+ st <- getRawField "entrysubtype" <|> return mempty
+ isEvent <- (True <$ (getRawField "eventdate"
+ <|> getRawField "eventtitle"
+ <|> getRawField "venue")) <|> return False
+ let reftype =
+ case et of
+ "article"
+ | st == "magazine" -> "article-magazine"
+ | st == "newspaper" -> "article-newspaper"
+ | otherwise -> "article-journal"
+ "book" -> "book"
+ "booklet" -> "pamphlet"
+ "bookinbook" -> "chapter"
+ "collection" -> "book"
+ "dataset" -> "dataset"
+ "electronic" -> "webpage"
+ "inbook" -> "chapter"
+ "incollection" -> "chapter"
+ "inreference" -> "entry-encyclopedia"
+ "inproceedings" -> "paper-conference"
+ "manual" -> "book"
+ "mastersthesis" -> "thesis"
+ "misc" -> ""
+ "mvbook" -> "book"
+ "mvcollection" -> "book"
+ "mvproceedings" -> "book"
+ "mvreference" -> "book"
+ "online" -> "webpage"
+ "patent" -> "patent"
+ "periodical"
+ | st == "magazine" -> "article-magazine"
+ | st == "newspaper" -> "article-newspaper"
+ | otherwise -> "article-journal"
+ "phdthesis" -> "thesis"
+ "proceedings" -> "book"
+ "reference" -> "book"
+ "report" -> "report"
+ "software" -> "book" -- no "software" type in CSL
+ "suppbook" -> "chapter"
+ "suppcollection" -> "chapter"
+ "suppperiodical"
+ | st == "magazine" -> "article-magazine"
+ | st == "newspaper" -> "article-newspaper"
+ | otherwise -> "article-journal"
+ "techreport" -> "report"
+ "thesis" -> "thesis"
+ "unpublished" -> if isEvent then "speech" else "manuscript"
+ "www" -> "webpage"
+ -- biblatex, "unsupported"
+ "artwork" -> "graphic"
+ "audio" -> "song" -- for audio *recordings*
+ "commentary" -> "book"
+ "image" -> "graphic" -- or "figure" ?
+ "jurisdiction" -> "legal_case"
+ "legislation" -> "legislation" -- or "bill" ?
+ "legal" -> "treaty"
+ "letter" -> "personal_communication"
+ "movie" -> "motion_picture"
+ "music" -> "song" -- for musical *recordings*
+ "performance" -> "speech"
+ "review" -> "review" -- or "review-book" ?
+ "standard" -> "legislation"
+ "video" -> "motion_picture"
+ -- biblatex-apa:
+ "data" -> "dataset"
+ "letters" -> "personal_communication"
+ "newsarticle" -> "article-newspaper"
+ _ -> ""
+
+ let refgenre =
+ case et of
+ "mastersthesis" -> if T.null reftype'
+ then Just $ resolveKey' lang "mathesis"
+ else Just reftype'
+ "phdthesis" -> if T.null reftype'
+ then Just $ resolveKey' lang "phdthesis"
+ else Just reftype'
+ _ -> if T.null reftype'
+ then Nothing
+ else Just reftype'
+ return (reftype, refgenre)
+
+
+-- transformKey source target key
+-- derived from Appendix C of bibtex manual
+transformKey :: Text -> Text -> Text -> [Text]
+transformKey _ _ "ids" = []
+transformKey _ _ "crossref" = []
+transformKey _ _ "xref" = []
+transformKey _ _ "entryset" = []
+transformKey _ _ "entrysubtype" = []
+transformKey _ _ "execute" = []
+transformKey _ _ "label" = []
+transformKey _ _ "options" = []
+transformKey _ _ "presort" = []
+transformKey _ _ "related" = []
+transformKey _ _ "relatedoptions" = []
+transformKey _ _ "relatedstring" = []
+transformKey _ _ "relatedtype" = []
+transformKey _ _ "shorthand" = []
+transformKey _ _ "shorthandintro" = []
+transformKey _ _ "sortkey" = []
+transformKey x y "author"
+ | x `elem` ["mvbook", "book"] &&
+ y `elem` ["inbook", "bookinbook", "suppbook"] = ["bookauthor", "author"]
+-- note: this next clause is not in the biblatex manual, but it makes
+-- sense in the context of CSL conversion:
+transformKey x y "author"
+ | x == "mvbook" && y == "book" = ["bookauthor", "author"]
+transformKey "mvbook" y z
+ | y `elem` ["book", "inbook", "bookinbook", "suppbook"] = standardTrans z
+transformKey x y z
+ | x `elem` ["mvcollection", "mvreference"] &&
+ y `elem` ["collection", "reference", "incollection", "inreference",
+ "suppcollection"] = standardTrans z
+transformKey "mvproceedings" y z
+ | y `elem` ["proceedings", "inproceedings"] = standardTrans z
+transformKey "book" y z
+ | y `elem` ["inbook", "bookinbook", "suppbook"] = bookTrans z
+transformKey x y z
+ | x `elem` ["collection", "reference"] &&
+ y `elem` ["incollection", "inreference", "suppcollection"] = bookTrans z
+transformKey "proceedings" "inproceedings" z = bookTrans z
+transformKey "periodical" y z
+ | y `elem` ["article", "suppperiodical"] =
+ case z of
+ "title" -> ["journaltitle"]
+ "subtitle" -> ["journalsubtitle"]
+ "shorttitle" -> []
+ "sorttitle" -> []
+ "indextitle" -> []
+ "indexsorttitle" -> []
+ _ -> [z]
+transformKey _ _ x = [x]
+
+standardTrans :: Text -> [Text]
+standardTrans z =
+ case z of
+ "title" -> ["maintitle"]
+ "subtitle" -> ["mainsubtitle"]
+ "titleaddon" -> ["maintitleaddon"]
+ "shorttitle" -> []
+ "sorttitle" -> []
+ "indextitle" -> []
+ "indexsorttitle" -> []
+ _ -> [z]
+
+bookTrans :: Text -> [Text]
+bookTrans z =
+ case z of
+ "title" -> ["booktitle"]
+ "subtitle" -> ["booksubtitle"]
+ "titleaddon" -> ["booktitleaddon"]
+ "shorttitle" -> []
+ "sorttitle" -> []
+ "indextitle" -> []
+ "indexsorttitle" -> []
+ _ -> [z]
+
+resolveKey :: Lang -> Inlines -> Inlines
+resolveKey lang ils = Walk.walk go ils
+ where go (Str s) = Str $ resolveKey' lang s
+ go x = x
+
+resolveKey' :: Lang -> Text -> Text
+resolveKey' lang@(Lang l _) k =
+ case Map.lookup l biblatexStringMap >>= Map.lookup (T.toLower k) of
+ Nothing -> k
+ Just (x, _) -> either (const k) stringify $ parseLaTeX lang x
+
+convertEnDash :: Inline -> Inline
+convertEnDash (Str s) = Str (T.map (\c -> if c == '–' then '-' else c) s)
+convertEnDash x = x
diff --git a/src/Text/Pandoc/Citeproc/CslJson.hs b/src/Text/Pandoc/Citeproc/CslJson.hs
new file mode 100644
index 000000000..862af5188
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/CslJson.hs
@@ -0,0 +1,37 @@
+{-# LANGUAGE OverloadedStrings #-}
+module Text.Pandoc.Citeproc.CslJson
+ ( cslJsonToReferences )
+where
+
+import Citeproc.CslJson
+import Citeproc.Types
+import Control.Monad.Identity (runIdentity)
+import Data.Aeson (eitherDecodeStrict')
+import Data.ByteString (ByteString)
+import Text.Pandoc.Builder as B
+import Data.Text (Text)
+
+fromCslJson :: CslJson Text -> Inlines
+fromCslJson (CslText t) = B.text t
+fromCslJson CslEmpty = mempty
+fromCslJson (CslConcat x y) = fromCslJson x <> fromCslJson y
+fromCslJson (CslQuoted x) = B.doubleQuoted (fromCslJson x)
+fromCslJson (CslItalic x) = B.emph (fromCslJson x)
+fromCslJson (CslNormal x) = fromCslJson x -- TODO?
+fromCslJson (CslBold x) = B.strong (fromCslJson x)
+fromCslJson (CslUnderline x) = B.underline (fromCslJson x)
+fromCslJson (CslNoDecoration x) =
+ B.spanWith ("",["nodecoration"],[]) (fromCslJson x)
+fromCslJson (CslSmallCaps x) = B.smallcaps (fromCslJson x)
+fromCslJson (CslBaseline x) = fromCslJson x
+fromCslJson (CslSub x) = B.subscript (fromCslJson x)
+fromCslJson (CslSup x) = B.superscript (fromCslJson x)
+fromCslJson (CslNoCase x) = B.spanWith ("",["nocase"],[]) (fromCslJson x)
+fromCslJson (CslDiv t x) = B.spanWith ("",["csl-" <> t],[]) (fromCslJson x)
+
+cslJsonToReferences :: ByteString -> Either String [Reference Inlines]
+cslJsonToReferences raw =
+ case eitherDecodeStrict' raw of
+ Left e -> Left e
+ Right cslrefs -> Right $
+ map (runIdentity . traverse (return . fromCslJson)) cslrefs
diff --git a/src/Text/Pandoc/Citeproc/Data.hs b/src/Text/Pandoc/Citeproc/Data.hs
new file mode 100644
index 000000000..dfdaf2598
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/Data.hs
@@ -0,0 +1,31 @@
+{-# LANGUAGE OverloadedStrings #-}
+{-# LANGUAGE TemplateHaskell #-}
+module Text.Pandoc.Citeproc.Data
+ (biblatexStringMap)
+where
+import Data.FileEmbed
+import Data.ByteString (ByteString)
+import qualified Data.Map as M
+import qualified Data.Text.Encoding as TE
+import qualified Data.Text as T
+import Data.Text (Text)
+import Text.Pandoc.Citeproc.Util (toIETF)
+import Citeproc (Lang(..), parseLang)
+
+biblatexLocalizations :: [(FilePath, ByteString)]
+biblatexLocalizations = $(embedDir "citeproc/biblatex-localization")
+
+-- biblatex localization keys, from files at
+-- http://github.com/plk/biblatex/tree/master/tex/latex/biblatex/lbx
+biblatexStringMap :: M.Map Text (M.Map Text (Text, Text))
+biblatexStringMap = foldr go mempty biblatexLocalizations
+ where
+ go (fp, bs) =
+ let Lang lang _ = parseLang (toIETF $ T.takeWhile (/= '.') $ T.pack fp)
+ ls = T.lines $ TE.decodeUtf8 bs
+ in if length ls > 4
+ then M.insert lang (toStringMap $ map (T.splitOn "|") ls)
+ else id
+ toStringMap = foldr go' mempty
+ go' [term, x, y] = M.insert term (x, y)
+ go' _ = id
diff --git a/src/Text/Pandoc/Citeproc/Locator.hs b/src/Text/Pandoc/Citeproc/Locator.hs
new file mode 100644
index 000000000..dba762c02
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/Locator.hs
@@ -0,0 +1,279 @@
+{-# LANGUAGE TupleSections #-}
+{-# LANGUAGE ViewPatterns #-}
+{-# LANGUAGE OverloadedStrings #-}
+module Text.Pandoc.Citeproc.Locator
+ ( parseLocator )
+where
+import Citeproc.Types
+import Data.Text (Text)
+import qualified Data.Text as T
+import Text.Parsec
+import Text.Pandoc.Definition
+import Text.Pandoc.Parsing (romanNumeral)
+import Text.Pandoc.Shared (stringify)
+import Control.Monad (mzero)
+import qualified Data.Map as M
+import Data.Char (isSpace, isPunctuation, isDigit)
+
+parseLocator :: Locale -> [Inline] -> (Maybe (Text, Text), [Inline])
+parseLocator locale inp =
+ case parse (pLocatorWords (toLocatorMap locale)) "suffix" $ splitInp inp of
+ Right r -> r
+ Left _ -> (Nothing, inp)
+
+splitInp :: [Inline] -> [Inline]
+splitInp = splitStrWhen (\c -> isSpace c || (isPunctuation c && c /= ':'))
+
+--
+-- Locator parsing
+--
+
+type LocatorParser = Parsec [Inline] ()
+
+pLocatorWords :: LocatorMap
+ -> LocatorParser (Maybe (Text, Text), [Inline])
+pLocatorWords locMap = do
+ optional $ pMatchChar "," (== ',')
+ optional pSpace
+ (la, lo) <- pLocatorDelimited locMap <|> pLocatorIntegrated locMap
+ s <- getInput -- rest is suffix
+ -- need to trim, otherwise "p. 9" and "9" will have 'different' locators later on
+ -- i.e. the first one will be " 9"
+ return $
+ if T.null la && T.null lo
+ then (Nothing, s)
+ else (Just (la, T.strip lo), s)
+
+pLocatorDelimited :: LocatorMap -> LocatorParser (Text, Text)
+pLocatorDelimited locMap = try $ do
+ _ <- pMatchChar "{" (== '{')
+ skipMany pSpace -- gobble pre-spaces so label doesn't try to include them
+ (la, _) <- pLocatorLabelDelimited locMap
+ -- we only care about balancing {} and [] (because of the outer [] scope);
+ -- the rest can be anything
+ let inner = do { t <- anyToken; return (True, stringify t) }
+ gs <- many (pBalancedBraces [('{','}'), ('[',']')] inner)
+ _ <- pMatchChar "}" (== '}')
+ let lo = T.concat $ map snd gs
+ return (la, lo)
+
+pLocatorLabelDelimited :: LocatorMap -> LocatorParser (Text, Bool)
+pLocatorLabelDelimited locMap
+ = pLocatorLabel' locMap lim <|> return ("page", True)
+ where
+ lim = stringify <$> anyToken
+
+pLocatorIntegrated :: LocatorMap -> LocatorParser (Text, Text)
+pLocatorIntegrated locMap = try $ do
+ (la, wasImplicit) <- pLocatorLabelIntegrated locMap
+ -- if we got the label implicitly, we have presupposed the first one is
+ -- going to have a digit, so guarantee that. You _can_ have p. (a)
+ -- because you specified it.
+ let modifier = if wasImplicit
+ then requireDigits
+ else requireRomansOrDigits
+ g <- try $ pLocatorWordIntegrated (not wasImplicit) >>= modifier
+ gs <- many (try $ pLocatorWordIntegrated False >>= modifier)
+ let lo = T.concat (g:gs)
+ return (la, lo)
+
+pLocatorLabelIntegrated :: LocatorMap -> LocatorParser (Text, Bool)
+pLocatorLabelIntegrated locMap
+ = pLocatorLabel' locMap lim <|> (lookAhead digital >> return ("page", True))
+ where
+ lim = try $ pLocatorWordIntegrated True >>= requireRomansOrDigits
+ digital = try $ pLocatorWordIntegrated True >>= requireDigits
+
+pLocatorLabel' :: LocatorMap -> LocatorParser Text
+ -> LocatorParser (Text, Bool)
+pLocatorLabel' locMap lim = go ""
+ where
+ -- grow the match string until we hit the end
+ -- trying to find the largest match for a label
+ go acc = try $ do
+ -- advance at least one token each time
+ -- the pathological case is "p.3"
+ t <- anyToken
+ ts <- manyTill anyToken (try $ lookAhead lim)
+ let s = acc <> stringify (t:ts)
+ case M.lookup (T.strip s) locMap of
+ -- try to find a longer one, or return this one
+ Just l -> go s <|> return (l, False)
+ Nothing -> go s
+
+-- hard requirement for a locator to have some real digits in it
+requireDigits :: (Bool, Text) -> LocatorParser Text
+requireDigits (_, s) = if not (T.any isDigit s)
+ then Prelude.fail "requireDigits"
+ else return s
+
+-- soft requirement for a sequence with some roman or arabic parts
+-- (a)(iv) -- because iv is roman
+-- 1(a) -- because 1 is an actual digit
+-- NOT: a, (a)-(b), hello, (some text in brackets)
+requireRomansOrDigits :: (Bool, Text) -> LocatorParser Text
+requireRomansOrDigits (d, s) = if not d
+ then Prelude.fail "requireRomansOrDigits"
+ else return s
+
+pLocatorWordIntegrated :: Bool -> LocatorParser (Bool, Text)
+pLocatorWordIntegrated isFirst = try $ do
+ punct <- if isFirst
+ then return ""
+ else (stringify <$> pLocatorSep) <|> return ""
+ sp <- option "" (pSpace >> return " ")
+ (dig, s) <- pBalancedBraces [('(',')'), ('[',']'), ('{','}')] pPageSeq
+ return (dig, punct <> sp <> s)
+
+-- we want to capture: 123, 123A, C22, XVII, 33-44, 22-33; 22-11
+-- 34(1), 34A(A), 34(1)(i)(i), (1)(a)
+-- [17], [17]-[18], '591 [84]'
+-- (because CSL cannot pull out individual pages/sections
+-- to wrap in braces on a per-style basis)
+pBalancedBraces :: [(Char, Char)]
+ -> LocatorParser (Bool, Text)
+ -> LocatorParser (Bool, Text)
+pBalancedBraces braces p = try $ do
+ ss <- many1 surround
+ return $ anyWereDigitLike ss
+ where
+ except = notFollowedBy pBraces >> p
+ -- outer and inner
+ surround = foldl (\a (open, close) -> sur open close except <|> a)
+ except
+ braces
+
+ isc c = stringify <$> pMatchChar [c] (== c)
+
+ sur c c' m = try $ do
+ (d, mid) <- between (isc c) (isc c') (option (False, "") m)
+ return (d, T.cons c . flip T.snoc c' $ mid)
+
+ flattened = concatMap (\(o, c) -> [o, c]) braces
+ pBraces = pMatchChar "braces" (`elem` flattened)
+
+
+-- YES 1, 1.2, 1.2.3
+-- NO 1., 1.2. a.6
+-- can't use sepBy because we want to leave trailing .s
+pPageSeq :: LocatorParser (Bool, Text)
+pPageSeq = oneDotTwo <|> withPeriod
+ where
+ oneDotTwo = do
+ u <- pPageUnit
+ us <- many withPeriod
+ return $ anyWereDigitLike (u:us)
+ withPeriod = try $ do
+ -- .2
+ p <- pMatchChar "." (== '.')
+ u <- try pPageUnit
+ return (fst u, stringify p <> snd u)
+
+anyWereDigitLike :: [(Bool, Text)] -> (Bool, Text)
+anyWereDigitLike as = (any fst as, T.concat $ map snd as)
+
+pPageUnit :: LocatorParser (Bool, Text)
+pPageUnit = roman <|> plainUnit
+ where
+ -- roman is a 'digit'
+ roman = (True,) <$> pRoman
+ plainUnit = do
+ ts <- many1 (notFollowedBy pSpace >>
+ notFollowedBy pLocatorPunct >>
+ anyToken)
+ let s = stringify ts
+ -- otherwise look for actual digits or -s
+ return (T.any isDigit s, s)
+
+pRoman :: LocatorParser Text
+pRoman = try $ do
+ tok <- anyToken
+ case tok of
+ Str t -> case parse (romanNumeral True *> eof)
+ "roman numeral" (T.toUpper t) of
+ Left _ -> mzero
+ Right () -> return t
+ _ -> mzero
+
+pLocatorPunct :: LocatorParser Inline
+pLocatorPunct = pMatchChar "punctuation" isLocatorPunct
+
+pLocatorSep :: LocatorParser Inline
+pLocatorSep = pMatchChar "locator separator" isLocatorSep
+
+pMatchChar :: String -> (Char -> Bool) -> LocatorParser Inline
+pMatchChar msg f = satisfyTok f' <?> msg
+ where
+ f' (Str (T.unpack -> [c])) = f c
+ f' _ = False
+
+pSpace :: LocatorParser Inline
+pSpace = satisfyTok (\t -> isSpacey t || t == Str "\160") <?> "space"
+
+satisfyTok :: (Inline -> Bool) -> LocatorParser Inline
+satisfyTok f = tokenPrim show (\sp _ _ -> sp) (\tok -> if f tok
+ then Just tok
+ else Nothing)
+
+isSpacey :: Inline -> Bool
+isSpacey Space = True
+isSpacey SoftBreak = True
+isSpacey _ = False
+
+isLocatorPunct :: Char -> Bool
+isLocatorPunct '-' = False -- page range
+isLocatorPunct '–' = False -- page range, en dash
+isLocatorPunct ':' = False -- vol:page-range hack
+isLocatorPunct c = isPunctuation c -- includes [{()}]
+
+isLocatorSep :: Char -> Bool
+isLocatorSep ',' = True
+isLocatorSep ';' = True
+isLocatorSep _ = False
+
+splitStrWhen :: (Char -> Bool) -> [Inline] -> [Inline]
+splitStrWhen _ [] = []
+splitStrWhen p (Str xs : ys) = go (T.unpack xs) ++ splitStrWhen p ys
+ where
+ go [] = []
+ go s = case break p s of
+ ([],[]) -> []
+ (zs,[]) -> [Str $ T.pack zs]
+ ([],w:ws) -> Str (T.singleton w) : go ws
+ (zs,w:ws) -> Str (T.pack zs) : Str (T.singleton w) : go ws
+splitStrWhen p (x : ys) = x : splitStrWhen p ys
+
+--
+-- Locator Map
+--
+
+type LocatorMap = M.Map Text Text
+
+toLocatorMap :: Locale -> LocatorMap
+toLocatorMap locale =
+ foldr go mempty locatorTerms
+ where
+ go tname locmap =
+ case M.lookup tname (localeTerms locale) of
+ Nothing -> locmap
+ Just ts -> foldr (\x -> M.insert (snd x) tname) locmap ts
+
+locatorTerms :: [Text]
+locatorTerms =
+ [ "book"
+ , "chapter"
+ , "column"
+ , "figure"
+ , "folio"
+ , "issue"
+ , "line"
+ , "note"
+ , "opus"
+ , "page"
+ , "number-of-pages"
+ , "paragraph"
+ , "part"
+ , "section"
+ , "sub verbo"
+ , "verse"
+ , "volume" ]
diff --git a/src/Text/Pandoc/Citeproc/MetaValue.hs b/src/Text/Pandoc/Citeproc/MetaValue.hs
new file mode 100644
index 000000000..53b14f904
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/MetaValue.hs
@@ -0,0 +1,252 @@
+{-# LANGUAGE OverloadedStrings #-}
+module Text.Pandoc.Citeproc.MetaValue
+ ( referenceToMetaValue
+ , metaValueToReference
+ , metaValueToText
+ , metaValueToPath
+ )
+where
+
+import Citeproc.Types
+import Text.Pandoc.Definition
+import Text.Pandoc.Builder as B
+import Text.Pandoc.Walk (query)
+import Text.Pandoc.Shared (stringify)
+import Data.Maybe
+import Safe
+import qualified Data.Set as Set
+import qualified Data.Map as M
+import qualified Data.Text as T
+import Data.Text (Text)
+import Text.Printf (printf)
+import Control.Applicative ((<|>))
+
+metaValueToText :: MetaValue -> Maybe Text
+metaValueToText (MetaString t) = Just t
+metaValueToText (MetaInlines ils) = Just $ stringify ils
+metaValueToText (MetaBlocks bls) = Just $ stringify bls
+metaValueToText (MetaList xs) = T.unwords <$> mapM metaValueToText xs
+metaValueToText _ = Nothing
+
+metaValueToPath :: MetaValue -> Maybe FilePath
+metaValueToPath = fmap T.unpack . metaValueToText
+
+metaValueToBool :: MetaValue -> Maybe Bool
+metaValueToBool (MetaBool b) = Just b
+metaValueToBool (MetaString "true") = Just True
+metaValueToBool (MetaString "false") = Just False
+metaValueToBool (MetaInlines ils) =
+ metaValueToBool (MetaString (stringify ils))
+metaValueToBool _ = Nothing
+
+referenceToMetaValue :: Reference Inlines -> MetaValue
+referenceToMetaValue ref =
+ let ItemId id' = referenceId ref
+ type' = referenceType ref
+ in MetaMap $ M.insert "id" (MetaString id')
+ $ M.insert "type" (MetaString type')
+ $ M.map valToMetaValue
+ $ M.mapKeys fromVariable
+ $ referenceVariables ref
+
+
+valToMetaValue :: Val Inlines -> MetaValue
+valToMetaValue (TextVal t) = MetaString t
+valToMetaValue (FancyVal ils) = MetaInlines (B.toList ils)
+valToMetaValue (NumVal n) = MetaString (T.pack $ show n)
+valToMetaValue (NamesVal ns) = MetaList $ map nameToMetaValue ns
+valToMetaValue (DateVal d) = dateToMetaValue d
+
+nameToMetaValue :: Name -> MetaValue
+nameToMetaValue name =
+ MetaMap $
+ (maybe id (M.insert "family" . MetaString) (nameFamily name)) .
+ (maybe id (M.insert "given" . MetaString) (nameGiven name)) .
+ (maybe id (M.insert "dropping-particle" . MetaString)
+ (nameDroppingParticle name)) .
+ (maybe id (M.insert "non-dropping-particle" . MetaString)
+ (nameNonDroppingParticle name)) .
+ (maybe id (M.insert "suffix" . MetaString) (nameSuffix name)) .
+ (maybe id (M.insert "literal" . MetaString) (nameLiteral name)) .
+ (if nameCommaSuffix name
+ then M.insert "comma-suffix" (MetaBool True)
+ else id) .
+ (if nameStaticOrdering name
+ then M.insert "static-ordering" (MetaBool True)
+ else id)
+ $ mempty
+
+dateToMetaValue :: Date -> MetaValue
+dateToMetaValue date =
+ MetaString $
+ (case dateLiteral date of
+ Just l -> l
+ Nothing -> T.intercalate "/" $ map datePartsToEDTF $ dateParts date)
+ <> (if dateCirca date then "~" else "")
+ where
+ datePartsToEDTF (DateParts dps) =
+ T.pack $
+ (case dps of
+ (y:_) | y > 9999 || y < -10000 -> ('y':)
+ _ -> id) $
+ case dps of
+ (y:m:d:_)
+ | y < -1 -> printf "%05d-%02d-%02d" (y+1) m d
+ | otherwise -> printf "%04d-%02d-%02d" y m d
+ (y:m:[])
+ | y < -1 -> printf "%05d-%02d" (y+1) m
+ | otherwise -> printf "%04d-%02d" y m
+ (y:[])
+ | y < -1 -> printf "%05d" (y+1)
+ | otherwise -> printf "%04d" y
+ _ -> mempty
+
+metaValueToReference :: MetaValue -> Maybe (Reference Inlines)
+metaValueToReference (MetaMap m) = do
+ let m' = M.mapKeys normalizeKey m
+ id' <- M.lookup "id" m' >>= metaValueToText
+ type' <- (M.lookup "type" m' >>= metaValueToText) <|> pure ""
+ let m'' = M.delete "id" $ M.delete "type" m'
+ let vars = M.mapKeys toVariable $ M.mapWithKey metaValueToVal m''
+ return $ Reference { referenceId = ItemId id'
+ , referenceType = type'
+ , referenceDisambiguation = Nothing
+ , referenceVariables = vars }
+metaValueToReference _ = Nothing
+
+metaValueToVal :: Text -> MetaValue -> Val Inlines
+metaValueToVal k v
+ | k `Set.member` dateVariables
+ = DateVal $ metaValueToDate v
+ | k `Set.member` nameVariables
+ = NamesVal $ metaValueToNames v
+ | k == "other-ids"
+ = TextVal $ fromMaybe mempty $ metaValueToText v
+ -- will create space-separated list
+ | otherwise =
+ case v of
+ MetaString t -> TextVal t
+ MetaInlines ils -> FancyVal (B.fromList ils)
+ MetaBlocks bs -> FancyVal (B.fromList $ query id bs)
+ MetaBool b -> TextVal (if b then "true" else "false")
+ MetaList _ -> TextVal mempty
+ MetaMap _ -> TextVal mempty
+
+metaValueToDate :: MetaValue -> Date
+metaValueToDate (MetaMap m) =
+ Date
+ { dateParts = dateparts
+ , dateCirca = circa
+ , dateSeason = season
+ , dateLiteral = literal }
+ where
+ dateparts = case M.lookup "date-parts" m of
+ Just (MetaList xs) ->
+ mapMaybe metaValueToDateParts xs
+ Just _ -> []
+ Nothing ->
+ maybe [] (:[]) $ metaValueToDateParts (MetaMap m)
+ circa = fromMaybe False $
+ M.lookup "circa" m >>= metaValueToBool
+ season = M.lookup "season" m >>= metaValueToInt
+ literal = M.lookup "literal" m >>= metaValueToText
+metaValueToDate (MetaList xs) =
+ Date{ dateParts = mapMaybe metaValueToDateParts xs
+ , dateCirca = False
+ , dateSeason = Nothing
+ , dateLiteral = Nothing }
+metaValueToDate x =
+ fromMaybe emptyDate $ metaValueToText x >>= rawDateEDTF
+
+
+metaValueToInt :: MetaValue -> Maybe Int
+metaValueToInt x = metaValueToText x >>= readMay . T.unpack
+
+metaValueToDateParts :: MetaValue -> Maybe DateParts
+metaValueToDateParts (MetaList xs) =
+ Just $ DateParts $ map (fromMaybe 0 . metaValueToInt) xs
+metaValueToDateParts (MetaMap m) =
+ case (M.lookup "year" m >>= metaValueToInt,
+ ((M.lookup "month" m >>= metaValueToInt)
+ <|>
+ ((+ 20) <$> (M.lookup "season" m >>= metaValueToInt))),
+ M.lookup "day" m >>= metaValueToInt) of
+ (Just y, Just mo, Just d) -> Just $ DateParts [y, mo, d]
+ (Just y, Just mo, Nothing) -> Just $ DateParts [y, mo]
+ (Just y, Nothing, _) -> Just $ DateParts [y]
+ _ -> Nothing
+metaValueToDateParts _ = Nothing
+
+emptyDate :: Date
+emptyDate = Date { dateParts = []
+ , dateCirca = False
+ , dateSeason = Nothing
+ , dateLiteral = Nothing }
+
+metaValueToNames :: MetaValue -> [Name]
+metaValueToNames (MetaList xs) = mapMaybe metaValueToName xs
+metaValueToNames x = maybeToList $ metaValueToName x
+
+metaValueToName :: MetaValue -> Maybe Name
+metaValueToName (MetaMap m) = extractParticles <$>
+ Just Name
+ { nameFamily = family
+ , nameGiven = given
+ , nameDroppingParticle = dropping
+ , nameNonDroppingParticle = nondropping
+ , nameSuffix = suffix
+ , nameCommaSuffix = commasuffix
+ , nameStaticOrdering = staticordering
+ , nameLiteral = literal
+ }
+ where
+ family = M.lookup "family" m >>= metaValueToText
+ given = M.lookup "given" m >>= metaValueToText
+ dropping = M.lookup "dropping-particle" m
+ >>= metaValueToText
+ nondropping = M.lookup "non-dropping-particle" m
+ >>= metaValueToText
+ suffix = M.lookup "suffix" m >>= metaValueToText
+ commasuffix = fromMaybe False $
+ M.lookup "comma-suffix" m >>= metaValueToBool
+ staticordering = fromMaybe False $
+ M.lookup "static-ordering" m >>= metaValueToBool
+ literal = M.lookup "literal" m >>= metaValueToText
+metaValueToName x = extractParticles <$>
+ case metaValueToText x of
+ Nothing -> Nothing
+ Just lit -> Just Name
+ { nameFamily = Nothing
+ , nameGiven = Nothing
+ , nameDroppingParticle = Nothing
+ , nameNonDroppingParticle = Nothing
+ , nameSuffix = Nothing
+ , nameCommaSuffix = False
+ , nameStaticOrdering = False
+ , nameLiteral = Just lit }
+
+dateVariables :: Set.Set Text
+dateVariables = Set.fromList
+ [ "accessed", "container", "event-date", "issued",
+ "original-date", "submitted" ]
+
+nameVariables :: Set.Set Text
+nameVariables = Set.fromList
+ [ "author", "collection-editor", "composer",
+ "container-author", "director", "editor",
+ "editorial-director", "illustrator",
+ "interviewer", "original-author",
+ "recipient", "reviewed-author",
+ "translator" ]
+
+normalizeKey :: Text -> Text
+normalizeKey k =
+ case T.toLower k of
+ "doi" -> "DOI"
+ "isbn" -> "ISBN"
+ "issn" -> "ISSN"
+ "pmcid" -> "PMCID"
+ "pmid" -> "PMID"
+ "url" -> "URL"
+ x -> x
+
diff --git a/src/Text/Pandoc/Citeproc/Util.hs b/src/Text/Pandoc/Citeproc/Util.hs
new file mode 100644
index 000000000..6d8e01bc9
--- /dev/null
+++ b/src/Text/Pandoc/Citeproc/Util.hs
@@ -0,0 +1,70 @@
+{-# LANGUAGE FlexibleContexts #-}
+{-# LANGUAGE OverloadedStrings #-}
+module Text.Pandoc.Citeproc.Util
+ ( toIETF )
+where
+import Data.Text (Text)
+
+toIETF :: Text -> Text
+toIETF "english" = "en-US" -- "en-EN" unavailable in CSL
+toIETF "usenglish" = "en-US"
+toIETF "american" = "en-US"
+toIETF "british" = "en-GB"
+toIETF "ukenglish" = "en-GB"
+toIETF "canadian" = "en-US" -- "en-CA" unavailable in CSL
+toIETF "australian" = "en-GB" -- "en-AU" unavailable in CSL
+toIETF "newzealand" = "en-GB" -- "en-NZ" unavailable in CSL
+toIETF "afrikaans" = "af-ZA"
+toIETF "arabic" = "ar"
+toIETF "basque" = "eu"
+toIETF "bulgarian" = "bg-BG"
+toIETF "catalan" = "ca-AD"
+toIETF "croatian" = "hr-HR"
+toIETF "czech" = "cs-CZ"
+toIETF "danish" = "da-DK"
+toIETF "dutch" = "nl-NL"
+toIETF "estonian" = "et-EE"
+toIETF "finnish" = "fi-FI"
+toIETF "canadien" = "fr-CA"
+toIETF "acadian" = "fr-CA"
+toIETF "french" = "fr-FR"
+toIETF "francais" = "fr-FR"
+toIETF "austrian" = "de-AT"
+toIETF "naustrian" = "de-AT"
+toIETF "german" = "de-DE"
+toIETF "germanb" = "de-DE"
+toIETF "ngerman" = "de-DE"
+toIETF "greek" = "el-GR"
+toIETF "polutonikogreek" = "el-GR"
+toIETF "hebrew" = "he-IL"
+toIETF "hungarian" = "hu-HU"
+toIETF "icelandic" = "is-IS"
+toIETF "italian" = "it-IT"
+toIETF "japanese" = "ja-JP"
+toIETF "latvian" = "lv-LV"
+toIETF "lithuanian" = "lt-LT"
+toIETF "magyar" = "hu-HU"
+toIETF "mongolian" = "mn-MN"
+toIETF "norsk" = "nb-NO"
+toIETF "nynorsk" = "nn-NO"
+toIETF "farsi" = "fa-IR"
+toIETF "polish" = "pl-PL"
+toIETF "brazil" = "pt-BR"
+toIETF "brazilian" = "pt-BR"
+toIETF "portugues" = "pt-PT"
+toIETF "portuguese" = "pt-PT"
+toIETF "romanian" = "ro-RO"
+toIETF "russian" = "ru-RU"
+toIETF "serbian" = "sr-RS"
+toIETF "serbianc" = "sr-RS"
+toIETF "slovak" = "sk-SK"
+toIETF "slovene" = "sl-SL"
+toIETF "spanish" = "es-ES"
+toIETF "swedish" = "sv-SE"
+toIETF "thai" = "th-TH"
+toIETF "turkish" = "tr-TR"
+toIETF "ukrainian" = "uk-UA"
+toIETF "vietnamese" = "vi-VN"
+toIETF "latin" = "la"
+toIETF x = x
+