From fd99fe4d7ee5459d32505ef14f4453839cfcefab Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Tue, 17 Aug 2021 15:34:23 -0700 Subject: Revise citeproc code to fit new citeproc 0.5 API. Linkification of URLs in the bibliography is now done in the citeproc library, depending on the setting of an option. We set that option depending on the value of the metadata field `link-bibliography` (defaulting to true, for consistency with earlier behavior, though the new behavior includes the CSL draft recommendation of hyperlinking the title or the whole entry if a DOI, PMID, PMCID, or URL field is present but not explicitly rendered). These changes implement the following recommendations from the draft CSL v1.0.2 spec (Appendix VI): > The CSL syntax does not have support for configuration of links. > However, processors should include links on bibliographic references, > using the following rules: > If the bibliography entry for an item renders any of the following > identifiers, the identifier should be anchored as a link, with the > target of the link as follows: > - url: output as is > - doi: prepend with "`https://doi.org/`" > - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`" > - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`" > If the identifier is rendered as a URI, include rendered URI components > (e.g. "`https://doi.org/`") in the link anchor. Do not include any other > affix text in the link anchor (e.g. "Available from: ", "doi: ", "PMID: "). > If the bibliography entry for an item does not render any of > the above identifiers, then set the anchor of the link as the item > title. If title is not rendered, then set the anchor of the link as the > full bibliography entry for the item. Set the target of the link as one > of the following, in order of priority: > > - doi: prepend with "`https://doi.org/`" > - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`" > - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`" > - url: output as is > > If the item data does not include any of the above identifiers, do not > include a link. > > Citation processors should include an option flag for calling > applications to disable bibliography linking behavior. Thanks to Benjamin Bray for getting this all working. --- test/command/pandoc-citeproc-327.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'test/command/pandoc-citeproc-327.md') diff --git a/test/command/pandoc-citeproc-327.md b/test/command/pandoc-citeproc-327.md index 92b29bdf8..5cfd59576 100644 --- a/test/command/pandoc-citeproc-327.md +++ b/test/command/pandoc-citeproc-327.md @@ -44,8 +44,8 @@ I referenced something here^\[1\]^ ::: {#refs .references .csl-bib-body} ::: {#ref-LiLiaoDongWanHaiYuDiQiDongWuCiJiShengChanLiYanJiuJiShengJingGuaYiXingPingJie2017 .csl-entry} [\[1\] ]{.csl-left-margin}[李轶平, 于旭光, 孙明, 等. -辽东湾海域底栖动物次级生产力研究及生境适宜性评价\[J\]. 水产科学, -2017(06): 728--734.]{.csl-right-inline} +[辽东湾海域底栖动物次级生产力研究及生境适宜性评价](http://kns.cnki.net/kns/detail/detail.aspx?QueryID=4&CurRec=4&recid=&FileName=CHAN201706006&DbName=CJFDLAST2018&DbCode=CJFQ&yx=Y&pr=&URLID=21.1110.S.20171129.1725.006)\[J\]. +水产科学, 2017(06): 728--734.]{.csl-right-inline} ::: ::: ``` -- cgit v1.2.3 From d226a35c0ac6485c75f083ce3b25ada1d623f45f Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Mon, 25 Oct 2021 08:48:18 -0700 Subject: Switch back from HsYAML to yaml. Reasons: - Performance: HsYAML is around 20 times slower in parsing large YAML bibliographies (#6084). - An issue was submitted to HsYAML, but it hasn't gotten any attention. HsYAML seems borderline unmaintained; it hasn't had a commit in over a year. - Unfortunately this goes back on our attempts to free ourselves from C dependencies (#4535). But I don't see a better alternative until a better pure Haskell parser is available. Closes #6084. Notes: - We've removed the FromYAML instances for all types that had them, since this is a HsYAML-specific typeclass [API change]. (The yaml package just uses From/ToJSON.) - Unlike HsYAML (in the configuration we were using), yaml parses 'Y', 'N', 'Yes', 'No', 'On', 'Off' as boolean values. Users may need to quote these when they are meant to be interpreted as strings. Similarly, 'null' is parsed as a YAML null value (and will be treated as an empty string by pandoc rather than the string 'null'). Quoting it will force it to be interpreted as a string. - Some tests had to be adjusted accordingly. - Pandoc now behaves better when the YAML metadata contains escaping errors: instead of just falling back on treating the section as a table, it raises a YAML parsing error. --- pandoc.cabal | 10 +- src/Text/Pandoc/App.hs | 2 +- src/Text/Pandoc/App/Opt.hs | 296 ++++++++++++++++------------------- src/Text/Pandoc/Citeproc.hs | 2 +- src/Text/Pandoc/Filter.hs | 20 ++- src/Text/Pandoc/Logging.hs | 8 - src/Text/Pandoc/Options.hs | 217 +++++++++++++------------ src/Text/Pandoc/Readers/Markdown.hs | 10 +- src/Text/Pandoc/Readers/Metadata.hs | 168 +++++++------------- src/Text/Pandoc/Translations.hs | 30 +--- test/command/4819.md | 4 +- test/command/6741.md | 8 +- test/command/pandoc-citeproc-312.md | 2 +- test/command/pandoc-citeproc-327.md | 2 +- test/command/yaml-metadata-blocks.md | 4 +- 15 files changed, 348 insertions(+), 435 deletions(-) (limited to 'test/command/pandoc-citeproc-327.md') diff --git a/pandoc.cabal b/pandoc.cabal index 458e69ab6..f442bc18c 100644 --- a/pandoc.cabal +++ b/pandoc.cabal @@ -522,7 +522,6 @@ common common-executable library import: common-options build-depends: Glob >= 0.7 && < 0.11, - HsYAML >= 0.2 && < 0.3, JuicyPixels >= 3.1.6.1 && < 3.4, SHA >= 1.6 && < 1.7, aeson >= 0.7 && < 2.1, @@ -567,6 +566,8 @@ library network-uri >= 2.6 && < 2.8, pandoc-types >= 1.22.1 && < 1.23, parsec >= 3.1 && < 3.2, + pretty >= 1.1 && < 1.2, + pretty-show >= 1.10 && < 1.11, process >= 1.2.3 && < 1.7, random >= 1 && < 1.3, safe >= 0.3.18 && < 0.4, @@ -581,14 +582,13 @@ library text >= 1.1.1.0 && < 1.3, text-conversions >= 0.3 && < 0.4, time >= 1.5 && < 1.13, + unicode-collation >= 0.1.1 && < 0.2, unicode-transforms >= 0.3 && < 0.4, xml >= 1.3.12 && < 1.4, xml-conduit >= 1.9.1.1 && < 1.10, - unicode-collation >= 0.1.1 && < 0.2, + yaml >= 0.11 && < 0.12, zip-archive >= 0.2.3.4 && < 0.5, - zlib >= 0.5 && < 0.7, - pretty-show >= 1.10 && < 1.11, - pretty >= 1.1 && < 1.2 + zlib >= 0.5 && < 0.7 if os(windows) && arch(i386) build-depends: basement >= 0.0.10, foundation >= 0.0.23 diff --git a/src/Text/Pandoc/App.hs b/src/Text/Pandoc/App.hs index b639a97b7..20e647456 100644 --- a/src/Text/Pandoc/App.hs +++ b/src/Text/Pandoc/App.hs @@ -217,7 +217,7 @@ convertWithOpts opts = do case optMetadataFiles opts of [] -> return mempty paths -> mconcat <$> - mapM (\path -> do raw <- readFileLazy path + mapM (\path -> do raw <- readFileStrict path yamlToMeta readerOpts (Just path) raw) paths let transforms = (case optShiftHeadingLevelBy opts of diff --git a/src/Text/Pandoc/App/Opt.hs b/src/Text/Pandoc/App/Opt.hs index 48eb15fdf..93953d53f 100644 --- a/src/Text/Pandoc/App/Opt.hs +++ b/src/Text/Pandoc/App/Opt.hs @@ -40,7 +40,7 @@ import Text.Pandoc.Options (TopLevelDivision (TopLevelDefault), ReferenceLocation (EndOfDocument), ObfuscationMethod (NoObfuscation), CiteMethod (Citeproc)) -import Text.Pandoc.Class (readFileLazy, fileExists, setVerbosity, report, +import Text.Pandoc.Class (readFileStrict, fileExists, setVerbosity, report, PandocMonad(lookupEnv), getUserDataDir) import Text.Pandoc.Error (PandocError (PandocParseError, PandocSomeError)) import Text.Pandoc.Shared (camelCaseStrToHyphenated, defaultUserDataDir, @@ -54,22 +54,14 @@ import Data.Default (def) import qualified Data.Text as T import qualified Data.Map as M import Text.Pandoc.Definition (Meta(..), MetaValue(..)) -import Data.Aeson (defaultOptions, Options(..)) +import Data.Aeson (defaultOptions, Options(..), Result(..), fromJSON) import Data.Aeson.TH (deriveJSON) import Control.Applicative ((<|>)) -import Data.YAML +import Data.Yaml -- | The type of line-endings to be used when writing plain-text. data LineEnding = LF | CRLF | Native deriving (Show, Generic) -instance FromYAML LineEnding where - parseYAML = withStr "LineEnding" $ \t -> - case T.toLower t of - "lf" -> return LF - "crlf" -> return CRLF - "native" -> return Native - _ -> fail $ "Unknown line ending type " ++ show t - -- | How to handle output blocks in ipynb. data IpynbOutput = IpynbOutputAll @@ -77,14 +69,6 @@ data IpynbOutput = | IpynbOutputBest deriving (Show, Generic) -instance FromYAML IpynbOutput where - parseYAML = withStr "LineEnding" $ \t -> - case t of - "none" -> return IpynbOutputNone - "all" -> return IpynbOutputAll - "best" -> return IpynbOutputBest - _ -> fail $ "Unknown ipynb output type " ++ show t - -- | Data structure for command line options. data Opt = Opt { optTabStop :: Int -- ^ Number of spaces per tab @@ -163,9 +147,12 @@ data Opt = Opt , optSandbox :: Bool } deriving (Generic, Show) -instance FromYAML (Opt -> Opt) where - parseYAML (Mapping _ _ m) = chain doOpt (M.toList m) - parseYAML n = failAtNode n "Expected a mapping" +instance FromJSON (Opt -> Opt) where + parseJSON (Object m) = + case fromJSON (Object m) of + Error err' -> fail err' + Success (m' :: M.Map Text Value) -> chain doOpt (M.toList m') + parseJSON _ = fail "Expected a mapping" data DefaultsState = DefaultsState { @@ -174,22 +161,21 @@ data DefaultsState = DefaultsState } deriving (Show) instance (PandocMonad m, MonadIO m) - => FromYAML (Opt -> StateT DefaultsState m Opt) where - parseYAML (Mapping _ _ m) = do - let opts = M.mapKeys toText m - dataDir <- case M.lookup "data-dir" opts of - Nothing -> return Nothing - Just v -> Just . unpack <$> parseYAML v - f <- parseOptions (M.toList m) - case M.lookup "defaults" opts of - Just v -> do - g <- parseDefaults v dataDir - return $ g >=> f >=> resolveVarsInOpt - Nothing -> return $ f >=> resolveVarsInOpt - where - toText (Scalar _ (SStr s)) = s - toText _ = "" - parseYAML n = failAtNode n "Expected a mapping" + => FromJSON (Opt -> StateT DefaultsState m Opt) where + parseJSON (Object o) = + case fromJSON (Object o) of + Error err' -> fail err' + Success (opts :: M.Map Text Value) -> do + dataDir <- case M.lookup "data-dir" opts of + Nothing -> return Nothing + Just v -> Just . unpack <$> parseJSON v + f <- parseOptions (M.toList opts) + case M.lookup "defaults" opts of + Just v -> do + g <- parseDefaults v dataDir + return $ g >=> f >=> resolveVarsInOpt + Nothing -> return $ f >=> resolveVarsInOpt + parseJSON _ = fail "Expected a mapping" resolveVarsInOpt :: forall m. (PandocMonad m, MonadIO m) => Opt -> StateT DefaultsState m Opt @@ -303,7 +289,7 @@ resolveVarsInOpt parseDefaults :: (PandocMonad m, MonadIO m) - => Node Pos + => Value -> Maybe FilePath -> Parser (Opt -> StateT DefaultsState m Opt) parseDefaults n dataDir = parseDefsNames n >>= \ds -> return $ \o -> do @@ -322,11 +308,11 @@ parseDefaults n dataDir = parseDefsNames n >>= \ds -> return $ \o -> do "Error: Circular defaults file reference in " ++ "'" ++ defsParent ++ "'" else foldM applyDefaults o defsChildren - where parseDefsNames x = (parseYAML x >>= \xs -> return $ map unpack xs) - <|> (parseYAML x >>= \x' -> return [unpack x']) + where parseDefsNames x = (parseJSON x >>= \xs -> return $ map unpack xs) + <|> (parseJSON x >>= \x' -> return [unpack x']) parseOptions :: Monad m - => [(Node Pos, Node Pos)] + => [(Text, Value)] -> Parser (Opt -> StateT DefaultsState m Opt) parseOptions ns = do f <- chain doOpt' ns @@ -336,269 +322,267 @@ chain :: Monad m => (a -> m (b -> b)) -> [a] -> m (b -> b) chain f = foldM g id where g o n = f n >>= \o' -> return $ o' . o -doOpt' :: (Node Pos, Node Pos) -> Parser (Opt -> Opt) -doOpt' (k',v) = do - k <- parseStringKey k' +doOpt' :: (Text, Value) -> Parser (Opt -> Opt) +doOpt' (k,v) = do case k of "defaults" -> return id - _ -> doOpt (k',v) + _ -> doOpt (k,v) -doOpt :: (Node Pos, Node Pos) -> Parser (Opt -> Opt) -doOpt (k',v) = do - k <- parseStringKey k' +doOpt :: (Text, Value) -> Parser (Opt -> Opt) +doOpt (k,v) = do case k of "tab-stop" -> - parseYAML v >>= \x -> return (\o -> o{ optTabStop = x }) + parseJSON v >>= \x -> return (\o -> o{ optTabStop = x }) "preserve-tabs" -> - parseYAML v >>= \x -> return (\o -> o{ optPreserveTabs = x }) + parseJSON v >>= \x -> return (\o -> o{ optPreserveTabs = x }) "standalone" -> - parseYAML v >>= \x -> return (\o -> o{ optStandalone = x }) + parseJSON v >>= \x -> return (\o -> o{ optStandalone = x }) "table-of-contents" -> - parseYAML v >>= \x -> return (\o -> o{ optTableOfContents = x }) + parseJSON v >>= \x -> return (\o -> o{ optTableOfContents = x }) "toc" -> - parseYAML v >>= \x -> return (\o -> o{ optTableOfContents = x }) + parseJSON v >>= \x -> return (\o -> o{ optTableOfContents = x }) "from" -> - parseYAML v >>= \x -> return (\o -> o{ optFrom = x }) + parseJSON v >>= \x -> return (\o -> o{ optFrom = x }) "reader" -> - parseYAML v >>= \x -> return (\o -> o{ optFrom = x }) + parseJSON v >>= \x -> return (\o -> o{ optFrom = x }) "to" -> - parseYAML v >>= \x -> return (\o -> o{ optTo = x }) + parseJSON v >>= \x -> return (\o -> o{ optTo = x }) "writer" -> - parseYAML v >>= \x -> return (\o -> o{ optTo = x }) + parseJSON v >>= \x -> return (\o -> o{ optTo = x }) "shift-heading-level-by" -> - parseYAML v >>= \x -> return (\o -> o{ optShiftHeadingLevelBy = x }) + parseJSON v >>= \x -> return (\o -> o{ optShiftHeadingLevelBy = x }) "template" -> - parseYAML v >>= \x -> return (\o -> o{ optTemplate = unpack <$> x }) + parseJSON v >>= \x -> return (\o -> o{ optTemplate = unpack <$> x }) "variables" -> - parseYAML v >>= \x -> return (\o -> o{ optVariables = + parseJSON v >>= \x -> return (\o -> o{ optVariables = x <> optVariables o }) -- Note: x comes first because <> for Context is left-biased union -- and we want to favor later default files. See #5988. "metadata" -> yamlToMeta v >>= \x -> return (\o -> o{ optMetadata = optMetadata o <> x }) "metadata-files" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optMetadataFiles = optMetadataFiles o <> map unpack x }) "metadata-file" -> -- allow either a list or a single value - (parseYAML v >>= \x -> return (\o -> o{ optMetadataFiles = + (parseJSON v >>= \x -> return (\o -> o{ optMetadataFiles = optMetadataFiles o <> map unpack x })) <|> - (parseYAML v >>= \x -> + (parseJSON v >>= \x -> return (\o -> o{ optMetadataFiles = optMetadataFiles o <>[unpack x] })) "output-file" -> - parseYAML v >>= \x -> return (\o -> o{ optOutputFile = unpack <$> x }) + parseJSON v >>= \x -> return (\o -> o{ optOutputFile = unpack <$> x }) "input-files" -> - parseYAML v >>= \x -> return (\o -> o{ optInputFiles = + parseJSON v >>= \x -> return (\o -> o{ optInputFiles = optInputFiles o <> (map unpack <$> x) }) "input-file" -> -- allow either a list or a single value - (parseYAML v >>= \x -> return (\o -> o{ optInputFiles = + (parseJSON v >>= \x -> return (\o -> o{ optInputFiles = optInputFiles o <> (map unpack <$> x) })) <|> - (parseYAML v >>= \x -> return (\o -> o{ optInputFiles = + (parseJSON v >>= \x -> return (\o -> o{ optInputFiles = optInputFiles o <> ((\z -> [unpack z]) <$> x) })) "number-sections" -> - parseYAML v >>= \x -> return (\o -> o{ optNumberSections = x }) + parseJSON v >>= \x -> return (\o -> o{ optNumberSections = x }) "number-offset" -> - parseYAML v >>= \x -> return (\o -> o{ optNumberOffset = x }) + parseJSON v >>= \x -> return (\o -> o{ optNumberOffset = x }) "section-divs" -> - parseYAML v >>= \x -> return (\o -> o{ optSectionDivs = x }) + parseJSON v >>= \x -> return (\o -> o{ optSectionDivs = x }) "incremental" -> - parseYAML v >>= \x -> return (\o -> o{ optIncremental = x }) + parseJSON v >>= \x -> return (\o -> o{ optIncremental = x }) "self-contained" -> - parseYAML v >>= \x -> return (\o -> o{ optSelfContained = x }) + parseJSON v >>= \x -> return (\o -> o{ optSelfContained = x }) "html-q-tags" -> - parseYAML v >>= \x -> return (\o -> o{ optHtmlQTags = x }) + parseJSON v >>= \x -> return (\o -> o{ optHtmlQTags = x }) "highlight-style" -> - parseYAML v >>= \x -> return (\o -> o{ optHighlightStyle = x }) + parseJSON v >>= \x -> return (\o -> o{ optHighlightStyle = x }) "syntax-definition" -> - (parseYAML v >>= \x -> + (parseJSON v >>= \x -> return (\o -> o{ optSyntaxDefinitions = optSyntaxDefinitions o <> map unpack x })) <|> - (parseYAML v >>= \x -> + (parseJSON v >>= \x -> return (\o -> o{ optSyntaxDefinitions = optSyntaxDefinitions o <> [unpack x] })) "syntax-definitions" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optSyntaxDefinitions = optSyntaxDefinitions o <> map unpack x }) "top-level-division" -> - parseYAML v >>= \x -> return (\o -> o{ optTopLevelDivision = x }) + parseJSON v >>= \x -> return (\o -> o{ optTopLevelDivision = x }) "html-math-method" -> - parseYAML v >>= \x -> return (\o -> o{ optHTMLMathMethod = x }) + parseJSON v >>= \x -> return (\o -> o{ optHTMLMathMethod = x }) "abbreviations" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optAbbreviations = unpack <$> x }) "reference-doc" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optReferenceDoc = unpack <$> x }) "epub-subdirectory" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optEpubSubdirectory = unpack x }) "epub-metadata" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optEpubMetadata = unpack <$> x }) "epub-fonts" -> - parseYAML v >>= \x -> return (\o -> o{ optEpubFonts = optEpubFonts o <> + parseJSON v >>= \x -> return (\o -> o{ optEpubFonts = optEpubFonts o <> map unpack x }) "epub-chapter-level" -> - parseYAML v >>= \x -> return (\o -> o{ optEpubChapterLevel = x }) + parseJSON v >>= \x -> return (\o -> o{ optEpubChapterLevel = x }) "epub-cover-image" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optEpubCoverImage = unpack <$> x }) "toc-depth" -> - parseYAML v >>= \x -> return (\o -> o{ optTOCDepth = x }) + parseJSON v >>= \x -> return (\o -> o{ optTOCDepth = x }) "dump-args" -> - parseYAML v >>= \x -> return (\o -> o{ optDumpArgs = x }) + parseJSON v >>= \x -> return (\o -> o{ optDumpArgs = x }) "ignore-args" -> - parseYAML v >>= \x -> return (\o -> o{ optIgnoreArgs = x }) + parseJSON v >>= \x -> return (\o -> o{ optIgnoreArgs = x }) "verbosity" -> - parseYAML v >>= \x -> return (\o -> o{ optVerbosity = x }) + parseJSON v >>= \x -> return (\o -> o{ optVerbosity = x }) "trace" -> - parseYAML v >>= \x -> return (\o -> o{ optTrace = x }) + parseJSON v >>= \x -> return (\o -> o{ optTrace = x }) "log-file" -> - parseYAML v >>= \x -> return (\o -> o{ optLogFile = unpack <$> x }) + parseJSON v >>= \x -> return (\o -> o{ optLogFile = unpack <$> x }) "fail-if-warnings" -> - parseYAML v >>= \x -> return (\o -> o{ optFailIfWarnings = x }) + parseJSON v >>= \x -> return (\o -> o{ optFailIfWarnings = x }) "reference-links" -> - parseYAML v >>= \x -> return (\o -> o{ optReferenceLinks = x }) + parseJSON v >>= \x -> return (\o -> o{ optReferenceLinks = x }) "reference-location" -> - parseYAML v >>= \x -> return (\o -> o{ optReferenceLocation = x }) + parseJSON v >>= \x -> return (\o -> o{ optReferenceLocation = x }) "dpi" -> - parseYAML v >>= \x -> return (\o -> o{ optDpi = x }) + parseJSON v >>= \x -> return (\o -> o{ optDpi = x }) "wrap" -> - parseYAML v >>= \x -> return (\o -> o{ optWrap = x }) + parseJSON v >>= \x -> return (\o -> o{ optWrap = x }) "columns" -> - parseYAML v >>= \x -> return (\o -> o{ optColumns = x }) + parseJSON v >>= \x -> return (\o -> o{ optColumns = x }) "filters" -> - parseYAML v >>= \x -> return (\o -> o{ optFilters = optFilters o <> x }) + parseJSON v >>= \x -> return (\o -> o{ optFilters = optFilters o <> x }) "citeproc" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> if x then return (\o -> o{ optFilters = CiteprocFilter : optFilters o }) else return id "email-obfuscation" -> - parseYAML v >>= \x -> return (\o -> o{ optEmailObfuscation = x }) + parseJSON v >>= \x -> return (\o -> o{ optEmailObfuscation = x }) "identifier-prefix" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optIdentifierPrefix = x }) "strip-empty-paragraphs" -> - parseYAML v >>= \x -> return (\o -> o{ optStripEmptyParagraphs = x }) + parseJSON v >>= \x -> return (\o -> o{ optStripEmptyParagraphs = x }) "indented-code-classes" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optIndentedCodeClasses = x }) "data-dir" -> - parseYAML v >>= \x -> return (\o -> o{ optDataDir = unpack <$> x }) + parseJSON v >>= \x -> return (\o -> o{ optDataDir = unpack <$> x }) "cite-method" -> - parseYAML v >>= \x -> return (\o -> o{ optCiteMethod = x }) + parseJSON v >>= \x -> return (\o -> o{ optCiteMethod = x }) "listings" -> - parseYAML v >>= \x -> return (\o -> o{ optListings = x }) + parseJSON v >>= \x -> return (\o -> o{ optListings = x }) "pdf-engine" -> - parseYAML v >>= \x -> return (\o -> o{ optPdfEngine = unpack <$> x }) + parseJSON v >>= \x -> return (\o -> o{ optPdfEngine = unpack <$> x }) "pdf-engine-opts" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optPdfEngineOpts = map unpack x }) "pdf-engine-opt" -> - (parseYAML v >>= \x -> + (parseJSON v >>= \x -> return (\o -> o{ optPdfEngineOpts = map unpack x })) <|> - (parseYAML v >>= \x -> + (parseJSON v >>= \x -> return (\o -> o{ optPdfEngineOpts = [unpack x] })) "slide-level" -> - parseYAML v >>= \x -> return (\o -> o{ optSlideLevel = x }) + parseJSON v >>= \x -> return (\o -> o{ optSlideLevel = x }) "atx-headers" -> - parseYAML v >>= \x -> return (\o -> o{ optSetextHeaders = not x }) + parseJSON v >>= \x -> return (\o -> o{ optSetextHeaders = not x }) "markdown-headings" -> - parseYAML v >>= \x -> return (\o -> + parseJSON v >>= \x -> return (\o -> case T.toLower x of "atx" -> o{ optSetextHeaders = False } "setext" -> o{ optSetextHeaders = True } _ -> o) "ascii" -> - parseYAML v >>= \x -> return (\o -> o{ optAscii = x }) + parseJSON v >>= \x -> return (\o -> o{ optAscii = x }) "default-image-extension" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optDefaultImageExtension = x }) "extract-media" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optExtractMedia = unpack <$> x }) "track-changes" -> - parseYAML v >>= \x -> return (\o -> o{ optTrackChanges = x }) + parseJSON v >>= \x -> return (\o -> o{ optTrackChanges = x }) "file-scope" -> - parseYAML v >>= \x -> return (\o -> o{ optFileScope = x }) + parseJSON v >>= \x -> return (\o -> o{ optFileScope = x }) "title-prefix" -> - parseYAML v >>= \x -> return (\o -> o{ optTitlePrefix = x, + parseJSON v >>= \x -> return (\o -> o{ optTitlePrefix = x, optStandalone = True }) "css" -> - (parseYAML v >>= \x -> return (\o -> o{ optCss = optCss o <> + (parseJSON v >>= \x -> return (\o -> o{ optCss = optCss o <> map unpack x })) <|> - (parseYAML v >>= \x -> return (\o -> o{ optCss = optCss o <> + (parseJSON v >>= \x -> return (\o -> o{ optCss = optCss o <> [unpack x] })) "bibliography" -> - (parseYAML v >>= \x -> return (\o -> + (parseJSON v >>= \x -> return (\o -> o{ optBibliography = optBibliography o <> map unpack x })) <|> - (parseYAML v >>= \x -> return (\o -> + (parseJSON v >>= \x -> return (\o -> o{ optBibliography = optBibliography o <> [unpack x] })) "csl" -> - parseYAML v >>= \x -> return (\o -> o{ optCSL = unpack <$> x }) + parseJSON v >>= \x -> return (\o -> o{ optCSL = unpack <$> x }) "citation-abbreviations" -> - parseYAML v >>= \x -> return (\o -> o{ optCitationAbbreviations = + parseJSON v >>= \x -> return (\o -> o{ optCitationAbbreviations = unpack <$> x }) "ipynb-output" -> - parseYAML v >>= \x -> return (\o -> o{ optIpynbOutput = x }) + parseJSON v >>= \x -> return (\o -> o{ optIpynbOutput = x }) "include-before-body" -> - (parseYAML v >>= \x -> + (parseJSON v >>= \x -> return (\o -> o{ optIncludeBeforeBody = optIncludeBeforeBody o <> map unpack x })) <|> - (parseYAML v >>= \x -> + (parseJSON v >>= \x -> return (\o -> o{ optIncludeBeforeBody = optIncludeBeforeBody o <> [unpack x] })) "include-after-body" -> - (parseYAML v >>= \x -> + (parseJSON v >>= \x -> return (\o -> o{ optIncludeAfterBody = optIncludeAfterBody o <> map unpack x })) <|> - (parseYAML v >>= \x -> + (parseJSON v >>= \x -> return (\o -> o{ optIncludeAfterBody = optIncludeAfterBody o <> [unpack x] })) "include-in-header" -> - (parseYAML v >>= \x -> + (parseJSON v >>= \x -> return (\o -> o{ optIncludeInHeader = optIncludeInHeader o <> map unpack x })) <|> - (parseYAML v >>= \x -> + (parseJSON v >>= \x -> return (\o -> o{ optIncludeInHeader = optIncludeInHeader o <> [unpack x] })) "resource-path" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optResourcePath = map unpack x <> optResourcePath o }) "request-headers" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optRequestHeaders = x }) "no-check-certificate" -> - parseYAML v >>= \x -> + parseJSON v >>= \x -> return (\o -> o{ optNoCheckCertificate = x }) "eol" -> - parseYAML v >>= \x -> return (\o -> o{ optEol = x }) + parseJSON v >>= \x -> return (\o -> o{ optEol = x }) "strip-comments" -> - parseYAML v >>= \x -> return (\o -> o { optStripComments = x }) + parseJSON v >>= \x -> return (\o -> o { optStripComments = x }) "sandbox" -> - parseYAML v >>= \x -> return (\o -> o { optSandbox = x }) - _ -> failAtNode k' $ "Unknown option " ++ show k + parseJSON v >>= \x -> return (\o -> o { optSandbox = x }) + _ -> fail $ "Unknown option " ++ show k -- | Defaults for command-line options. defaultOpts :: Opt @@ -679,18 +663,12 @@ defaultOpts = Opt , optSandbox = False } -parseStringKey :: Node Pos -> Parser Text -parseStringKey k = case k of - Scalar _ (SStr t) -> return t - Scalar _ _ -> failAtNode k "Non-string key" - _ -> failAtNode k "Non-scalar key" - -yamlToMeta :: Node Pos -> Parser Meta -yamlToMeta (Mapping _ _ m) = - either (fail . show) return $ runEverything (yamlMap pMetaString m) - where - pMetaString = pure . MetaString <$> P.manyChar P.anyChar - runEverything p = +yamlToMeta :: Value -> Parser Meta +yamlToMeta (Object o) = + either (fail . show) return $ runEverything (yamlMap pMetaString o) + where + pMetaString = pure . MetaString <$> P.manyChar P.anyChar + runEverything p = runPure (P.readWithM p (def :: P.ParserState) ("" :: Text)) >>= fmap (Meta . flip P.runF def) yamlToMeta _ = return mempty @@ -703,14 +681,12 @@ applyDefaults :: (PandocMonad m, MonadIO m) applyDefaults opt file = do setVerbosity $ optVerbosity opt modify $ \defsState -> defsState{ curDefaults = Just file } - inp <- readFileLazy file - case decode1 inp of + inp <- readFileStrict file + case decodeEither' inp of Right f -> f opt - Left (errpos, errmsg) -> throwError $ - PandocParseError $ T.pack $ - "Error parsing " ++ file ++ " line " ++ - show (posLine errpos) ++ " column " ++ - show (posColumn errpos) ++ ":\n" ++ errmsg + Left err' -> throwError $ + PandocParseError + $ T.pack $ Data.Yaml.prettyPrintParseException err' fullDefaultsPath :: (PandocMonad m, MonadIO m) => Maybe FilePath diff --git a/src/Text/Pandoc/Citeproc.hs b/src/Text/Pandoc/Citeproc.hs index 9a649402e..9f110330e 100644 --- a/src/Text/Pandoc/Citeproc.hs +++ b/src/Text/Pandoc/Citeproc.hs @@ -268,7 +268,7 @@ getRefs locale format idpred mbfp raw = do rs <- yamlToRefs idpred def{ readerExtensions = pandocExtensions } (T.unpack <$> mbfp) - (L.fromStrict raw) + raw return $ mapMaybe metaValueToReference rs -- assumes we walk in same order as query diff --git a/src/Text/Pandoc/Filter.hs b/src/Text/Pandoc/Filter.hs index c2f522109..84015ed92 100644 --- a/src/Text/Pandoc/Filter.hs +++ b/src/Text/Pandoc/Filter.hs @@ -19,7 +19,7 @@ module Text.Pandoc.Filter ) where import System.CPUTime (getCPUTime) -import Data.Aeson.TH (deriveJSON, defaultOptions) +import Data.Aeson import GHC.Generics (Generic) import Text.Pandoc.Class (report, getVerbosity, PandocMonad) import Text.Pandoc.Definition (Pandoc) @@ -29,7 +29,6 @@ import Text.Pandoc.Citeproc (processCitations) import qualified Text.Pandoc.Filter.JSON as JSONFilter import qualified Text.Pandoc.Filter.Lua as LuaFilter import qualified Text.Pandoc.Filter.Path as Path -import Data.YAML import qualified Data.Text as T import System.FilePath (takeExtension) import Control.Applicative ((<|>)) @@ -42,9 +41,9 @@ data Filter = LuaFilter FilePath | CiteprocFilter -- built-in citeproc deriving (Show, Generic) -instance FromYAML Filter where - parseYAML node = - (withMap "Filter" $ \m -> do +instance FromJSON Filter where + parseJSON node = + (withObject "Filter" $ \m -> do ty <- m .: "type" fp <- m .:? "path" let missingPath = fail $ "Expected 'path' for filter of type " ++ show ty @@ -55,7 +54,7 @@ instance FromYAML Filter where "json" -> filterWithPath JSONFilter fp _ -> fail $ "Unknown filter type " ++ show (ty :: T.Text)) node <|> - (withStr "Filter" $ \t -> do + (withText "Filter" $ \t -> do let fp = T.unpack t if fp == "citeproc" then return CiteprocFilter @@ -64,6 +63,13 @@ instance FromYAML Filter where ".lua" -> LuaFilter fp _ -> JSONFilter fp) node +instance ToJSON Filter where + toJSON CiteprocFilter = object [ "type" .= String "citeproc" ] + toJSON (LuaFilter fp) = object [ "type" .= String "lua", + "path" .= String (T.pack fp) ] + toJSON (JSONFilter fp) = object [ "type" .= String "json", + "path" .= String (T.pack fp) ] + -- | Modify the given document using a filter. applyFilters :: (PandocMonad m, MonadIO m) => ReaderOptions @@ -96,5 +102,3 @@ expandFilterPath :: (PandocMonad m, MonadIO m) => Filter -> m Filter expandFilterPath (LuaFilter fp) = LuaFilter <$> Path.expandFilterPath fp expandFilterPath (JSONFilter fp) = JSONFilter <$> Path.expandFilterPath fp expandFilterPath CiteprocFilter = return CiteprocFilter - -$(deriveJSON defaultOptions ''Filter) diff --git a/src/Text/Pandoc/Logging.hs b/src/Text/Pandoc/Logging.hs index bb2fb5d36..2268f29f7 100644 --- a/src/Text/Pandoc/Logging.hs +++ b/src/Text/Pandoc/Logging.hs @@ -24,7 +24,6 @@ module Text.Pandoc.Logging ( ) where import Control.Monad (mzero) -import Data.YAML (withStr, FromYAML(..)) import Data.Aeson import Data.Aeson.Encode.Pretty (Config (..), defConfig, encodePretty', keyOrder) @@ -53,13 +52,6 @@ instance FromJSON Verbosity where _ -> mzero parseJSON _ = mzero -instance FromYAML Verbosity where - parseYAML = withStr "Verbosity" $ \case - "ERROR" -> return ERROR - "WARNING" -> return WARNING - "INFO" -> return INFO - _ -> mzero - data LogMessage = SkippedContent Text SourcePos | IgnoredElement Text diff --git a/src/Text/Pandoc/Options.hs b/src/Text/Pandoc/Options.hs index 85d9aa103..ba18377c2 100644 --- a/src/Text/Pandoc/Options.hs +++ b/src/Text/Pandoc/Options.hs @@ -34,7 +34,6 @@ module Text.Pandoc.Options ( module Text.Pandoc.Extensions , defaultKaTeXURL ) where import Control.Applicative ((<|>)) -import Data.Char (toLower) import Data.Maybe (fromMaybe) import Data.Data (Data) import Data.Default @@ -46,10 +45,9 @@ import Skylighting (SyntaxMap, defaultSyntaxMap) import Text.DocTemplates (Context(..), Template) import Text.Pandoc.Extensions import Text.Pandoc.Highlighting (Style, pygments) -import Text.Pandoc.Shared (camelCaseStrToHyphenated) -import Data.Aeson.TH (deriveJSON, defaultOptions, Options(..), - SumEncoding(..)) -import Data.YAML +import Text.Pandoc.UTF8 (toStringLazy) +import Data.Aeson.TH (deriveJSON) +import Data.Aeson class HasSyntaxExtensions a where getExtensions :: a -> Extensions @@ -106,9 +104,9 @@ data HTMLMathMethod = PlainMath | KaTeX Text -- url of KaTeX files deriving (Show, Read, Eq, Data, Typeable, Generic) -instance FromYAML HTMLMathMethod where - parseYAML node = - (withMap "HTMLMathMethod" $ \m -> do +instance FromJSON HTMLMathMethod where + parseJSON node = + (withObject "HTMLMathMethod" $ \m -> do method <- m .: "method" mburl <- m .:? "url" case method :: Text of @@ -121,28 +119,48 @@ instance FromYAML HTMLMathMethod where "katex" -> return $ KaTeX $ fromMaybe defaultKaTeXURL mburl _ -> fail $ "Unknown HTML math method " ++ show method) node - <|> (withStr "HTMLMathMethod" $ \method -> - case method of - "plain" -> return PlainMath - "webtex" -> return $ WebTeX "" - "gladtex" -> return GladTeX - "mathml" -> return MathML - "mathjax" -> return $ MathJax defaultMathJaxURL - "katex" -> return $ KaTeX defaultKaTeXURL - _ -> fail $ "Unknown HTML math method " ++ show method) node + <|> (case node of + String "plain" -> return PlainMath + String "webtex" -> return $ WebTeX "" + String "gladtex" -> return GladTeX + String "mathml" -> return MathML + String "mathjax" -> return $ MathJax defaultMathJaxURL + String "katex" -> return $ KaTeX defaultKaTeXURL + _ -> fail $ "Unknown HTML math method " <> + toStringLazy (encode node)) + +instance ToJSON HTMLMathMethod where + toJSON PlainMath = String "plain" + toJSON (WebTeX "") = String "webtex" + toJSON (WebTeX url) = object ["method" .= String "webtex", + "url" .= String url] + toJSON GladTeX = String "gladtex" + toJSON MathML = String "mathml" + toJSON (MathJax "") = String "mathjax" + toJSON (MathJax url) = object ["method" .= String "mathjax", + "url" .= String url] + toJSON (KaTeX "") = String "katex" + toJSON (KaTeX url) = object ["method" .= String "katex", + "url" .= String url] data CiteMethod = Citeproc -- use citeproc to render them | Natbib -- output natbib cite commands | Biblatex -- output biblatex cite commands deriving (Show, Read, Eq, Data, Typeable, Generic) -instance FromYAML CiteMethod where - parseYAML = withStr "Citeproc" $ \t -> - case t of - "citeproc" -> return Citeproc - "natbib" -> return Natbib - "biblatex" -> return Biblatex - _ -> fail $ "Unknown citation method " ++ show t +instance FromJSON CiteMethod where + parseJSON v = + case v of + String "citeproc" -> return Citeproc + String "natbib" -> return Natbib + String "biblatex" -> return Biblatex + _ -> fail $ "Unknown citation method: " <> + toStringLazy (encode v) + +instance ToJSON CiteMethod where + toJSON Citeproc = String "citeproc" + toJSON Natbib = String "natbib" + toJSON Biblatex = String "biblatex" -- | Methods for obfuscating email addresses in HTML. data ObfuscationMethod = NoObfuscation @@ -150,13 +168,18 @@ data ObfuscationMethod = NoObfuscation | JavascriptObfuscation deriving (Show, Read, Eq, Data, Typeable, Generic) -instance FromYAML ObfuscationMethod where - parseYAML = withStr "Citeproc" $ \t -> - case t of - "none" -> return NoObfuscation - "references" -> return ReferenceObfuscation - "javascript" -> return JavascriptObfuscation - _ -> fail $ "Unknown obfuscation method " ++ show t +instance FromJSON ObfuscationMethod where + parseJSON v = + case v of + String "none" -> return NoObfuscation + String "references" -> return ReferenceObfuscation + String "javascript" -> return JavascriptObfuscation + _ -> fail $ "Unknown obfuscation method " ++ toStringLazy (encode v) + +instance ToJSON ObfuscationMethod where + toJSON NoObfuscation = String "none" + toJSON ReferenceObfuscation = String "references" + toJSON JavascriptObfuscation = String "javascript" -- | Varieties of HTML slide shows. data HTMLSlideVariant = S5Slides @@ -173,13 +196,22 @@ data TrackChanges = AcceptChanges | AllChanges deriving (Show, Read, Eq, Data, Typeable, Generic) -instance FromYAML TrackChanges where - parseYAML = withStr "TrackChanges" $ \t -> - case t of - "accept" -> return AcceptChanges - "reject" -> return RejectChanges - "all" -> return AllChanges - _ -> fail $ "Unknown track changes method " ++ show t +-- update in doc/filters.md if this changes: +instance FromJSON TrackChanges where + parseJSON v = + case v of + String "accept" -> return AcceptChanges + String "reject" -> return RejectChanges + String "all" -> return AllChanges + String "accept-changes" -> return AcceptChanges + String "reject-changes" -> return RejectChanges + String "all-changes" -> return AllChanges + _ -> fail $ "Unknown track changes method " <> toStringLazy (encode v) + +instance ToJSON TrackChanges where + toJSON AcceptChanges = String "accept-changes" + toJSON RejectChanges = String "reject-changes" + toJSON AllChanges = String "all-changes" -- | Options for wrapping text in the output. data WrapOption = WrapAuto -- ^ Automatically wrap to width @@ -187,14 +219,21 @@ data WrapOption = WrapAuto -- ^ Automatically wrap to width | WrapPreserve -- ^ Preserve wrapping of input source deriving (Show, Read, Eq, Data, Typeable, Generic) -instance FromYAML WrapOption where - parseYAML = withStr "WrapOption" $ \t -> - case t of - "auto" -> return WrapAuto - "none" -> return WrapNone - "preserve" -> return WrapPreserve - _ -> fail $ "Unknown wrap method " ++ show t - +instance FromJSON WrapOption where + parseJSON v = + case v of + String "auto" -> return WrapAuto + String "wrap-auto" -> return WrapAuto + String "none" -> return WrapNone + String "wrap-none" -> return WrapNone + String "preserve" -> return WrapPreserve + String "wrap-preserve" -> return WrapPreserve + _ -> fail $ "Unknown wrap method " <> toStringLazy (encode v) + +instance ToJSON WrapOption where + toJSON WrapAuto = "wrap-auto" + toJSON WrapNone = "wrap-none" + toJSON WrapPreserve = "wrap-preserve" -- | Options defining the type of top-level headers. data TopLevelDivision = TopLevelPart -- ^ Top-level headers become parts @@ -204,15 +243,24 @@ data TopLevelDivision = TopLevelPart -- ^ Top-level headers become parts -- heuristics deriving (Show, Read, Eq, Data, Typeable, Generic) -instance FromYAML TopLevelDivision where - parseYAML = withStr "TopLevelDivision" $ \t -> - case t of - "part" -> return TopLevelPart - "chapter" -> return TopLevelChapter - "section" -> return TopLevelSection - "default" -> return TopLevelDefault - _ -> fail $ "Unknown top level division " ++ show t - +instance FromJSON TopLevelDivision where + parseJSON v = + case v of + String "part" -> return TopLevelPart + String "top-level-part" -> return TopLevelPart + String "chapter" -> return TopLevelChapter + String "top-level-chapter" -> return TopLevelChapter + String "section" -> return TopLevelSection + String "top-level-section" -> return TopLevelSection + String "default" -> return TopLevelDefault + String "top-level-default" -> return TopLevelDefault + _ -> fail $ "Unknown top level division " <> toStringLazy (encode v) + +instance ToJSON TopLevelDivision where + toJSON TopLevelPart = "top-level-part" + toJSON TopLevelChapter = "top-level-chapter" + toJSON TopLevelSection = "top-level-section" + toJSON TopLevelDefault = "top-level-default" -- | Locations for footnotes and references in markdown output data ReferenceLocation = EndOfBlock -- ^ End of block @@ -220,14 +268,21 @@ data ReferenceLocation = EndOfBlock -- ^ End of block | EndOfDocument -- ^ at end of document deriving (Show, Read, Eq, Data, Typeable, Generic) -instance FromYAML ReferenceLocation where - parseYAML = withStr "ReferenceLocation" $ \t -> - case t of - "block" -> return EndOfBlock - "section" -> return EndOfSection - "document" -> return EndOfDocument - _ -> fail $ "Unknown reference location " ++ show t - +instance FromJSON ReferenceLocation where + parseJSON v = + case v of + String "block" -> return EndOfBlock + String "end-of-block" -> return EndOfBlock + String "section" -> return EndOfSection + String "end-of-section" -> return EndOfSection + String "document" -> return EndOfDocument + String "end-of-document" -> return EndOfDocument + _ -> fail $ "Unknown reference location " <> toStringLazy (encode v) + +instance ToJSON ReferenceLocation where + toJSON EndOfBlock = "end-of-block" + toJSON EndOfSection = "end-of-section" + toJSON EndOfDocument = "end-of-document" -- | Options for writers data WriterOptions = WriterOptions @@ -315,43 +370,7 @@ defaultMathJaxURL = "https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml-full.j defaultKaTeXURL :: Text defaultKaTeXURL = "https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.11.1/" --- Update documentation in doc/filters.md if this is changed. -$(deriveJSON defaultOptions{ constructorTagModifier = - camelCaseStrToHyphenated - } ''TrackChanges) - -$(deriveJSON defaultOptions{ constructorTagModifier = - camelCaseStrToHyphenated - } ''WrapOption) - -$(deriveJSON defaultOptions{ constructorTagModifier = - camelCaseStrToHyphenated . drop 8 - } ''TopLevelDivision) - -$(deriveJSON defaultOptions{ constructorTagModifier = - camelCaseStrToHyphenated - } ''ReferenceLocation) - -- Update documentation in doc/filters.md if this is changed. $(deriveJSON defaultOptions ''ReaderOptions) -$(deriveJSON defaultOptions{ - constructorTagModifier = map toLower, - sumEncoding = TaggedObject{ - tagFieldName = "method", - contentsFieldName = "url" } - } ''HTMLMathMethod) - -$(deriveJSON defaultOptions{ constructorTagModifier = - camelCaseStrToHyphenated - } ''CiteMethod) - -$(deriveJSON defaultOptions{ constructorTagModifier = - \case - "NoObfuscation" -> "none" - "ReferenceObfuscation" -> "references" - "JavascriptObfuscation" -> "javascript" - _ -> "none" - } ''ObfuscationMethod) - $(deriveJSON defaultOptions ''HTMLSlideVariant) diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs index e7ab8efb4..a73c0cba3 100644 --- a/src/Text/Pandoc/Readers/Markdown.hs +++ b/src/Text/Pandoc/Readers/Markdown.hs @@ -28,7 +28,7 @@ import Data.Maybe import qualified Data.Set as Set import Data.Text (Text) import qualified Data.Text as T -import qualified Data.ByteString.Lazy as BL +import qualified Data.ByteString as BS import System.FilePath (addExtension, takeExtension, takeDirectory) import qualified System.FilePath.Windows as Windows import qualified System.FilePath.Posix as Posix @@ -72,14 +72,12 @@ readMarkdown opts s = do yamlToMeta :: PandocMonad m => ReaderOptions -> Maybe FilePath - -> BL.ByteString + -> BS.ByteString -> m Meta yamlToMeta opts mbfp bstr = do let parser = do oldPos <- getPosition - case mbfp of - Nothing -> return () - Just fp -> setPosition $ initialPos fp + setPosition $ initialPos (fromMaybe "" mbfp) meta <- yamlBsToMeta (fmap B.toMetaValue <$> parseBlocks) bstr setPosition oldPos return $ runF meta defaultParserState @@ -95,7 +93,7 @@ yamlToRefs :: PandocMonad m => (Text -> Bool) -> ReaderOptions -> Maybe FilePath - -> BL.ByteString + -> BS.ByteString -> m [MetaValue] yamlToRefs idpred opts mbfp bstr = do let parser = do diff --git a/src/Text/Pandoc/Readers/Metadata.hs b/src/Text/Pandoc/Readers/Metadata.hs index cbc523b25..534a7645b 100644 --- a/src/Text/Pandoc/Readers/Metadata.hs +++ b/src/Text/Pandoc/Readers/Metadata.hs @@ -17,102 +17,61 @@ module Text.Pandoc.Readers.Metadata ( yamlMetaBlock, yamlMap ) where -import Control.Monad + import Control.Monad.Except (throwError) -import qualified Data.ByteString.Lazy as BL +import qualified Data.ByteString as B import qualified Data.Map as M -import Data.Maybe import Data.Text (Text) import qualified Data.Text as T -import qualified Data.YAML as YAML -import qualified Data.YAML.Event as YE +import qualified Data.Yaml as Yaml +import Data.Aeson (Value(..), Object, Result(..), fromJSON, (.:?), withObject) +import Data.Aeson.Types (parse) +import Text.Pandoc.Shared (tshow) import Text.Pandoc.Class.PandocMonad (PandocMonad (..)) -import Text.Pandoc.Definition +import Text.Pandoc.Definition hiding (Null) import Text.Pandoc.Error -import Text.Pandoc.Parsing hiding (tableWith) -import Text.Pandoc.Shared -import qualified Data.Text.Lazy as TL +import Text.Pandoc.Parsing hiding (tableWith, parse) + + import qualified Text.Pandoc.UTF8 as UTF8 yamlBsToMeta :: (PandocMonad m, HasLastStrPosition st) => ParserT Sources st m (Future st MetaValue) - -> BL.ByteString + -> B.ByteString -> ParserT Sources st m (Future st Meta) yamlBsToMeta pMetaValue bstr = do - case YAML.decodeNode' YAML.failsafeSchemaResolver False False bstr of - Right (YAML.Doc (YAML.Mapping _ _ o):_) - -> fmap Meta <$> yamlMap pMetaValue o - Right [] -> return . return $ mempty - Right [YAML.Doc (YAML.Scalar _ YAML.SNull)] - -> return . return $ mempty - -- the following is what we get from a comment: - Right [YAML.Doc (YAML.Scalar _ (YAML.SUnknown _ ""))] - -> return . return $ mempty + case Yaml.decodeAllEither' bstr of + Right (Object o:_) -> fmap Meta <$> yamlMap pMetaValue o + Right [Null] -> return . return $ mempty Right _ -> Prelude.fail "expected YAML object" - Left (yamlpos, err') - -> do pos <- getPosition - setPosition $ incSourceLine - (setSourceColumn pos (YE.posColumn yamlpos)) - (YE.posLine yamlpos - 1) - Prelude.fail err' - -fakePos :: YAML.Pos -fakePos = YAML.Pos (-1) (-1) 1 0 - -lookupYAML :: Text - -> YAML.Node YE.Pos - -> Maybe (YAML.Node YE.Pos) -lookupYAML t (YAML.Mapping _ _ m) = - M.lookup (YAML.Scalar fakePos (YAML.SUnknown YE.untagged t)) m - `mplus` - M.lookup (YAML.Scalar fakePos (YAML.SStr t)) m -lookupYAML _ _ = Nothing + Left err' -> do + throwError $ PandocParseError + $ T.pack $ Yaml.prettyPrintParseException err' -- Returns filtered list of references. yamlBsToRefs :: (PandocMonad m, HasLastStrPosition st) => ParserT Sources st m (Future st MetaValue) -> (Text -> Bool) -- ^ Filter for id - -> BL.ByteString + -> B.ByteString -> ParserT Sources st m (Future st [MetaValue]) yamlBsToRefs pMetaValue idpred bstr = - case YAML.decodeNode' YAML.failsafeSchemaResolver False False bstr of - Right (YAML.Doc o@YAML.Mapping{}:_) - -> case lookupYAML "references" o of - Just (YAML.Sequence _ _ ns) -> do - let g n = case lookupYAML "id" n of - Just n' -> - case nodeToKey n' of - Nothing -> False - Just t -> idpred t || - case lookupYAML "other-ids" n of - Just (YAML.Sequence _ _ ns') -> - let ts' = mapMaybe nodeToKey ns' - in any idpred ts' - _ -> False - Nothing -> False - sequence <$> - mapM (yamlToMetaValue pMetaValue) (filter g ns) - Just _ -> - Prelude.fail "expecting sequence in 'references' field" - Nothing -> - Prelude.fail "expecting 'references' field" - - Right [] -> return . return $ mempty - Right [YAML.Doc (YAML.Scalar _ YAML.SNull)] - -> return . return $ mempty - Right _ -> Prelude.fail "expecting YAML object" - Left (yamlpos, err') - -> do pos <- getPosition - setPosition $ incSourceLine - (setSourceColumn pos (YE.posColumn yamlpos)) - (YE.posLine yamlpos - 1) - Prelude.fail err' - - -nodeToKey :: YAML.Node YE.Pos -> Maybe Text -nodeToKey (YAML.Scalar _ (YAML.SStr t)) = Just t -nodeToKey (YAML.Scalar _ (YAML.SUnknown _ t)) = Just t -nodeToKey _ = Nothing + case Yaml.decodeEither' bstr of + Right (Object m) -> do + let isSelected (String t) = idpred t + isSelected _ = False + let hasSelectedId (Object o) = + case parse (withObject "ref" (.:? "id")) (Object o) of + Success (Just id') -> isSelected id' + _ -> False + hasSelectedId _ = False + case parse (withObject "metadata" (.:? "references")) (Object m) of + Success (Just refs) -> sequence <$> + mapM (yamlToMetaValue pMetaValue) (filter hasSelectedId refs) + _ -> return $ return [] + Right _ -> return . return $ [] + Left err' -> do + throwError $ PandocParseError + $ T.pack $ Yaml.prettyPrintParseException err' normalizeMetaValue :: (PandocMonad m, HasLastStrPosition st) => ParserT Sources st m (Future st MetaValue) @@ -133,47 +92,36 @@ normalizeMetaValue pMetaValue x = isSpaceChar '\t' = True isSpaceChar _ = False -checkBoolean :: Text -> Maybe Bool -checkBoolean t - | t == T.pack "true" || t == T.pack "True" || t == T.pack "TRUE" = Just True - | t == T.pack "false" || t == T.pack "False" || t == T.pack "FALSE" = Just False - | otherwise = Nothing - yamlToMetaValue :: (PandocMonad m, HasLastStrPosition st) => ParserT Sources st m (Future st MetaValue) - -> YAML.Node YE.Pos + -> Value -> ParserT Sources st m (Future st MetaValue) -yamlToMetaValue pMetaValue (YAML.Scalar _ x) = - case x of - YAML.SStr t -> normalizeMetaValue pMetaValue t - YAML.SBool b -> return $ return $ MetaBool b - YAML.SFloat d -> return $ return $ MetaString $ tshow d - YAML.SInt i -> return $ return $ MetaString $ tshow i - YAML.SUnknown _ t -> - case checkBoolean t of - Just b -> return $ return $ MetaBool b - Nothing -> normalizeMetaValue pMetaValue t - YAML.SNull -> return $ return $ MetaString "" - -yamlToMetaValue pMetaValue (YAML.Sequence _ _ xs) = - fmap MetaList . sequence - <$> mapM (yamlToMetaValue pMetaValue) xs -yamlToMetaValue pMetaValue (YAML.Mapping _ _ o) = - fmap MetaMap <$> yamlMap pMetaValue o -yamlToMetaValue _ _ = return $ return $ MetaString "" +yamlToMetaValue pMetaValue v = + case v of + String t -> normalizeMetaValue pMetaValue t + Bool b -> return $ return $ MetaBool b + Number d -> normalizeMetaValue pMetaValue $ + case fromJSON v of + Success (x :: Int) -> tshow x + _ -> tshow d + Null -> return $ return $ MetaString "" + Array{} -> do + case fromJSON v of + Error err' -> throwError $ PandocParseError $ T.pack err' + Success xs -> fmap MetaList . sequence <$> + mapM (yamlToMetaValue pMetaValue) xs + Object o -> fmap MetaMap <$> yamlMap pMetaValue o yamlMap :: (PandocMonad m, HasLastStrPosition st) => ParserT Sources st m (Future st MetaValue) - -> M.Map (YAML.Node YE.Pos) (YAML.Node YE.Pos) + -> Object -> ParserT Sources st m (Future st (M.Map Text MetaValue)) yamlMap pMetaValue o = do - kvs <- forM (M.toList o) $ \(key, v) -> do - k <- maybe (throwError $ PandocParseError - "Non-string key in YAML mapping") - return $ nodeToKey key - return (k, v) - let kvs' = filter (not . ignorable . fst) kvs - fmap M.fromList . sequence <$> mapM toMeta kvs' + case fromJSON (Object o) of + Error err' -> throwError $ PandocParseError $ T.pack err' + Success (m' :: M.Map Text Value) -> do + let kvs = filter (not . ignorable . fst) $ M.toList m' + fmap M.fromList . sequence <$> mapM toMeta kvs where ignorable t = "_" `T.isSuffixOf` t toMeta (k, v) = do @@ -194,7 +142,7 @@ yamlMetaBlock parser = try $ do -- by including --- and ..., we allow yaml blocks with just comments: let rawYaml = T.unlines ("---" : (rawYamlLines ++ ["..."])) optional blanklines - yamlBsToMeta parser $ UTF8.fromTextLazy $ TL.fromStrict rawYaml + yamlBsToMeta parser $ UTF8.fromText rawYaml stopLine :: Monad m => ParserT Sources st m () stopLine = try $ (string "---" <|> string "...") >> blankline >> return () diff --git a/src/Text/Pandoc/Translations.hs b/src/Text/Pandoc/Translations.hs index 17d56f262..b0476a0ab 100644 --- a/src/Text/Pandoc/Translations.hs +++ b/src/Text/Pandoc/Translations.hs @@ -33,10 +33,11 @@ import Data.Aeson.Types (Value(..), FromJSON(..)) import qualified Data.Aeson.Types as Aeson import qualified Data.Map as M import qualified Data.Text as T -import qualified Data.YAML as YAML +import qualified Data.Yaml as Yaml import GHC.Generics (Generic) import Text.Pandoc.Shared (safeRead) import qualified Text.Pandoc.UTF8 as UTF8 +import Data.Yaml (prettyPrintParseException) data Term = Abstract @@ -73,14 +74,6 @@ instance FromJSON Term where show t parseJSON invalid = Aeson.typeMismatch "Term" invalid -instance YAML.FromYAML Term where - parseYAML (YAML.Scalar _ (YAML.SStr t)) = - case safeRead t of - Just t' -> pure t' - Nothing -> Prelude.fail $ "Invalid Term name " ++ - show t - parseYAML invalid = YAML.typeMismatch "Term" invalid - instance FromJSON Translations where parseJSON o@(Object{}) = do xs <- parseJSON o >>= mapM addItem . M.toList @@ -94,27 +87,12 @@ instance FromJSON Translations where inv -> Aeson.typeMismatch "String" inv parseJSON invalid = Aeson.typeMismatch "Translations" invalid -instance YAML.FromYAML Translations where - parseYAML = YAML.withMap "Translations" $ - \tr -> Translations .M.fromList <$> mapM addItem (M.toList tr) - where addItem (n@(YAML.Scalar _ (YAML.SStr k)), v) = - case safeRead k of - Nothing -> YAML.typeMismatch "Term" n - Just t -> - case v of - (YAML.Scalar _ (YAML.SStr s)) -> - return (t, T.strip s) - n' -> YAML.typeMismatch "String" n' - addItem (n, _) = YAML.typeMismatch "String" n - lookupTerm :: Term -> Translations -> Maybe T.Text lookupTerm t (Translations tm) = M.lookup t tm readTranslations :: T.Text -> Either T.Text Translations readTranslations s = - case YAML.decodeStrict $ UTF8.fromText s of - Left (pos,err') -> Left $ T.pack $ err' ++ - " (line " ++ show (YAML.posLine pos) ++ " column " ++ - show (YAML.posColumn pos) ++ ")" + case Yaml.decodeAllEither' $ UTF8.fromText s of + Left err' -> Left $ T.pack $ prettyPrintParseException err' Right (t:_) -> Right t Right [] -> Left "empty YAML document" diff --git a/test/command/4819.md b/test/command/4819.md index c7cd9a90a..5caa5c6de 100644 --- a/test/command/4819.md +++ b/test/command/4819.md @@ -49,9 +49,7 @@ foo: no ... ^D Pandoc - Meta - { unMeta = fromList [ ( "foo" , MetaInlines [ Str "no" ] ) ] - } + Meta { unMeta = fromList [ ( "foo" , MetaBool False ) ] } [] ``` diff --git a/test/command/6741.md b/test/command/6741.md index 5fcc0700e..7764055a1 100644 --- a/test/command/6741.md +++ b/test/command/6741.md @@ -35,13 +35,13 @@ references: type: article-journal - author: - family: Suzuki - given: Y + given: Y. - family: Minami - given: T + given: T. - family: Laeng - given: B + given: B. - family: Nakauchi - given: S + given: S. container-title: Acta Psychologica DOI: 10.1016/j.actpsy.2019.102882 id: suzuki2019 diff --git a/test/command/pandoc-citeproc-312.md b/test/command/pandoc-citeproc-312.md index dfc4fe25c..20a9b759b 100644 --- a/test/command/pandoc-citeproc-312.md +++ b/test/command/pandoc-citeproc-312.md @@ -6,7 +6,7 @@ nocite: '@*' references: - author: - literal: NN - id: Y + id: 'Y' issued: - year: 1950 title: 'Date: Year' diff --git a/test/command/pandoc-citeproc-327.md b/test/command/pandoc-citeproc-327.md index 5cfd59576..2ea3a247e 100644 --- a/test/command/pandoc-citeproc-327.md +++ b/test/command/pandoc-citeproc-327.md @@ -45,7 +45,7 @@ I referenced something here^\[1\]^ ::: {#ref-LiLiaoDongWanHaiYuDiQiDongWuCiJiShengChanLiYanJiuJiShengJingGuaYiXingPingJie2017 .csl-entry} [\[1\] ]{.csl-left-margin}[李轶平, 于旭光, 孙明, 等. [辽东湾海域底栖动物次级生产力研究及生境适宜性评价](http://kns.cnki.net/kns/detail/detail.aspx?QueryID=4&CurRec=4&recid=&FileName=CHAN201706006&DbName=CJFDLAST2018&DbCode=CJFQ&yx=Y&pr=&URLID=21.1110.S.20171129.1725.006)\[J\]. -水产科学, 2017(06): 728--734.]{.csl-right-inline} +水产科学, 2017(6): 728--734.]{.csl-right-inline} ::: ::: ``` diff --git a/test/command/yaml-metadata-blocks.md b/test/command/yaml-metadata-blocks.md index d483618d3..d92994b03 100644 --- a/test/command/yaml-metadata-blocks.md +++ b/test/command/yaml-metadata-blocks.md @@ -51,11 +51,11 @@ Pandoc , ( "float" , MetaInlines [ Str "2.5" ] ) , ( "int" , MetaInlines [ Str "8" ] ) , ( "more" , MetaBool False ) - , ( "nothing" , MetaInlines [ Str "null" ] ) + , ( "nothing" , MetaString "" ) , ( "scientific" , MetaInlines [ Str "3.7e-5" ] ) ]) ) - , ( "nothing" , MetaInlines [ Str "null" ] ) + , ( "nothing" , MetaString "" ) , ( "scientific" , MetaInlines [ Str "3.7e-5" ] ) ] } -- cgit v1.2.3