From 6e45607f9948f45b2e94f54b4825b667ca0d5441 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Sat, 1 May 2021 13:17:45 -0700 Subject: Change reader types, allowing better tracking of source positions. Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn't report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn't resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752). Add Text.Pandoc.Sources (exported module), with a `Sources` type and a `ToSources` class. A `Sources` wraps a list of `(SourcePos, Text)` pairs. [API change] A parsec `Stream` instance is provided for `Sources`. The module also exports versions of parsec's `satisfy` and other Char parsers that track source positions accurately from a `Sources` stream (or any instance of the new `UpdateSourcePos` class). Text.Pandoc.Parsing now exports these modified Char parsers instead of the ones parsec provides. Modified parsers to use a `Sources` as stream [API change]. The readers that previously took a `Text` argument have been modified to take any instance of `ToSources`. So, they may still be used with a `Text`, but they can also be used with a `Sources` object. In Text.Pandoc.Error, modified the constructor PandocParsecError to take a `Sources` rather than a `Text` as first argument, so parse error locations can be accurately reported. T.P.Error: showPos, do not print "-" as source name. --- src/Text/Pandoc/Citeproc/BibTeX.hs | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) (limited to 'src/Text/Pandoc/Citeproc') diff --git a/src/Text/Pandoc/Citeproc/BibTeX.hs b/src/Text/Pandoc/Citeproc/BibTeX.hs index 510e56f9c..f6833000c 100644 --- a/src/Text/Pandoc/Citeproc/BibTeX.hs +++ b/src/Text/Pandoc/Citeproc/BibTeX.hs @@ -59,10 +59,11 @@ data Variant = Bibtex | Biblatex deriving (Show, Eq, Ord) -- | Parse BibTeX or BibLaTeX into a list of 'Reference's. -readBibtexString :: Variant -- ^ bibtex or biblatex +readBibtexString :: ToSources a + => Variant -- ^ bibtex or biblatex -> Locale -- ^ Locale -> (Text -> Bool) -- ^ Filter on citation ids - -> Text -- ^ bibtex/biblatex text + -> a -- ^ bibtex/biblatex text -> Either ParseError [Reference Inlines] readBibtexString variant locale idpred contents = do case runParser (((resolveCrossRefs variant <$> bibEntries) <* eof) >>= @@ -70,7 +71,7 @@ readBibtexString variant locale idpred contents = do filter (\item -> idpred (identifier item) && entryType item /= "xdata")) (fromMaybe defaultLang $ localeLanguage locale, Map.empty) - "" contents of + "" (toSources contents) of Left err -> Left err Right xs -> return xs @@ -339,7 +340,7 @@ defaultLang = Lang "en" Nothing (Just "US") [] [] [] -- a map of bibtex "string" macros type StringMap = Map.Map Text Text -type BibParser = Parser Text (Lang, StringMap) +type BibParser = Parser Sources (Lang, StringMap) data Item = Item{ identifier :: Text , sourcePos :: SourcePos @@ -804,7 +805,7 @@ bibEntries = do (bibComment <|> bibPreamble <|> bibString)) bibSkip :: BibParser () -bibSkip = () <$ take1WhileP (/='@') +bibSkip = skipMany1 (satisfy (/='@')) bibComment :: BibParser () bibComment = do @@ -829,6 +830,9 @@ bibString = do updateState (\(l,m) -> (l, Map.insert k v m)) return () +take1WhileP :: Monad m => (Char -> Bool) -> ParserT Sources u m Text +take1WhileP f = T.pack <$> many1 (satisfy f) + inBraces :: BibParser Text inBraces = do char '{' -- cgit v1.2.3