diff options
author | John MacFarlane <jgm@berkeley.edu> | 2010-11-18 13:22:20 -0800 |
---|---|---|
committer | John MacFarlane <jgm@berkeley.edu> | 2010-11-18 13:22:20 -0800 |
commit | f3bb3c1ff1c85ea3bc9132b4c890905a9af20c3a (patch) | |
tree | 0d3d0167c4298a0896cdba1f7902d00908cfbddf | |
parent | aaf7de0ddaea292ba4e869a6f0fa5adaaf02b813 (diff) | |
download | pandoc-f3bb3c1ff1c85ea3bc9132b4c890905a9af20c3a.tar.gz |
Markdown citation parser improvements and test updates.
Now we handle a suffix after a bare locator, e.g.
@item1 [p. 30, suffix]
The suffix now includes any punctuation that introduces it.
A few tests fail because of problems with citeproc (extra space
before the suffix, missing space after comma separating multiple
page ranges in the locator).
-rw-r--r-- | src/Text/Pandoc/Readers/Markdown.hs | 28 | ||||
-rw-r--r-- | tests/markdown-citations.plain | 8 | ||||
-rw-r--r-- | tests/markdown-citations.txt | 4 |
3 files changed, 24 insertions, 16 deletions
diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs index b0aab9c70..851cf25e7 100644 --- a/src/Text/Pandoc/Readers/Markdown.hs +++ b/src/Text/Pandoc/Readers/Markdown.hs @@ -1348,13 +1348,12 @@ bareloc :: Citation -> GenParser Char ParserState [Citation] bareloc c = try $ do spnl char '[' - spnl loc <- locator - spnl + suff <- suffix rest <- option [] $ try $ char ';' >> citeList spnl char ']' - return $ c{ citationLocator = loc } : rest + return $ c{ citationLocator = loc, citationSuffix = suff } : rest normalCite :: GenParser Char ParserState [Citation] normalCite = try $ do @@ -1376,28 +1375,31 @@ citeKey = try $ do locator :: GenParser Char st String locator = try $ do spnl - w <- many1 (noneOf " \t\n;]") - spnl - ws <- many locatorWord + w <- many1 (noneOf " \t\n;,]") + ws <- many (locatorWord <|> locatorComma) return $ unwords $ w:ws locatorWord :: GenParser Char st String locatorWord = try $ do - wd <- many1 $ (try $ char '\\' >> oneOf "]; \t\n") <|> noneOf "]; \t\n" spnl - if any isDigit wd - then return wd - else pzero + wd <- many1 $ (try $ char '\\' >> oneOf "];, \t\n") <|> noneOf "];, \t\n" + guard $ any isDigit wd + return wd + +locatorComma :: GenParser Char st String +locatorComma = try $ do + char ',' + lookAhead $ locatorWord + return "," suffix :: GenParser Char ParserState [Inline] suffix = try $ do - char ',' spnl liftM normalizeSpaces $ many $ notFollowedBy (oneOf ";]") >> inline prefix :: GenParser Char ParserState [Inline] prefix = liftM normalizeSpaces $ - manyTill inline (lookAhead citeKey) + manyTill inline (char ']' <|> liftM (const ']') (lookAhead citeKey)) citeList :: GenParser Char ParserState [Citation] citeList = sepBy1 citation (try $ char ';' >> spnl) @@ -1407,7 +1409,7 @@ citation = try $ do pref <- prefix (suppress_author, key) <- citeKey loc <- option "" $ try $ blankSpace >> locator - suff <- option [] suffix + suff <- suffix return $ Citation{ citationId = key , citationPrefix = pref , citationSuffix = suff diff --git a/tests/markdown-citations.plain b/tests/markdown-citations.plain index b809842be..dd5d23efc 100644 --- a/tests/markdown-citations.plain +++ b/tests/markdown-citations.plain @@ -5,12 +5,14 @@ Pandoc with citeproc-hs @nonexistent -Doe (2005) says blah. Doe (2005, 30) says blah. Doe -(2005; 2006, 30; see also Doe and Roe 2007) says blah. +Doe (2005) says blah. Doe (2005, 30) says blah. Doe (2005, 30, suffix) +says blah. Doe (2005; 2006, 30; see also Doe and Roe 2007) says blah. In a note.[^1] A citation group (see Doe 2005, 34-35; also Doe and Roe 2007, chap. 3). Another one -(see Doe 2005, 34-35). And another one in a note.[^2] +(see Doe 2005, 34-35). And another one in a note.[^2] Citation with +a suffix and locator (Doe 2005, 33, 35-37, and nowhere else). +Citation with suffix only (Doe 2005, and nowhere else). Now some modifiers.[^3] diff --git a/tests/markdown-citations.txt b/tests/markdown-citations.txt index 9840832ce..c54a41304 100644 --- a/tests/markdown-citations.txt +++ b/tests/markdown-citations.txt @@ -6,11 +6,15 @@ @item1 says blah. @item1 [p. 30] says blah. +@item1 [p. 30, with suffix] says blah. @item1 [-@item2 p. 30; see also @item3] says blah. In a note.[^1] A citation group [see @item1 p. 34-35; also @item3 chap. 3]. Another one [see @item1 p. 34-35]. And another one in a note.[^2] +Citation with a suffix and locator [@item1 pp. 33, 35-37, +and nowhere else]. Citation with suffix only +[@item1, and nowhere else]. Now some modifiers.[^3] |