aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers/Markdown.hs
AgeCommit message (Collapse)AuthorFilesLines
2011-11-09Markdown citations: don't strip off initial space in locator.John MacFarlane1-1/+5
Previously `[@item1 and nowhere else]` yielded the locator ", and nowhere else", or, with the new citeproc-hs, "and nowhere else". Now it yields " and nowhere else".
2011-11-06Markdown reader: allow punctuation only internally in cite keys.John MacFarlane1-1/+2
The characters '.',':',';','$','<','>','~','#','-','_' can be used only between two letters or digits in a citation key. This means that '@item1.' will be parsed as a citation, 'item1', followed by a period, instead of a citation 'item1.', as was the case previously. Thanks to David Sanson for alerting us to the problem.
2011-07-30Added PRAGMA needed for ghc 6.12.John MacFarlane1-0/+1
2011-07-30Removed applicative stuff in Markdown reader.John MacFarlane1-16/+16
It requires parsec 3, and currently pandoc can build with parsec 2.
2011-07-30Markdown reader: Improved emph/strong parsing.John MacFarlane1-13/+34
Ported code from pandoc2. Now all tests pass.
2011-05-22Forbid ()s in citation item keys.John MacFarlane1-1/+1
Resolves Issue #304: problems with (@item1; @item2) because the final paren was being parsed as part of the item key.
2011-04-20Disallow notes within notes in reST and markdown.John MacFarlane1-1/+8
These previously caused infinite looping and stack overflows. For example: [^1] [^1]: See [^1] Note references are allowed in reST notes, so this isn't a full implementation of reST. That can come later. For now we need to prevent the stack overflows. Partially resolves Issue #297.
2011-03-02Markdown+lhs reader: Require space after inverse bird tracks.John MacFarlane1-1/+3
The point of the change is to allow html tags to be used freely at the left margin of a markdown+lhs document. Thanks to Conal Elliot for the suggestion.
2011-02-01Markdown reader: Simplified and corrected footnote block parser.John MacFarlane1-7/+10
2011-01-31Improved fix to markdown noteBlock parser.John MacFarlane1-1/+1
The last patch did not handle cases with > 4 spaces. Also added a more general test case.
2011-01-31Markdown reader: Fixed whitespace footnote bug (Jesse Rosenthal).John MacFarlane1-1/+2
The problem was in input like this: [^1]: note not in note. Also added a test case for this.
2011-01-29Markdown reader tables: Fixed bug in alignments.John MacFarlane1-4/+5
Previously pandoc got confused by blank rows in the header.
2011-01-26Add support for attributes in inline Code.John MacFarlane1-2/+3
Additional related changes: * URLs in Code in autolinks now use class "url". * Require highlighting-kate 0.2.8.2, which omits the final <br/> tag, essential for inline code.
2011-01-26Markdown reader: Don't parse latex/context environments as inline.John MacFarlane1-9/+15
2011-01-26Distinguish latex & context environments; blank line after in writers.John MacFarlane1-3/+4
2011-01-26Bumped version to 1.8; depend on pandoc-types 1.8.John MacFarlane1-10/+10
The old TeX, HtmlInline and RawHtml elements have been removed and replaced by generic RawInline and RawBlock elements. All modules updated to use the new raw elements.
2011-01-22Markdown reader: slight speedup by moving whitespace parser.John MacFarlane1-2/+2
2011-01-19Replaced more noneOf/oneOf parsers.John MacFarlane1-5/+11
2011-01-19Replaced uses of oneOf with more efficient parsers.John MacFarlane1-12/+19
This speeds up the markdown reader.
2011-01-04Markdown reader: Removed unneeded definitions.John MacFarlane1-10/+8
specialChars, strChar, specialCharsMinusLt.
2011-01-04Moved 'macro' and 'applyMacros'' from markdown reader to Parsing.John MacFarlane1-24/+0
2011-01-01Fixed regression in markdown reader.John MacFarlane1-3/+3
'(_hi_)' was being parsed with literal underscores (no emphasis). The fix: the 'str' parser now only parses alphanumerics and embedded underscores. All other symbols are handled by the 'symbol' parser. This has a slight effect on the AST, since you'll get [Str "hi",Str ":"] insntead of [Str "hi:"]. But there should not be a visible effect in any of the writers. Thanks to gwern for pointing out the regression.
2010-12-30New HTML reader using tagsoup as a lexer.John MacFarlane1-28/+27
* The new reader is faster and more accurate. * API changes for Text.Pandoc.Readers.HTML: - removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag, anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType, htmlBlockElement, htmlComment - added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag * tagsoup is a new dependency. * Text.Pandoc.Parsing: Generalized type on readWith. * Benchmark.hs: Added length calculation to force full evaluation. * Updated HTML reader tests. * Updated markdown and textile readers to use the functions from the HTML reader. * Note: The markdown reader now correctly handles some cases it did not before. For example: <hr/> is reproduced without adding a space. <script> a = '<b>'; </script> is parsed correctly.
2010-12-24Use functions from Text.Pandoc.Generic instead of processWith(M).John MacFarlane1-1/+2
2010-12-14Fixed regression in parsing _emph_John MacFarlane1-1/+1
There was a bug in parsing '_emph_, ...': when followed by a comma, underscore emphasis did not register. (Thanks to gwern for pointing this out.) This bug was introduced by the change in c66921f2acea456af527b93e2daa1d8594798642
2010-12-13Moved special handling of punctuation in suffix out of markdown reader.Nathan Gass1-7/+2
This allows different writers to handle punctuation in the suffix differently.
2010-12-13Markdown reader: Further fix to abbrevs.John MacFarlane1-1/+1
2010-12-13Markdown reader: Fixed abbrev handler to allow abbrev at end of line.John MacFarlane1-2/+2
E.g., Mr. Frank.
2010-12-13Markdown reader: Fixed referenceKey parser to allow space after newline.John MacFarlane1-2/+1
2010-12-13Markdown reader: Fixed regression in reference key parser.John MacFarlane1-0/+1
* The recent change allowing spaces and newlines in the URL caused problems when reference keys are stacked up without blank lines between. This is now fixed. * Added test.
2010-12-12Markdown reader: fix superscripts with links.John MacFarlane1-1/+1
Moved inlineNote parser after superscript parser, so ^[link](/foo)^ gets recognized as a superscripted link, not an inline note followed by garbage. Thanks to Conal Elliott for pointing out the problem.
2010-12-10Markdown reader: small cosmetic code improvements.John MacFarlane1-8/+6
2010-12-10Removed HTML sanitization.John MacFarlane1-11/+5
This is better done on the resulting HTML; use the xss-sanitize library for this. xss-sanitize is based on pandoc's sanitization, but improves it. - Removed stateSanitize from ParserState. - Removed --sanitize-html option.
2010-12-10Markdown reader: Allow linebreaks in URLs (treat as spaces).John MacFarlane1-6/+21
Also, a string of consecutive spaces or tabs is now parsed as a single space. If you have multiple spaces in your URL, use %20%20.
2010-12-10Markdown reader: Rewrote para parser for better efficiency.John MacFarlane1-10/+8
This change avoids repeated parsing of inline lists for 'plain' blocks.
2010-12-08Markdown reader: minor footnote changes.John MacFarlane1-2/+3
Don't skipNonindentSpaces in noteMarker, since it's also used in the inline note parser.
2010-12-07Smart punctuation: recognize entities.John MacFarlane1-1/+1
Now &ldquo;Hi&rdquo; gets parsed as a Quoted DoubleQuote inline.
2010-12-07Markdown reader: Moved smartPunctuation parser, for slight speed bump.John MacFarlane1-1/+1
2010-12-07Moved smartPunctuation from Markdown to Parsing.John MacFarlane1-89/+2
+ Parameterized smartPunctuation on an inline parser. + Handle smartPunctuation in Textile reader.
2010-12-06Markdown reader: better handling of intraword _.John MacFarlane1-3/+5
The 'str' parser now reads internal _'s as part of the string. This prevents pandoc from getting started looking for an emphasized block, which can cause exponential slowdowns in some cases. Resolves Issue #182.
2010-12-06Markdown reader: handle curly quotes better.John MacFarlane1-15/+14
Previously, curly quotes were just parsed literally, leading to problems in some output formats. Now they are parsed as Quoted inlines, if --smart is specified. Resolves Issue #270.
2010-12-05Fix regression: markdown references should be case-insensitive.John MacFarlane1-2/+2
This broke when we added the Key type. We had assumed that the custom case-insensitive Ord instance would ensure case-insensitive matching, but that is not how Data.Map works. * Added a test case for case-insensitivity in markdown-reader-more * Removed old refsMatch from Text.Pandoc.Parsing module; * hid the 'Key' constructor; * dropped the custom Ord and Eq instances, deriving instead; * added fromKey and toKey to convert between Keys and Inline lists; * toKey ensures that keys are case-insensitive, since this is the only way the API provides to construct a Key. Resolves Issue #272.
2010-12-03Merge branch 'citeproc' into master.John MacFarlane1-37/+92
Conflicts: src/Text/Pandoc/Definition.hs
2010-12-03punctuation handling, and more html-specific handlingpaul.rivier1-1/+2
2010-11-28Merge branch 'master' into citeprocJohn MacFarlane1-0/+3
2010-11-28Markdown parser performance improvement.John MacFarlane1-0/+3
Do a quick lookahead to make sure what follows looks like a setext header before parsing any Inlines. This gives a 15% performance boost in one benchmark. Many thanks to knieriem for finding the problem (in peg-markdown): https://github.com/jgm/peg-markdown/issues/issue/3
2010-11-26Markdown suffix parser fix.John MacFarlane1-2/+7
If suffix doesn't begin with punctuation, include opening comma and space in result. Previously, @item [only a suffix] would result in something like Doe (2002only a suffix) because there was no opening delimiter.
2010-11-26Split locator and suffix in Biblio rather than Markdown parser.John MacFarlane1-36/+2
Patch from Nathan Gass.
2010-11-22Check biblio for all citations, not just textual.John MacFarlane1-5/+5
2010-11-18Markdown citation parser: small refactoring for clarity.John MacFarlane1-1/+5