pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2012-02-04	Complete rewrite of LaTeX reader.	John MacFarlane	1	-4/+20
	* The new reader is more robust, accurate, and extensible. It is still quite incomplete, but it should be easier now to add features. * Text.Pandoc.Parsing: Added withRaw combinator. * Markdown reader: do escapedChar before raw latex inline. Otherwise we capture commands like \{. * Fixed latex citation tests for new citeproc. * Handle \include{} commands in latex. This is done in pandoc.hs, not the (pure) latex reader. But the reader exports the needed function, handleIncludes. * Moved err and warn from pandoc.hs to Shared. * Fixed tests - raw tex should sometimes have trailing space. * Updated lhs-test for highlighting-kate changes.
2012-01-27	Fixed table parsing with wide or combining characters.	John MacFarlane	1	-1/+1
	Closes #348. Closes #108.
2012-01-01	New treatment of dashes in --smart mode.	John MacFarlane	1	-5/+29
	* `---` is always em-dash, `--` is always en-dash. * pandoc no longer tries to guess when `-` should be en-dash. * A new option, `--old-dashes`, is provided for legacy documents. Rationale: The rules for en-dash are too complex and language-dependent for a guesser to work reliably. This change gives users greater control. The alternative of using unicode isn't very good, since unicode em- and en- dashes are barely distinguishable in a monospace font.
2011-12-29	Better smart quote parsing.	John MacFarlane	1	-1/+7
	* Added stateLastStrPos to ParserState. This lets us keep track of whether we're parsing the position immediately after a 'str'. If we encounter a ' in such a location, it must be an apostrophe, and can't be a single quote start. * Set this in the markdown, textile, html, and rst str parsers. * Closes #360.
2011-12-27	Replaced Apostrophe, Ellipses, EmDash, EnDash w/ unicode strings.	John MacFarlane	1	-6/+6

2011-12-27	Pretty: return Str with unicode instead of Apostrophe.	John MacFarlane	1	-1/+1

2011-12-05	Parsing: Removed charsInBalanced', added param to charsInBalanced.	John MacFarlane	1	-20/+13
	The extra parameter is a character parser. This is needed for proper handling of escapes, etc.
2011-12-05	Parsing: Changed type of escaped to return Char	John MacFarlane	1	-5/+2

2011-07-30	Added nonspaceChar to Text.Pandoc.Parsing.	John MacFarlane	1	-0/+5

2011-07-25	Smart quotes: handle '...hi' properly.	John MacFarlane	1	-1/+2
	Also added test case.
2011-07-23	Properly handle characters in the 128..159 range.	John MacFarlane	1	-7/+7
	These aren't valid in HTML, but many HTML files produced by Windows tools contain them. We substitute correct unicode characters.
2011-04-29	Revert "Parsing: Use new type aliases, PandocParser, GeneralParser."	John MacFarlane	1	-123/+118
	This reverts commit ec5410bc4e9d228b7dc0123061d80f9addf825bf.
2011-04-29	Parsing: Use new type aliases, PandocParser, GeneralParser.	John MacFarlane	1	-118/+123
	This should make it easier to change the types later.
2011-03-18	Changed uri parser so it doesn't include trailing punctuation.	John MacFarlane	1	-3/+19
	So, in RST, 'http://google.com.' should be parsed as a link to 'http://google.com' followed by a period. The parser is smart enough to recognize balanced parentheses, as often occur in wikipedia links: 'http://foo.bar/baz_(bam)'. Also added ()s to RST specialChars, so '(http://google.com)' will be parsed as a link in parens. Added test cases. Resolves Issue #291.
2011-01-26	Add support for attributes in inline Code.	John MacFarlane	1	-1/+1
	Additional related changes: * URLs in Code in autolinks now use class "url". * Require highlighting-kate 0.2.8.2, which omits the final <br/> tag, essential for inline code.
2011-01-26	Bumped version to 1.8; depend on pandoc-types 1.8.	John MacFarlane	1	-7/+6
	The old TeX, HtmlInline and RawHtml elements have been removed and replaced by generic RawInline and RawBlock elements. All modules updated to use the new raw elements.
2011-01-19	More small parser rewrites for small performance gains.	John MacFarlane	1	-9/+11

2011-01-19	Parsing: Rewrote spaceChar for significant speedup in readers.	John MacFarlane	1	-1/+1

2011-01-14	Parsing: Fixed bug in grid table parser.	John MacFarlane	1	-5/+5
	Spaces at end of line were not being stripped properly, resulting in unintended LineBreaks.
2011-01-05	Fixed macro parsing.	John MacFarlane	1	-8/+10

2011-01-04	Moved 'macro' and 'applyMacros'' from markdown reader to Parsing.	John MacFarlane	1	-2/+27

2010-12-30	New HTML reader using tagsoup as a lexer.	John MacFarlane	1	-3/+3
	* The new reader is faster and more accurate. * API changes for Text.Pandoc.Readers.HTML: - removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag, anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType, htmlBlockElement, htmlComment - added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag * tagsoup is a new dependency. * Text.Pandoc.Parsing: Generalized type on readWith. * Benchmark.hs: Added length calculation to force full evaluation. * Updated HTML reader tests. * Updated markdown and textile readers to use the functions from the HTML reader. * Note: The markdown reader now correctly handles some cases it did not before. For example: <hr/> is reproduced without adding a space. <script> a = '<b>'; </script> is parsed correctly.
2010-12-24	Use functions from Text.Pandoc.Generic instead of processWith(M).	John MacFarlane	1	-1/+2

2010-12-17	Added new prettyprinting module.	John MacFarlane	1	-2/+3
	* Added Text.Pandoc.Pretty. This is better suited for pandoc than the 'pretty' package. One advantage is that we now get proper wrapping; Emph [Inline] is no longer treated as a big unwrappable unit. Previously we only got breaks for spaces at the "outer level." We can also more easily avoid doubled blank lines. Performance is significantly better as well. * Removed Text.Pandoc.Blocks. Text.Pandoc.Pretty allows you to define blocks and concatenate them. * Modified markdown, RST, org readers to use Text.Pandoc.Pretty instead of Text.PrettyPrint.HughesPJ. * Text.Pandoc.Shared: Added writerColumns to WriterOptions. * Markdown, RST, Org writers now break text at writerColumns. * Added --columns command-line option, which sets stColumns and writerColumns. * Table parsing: If the size of the header > stColumns, use the header size as 100% for purposes of calculating relative widths of columns.
2010-12-10	Removed HTML sanitization.	John MacFarlane	1	-2/+0
	This is better done on the resulting HTML; use the xss-sanitize library for this. xss-sanitize is based on pandoc's sanitization, but improves it. - Removed stateSanitize from ParserState. - Removed --sanitize-html option.
2010-12-07	Smart punctuation: recognize entities.	John MacFarlane	1	-8/+22
	Now “Hi” gets parsed as a Quoted DoubleQuote inline.
2010-12-07	Smart punctuation: don't alllow ellipses containing spaces.	John MacFarlane	1	-1/+1
	Previously we allowed '. . .', ' . . . ', etc. This caused too many complications, and removed author's flexibility in combining ellipses with spaces and periods.
2010-12-07	Moved smartPunctuation from Markdown to Parsing.	John MacFarlane	1	-3/+92
	+ Parameterized smartPunctuation on an inline parser. + Handle smartPunctuation in Textile reader.
2010-12-05	Fix regression: markdown references should be case-insensitive.	John MacFarlane	1	-38/+17
	This broke when we added the Key type. We had assumed that the custom case-insensitive Ord instance would ensure case-insensitive matching, but that is not how Data.Map works. * Added a test case for case-insensitivity in markdown-reader-more * Removed old refsMatch from Text.Pandoc.Parsing module; * hid the 'Key' constructor; * dropped the custom Ord and Eq instances, deriving instead; * added fromKey and toKey to convert between Keys and Inline lists; * toKey ensures that keys are case-insensitive, since this is the only way the API provides to construct a Key. Resolves Issue #272.
2010-11-06	Removed CITEPROC CPP conditionals from library code.	John MacFarlane	1	-4/+0
	By Cabal policy, the API should not change depending on flags.
2010-10-26	Process LaTeX macros in markdown, and apply to TeX math.	John MacFarlane	1	-2/+7
	Example: \newcommand{\plus}[2]{#1 + #2} $\plus{3}{4}$ yields: 3+4
2010-07-13	Parse \chapter{} in latex.	John MacFarlane	1	-2/+4
	+ Added stateHasChapters to ParserState. + If a \chapter command is encountered, this is set to True and subsequent \section commands (etc.) will be bumped up one level.
2010-07-11	Merge branch 'atlists'. Added auto-numbered example lists.	John MacFarlane	1	-5/+27

2010-07-06	Allow language-neutral table captions.	John MacFarlane	1	-1/+4
	+ Captions may now begin simply with ':', instead of 'Table:' + Captions may now appear either above or below the table. + Resolves Issue #227.
2010-07-05	More refactoring of grid table code.	John MacFarlane	1	-8/+60

2010-07-05	Minor reformatting.	John MacFarlane	1	-2/+4

2010-07-05	Moved generic grid table functions from RST reader -> Parsing.	John MacFarlane	1	-3/+85
	Here they can be used by the Markdown reader as well.
2010-07-05	Moved parsing functions from Text.Pandoc.Shared to new module.	John MacFarlane	1	-0/+537
	+ Text.Pandoc.Parsing