aboutsummaryrefslogtreecommitdiff
path: root/src/Text
AgeCommit message (Collapse)AuthorFilesLines
2011-01-04Moved 'macro' and 'applyMacros'' from markdown reader to Parsing.John MacFarlane2-26/+27
2011-01-01Fixed regression in markdown reader.John MacFarlane1-3/+3
'(_hi_)' was being parsed with literal underscores (no emphasis). The fix: the 'str' parser now only parses alphanumerics and embedded underscores. All other symbols are handled by the 'symbol' parser. This has a slight effect on the AST, since you'll get [Str "hi",Str ":"] insntead of [Str "hi:"]. But there should not be a visible effect in any of the writers. Thanks to gwern for pointing out the regression.
2011-01-01Updated copyright notices.John MacFarlane1-1/+1
2010-12-30LaTeX reader: Allow ignored comments after \end{document}.John MacFarlane1-3/+1
2010-12-30HTML reader: Fixed some parsing bugs.John MacFarlane1-22/+28
2010-12-30Added support for listings package code blocks and inline code.Puneeth Chaganti1-2/+9
2010-12-30Textile reader: Slight speed improvement.John MacFarlane1-5/+5
2010-12-30New HTML reader using tagsoup as a lexer.John MacFarlane4-628/+424
* The new reader is faster and more accurate. * API changes for Text.Pandoc.Readers.HTML: - removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag, anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType, htmlBlockElement, htmlComment - added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag * tagsoup is a new dependency. * Text.Pandoc.Parsing: Generalized type on readWith. * Benchmark.hs: Added length calculation to force full evaluation. * Updated HTML reader tests. * Updated markdown and textile readers to use the functions from the HTML reader. * Note: The markdown reader now correctly handles some cases it did not before. For example: <hr/> is reproduced without adding a space. <script> a = '<b>'; </script> is parsed correctly.
2010-12-26normalize: Don't reduce [Space] to [].John MacFarlane1-4/+1
2010-12-26Improved 'normalize'.John MacFarlane1-41/+44
Now normalizeInlines is split into consolidateInlines and removeEmptyInlines. We need to remove empties before consolidating.
2010-12-26Markdown writer: Fixed bug in Image.John MacFarlane1-1/+1
URI was getting unescaped twice!
2010-12-25Improved normalize.John MacFarlane1-0/+15
2010-12-24Use functions from Text.Pandoc.Generic instead of processWith(M).John MacFarlane9-17/+28
2010-12-22HTML reader: Simplified parsing of <script> sections.John MacFarlane1-24/+1
I had previously assumed that we needed to ignore </script> occuring in a string literal or javascript comment. It turns out, though, that browsers aren't that smart.
2010-12-22Made --smart work with HTML reader.John MacFarlane1-4/+13
It did not work before, because - and quotes were gobbled up by the str parser.
2010-12-22RST reader: Added unicode quote characters to specialChars.John MacFarlane1-1/+1
(So they can trigger Quoted environments.)
2010-12-22RST reader: recouped speed loss due to addition of --smart.John MacFarlane1-4/+4
This was achieved by rearranging the parsers in inline. Benchmarks went from 500ms to 307ms -- not quite back to the 279ms we had in 1.6, before supporting smart punctuation and footnotes, but close.
2010-12-22ODT writer: Don't wrap text in opendocument.John MacFarlane1-1/+1
2010-12-22Removed all dependencies on 'pretty' package.John MacFarlane1-4/+0
2010-12-22Texinfo writer: Updated to use Pretty.John MacFarlane1-56/+37
2010-12-22Shared: Removed unneeded prettyprinting functions:John MacFarlane1-75/+0
wrapped, wrapIfNeeded, wrappedTeX, wrapTeXIfNeeded, hang'.
2010-12-22Shared: Removed BlockWrapper, wrappedBlocksToDoc.John MacFarlane1-13/+1
These are no longer needed with the new Pretty module.
2010-12-22Pretty: Added quote, doubleQuote.John MacFarlane1-0/+10
2010-12-22Man writer: updated to use Pretty.John MacFarlane1-18/+22
2010-12-21OpenDocument writer: Updated to use Pretty.John MacFarlane1-8/+12
2010-12-21XML: don't use breaking spaces in attribute lists.John MacFarlane1-4/+5
2010-12-21Docbook writer: Updated to use Pretty.John MacFarlane1-21/+20
2010-12-21Pretty: don't print a breaking space before a newline.John MacFarlane1-0/+4
2010-12-21Shared: Made splitBy take a test instead of an element.John MacFarlane4-9/+9
2010-12-21XML: Replaced escapeStringAsXML with a faster version.John MacFarlane1-9/+1
Benchmarked with criterion, it's about 8x faster than the old version. This speeds up docbook, opendocument, and html writers.
2010-12-20Markdown writer: use \ for newline instead of two spaces at eol.John MacFarlane1-1/+4
(Unless --strict.)
2010-12-20Markdown writer: Use delimited code block if there are attributes.John MacFarlane1-2/+21
(Unless in strict mode.)
2010-12-20Plain writer: set stateStrictMarkdown automatically.John MacFarlane1-3/+4
2010-12-20ConTeXt writer: Updated to use Text.Pandoc.Pretty.John MacFarlane1-73/+71
2010-12-20Renamed 'enclosed' to 'inside'.John MacFarlane1-7/+7
This avoids conflict with 'enclosed' in Text.Pandoc.Parsing.
2010-12-19Pretty: Fixed parens.John MacFarlane1-1/+1
2010-12-19Pretty: Added enclosed, parens.John MacFarlane1-2/+13
2010-12-19LaTeX writer: A bit of code polish.John MacFarlane1-29/+28
2010-12-19LaTeX writer: Modified to use Pretty.John MacFarlane1-34/+30
Improved footnote formatting, removed spurious blank lines.
2010-12-19Shared: Use stringify to simplify inlineListToIdentifier.John MacFarlane1-28/+11
2010-12-19Pretty: Added braces and brackets.John MacFarlane1-0/+9
2010-12-18LaTeX writer: Use \paragraph, \subparagraph for level 4,5 headers.John MacFarlane1-9/+10
2010-12-17Added new prettyprinting module.John MacFarlane7-451/+708
* Added Text.Pandoc.Pretty. This is better suited for pandoc than the 'pretty' package. One advantage is that we now get proper wrapping; Emph [Inline] is no longer treated as a big unwrappable unit. Previously we only got breaks for spaces at the "outer level." We can also more easily avoid doubled blank lines. Performance is significantly better as well. * Removed Text.Pandoc.Blocks. Text.Pandoc.Pretty allows you to define blocks and concatenate them. * Modified markdown, RST, org readers to use Text.Pandoc.Pretty instead of Text.PrettyPrint.HughesPJ. * Text.Pandoc.Shared: Added writerColumns to WriterOptions. * Markdown, RST, Org writers now break text at writerColumns. * Added --columns command-line option, which sets stColumns and writerColumns. * Table parsing: If the size of the header > stColumns, use the header size as 100% for purposes of calculating relative widths of columns.
2010-12-15HTML reader: allow : in tags.John MacFarlane1-2/+6
Resolves Issue #274.
2010-12-15Use top-level header at end as bibliography title for natbib and biblatex ↵Nathan Gass1-4/+13
output.
2010-12-15Remove punctuation at start of suffix for natbib and biblatex output.Nathan Gass1-2/+6
This is necessary as the latex citation commands include there own punctuation, which resulted in doubled commas for markdown documents where citeproc output works correctly.
2010-12-15Support multiple bibliography files with natbib and biblatex output.Nathan Gass2-3/+4
2010-12-14Added 'normalize' to Text.Pandoc.Shared.John MacFarlane1-1/+53
2010-12-14Fixed preamble parsing in LaTeX reader.John MacFarlane1-2/+8
2010-12-14Fixed regression in parsing _emph_John MacFarlane1-1/+1
There was a bug in parsing '_emph_, ...': when followed by a comma, underscore emphasis did not register. (Thanks to gwern for pointing this out.) This bug was introduced by the change in c66921f2acea456af527b93e2daa1d8594798642