Age | Commit message (Collapse) | Author | Files | Lines |
|
git-svn-id: https://pandoc.googlecode.com/svn/trunk@550 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
in cleaner, faster code, and it makes it easier to use Pandoc in
other projects, like wikis, that use Text.XHtml. Two functions
are now provided, writeHtml and writeHtmlString: the former outputs
an Html structure, the latter a rendered string. The S5 writer is
also changed, in parallel ways (writeS5, writeS5String). The Html
header is now written programmatically, so it has been removed from
the 'headers' directory. The S5 header is still needed, but the
doctype and some of the meta declarations have been removed, since
they are written programatically. The INSTALL file and cabalize
have been updated to reflect the new dependency on the xhtml package.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@549 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
(160) as " ", since otherwise it is hard to distinguish
from a regular space. (Addresses Issue #3.)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@541 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
git-svn-id: https://pandoc.googlecode.com/svn/trunk@520 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
It is no longer needed now that all entities are processed in the markdown
and HTML readers. All calls to stringToSGML have been replaced by calls
to encodeEntities.
+ Since inTag's attribute handling already encodes entities,
calls to encodeEntities are no longer needed for attribute values, so
they've been removed.
+ The HTML and Markdown readers now call decodeEntities on all raw
strings (e.g. authors, dates, link titles), to ensure that no unprocessed
entities are included in the native representation of the document.
(In the HTML reader, most of this work is done by a change in
extractAttributeName.)
+ The result is a small speed improvement (around 5% on my benchmark)
and cleaner code.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@519 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
Str inline in Docbook and HTML writers, since now these
strings should not contain literal entity references.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@518 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
Now these are stored as a '"' character, not as '"'.
The function escapeLinkTitle in the Markdown writer is
unnecessary and was removed. Tests modified accordingly.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@517 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
above 128 in HTML and Docbook output, we now just use unicode. After all,
we're declaring UTF-8 content in the header. This makes the HTML and
docbook files produced by pandoc much more readable and editable.
Changes to Entities.hs:
+ Removed specialCharToEntity
+ Added escapeSGMLChar (which just escapes the basic four, <>&")
+ Modified encodeEntities and stringToSGML to use escapeSGMLChar
+ Removed encodeEntitiesNumerical
+ Rewrote encodeEntities for better performance
+ Rewrote stringToSGML for better performance
git-svn-id: https://pandoc.googlecode.com/svn/trunk@516 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
+ Entities are parsed (and unicode characters returned) in both
Markdown and HTML readers.
+ Parsers characterEntity, namedEntity, decimalEntity, hexEntity added
to Entities.hs; these parse a string and return a unicode character.
+ Changed 'entity' parser in HTML reader to use the 'characterEntity'
parser from Entities.hs.
+ Added new 'entity' parser to Markdown reader, and added '&' as a
special character. Adjusted test suite accordingly since now we
get 'Str "AT",Str "&",Str "T"' instead of 'Str "AT&T"..
+ stringToSGML moved to Entities.hs. escapeSGML removed as redundant,
given encodeEntities.
+ stringToSGML, encodeEntities, and specialCharToEntity are given a
boolean parameter that causes only numerical entities to be used.
This is used in the docbook writer. The HTML writer uses named
entities where possible, but not all docbook-consumers know about
the named entities without special instructions, so it seems safer
to use numerical entities there.
+ decodeEntities is rewritten in a way that avoids Text.Regex, using
the new parsers.
+ charToEntity and charToNumericalEntity added to Entities.hs.
+ Moved specialCharToEntity from Shared.hs to Entities.hs.
+ Removed unneeded 'decodeEntities' from 'str' parser in HTML and
Markdown readers.
+ Removed sgmlHexEntity, sgmlDecimalEntity, sgmlNamedEntity, and
sgmlCharacterEntity from Shared.hs.
+ Modified Docbook writer so that it doesn't rely on Text.Regex for
detecting "mailto" links.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@515 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
So, instead of [site.com](site.com) we get <site.com>.
Changed test suite accordingly.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@508 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
+ LaTeX writer now handles consecutive quotes properly:
for example, ``\,`hello'\,''
+ LaTeX reader now parses '\,' as empty Str
+ normalizeSpaces function in Shared now removes empty Str elements
+ Modified tests accordingly
git-svn-id: https://pandoc.googlecode.com/svn/trunk@506 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
git-svn-id: https://pandoc.googlecode.com/svn/trunk@501 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
list function that can be used to substitute one substring
for another in a string, like 'gsub' except without regular
expressions.
+ Use 'substitute' instead of 'gsub' in the LaTeX writer. This
avoids what appears to be a bug in Text.Regex, whereby "\\^"
matches "\350". There seems to be a slight speed improvement
as well. (Note: If this works, it would be good to replace
other uses of gsub that don't employ regexs with 'substitute'.)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@500 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
DocBook, and HTML writers. The syntax is documented in
README. Tests have been added to the test suite.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@493 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
numerical entities, for portability across stylesheets.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@473 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
Markdown.pl:
+ title attribute comes after alt attribute
+ title is included even if null
git-svn-id: https://pandoc.googlecode.com/svn/trunk@445 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
is now handled in the Markdown and LaTeX readers, rather than in
the writers. The HTML writer has been rewritten to use the
prettyprinting library.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@436 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
style of implicit reference link. It now uses [this style][],
not [this style]. Reason: only newer, beta versions of Markdown
allow the single-bracket style.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@419 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
Text/Shared/Pandoc. (escapeSGML, stringToSGML, inTag,
inTagSimple, inTagIndented, selfClosingTag) These can be
used by both the HTML and Docbook writers.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@417 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
in Docbook writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@413 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
LineBreaks no longer cause ugly wrapping in Markdown output.
+ Replaced splitBySpace with the more general, polymorphic function
splitBy (in Text/Pandoc/Shared).
git-svn-id: https://pandoc.googlecode.com/svn/trunk@411 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
code accordingly.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@395 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
Though XML tools should support unicode, some people will be
using SGML tools, and these do not. Using entities makes the
docbook files more portable.
Also refactored encodeEntities and charToHtmlEntity in
HtmlEntities.hs
git-svn-id: https://pandoc.googlecode.com/svn/trunk@394 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
escaped characters rather than <programlisting> and CDATA.
Reason: XML source more easily editable and readable.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@393 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
writer. The line isn't necessary, since we have a case for
every kind of block element.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@388 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
git-svn-id: https://pandoc.googlecode.com/svn/trunk@386 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
option to pandoc, which forces it to stay as close as possible
to official Markdown syntax.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@347 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
+ Removed invisible anchors in front of header tags in HTML output.
Reason: no way to prevent duplicate ID attributes (which is invalid
HTML), since there might be duplicate header titles. See
http://six.pairlist.net/pipermail/markdown-discuss/2005-January/000975.html.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@306 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
+ This uncovered an existing bug in the RTF writer, which got indentation
wrong on footnotes occuring in indented blocks like lists. Fixed
this bug.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@263 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
git-svn-id: https://pandoc.googlecode.com/svn/trunk@260 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
git-svn-id: https://pandoc.googlecode.com/svn/trunk@258 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
git-svn-id: https://pandoc.googlecode.com/svn/trunk@257 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
(which seems to be the term that is used in this context).
git-svn-id: https://pandoc.googlecode.com/svn/trunk@255 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
by combining it with entity obfuscation.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@254 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
+ Reformatted code consistently.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@252 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
heading. The anchors are derived form the text of the section
heading as described in README. This makes it easy to insert
links that jump from one part of a document to another:
for example, '[back to the Introduction](#Introduction)'.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@246 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
'comparing' is from Data.Ord, which is not available in GHC 6.4.
+ Added line break after </li> in HTML footnote output, for easier
inspection of the source.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@245 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
git-svn-id: https://pandoc.googlecode.com/svn/trunk@241 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
from templates in src/templates, and so should not be in the
repository.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@234 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
for markdown footnotes. References are now like this[^1]
rather than like this^(1). There are corresponding changes
in the footnotes themselves. See the updated README for
more details.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@230 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
and symbols accordingly.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@224 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
cases where latex commands or HTML entity references appear
after quotes.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@202 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
of the readers to make spacing at end of output more consistent.
Modified tests accordingly.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@201 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
git-svn-id: https://pandoc.googlecode.com/svn/trunk@171 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
+ LaTeX reader did not parse metadata correctly. Now the title,
author, and date are parsed correctly, and everything else in
the preamble is skipped.
+ Simplified parsing of LaTeX command arguments and options.
The function commandArgs now returns a list of arguments OR
options (in whatever order they appear). The brackets are
included, and a new stripFirstAndLast function is provided
to strip them off when needed. This fixes a problem in dealing
with \newcommand, etc.
+ Added a "try" before "parser" in definition of notFollowedBy'
combinator. Adjusted the code using this combinator accordingly.
+ Changed handling of code blocks. Previously, some readers allowed
trailing newlines, while others stripped them. Now, all readers
strip trailing newlines in code blocks; writers insert a newline
at the end of code blocks as needed.
+ Changed test suite to reflect these changes.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@137 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
+ Recognize a double hyphen as an Em-dash, even when it occurs next
to punctuation (e.g. a quotation mark).
+ Collapse space around Em-dashes.
+ Process quotes before dashes. This way (foo -- 'bar') will turn into
(foo---`bar') instead of (foo---'bar').
git-svn-id: https://pandoc.googlecode.com/svn/trunk@49 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
+ use Helvetica instead of Times New Roman as default font
+ specify \f0 in every \pard; otherwise font sizes are not registered properly
+ modify test of RTF writer accordingly
git-svn-id: https://pandoc.googlecode.com/svn/trunk@32 788f1e2b-df1e-0410-8736-df70ead52e1b
|
|
git-svn-id: https://pandoc.googlecode.com/svn/trunk@2 788f1e2b-df1e-0410-8736-df70ead52e1b
|