Age | Commit message (Collapse) | Author | Files | Lines |
|
Only if --parse-raw.
|
|
|
|
Closes #893.
|
|
Column widths are divided equally.
TODO: Get column widths from col tags if present.
|
|
This commit doesn't change the present behavior at all, but
it will make it easier to support non-simple tables in the future.
|
|
* Depend on pandoc 1.12.
* Added yaml dependency.
* `Text.Pandoc.XML`: Removed `stripTags`. (API change.)
* `Text.Pandoc.Shared`: Added `metaToJSON`.
This will be used in writers to create a JSON object for use
in the templates from the pandoc metadata.
* Revised readers and writers to use the new Meta type.
* `Text.Pandoc.Options`: Added `Ext_yaml_title_block`.
* Markdown reader: Added support for YAML metadata block.
Note that it must come at the beginning of the document.
* `Text.Pandoc.Parsing.ParserState`: Replace `stateTitle`,
`stateAuthors`, `stateDate` with `stateMeta`.
* RST reader: Improved metadata.
Treat initial field list as metadata when standalone specified.
Previously ALL fields "title", "author", "date" in field lists
were treated as metadata, even if not at the beginning.
Use `subtitle` metadata field for subtitle.
* `Text.Pandoc.Templates`: Export `renderTemplate'` that takes a string
instead of a compiled template..
* OPML template: Use 'for' loop for authors.
* Org template: '#+TITLE:' is inserted before the title.
Previously the writer did this.
|
|
- Specialize readWith to String input.
- On error have it print the line in which the error occurred,
with a caret pointing to the column.
- This should help diagnose parsing problems in LaTeX especially.
|
|
|
|
|
|
|
|
Previously header ids were autogenerated by the writers.
Now they are generated (unless supplied explicitly) in the
markdown parser, if the `header_identifiers` extension is
selected.
In addition, the textile reader now supports id attributes on
headers.
|
|
A tag must start with `<` followed by `!`,`?`, `/`, or a letter.
This makes it more useful in the wikimedia and markdown parsers.
|
|
|
|
|
|
|
|
Improved removal of markdown="1" attribute in Markdow reader.
|
|
|
|
Test individually for the extensions.
|
|
|
|
|
|
Better to keep reader and writer options separate.
|
|
This is the beginning of a larger transition that will make
Options, not ParserState, the parameter of the read functions.
(Options will also be used in writers, in place of WriterOptions.)
Next step is to remove strict, replacing it with granular
tests for different extensions.
|
|
This caused hangs in parsing certain markdown input using --strict.
|
|
|
|
No other module directly imports Parsec. This will make it easier
to change the parsing backend in the future, if we want to.
|
|
|
|
Closes #486.
|
|
Previously a paragraph containing just ` ` would be rendered
as an empty paragraph. Thanks to Paul Vorbach for pointing out the bug.
|
|
Closes #422: highlighting lost using `--self-contained`.
|
|
|
|
* Added stateLastStrPos to ParserState. This lets us keep track
of whether we're parsing the position immediately after a 'str'.
If we encounter a ' in such a location, it must be an apostrophe,
and can't be a single quote start.
* Set this in the markdown, textile, html, and rst str parsers.
* Closes #360.
|
|
It was always possible to include raw DocBook tags in a markdown
document, but now pandoc will be able to distinguish block from
inline tags and behave accordingly. Thus, for example,
<sidebar>
hello
</sidebar>
will not be wrapped in `<para>` tags.
|
|
See bug #274, which was not completely fixed by the last patch.
|
|
These aren't valid in HTML, but many HTML files produced by
Windows tools contain them. We substitute correct unicode
characters.
|
|
For example, in
Just a few glitches remaining.
<ul><li> In this situation, one loses the list.
</ul>
And in this, the preformatting.
<pre>Preformatted text not starting with its own blank line.
</pre>
Thansk to Dirk Laurie for noticing the issue.
|
|
Closes #274.
|
|
* Skip spaces after <b>, <emph>, etc.
* Convert Plain elements into Para when they're in a list
item with Para, Pre, BlockQuote, CodeBlock.
An example of HTML that pandoc handles better now:
~~~~
<h4> Testing html to markdown </h4>
<ul>
<li>
<b> An item in a list </b>
<p> An introductory sentence.
<pre>
Some preformatted text
at this stage comes next.
But alas! much havoc
is wrought by Pandoc.
</pre>
</ul>
~~~~
Thanks to Dirk Laurie for reporting the issues.
|
|
Additional related changes:
* URLs in Code in autolinks now use class "url".
* Require highlighting-kate 0.2.8.2, which omits the final <br/> tag,
essential for inline code.
|
|
The old TeX, HtmlInline and RawHtml elements have been removed
and replaced by generic RawInline and RawBlock elements.
All modules updated to use the new raw elements.
|
|
Resolves Issue #106. Thanks to Rodja Trappe for the idea
and some sample code.
|
|
This avoids the need for manual parsing all over the place.
|
|
|
|
|
|
* The new reader is faster and more accurate.
* API changes for Text.Pandoc.Readers.HTML:
- removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag,
anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType,
htmlBlockElement, htmlComment
- added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag
* tagsoup is a new dependency.
* Text.Pandoc.Parsing: Generalized type on readWith.
* Benchmark.hs: Added length calculation to force full evaluation.
* Updated HTML reader tests.
* Updated markdown and textile readers to use the functions from
the HTML reader.
* Note: The markdown reader now correctly handles some cases it did not
before. For example:
<hr/>
is reproduced without adding a space.
<script>
a = '<b>';
</script>
is parsed correctly.
|
|
I had previously assumed that we needed to ignore
</script> occuring in a string literal or javascript
comment. It turns out, though, that browsers aren't
that smart.
|
|
It did not work before, because - and quotes were gobbled
up by the str parser.
|
|
Resolves Issue #274.
|
|
This is better done on the resulting HTML; use the xss-sanitize library
for this. xss-sanitize is based on pandoc's sanitization, but improves
it.
- Removed stateSanitize from ParserState.
- Removed --sanitize-html option.
|
|
|
|
|