Age | Commit message (Collapse) | Author | Files | Lines |
|
This originated with @dubiousjim's observation in #1419
that there was a typo in the definition of enDash.
It returned an em dash character instead of an en dash.
I thought about why this had not been noticed before, and
realized that en dashes were just being parsed as regular
symbols.
That made me realize that, now that we no longer have
dedicate EnDash, EmDash, and Ellipses inline elements, as
we used to in pandoc, we no longer need to parse the
unicode characters specially. This allowed a considerable
simplification of the code.
Partially resolves #1419.
|
|
|
|
Replaced all inline occurences of fmap with the more idiomatic (<$>).
|
|
This function is equivalent to the more general (<*) which is defined in
Control.Applicative. This change makes pandoc code easier to understand for
those not familar with the codebase.
|
|
Before it wasn't possible to use these general combinators with the ParsecT
transformer but with the more general types this is now possible.
|
|
This is used to keep track of the ending tag we're waiting
for when we're parsing inside HTML block tags.
|
|
Closes #1313.
|
|
The function can be used by other readers, so it is made accessible for
all parsers.
|
|
Both `ParserState` and `OrgParserState` keep track of the parser position at
which the last string ended. This patch introduces a new class
`HasLastStrPosition` and makes the above types instances of that class. This
enables the generalization of functions updating the state or checking if one
is right after a string.
|
|
|
|
Closes #1274.
Rewrote handleIncludes.
We now report the actual source file and position where the error
occurs, even if it is included. We do this by inserting special
commands, `\PandocStartInclude` and `\PandocEndInclude`, that encode
this information in the preprocessing phase.
Also generalized the types of a couple functions from
`Text.Pandoc.Parsing`.
|
|
element and updated files accordingly
|
|
This is primarily for use in the LaTeX reader, so far.
|
|
Removed updateHeaderMap, setHeaderMap, getHeaderMap,
updateIdentifierList, setIdentifierList, getIdentifierList.
|
|
Contrary to the previous commit message, there was no API
change, since Text.Pandoc.Parsing is not an exposed module.
|
|
Previously these were typeclasses of monads. They've been changed
to be typeclasses of states. This ismplifies the instance definitions
and provides more flexibility.
This is an API change! However, it should be backwards compatible
unless you're defining instances of HasReaderOptions, HasHeaderMap,
or HasIdentifierList. The old getOption function should work as
before (albeit with a more general type).
The function askReaderOption has been removed.
extractReaderOptions has been added.
getOption has been given a default definition.
In HasHeaderMap, extractHeaderMap and updateHeaderMap have been added.
Default definitions have been given for getHeaderMap, putHeaderMap,
and modifyHeaderMap.
In HasIdentifierList, extractIdentifierList and updateIdentifierList
have been added. Default definitions have been given for
getIdentifierList, putIdentifierList, and modifyIdentifierList.
The ultimate goal here is to allow different parsers to use their
own, tailored parser states (instead of ParserState) while still
using shared functions.
|
|
|
|
|
|
rST parser now supports:
- All built-in rST roles
- New role definition
- Role inheritance
Issues/TODO:
- Silently ignores illegal fields on roles
- Silently drops class annotations for roles
- Only supports :format: fields with a single format for :raw: roles,
requires a change to Text.Pandoc.Definition.Format to support multiple
formats.
- Allows direct use of :raw: role, rST only allows indirect (i.e.,
inherited use of :raw:).
|
|
Replaces long conditional chains with calls to `elem` and `notElem`.
|
|
* Moved inlineMath, displayMath from Markdown reader to Parsing.
* Export them from Parsing. (API change.)
* Generalize their types.
|
|
New type classes HasReadeOptions, HasIdentifierList, HasHeaderMap.
These allow certain common functions to be reused even in parsers
that use custom state (instead of ParserState), such as the MediaWiki
reader.
Minor API bump.
|
|
Text.Pandoc.Parsing now exports registerHeader, which can be
used in other readers.
|
|
|
|
This allows pandoc to compile with tagsoup 0.13.x.
Thanks to Dirk Ullrich for the patch.
|
|
Closes #933.
|
|
|
|
* Depend on pandoc 1.12.
* Added yaml dependency.
* `Text.Pandoc.XML`: Removed `stripTags`. (API change.)
* `Text.Pandoc.Shared`: Added `metaToJSON`.
This will be used in writers to create a JSON object for use
in the templates from the pandoc metadata.
* Revised readers and writers to use the new Meta type.
* `Text.Pandoc.Options`: Added `Ext_yaml_title_block`.
* Markdown reader: Added support for YAML metadata block.
Note that it must come at the beginning of the document.
* `Text.Pandoc.Parsing.ParserState`: Replace `stateTitle`,
`stateAuthors`, `stateDate` with `stateMeta`.
* RST reader: Improved metadata.
Treat initial field list as metadata when standalone specified.
Previously ALL fields "title", "author", "date" in field lists
were treated as metadata, even if not at the beginning.
Use `subtitle` metadata field for subtitle.
* `Text.Pandoc.Templates`: Export `renderTemplate'` that takes a string
instead of a compiled template..
* OPML template: Use 'for' loop for authors.
* Org template: '#+TITLE:' is inserted before the title.
Previously the writer did this.
|
|
|
|
- Specialize readWith to String input.
- On error have it print the line in which the error occurred,
with a caret pointing to the column.
- This should help diagnose parsing problems in LaTeX especially.
|
|
Don't treat punctuation before percent-encoding as final punctuation.
Don't treat '+' as final punctuation.
|
|
`<` is no longer allowed in URLs, according to the uri parser
in Text.Pandoc.Parsing.
Added a test case.
|
|
(Markdown reader.)
|
|
Added tests for entities in titles and links.
Closes #723.
|
|
A markdown link `<http://göogle.com>` should
be a link to http://göogle.com.
|
|
The call to toLower in ciMatch was very expensive (and very often
used), because toLower from Data.Char calls a fully unicode
aware function. This optimization avoids the call to toLower
for the most common, ASCII cases. This dramatically reduces the
speed penalty that comes from enabling the `autolink_bare_uris`
extension. The penalty is still substantial (in one test, from 0.33s
to 0.44s), but nowhere near what it used to be.
|
|
Now latex macro definitions are preserved when output is latex,
and applied when it is another format, as originally intended.
Partially addresses #730.
\providecommand is still not supported. For this we need changes
to texmath.
|
|
|
|
|
|
|
|
Not only faster but uses less memory.
|
|
The bug prevented an autolink at the end of a string (e.g.
at the end of a line block line) from counting as a link.
Closes #711.
|
|
Added tests.
|
|
|
|
oneOfStrings will now take the longest match it can in a
list of strings, so if 'foo' and 'foobar' are both included,
'foobar' will match even if 'foo' is first in the list.
|
|
* It no longer uses Network.URIs URI parser, which is too restrictive
(not allowing unicode URIs unless encoded).
* It allows many more schemes.
* It better handles punctuation so as to avoid capturing trailing
punctuation in bare URLs.
|
|
Otherwise Network.URI.parseURI fails on e.g. Chinese
URLs. Changed an incorrect test in markdown-reader-more.
|
|
This makes 's', 'l', etc. parse properly.
Formerly we had some English-centric heuristics, but they
are no longer needed now that we keep track of the last
'Str' position in state.
Closes #698.
|
|
This will be used by both RST and markdown readers.
|
|
|