Age | Commit message (Collapse) | Author | Files | Lines |
|
Closes #1718.
Parsing.ParserState: Make stateNotes' a Map, add stateNoteRefs.
|
|
This is a verison of parseFromString specialied to
ParserState, which resets stateLastStrPos at the end.
This is almost always what we want.
This fixes a bug where `_hi_` wasn't treated as emphasis in
the following, because pandoc got confused about the
position of the last word:
- [o] _hi_
Closes #3690.
|
|
We also export the set of known `schemes`.
The new function replaces the function of the same name
from `Network.URI`, as the latter did not check whether a scheme is
well-known. E.g. MediaWiki wikis frequently feature pages with names
like `User:John`. These links were interpreted as URIs, thus turning
internal links into global links. This is prevented by also checking
whether the scheme of a URI is frequently used (i.e. is IANA registered
or an otherwise well-known scheme).
Fixes: #2713
Update set of well-known URIs from IANA list
All official IANA schemes (as of 2017-05-22) are included in the set of
known schemes. The four non-official schemes doi, isbn, javascript, and
pmid are kept.
|
|
|
|
Move anyLineNewline to Parsing.hs
|
|
|
|
The `insertIncludeFiles` function was generalized and renamed to
`insertIncludedFiles'`; the specialized versions are based on that.
|
|
The `insertIncludeFile` function is generalized to work with all parser
states which are instances of that class.
|
|
Calling `tail` on an empty list raises an exception, while calling the
otherwise equivalent `drop 1` will return the empty list again.
|
|
This follows the suggestions given by the FSF for GPL licensed software.
<https://www.gnu.org/prep/maintain/html_node/Copyright-Notices.html>
|
|
The grid table parsers for markdown and rst was combined into one single
parser, slightly changing parsing behavior of both parsers:
- The markdown parser now compactifies block content cell-wise: pure
text blocks in cells are now treated as paragraphs only if the cell
contains multiple paragraphs, and as plain blocks otherwise. Before,
this was true only for single-column tables.
- The rst parser now accepts newlines and multiple blocks in header
cells.
Closes: #3638
|
|
The parsing functions `tableWith` and `gridTableWith` are generalized to
work with more parsers. The parser state only has to be an instance of
the `HasOptions` class instead of requiring a concrete type. Block
parsers are required to return blocks wrapped into a monad, as this
makes it possible to use parsers returning results wrapped in `Future`s.
|
|
The `F` monads used for delayed evaluation of certain values in the
Markdown and Org readers are based on a shared data type capturing the
common pattern of both `F` types.
|
|
This avoids parsing bare URIs that start with a scheme
+ colon + `*`, `_`, or `]`.
Closes #3570.
|
|
Closes #1905.
Removed stateChapters from ParserState.
Now we parse chapters as level 0 headers, and parts as level -1 headers.
After parsing, we check for the lowest header level, and if it's
less than 1 we bump everything up so that 1 is the lowest header level.
So `\part` will always produce a header; no command-line options
are needed.
|
|
As noted in the previous commit, an autogenerated identifier
may still coincide with an explicit identifier that is given
for a header later in the document, or with an identifier on
a div, span, link, or image. This commit adds a warning
in this case, so users can supply an explicit identifier.
* Added `DuplicateIdentifier` to LogMessage.
* Modified HTML, Org, MediaWiki readers so their custom
state type is an instance of HasLogMessages. This is necessary
for `registerHeader` to issue warnings.
See #1745.
|
|
Previously only autogenerated ids were added to the list
of header identifiers in state, so explicit ids weren't taken
into account when generating unique identifiers. Duplicated
identifiers could result.
This simple fix ensures that explicitly given identifiers are
also taken into account.
Fixes #1745.
Note some limitations, however. An autogenerated identifier
may still coincide with an explicit identifier that is given
for a header later in the document, or with an identifier on
a div, span, link, or image. Fixing this would be much more
difficult, because we need to run `registerHeader` before
we have the complete parse tree (so we can't get a complete
list of identifiers from the document by walking the tree).
However, it might be worth issuing warnings for duplicate
header identifiers; I think we can do that. It is not
common for headers to have the same text, and the issue
can always be worked around by adding explicit identifiers,
if the user is aware of it.
|
|
This is newly exported in texmath 0.9.3.
Note that this means that `macro` will now parse one
macro at a time, rather than parsing a whole group together.
|
|
The citations appear at the end of the document as a definition
list in a special div with id `citations`.
Citations link to the definitions.
Added stateCitations to ParserState.
Closes #853.
|
|
This reverts commit 3c427fc17d53a564305aadde015dd2f048d9ff71.
|
|
in hopes that this will help the ghc 7.8.4 build...
|
|
|
|
We need to do logging by updating parser state, or we'll get
inappropriate and repeated log messages when there is parser
backtracking.
See #3447.
|
|
|
|
* Export readFileFromDirs from Class.
* Export insertIncludedFile from Parsing.
Simplified code in LaTeX/RST readers.
|
|
This can be used in several different modules, not just
LaTeX reader.
|
|
|
|
Changed signatures on Parsing.tableWith and Parsing.gridTableWith.
|
|
API change. CLI option change.
|
|
Now you will need to do
-f markdown+smart
instead of
-f markdown --smart
This change opens the way for writers, in addition to readers,
to be sensitive to +smart, but this change hasn't yet been made.
API change. Command-line option change.
Updated manual.
|
|
The type is implemented in terms of an underlying bitset
which should be more efficient.
API change: from Text.Pandoc.Extensions export Extensions,
emptyExtensions, extensionsFromList, enableExtension, disableExtension,
extensionEnabled.
|
|
* Removed handleIncludes from LaTeX reader [API change].
* Now the ordinary LaTeX reader handles includes in a way
that is appropriate to the monad it is run in.
|
|
Removed stateWarnings, addWarning, and readWithWarnings.
|
|
It doesn't help to solve the problem in 7.8.
|
|
It's having trouble figuring out HasQuoteContext.
|
|
|
|
We can remove this if we can figure out a better way to do this.
|
|
Technically `**@user` is a valid email address, but if we
allow things like this, we get bad results in markdown flavors
that autolink raw email addresses. (See #2940.)
So we exclude a few valid email addresses in order to
avoid these more common bad cases.
Closes #2940.
|
|
Line blocks are allowed to contain empty lines and should be parsed as a
single block in that case. Previously an empty (line block) line would
have terminated parsing of the line block element.
|
|
We already lower-bound tagsoup at 0.13.7, which means we were always
running the compatibility layer (it was conditional on min value
0.13). Better to just use `lookupEntity` from the library directly, and
convert a string to a char if need be.
|
|
This was only necessary for GHC versions with base below 4.5
(i.e., ghc < 7.4).
|
|
|
|
|
|
|
|
This avoids performance problems in documents with many identically
named headers.
Closes #2671.
|
|
Issue submitted at tagsoup.
|
|
- Text.Pandoc.XML.fromEntities: handle entities without a
semicolon. Always lookup character references with the
trailing ';', even if it wasn't present. And never add
it when looking up numerical entities. (This is what
tagsoup seems to require.)
- Text.Pandoc.Parsing.characterReference: Always lookup
character references with the trailing ';', and leave off
the ';' when looking up numerical entities.
This fixes a regression for e.g. `⟨`.
|
|
We were capturing final colons as in [@foo: bar];
the citation id was being parsed as "@foo:".
Closes jgm/pandoc-citeproc#201.
|
|
mb21-new-image-attributes
* Bumped version to 1.16.
* Added Attr field to Link and Image.
* Added `common_link_attributes` extension.
* Updated readers for link attributes.
* Updated writers for link attributes.
* Updated tests
* Updated stack.yaml to build against unreleased versions of
pandoc-types and texmath.
* Fixed various compiler warnings.
Closes #261.
TODO:
* Relative (percentage) image widths in docx writer.
* ODT/OpenDocument writer (untested, same issue about percentage widths).
* Update pandoc-citeproc.
|
|
Closes jgm/pandoc-citeproc#166.
|