Age | Commit message (Collapse) | Author | Files | Lines |
|
Preserve all attributes in img tags. If attributes have a `data-`
prefix, it will be stripped. In particular, this preserves a
`data-external` attribute as an `external` attribute in the pandoc AST.
|
|
If a code block is defined with `<pre><code
class="language-x">…</code></pre>`, where the `<pre>` element has no
attributes, then the attributes from the `<code>` element are used
instead. Any leading `language-` prefix is dropped in the code's *class*
attribute are dropped to improve syntax highlighting.
Closes: #7221
|
|
HTML5 `<header>` elements are treated like `<div>` elements.
|
|
The tags `<title>` and `<h1 class="title">` often contain the same
information, so the latter was dropped from the document. However, as
this can lead to loss of information, the heading is now always
retained.
Use `--shift-heading-level-by=-1` to turn the `<h1>` into the document
title, or a filter to restore the previous behavior.
Closes: #2293
|
|
Prevent the reader from crashing if the HTML input contains an unmatched
closing `</script>` tag.
Fixes: #7282
|
|
Previously, when multiple file arguments were provided, pandoc
simply concatenated them and passed the contents to the readers,
which took a Text argument.
As a result, the readers had no way of knowing which file
was the source of any particular bit of text. This meant that
we couldn't report accurate source positions on errors or
include accurate source positions as attributes in the AST.
More seriously, it meant that we couldn't resolve resource
paths relative to the files containing them
(see e.g. #5501, #6632, #6384, #3752).
Add Text.Pandoc.Sources (exported module), with a `Sources` type
and a `ToSources` class. A `Sources` wraps a list of `(SourcePos,
Text)` pairs. [API change] A parsec `Stream` instance is provided for
`Sources`. The module also exports versions of parsec's `satisfy` and
other Char parsers that track source positions accurately from a
`Sources` stream (or any instance of the new `UpdateSourcePos` class).
Text.Pandoc.Parsing now exports these modified Char parsers instead of
the ones parsec provides. Modified parsers to use a `Sources` as stream
[API change].
The readers that previously took a `Text` argument have been
modified to take any instance of `ToSources`. So, they may still
be used with a `Text`, but they can also be used with a `Sources`
object.
In Text.Pandoc.Error, modified the constructor PandocParsecError
to take a `Sources` rather than a `Text` as first argument,
so parse error locations can be accurately reported.
T.P.Error: showPos, do not print "-" as source name.
|
|
|
|
|
|
Also, remove exported class NamedTag(..) [API change].
This was just intended to smooth over the transition from String to Text
and is no longer needed.
The functions isInlineTag and isBlockTag are no longer
polymorphic.
|
|
Do a lookahead to find the right parser to use.
Benchmarks from 34ms to 23ms, with less allocation.
Also speeds up the epub reader.
|
|
- If src is empty, we simply skip the iframe.
- If src is invalid or cannot be fetched, we issue a warning
and skip instead of failing with an error.
- Closes #7099.
|
|
|
|
The `renderTags'` function was duplicated when the reader used `Text` as
its string type. The duplication is no longer necessary.
A side effect of this change is that empty `<col>` elements are written
as self-closing tags in raw HTML blocks.
|
|
These (as well as lang attributes on html) should update
lang in metadata. See #6938.
|
|
Previously we stripped attribute prefixes, reading
`xml:lang` as `lang` for example. This resulted in
two duplicate `lang` attributes when `xml:lang` and
`lang` were both used. This commit causes the prefixes
to be retained, and also avoids invald duplicate
attributes.
Closes #6938.
|
|
- `<tfoot>` elements are no longer added to the table body but used as
table footer.
- Separate `<tbody>` elements are no longer combined into one.
- Attributes on `<thead>`, `<tbody>`, `<th>`/`<td>`, and `<tfoot>`
elements are preserved.
|
|
|
|
This removes the `foldOrElse` function from the internal Text.Pandoc.CSS
module.
|
|
|
|
|
|
|
|
Reducing module size should reduce memory use during compilation.
This is preparatory work to tackle support for more table features.
|
|
See #6770.
|
|
...unless `raw_html` is set in the reader (in which case
the svg is passed through as raw HTML).
Closes #6770.
|
|
|
|
* Fix hlint suggestions, update hlint.yaml
Most suggestions were redundant brackets. Some required
LambdaCase.
The .hlint.yaml file had a small typo, and didn't ignore camelCase
suggestions in certain modules.
|
|
Closes #6385. (The summary element needs to be the first
child of details and should not be enclosed by p tags.)
NOTE: you need to include a blank line before the closing
`</details>`, if you want the last part of the content to
be parsed as a paragraph.
|
|
|
|
Deprecate `underlineSpan` in Shared in favor of `Text.Pandoc.Builder.underline`.
|
|
|
|
The Builder.simpleTable now only adds a row to the TableHead when the
given header row is not null. This uncovered an inconsistency in the
readers: some would unconditionally emit a header filled with empty
cells, even if the header was not present. Now every reader has the
conditional behaviour. Only the XWiki writer depended on the header
row being always present; it now pads its head as necessary.
|
|
- Writers.Native is now adapted to the new Table type.
- Inline captions should now be conditionally wrapped in a Plain, not
a Para block.
- The toLegacyTable function now lives in Writers.Shared.
|
|
|
|
|
|
See https://developer.mozilla.org/en-US/docs/Web/HTML/Element/bdo
Closes #5794
|
|
Closes #6247.
|
|
This should speed-up recompilation after changes in `Text.Pandoc.Class`,
as the number of modules affected by a change will be smaller in
general. It also offers faster insights into the parts of `T.P.Class`
used within a module.
|
|
* Use implicit Prelude
The previous behavior was introduced as a fix for #4464. It seems that
this change alone did not fix the issue, and `stack ghci` and `cabal
repl` only work with GHC 8.4.1 or newer, as no custom Prelude is loaded
for these versions. Given this, it seems cleaner to revert to the
implicit Prelude.
* PandocMonad: remove outdated check for base version
Only base versions 4.9 and later are supported, the check for
`MIN_VERSION_base(4,8,0)` is therefore unnecessary.
* Always use custom prelude
Previously, the custom prelude was used only with older GHC versions, as
a workaround for problems with ghci. The ghci problems are resolved by
replacing package `base` with `base-noprelude`, allowing for consistent
use of the custom prelude across all GHC versions.
|
|
* Update copyright year
* Copyright: add notes for Lua and Jira modules
|
|
* Remove unnecessary fmaps and only do toMilliseconds once
* Share the input tuple intead of making a new one
* Lift return out of if
* Simplify case statements
* Lift DottedNum out of the case statements
* Use st instead of mbs
* Use setState instead of updateState now that we have the whole state around
|
|
And similarly don't parse any `data-X` as `X` when `X`
is a valid HTML attribute.
Reported in comment on #5415.
|
|
|
|
Anywhere "maybe" is used with "id" as its second argument, using
"fromMaybe" instead will simplify the code. Conversely, anywhere
"fromMaybe" is used with the result of "fmap" or "<$>" as its second
argument, using "maybe" instead will simplify the code.
|
|
|
|
PR #5884.
+ Use pandoc-types 1.20 and texmath 0.12.
+ Text is now used instead of String, with a few exceptions.
+ In the MediaBag module, some of the types using Strings
were switched to use FilePath instead (not Text).
+ In the Parsing module, new parsers `manyChar`, `many1Char`,
`manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`,
`mantyUntilChar` have been added: these are like their
unsuffixed counterparts but pack some or all of their output.
+ `glob` in Text.Pandoc.Class still takes String since it seems
to be intended as an interface to Glob, which uses strings.
It seems to be used only once in the package, in the EPUB writer,
so that is not hard to change.
|
|
(#5882)
* Add HTML Reader support for `<dfn>`, parsing this as a Span with class `dfn`.
* Change `htmlSpanLikeElements` implementation to retain classes,
attributes and inline content.
|
|
|
|
Closes #5799
|
|
* HTML reader: Handle cite attribute for quotes. If a `<q>` tag has a `cite` attribute, we interpret it as a Quoted element with an inner Span. Closes #5798
* Refactor url canonicalization into a helper function
* Modify HTML writer to handle quote with cite.
[0]: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/q
|
|
The `<samp>` element is parsed as a Span with class `sample`.
Closes #5792.
|