Age | Commit message (Collapse) | Author | Files | Lines |
|
Previously, when multiple file arguments were provided, pandoc
simply concatenated them and passed the contents to the readers,
which took a Text argument.
As a result, the readers had no way of knowing which file
was the source of any particular bit of text. This meant that
we couldn't report accurate source positions on errors or
include accurate source positions as attributes in the AST.
More seriously, it meant that we couldn't resolve resource
paths relative to the files containing them
(see e.g. #5501, #6632, #6384, #3752).
Add Text.Pandoc.Sources (exported module), with a `Sources` type
and a `ToSources` class. A `Sources` wraps a list of `(SourcePos,
Text)` pairs. [API change] A parsec `Stream` instance is provided for
`Sources`. The module also exports versions of parsec's `satisfy` and
other Char parsers that track source positions accurately from a
`Sources` stream (or any instance of the new `UpdateSourcePos` class).
Text.Pandoc.Parsing now exports these modified Char parsers instead of
the ones parsec provides. Modified parsers to use a `Sources` as stream
[API change].
The readers that previously took a `Text` argument have been
modified to take any instance of `ToSources`. So, they may still
be used with a `Text`, but they can also be used with a `Sources`
object.
In Text.Pandoc.Error, modified the constructor PandocParsecError
to take a `Sources` rather than a `Text` as first argument,
so parse error locations can be accurately reported.
T.P.Error: showPos, do not print "-" as source name.
|
|
The org-ref syntax allows to list multiple citations separated by comma.
This fixes a bug that accepted commas as part of the citation id, so all
citation lists were parsed as one single citation.
Fixes: #7101
|
|
|
|
* Replace org-mode’s verbatim from code to codeWith.
This adds the `"verbatim"` class so that exporters can apply a specific
style on it. For instance, it will be possible for HTML to add a CSS
rule for code + verbatim class.
* Alter test for org-mode’s verbatim change.
See previous commit for further detail on the new implementation.
|
|
Links with (internal) targets that the reader doesn't know about are
converted into emphasized text. Information on the link target is now
preserved by wrapping the text in a Span of class `spurious-link`, with
an attribute `target` set to the link's original target. This allows to
recover and fix broken or unknown links with filters.
See: #6916
|
|
Footnotes can be removed from the final document with the `#+OPTION:
f:nil` export setting.
|
|
MathML-like entities, e.g., `\alpha`, can be disabled with the
`#+OPTION: e:nil` export setting.
|
|
The `tex` export option can be set with `#+OPTION: tex:nil` and allows
three settings:
- `t` causes LaTeX fragments to be parsed as TeX or added as raw TeX,
- `nil` removes all LaTeX fragments from the document, and
- `verbatim` treats LaTeX as text.
The default is `t`.
Closes: #4070
|
|
Deprecate `underlineSpan` in Shared in favor of `Text.Pandoc.Builder.underline`.
|
|
This should speed-up recompilation after changes in `Text.Pandoc.Class`,
as the number of modules affected by a change will be smaller in
general. It also offers faster insights into the parts of `T.P.Class`
used within a module.
|
|
* Use implicit Prelude
The previous behavior was introduced as a fix for #4464. It seems that
this change alone did not fix the issue, and `stack ghci` and `cabal
repl` only work with GHC 8.4.1 or newer, as no custom Prelude is loaded
for these versions. Given this, it seems cleaner to revert to the
implicit Prelude.
* PandocMonad: remove outdated check for base version
Only base versions 4.9 and later are supported, the check for
`MIN_VERSION_base(4,8,0)` is therefore unnecessary.
* Always use custom prelude
Previously, the custom prelude was used only with older GHC versions, as
a workaround for problems with ghci. The ghci problems are resolved by
replacing package `base` with `base-noprelude`, allowing for consistent
use of the custom prelude across all GHC versions.
|
|
* Update copyright year
* Copyright: add notes for Lua and Jira modules
|
|
Speeds up parsing of single-word, markup-less sub- and superscripts.
Fixes: #6127
|
|
|
|
PR #5884.
+ Use pandoc-types 1.20 and texmath 0.12.
+ Text is now used instead of String, with a few exceptions.
+ In the MediaBag module, some of the types using Strings
were switched to use FilePath instead (not Text).
+ In the Parsing module, new parsers `manyChar`, `many1Char`,
`manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`,
`mantyUntilChar` have been added: these are like their
unsuffixed counterparts but pack some or all of their output.
+ `glob` in Text.Pandoc.Class still takes String since it seems
to be intended as an interface to Glob, which uses strings.
It seems to be used only once in the package, in the EPUB writer,
so that is not hard to change.
|
|
Symbols like `\alpha` are output plain and unemphasized, not as math.
Fixes: #5483
|
|
The haddock module header contains essentially the
same information, so the boilerplate is redundant and
just one more thing to get out of sync.
|
|
Quite a few modules were missing copyright notices.
This commit adds copyright notices everywhere via haddock module
headers. The old license boilerplate comment is redundant with this and has
been removed.
Update copyright years to 2019.
Closes #4592.
|
|
Fixes a regression introduced by the previous commit.
|
|
Links with descriptions which are pointing to images are no longer read
as inline images, but as proper links.
Fixes: #5191
|
|
|
|
|
|
`enclosedByPair` alone does not the handle the empty array properly since it uses `many1Till`.
|
|
`exportsCode` is moved from `Blocks.hs` to `Shared.hs` and exported accordingly.
|
|
|
|
|
|
This seems to be necessary if we are to use our custom Prelude
with ghci.
Closes #4464.
|
|
|
|
And a few tweaks related to the Semigroups/Monoid change.
Closes #4448.
|
|
The characters allowed before and after emphasis can be configured via
`#+pandoc-emphasis-pre` and `#+pandoc-emphasis-post`, respectively. This
allows to change which strings are recognized as emphasized text on a
per-document or even per-paragraph basis. The allowed characters must be
given as (Haskell) string.
#+pandoc-emphasis-pre: "-\t ('\"{"
#+pandoc-emphasis-post: "-\t\n .,:!?;'\")}["
If the argument cannot be read as a string, the default value is
restored.
Closes: #4378
|
|
|
|
* Added underlineSpan builder function. This can be easily updated if needed. The purpose is for Readers to transform underlines consistently.
* Docx Reader: Use underlineSpan and update test
* Org Reader: Use underlineSpan and add test
* Textile Reader: Use underlineSpan and add test case
* Txt2Tags Reader: Use underlineSpan and update test
* HTML Reader: Use underlineSpan and add test case
|
|
The `\n` export option turns all newlines in the text into hard
linebreaks.
Closes #3950
|
|
The org reader was updated to match current org-mode behavior: the set
of characters which are acceptable to occur as the first or last
character in an org emphasis have been changed and now allows all
non-whitespace chars at the inner border of emphasized text (see
`org-emphasis-regexp-components`).
Fixes: #3933
|
|
This rewrite is primarily motivated by the need to
get macros working properly. A side benefit is that the
reader is significantly faster (27s -> 19s in one
benchmark, and there is a lot of room for further
optimization).
We now tokenize the input text, then parse the token stream.
Macros modify the token stream, so they should now be effective
in any context, including math. Thus, we no longer need the clunky
macro processing capacities of texmath.
A custom state LaTeXState is used instead of ParserState.
This, plus the tokenization, will require some rewriting
of the exported functions rawLaTeXInline, inlineCommand,
rawLaTeXBlock.
* Added Text.Pandoc.Readers.LaTeX.Types (new exported module).
Exports Macro, Tok, TokType, Line, Column. [API change]
* Text.Pandoc.Parsing: adjusted type of `insertIncludedFile`
so it can be used with token parser.
* Removed old texmath macro stuff from Parsing.
Use Macro from Text.Pandoc.Readers.LaTeX.Types instead.
* Removed texmath macro material from Markdown reader.
* Changed types for Text.Pandoc.Readers.LaTeX's
rawLaTeXInline and rawLaTeXBlock. (Both now return a String,
and they are polymorphic in state.)
* Added orgMacros field to OrgState. [API change]
* Removed readerApplyMacros from ReaderOptions.
Now we just check the `latex_macros` reader extension.
* Allow `\newcommand\foo{blah}` without braces.
Fixes #1390.
Fixes #2118.
Fixes #3236.
Fixes #3779.
Fixes #934.
Fixes #982.
|
|
|
|
|
|
Copy-pasting had lead to haddock module descriptions containing the
wrong module names.
|
|
Until now, org-ref cite keys included special characters also at the
end. This caused problems when citations occur right before colons or
at the end of a sentence.
With this change, all non alphanumeric characters at the end of a cite
key are ignored.
This also adds `,` to the list of special characters that are legal
in cite keys to better mirror the behaviour of org-export.
|
|
By not checking for the end condition before the first parse, the
parser was applied too often, consuming too much of the input.
This fixes the behaviour of
`testStringWith (many1Till (oneOf "ab") (string "aa")) "aaa"`
which before incorrectly returned `Right "a"`. With this change, it
instead correctly fails with `Left (PandocParsecError ...)` because it
is not able to parse at least one occurence of `oneOf "ab"` that is
not `"aa"`.
Note that this only affects `many1Till p end` where `p` matches on a
prefix of `end`.
|
|
Parsing of smart quotes and special characters can either be enabled via
the `smart` language extension or the `'` and `-` export options. Smart
parsing is active if either the extension or export option is enabled.
Only smart parsing of special characters (like ellipses and en and em
dashes) is enabled by default, while smart quotes are disabled.
This means that all smart parsing features will be enabled by adding the
`smart` language extension. Fine-grained control is possible by leaving
the language extension disabled. In that case, smart parsing is
controlled via the aforementioned export OPTIONS only.
Previously, all smart parsing was disabled unless the language extension
was enabled.
|
|
This follows the suggestions given by the FSF for GPL licensed software.
<https://www.gnu.org/prep/maintain/html_node/Copyright-Notices.html>
|
|
Closes: #3401
|
|
|
|
Source block parameter names are no longer prefixed with *rundoc*. This
was intended to simplify working with the rundoc project, a babel
runner. However, the rundoc project is unmaintained, and adding those
markers is not the reader's job anyway.
The original language that is specified for a source element is now
retained as the `data-org-language` attribute and only added if it
differs from the translated language.
|
|
Closes: #3577
|
|
|
|
These were confusing.
Now we rely on the +raw_tex or +raw_html extension with latex
or html input.
Thus, instead of
--parse-raw -f latex
we use
-f latex+raw_tex
and instead of
--parse-raw -f html
we use
-f html+raw_html
|
|
Now you will need to do
-f markdown+smart
instead of
-f markdown --smart
This change opens the way for writers, in addition to readers,
to be sensitive to +smart, but this change hasn't yet been made.
API change. Command-line option change.
Updated manual.
|
|
|