Age | Commit message (Collapse) | Author | Files | Lines |
|
Org reader: table parsing code refactoring and fixes
|
|
The org-reader was droping space after unescaped LaTeX-style symbol
commands: `\ForAll \Auml` resulted in `∀Ä` but should give `∀ Ä`
instead. This seems to be because the LaTeX-reader treats the
command-terminating space as part of the command. Dropping the trailing
space from the symbol-command fixes this issue.
|
|
This fixes Org mode parsing of some corner cases regarding empty cells
and rows. Empty cells weren't parsed correctly, e.g. `|||` should be
two empty cells, but would be parsed as a single cell containing a pipe
character. Empty rows where parsed as alignment rows and dropped from
the output.
This fixes #2616.
|
|
This refactores the codes conversing a list table lines to an org table
ADT. The old code was simplified and is now slightly less ugly.
|
|
Emacs Org-mode doesn't add any padding to table rows. The first
row (header or first body row) is used to determine the column count, no
other magic is performed.
The org reader was padding rows to the length of the longest table row.
This was done due to a misunderstanding of how Org handles tables. This
feature reflected how Org-mode handles tables when pressing <TAB>. The
Org exporter however, which is what the reader should implement, doesn't
do any of this. So this was a mis-feature that made the reader more
complex and reduced comparability. It was hence removed.
|
|
Docbook5 write support
|
|
According to http://docutils.sourceforge.net/docs/ref/rst/directives.html#code,
the code directive supports the ":class:" option.
|
|
Commit 91dc3342 made `readDocx` throw PandocError if there was an
unarchiving error. This extends that fix to `readOdt` and `readEPUB`.
|
|
Closes #2892.
|
|
Previously, readDocx would error out if zip-archive failed. We change
the archive extraction step from `toArchive` to `toArchiveOrFail`, which
returns an Either value.
|
|
|
|
|
|
Previously if a document only had math in a footnote,
the MathJax link would not be added.
Closes #2881.
|
|
|
|
This reverts commit 4c684561ee0665b014e887ae559b7020e4e9f2d3.
See
https://groups.google.com/d/msg/pandoc-discuss/u6J-_aCProU/UufN3IYRAgAJ
This should fix uneven spacing issues in multiline tables.
|
|
LaTeX Writer: fix polyglossia to babel env mapping
|
|
LaTeX writer: Add missing languages.
|
|
Ignore leading space in org code blocks
|
|
Closes #2843.
|
|
Fixes #2862
Also fix up tab handling for leading whitespace in code blocks.
|
|
`moveTo` and `moveFrom` are track-changes tags that are used when a
block of text is moved in the document. We now recognize these tags and
treat them the same as `insert` and `delete`, respectively. So,
`--track-changes=accept` will show the moved version, while
`--track-changes=reject` will show the original version.
|
|
Closes #2799.
Also added -s to markdown-reader-more test.
|
|
We are now more forgiving about parsing invalid HTML with
unescaped `&` as raw HTML. (Previously any unescaped `&`
would cause pandoc not to recognize the string as raw HTML.)
Closes #2410.
|
|
Updates the list from the hyphenation files at <http://mirror.ctan.org/language/hyph-utf8/tex/generic/hyph-utf8/loadhyph/>.
|
|
This allows one to access the hyphenation patterns at <http://mirrors.ctan.org/language/hyph-utf8/tex/generic/hyph-utf8/patterns/tex/hyph-la-x-classic.tex>, using its private language tag.
|
|
This allows templates to treat it differently.
|
|
|
|
Closes #2813.
|
|
Partially addresses #2813.
This isn't perfect, because now the hypertarget is in the
wrong place -- when you link to the figure, the screen
is positioned with the caption at the top, and most of
the figure off screen.
So this needs a bit more tweaking.
|
|
|
|
This was a regression, with the rewrite of `htmlInBalanced`
(from `Text.Pandoc.Readers.HTML`) in 1.17.
It caused newlines to be omitted in raw HTML blocks.
Closes #2804.
|
|
allow for optional argument in square brackets, closes #2728
|
|
LaTeX writer: figure label
|
|
Add a `\strut` after `\crlf` before space.
Closes #2744, #2745. Thanks to @c-foster.
This uses the fix suggested by @c-foster.
Mid-line spaces are still not supported, because of limitations
of the Markdown parser.
|
|
Closes #2742.
|
|
Some word functions -- especially graphics -- give various choices for
content so there can be backwards compatibility. This follows the
largely undocumented feature by working through the choices until we
find one that works.
Note that we had to split out the processing of child elems of runs into
a separate function so we can recurse properly. Any processing of an
element *within* a run (other than a plain run) should go into
`childElemToRun`.
|
|
Word uses list numbering styles to number its headings. We only call
something a numbered list if it does not also heave a heading style.
|
|
Traditionally pandoc operates on multiple files by first concetenating
them (around extra line breaks) and then processing the joined file. So
it only parses a multi-file document at the document scope. This has the
benefit that footnotes and links can be in different files, but it also
introduces a couple of difficulties:
- it is difficult to join files with footnotes without some sort of
preprocessing, which makes it difficult to write academic documents
in small pieces.
- it makes it impossible to process multiple binary input files, which
can't be catted.
- it makes it impossible to process files from different input
formats.
This commit introduces alternative method. Instead of catting the files
first, it parses the files first, and then combines the parsed
output. This makes it impossible to have links across multiple files,
and auto-identified headers won't work correctly if headers in multiple
files have the same name. On the other hand, footnotes across multiple
files will work correctly and will allow more freedom for input formats.
Since ByteStringReaders can currently only read one binary file, and
will ignore subsequent files, we also changes the behavior to
automatically parse before combining if using the ByteStringReader. If
we use one file, it will work as normal. If there is more than one file
it will combine them after parsing (assuming that the format is the
same).
Note that this is intended to be an optional method, defaulting to
off. Turn it on with `--file-scope`.
|
|
Have docx reader use it.
|
|
The regular readDocx just becomes a special case.
|
|
In order to be able to collect warnings during parsing, we add a state
monad transformer to the D monad. At the moment, this only includes a
list of warning strings (nothing currently triggers them, however). We
use StateT instead of WriterT to correspond more closely with the
warnings behavior in T.P.Parsing.
|
|
+ If the base path does not end with slash, the last component
will be replaced. E.g. base = `http://example.com/foo`
combines with `bar.html` to give `http://example.com/bar.html`.
+ If the href begins with a slash, the whole path of the base
is replaced. E.g. base = `http://example.com/foo/` combines
with `/bar.html` to give `http://example.com/bar.html`.
Closes #2777.
|
|
closes #2754
|
|
Fixes #2765.
Added test case.
|
|
|
|
We already allowed them in the header, but not in the body
rows, for some reason. This gives compatibility with org-mode
tables.
|
|
Previously an emph element could be parsed across the newline
at the end of the pipe table row.
I thought this would help with #2765, but it doesn't.
|
|
|
|
The feature checklist in the source code was out of date. Update.
|
|
e.g. `$$\hbox{$i$}$$`.
Partially addresses #2743.
|