aboutsummaryrefslogtreecommitdiff
path: root/src/Text
AgeCommit message (Collapse)AuthorFilesLines
2021-08-16Fix bug in last commit due to removal of take1WhileP.John MacFarlane1-2/+2
2021-08-15Multimarkdown sub- and superscripts (#5512) (#7188)OCzarnecki2-15/+20
Added an extension `short_subsuperscripts` which modifies the behavior of `subscript` and `superscript`, allowing subscripts or superscripts containing only alphanumerics to end with a space character (eg. `x^2 = 4` or `H~2 is combustible`). This improves support for multimarkdown. Closes #5512. Add `Ext_short_subsuperscripts` constructor to `Extension` [API change]. This is enabled by default for `markdown_mmd`.
2021-08-15Make docx writer sensitive to `native_numbering` extension.John MacFarlane3-12/+22
Figure and table numbers are now only included if `native_numbering` is enabled. (By default it is disabled.) This is a behavior change with respect to 2.14.1, but the behavior is that of previous versions. The change was necessary to avoid incompatibilities between pandoc's native numbering and third-party cross reference filters like pandoc-crossref. Closes #7499.
2021-08-13Convert Quoted in bib entries to special Spans...John MacFarlane1-1/+3
before passing them off to citeproc. This ensures that we get proper localization and flipflopping if, e.g., quotes are used in titles. Closes jgm/citeproc#87.
2021-08-13Citeproc: avoid odd handling of quotes.John MacFarlane1-1/+6
citeproc changes allow us to ignore Quoted elements; citeproc now uses its own method for represented quoted things, and only localizes and flipflops quotes it adds itself. See #87. The one thing left to do is to convert Quoted elements in bibliography databases (esp. titles) to `Span ("",["csl-quoted"],[])` before passing them to citeproc, IF the localized quotes for the quote type match the standard inverted commas.
2021-08-13Removed quote localization from citeproc processing.John MacFarlane1-20/+1
This is now done in citeproc itself.
2021-08-13Fix raw LaTeX injection issue (LaTeX writer).John MacFarlane1-5/+10
Using a code block containing `\end{verbatim}`, one could inject raw TeX into a LaTeX document even when `raw_tex` is disabled. Thanks to Augustin Laville for noticing the bug. Closes #7497.
2021-08-13LaTeX reader: proper implicit grouping around environment macros.John MacFarlane1-1/+2
2021-08-12Use Prelude from base-compat for ghc 8.4 too.John MacFarlane1-5/+1
We were having trouble building on ghc 8.4 because of the lack of a Foldable instance for (Alt Maybe) in base < 4.12. Mystery: for some reason our builds were failing for gitit but not in the pandoc CI.
2021-08-11Try fixing compile error on older ghcs.John MacFarlane1-1/+5
See https://github.com/jgm/gitit/runs/3308381697
2021-08-11Fix some lint issues.John MacFarlane2-6/+5
2021-08-11LaTeX reader: Support `\global` before `\def`, `\let`, etc.John MacFarlane1-2/+10
See #7494.
2021-08-11Fix scope for LaTeX macros.John MacFarlane3-55/+100
They should by default scope over the group in which they are defined (except `\gdef` and `\xdef`, which are global). In addition, environments must be treated as groups. We handle this by making sMacros in the LaTeX parser state a STACK of macro tables. Opening a group adds a table to the stack, closing one removes one. Only the top of the stack is queried. This commit adds a parameter for scope to the Macro constructor (not exported). Closes #7494.
2021-08-11LaTeX reader: improve handling of plain TeX macro primitives.John MacFarlane2-6/+29
- Fixed semantics for `\let`. - Implement `\edef`, `\gdef`, and `\xdef`. - Add comment noting that currently `\def` and `\edef` set global macros (so are equivalent to `\gdef` and `\xdef`). This should be fixed by scoping macro definitions to groups, in a future commit. Closes #7474.
2021-08-10HTML reader: treat commments as blank when parsing.John MacFarlane1-5/+7
This modifies pBlank. Previously comments could sometimes flummox the parser. Cloes #7482.
2021-08-10Fix RTF table parsing bug that created undesired nested tables.John MacFarlane1-1/+1
Closes #7488.
2021-08-10Add RTF reader.John MacFarlane2-0/+1336
- `rtf` is now supported as an input format as well as output. - New module Text.Pandoc.Readers.RTF (exporting `readRTF`). [API change] Closes #3982.
2021-08-08Allow `--slide-level=0`.John MacFarlane1-2/+2
When the slide level is set to 0, headings won't be used at all in splitting the document into slides. Horizontal rules must be used to separate slides. Closes #7476.
2021-08-04RTF writer: emit \outlinelevel for section headings.John MacFarlane1-1/+2
2021-08-03Stop using the HTTP package. (#7456)mt_caret6-11/+29
We only depend on the urlEncode function in the package, which is also provided by http-types. The HTTP package also depends on the network package, which has difficulty building on ghcjs. Add internal module Text.Pandoc.Network.HTTP, exporting `urlEncode`.
2021-08-03LaTeX table writer: Increase column width precision (#7466)Peter Fabinski1-1/+1
In some cases, the rounding performed by the LaTeX table writer would introduce visible overrun outside the text area. This adds two more decimal places to the width values.
2021-08-01RTF writer: omit `\bin` in `\pict`.John MacFarlane1-1/+1
According to the spec, this is not needed or wanted when the data is in hexadecimal format, as it is here.
2021-07-29parseFromString: preserve at least the source directory.John MacFarlane1-1/+1
Previously we just set the source name to "chunk" when parsing from strings, to avoid misleading source positions. This had the side effect that `rebase_relative_paths` would break inside sections that were parsed as strings. So, now we use "ORIGINAL_SOURCE_PATH_chunk" instead of just "chunk". Closes #7464.
2021-07-22LaTeX writer: Use ulem for underline.John MacFarlane1-1/+3
ulem is conditionally included already when the `strikeout` variable is set, so we set this when there is underlined text, and use `\uline` instead of `\underline`. This fixes wrapping for underlined text. Closes #7351.
2021-07-22MIME: use image/x-xcf instead of application/x-xcf.John MacFarlane1-1/+1
Closes #7454.
2021-07-17LaTeX reader: avoid trailing hyphen in translating languages.John MacFarlane1-2/+2
Previously `\foreignlanguage{english}` turned into `<span lang="en-">`. The same issue affected Arabic. Closes #7447.
2021-07-16DocBook reader: handle images with imageobjectco elements.John MacFarlane1-3/+3
Closes #7440.
2021-07-16LaTeX reader: Support `\cline` in LaTeX tables.John MacFarlane1-0/+1
Closes #7442.
2021-07-16PDF: Fix svgIn path error.John MacFarlane1-1/+1
We were duplicating the temp directory; this didn't show up on macOS or linux because there we use absolute paths for the temp directory. Closes #7431.
2021-07-11DocBook reader: add support for citerefentry (#7437)Jan Tojnar1-1/+5
Originally intended for referring to UNIX manual pages, either part of the same DocBook document as refentry element, or external – hence the manvolnum element. These days, refentry is more general, for example the element documentation pages linked below are each a refentry. As per the *Processing expectations* section of citerefentry, the element is supposed to be a hyperlink to a refentry (when in the same document) but pandoc does not support refentry tag at the moment so that is moot. https://tdg.docbook.org/tdg/5.1/citerefentry.html https://tdg.docbook.org/tdg/5.1/manvolnum.html https://tdg.docbook.org/tdg/5.1/refentry.html This roughly corresponds to a `manpage` role in rST syntax, which produces a `Code` AST node with attributes `.interpreted-text role=manpage` but that does not fit DocBook parser. https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#role-manpage
2021-07-11Improved parsing of raw LaTeX from Text streams (rawLaTeXParser).John MacFarlane2-11/+37
We now use source positions from the token stream to tell us how much of the text stream to consume. Getting this to work required a few other changes to make token source positions accurate. Closes #7434.
2021-07-09Always use / when adding directory to image path with extractMedia.John MacFarlane1-1/+1
Even on Windows. May help with #7431.
2021-07-09RST reader: fix regression with code includes.John MacFarlane1-1/+5
With the recent changes to include infrastructure, included code blocks were getting an extra newline. Closes #7436. Added regression test.
2021-07-07Don't incorporate externally linked images in EPUB documents (#7430)Michael Hoffmann1-1/+2
Just like it is possible to avoid incorporating an image in EPUB by passing `data-external="1"` to a raw HTML snippet, this makes the same possible for native Images, by looking for an associated `external` attribute.
2021-07-06Recognize data-external when reading HTML img tags (#7429)Michael Hoffmann1-8/+3
Preserve all attributes in img tags. If attributes have a `data-` prefix, it will be stripped. In particular, this preserves a `data-external` attribute as an `external` attribute in the pandoc AST.
2021-07-06T.P.PDF, convertImage: normalize paths.John MacFarlane1-3/+3
This will avoid paths on Windows with mixed path separators, which may cause problems with SVG conversion. See #7431.
2021-07-06Markdown reader: don't try to read contents in self-closing HTML tag.John MacFarlane1-1/+4
Previously we had problems parsing raw HTML with self-closing tags like `<col/>`. The problem was that pandoc would look for a closing tag to close the markdown contents, but the closing tag had, in effect, already been parsed by `htmlTag`. This fixes the issue described in <https://groups.google.com/d/msgid/pandoc-discuss/297bc662-7841-4423-bcbb-534e99bbba09n%40googlegroups.com>.
2021-07-06HTML reader: add col, colgroup to 'closes' definitionsJohn MacFarlane1-1/+3
2021-07-05Add command test for #7394.John MacFarlane1-0/+1
And fix a small bug in handling of citations in notes, which led to commas at the end of sentences in some cases.
2021-07-05Citeproc: cleanup and efficiency improvement in deNote.John MacFarlane1-15/+21
2021-07-05Revamp note citation handling.John MacFarlane1-14/+30
Use latest citeproc, which uses a Span with a class rather than a Note for notes. This helps us distinguish between user notes and citation notes. Don't put citations at the beginning of a note in parentheses. (Closes #7394.)
2021-07-02HTML5 writer, remove aria-hidden when explicit atl text is provided.Aner Lucero1-4/+7
2021-06-29Docx writer: Add table numbering for captioned tables.John MacFarlane2-3/+30
The numbers are added using fields, so that Word can create a list of tables that will update automatically.
2021-06-29Docx writer: Fixed a couple bugs in Figure numbering.John MacFarlane1-4/+3
2021-06-29Docx writer: support figure numbers.John MacFarlane2-3/+21
These are set up in such a way that they will work with Word's automatic table of figures. Closes #7392.
2021-06-29Remove duplicated alt text in HTML output.Aner Lucero1-2/+3
2021-06-28Improve punctuation moving with `--citeproc`.John MacFarlane1-14/+15
Previously, using `--citeproc` could cause punctuation to move in quotes even when there aer no citations. This has been changed; now, punctuation moving is limited to citations. In addition, we only move footnotes around punctuation if the style is a note style, even if `notes-after-punctuation` is `true`.
2021-06-28Allow `$` characters in bibtex keys.John MacFarlane1-1/+1
Closes #7409.
2021-06-28Text.Pandoc.Error: fix line calculations in reporting parsec errors.John MacFarlane1-3/+3
Also remove a spurious initial newline in the error report.
2021-06-28Set proper initial source name in parsing BibTeX.John MacFarlane1-1/+3
(For better error messages.)